mirror of
https://github.com/nlohmann/json.git
synced 2025-05-12 14:11:38 +00:00
📝 add note for wstring handling
This commit is contained in:
parent
a54403290e
commit
eb488bb4d9
@ -1612,6 +1612,7 @@ The library supports **Unicode input** as follows:
|
|||||||
- Invalid surrogates (e.g., incomplete pairs such as `\uDEAD`) will yield parse errors.
|
- Invalid surrogates (e.g., incomplete pairs such as `\uDEAD`) will yield parse errors.
|
||||||
- The strings stored in the library are UTF-8 encoded. When using the default string type (`std::string`), note that its length/size functions return the number of stored bytes rather than the number of characters or glyphs.
|
- The strings stored in the library are UTF-8 encoded. When using the default string type (`std::string`), note that its length/size functions return the number of stored bytes rather than the number of characters or glyphs.
|
||||||
- When you store strings with different encodings in the library, calling [`dump()`](https://nlohmann.github.io/json/api/basic_json/dump/) may throw an exception unless `json::error_handler_t::replace` or `json::error_handler_t::ignore` are used as error handlers.
|
- When you store strings with different encodings in the library, calling [`dump()`](https://nlohmann.github.io/json/api/basic_json/dump/) may throw an exception unless `json::error_handler_t::replace` or `json::error_handler_t::ignore` are used as error handlers.
|
||||||
|
- To store wide strings (e.g., `std::wstring`), you need to convert them to a a UTF-8 encoded `std::string` before, see [an example](https://json.nlohmann.me/home/faq/#wide-string-handling).
|
||||||
|
|
||||||
### Comments in JSON
|
### Comments in JSON
|
||||||
|
|
||||||
|
@ -44,7 +44,7 @@ for objects.
|
|||||||
|
|
||||||
!!! question
|
!!! question
|
||||||
|
|
||||||
- Can you add an option to ignore trailing commas?
|
Can you add an option to ignore trailing commas?
|
||||||
|
|
||||||
This library does not support any feature which would jeopardize interoperability.
|
This library does not support any feature which would jeopardize interoperability.
|
||||||
|
|
||||||
@ -70,6 +70,45 @@ The library supports **Unicode input** as follows:
|
|||||||
In most cases, the parser is right to complain, because the input is not UTF-8 encoded. This is especially true for Microsoft Windows where Latin-1 or ISO 8859-1 is often the standard encoding.
|
In most cases, the parser is right to complain, because the input is not UTF-8 encoded. This is especially true for Microsoft Windows where Latin-1 or ISO 8859-1 is often the standard encoding.
|
||||||
|
|
||||||
|
|
||||||
|
### Wide string handling
|
||||||
|
|
||||||
|
!!! question
|
||||||
|
|
||||||
|
Why are wide strings (e.g., `std::wstring`) dumped as arrays of numbers?
|
||||||
|
|
||||||
|
As described [above](#parse-errors-reading-non-ascii-characters), the library assumes UTF-8 as encoding. To store a wide string, you need to change the encoding.
|
||||||
|
|
||||||
|
!!! example
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
#include <codecvt> // codecvt_utf8
|
||||||
|
#include <locale> // wstring_convert
|
||||||
|
|
||||||
|
// encoding function
|
||||||
|
std::string to_utf8(std::wstring& wide_string)
|
||||||
|
{
|
||||||
|
static std::wstring_convert<std::codecvt_utf8<wchar_t>> utf8_conv;
|
||||||
|
return utf8_conv.to_bytes(wide_string);
|
||||||
|
}
|
||||||
|
|
||||||
|
json j;
|
||||||
|
std::wstring ws = L"車B1234 こんにちは";
|
||||||
|
|
||||||
|
j["original"] = ws;
|
||||||
|
j["encoded"] = to_utf8(ws);
|
||||||
|
|
||||||
|
std::cout << j << std::endl;
|
||||||
|
```
|
||||||
|
|
||||||
|
The result is:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"encoded": "車B1234 こんにちは",
|
||||||
|
"original": [36554, 66, 49, 50, 51, 52, 32, 12371, 12435, 12395, 12385, 12399]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
## Exceptions
|
## Exceptions
|
||||||
|
|
||||||
### Parsing without exceptions
|
### Parsing without exceptions
|
||||||
|
Loading…
x
Reference in New Issue
Block a user