Support UBJSON-derived Binary JData (BJData) format (#3336)

* support UBJSON-derived Binary JData (BJData) format

* fix Codacy warning

* partially fix VS compilation errors

* fix additional VS errors

* fix more VS compilation errors

* fix additional warnings and errors for clang and msvc

* add more tests to cover the new bjdata types

* add tests for optimized ndarray, improve coverage, fix clang/gcc warnings

* gcc warn useless conversion but msvc gives an error

* fix ci_test errors

* complete test coverage, fix ci_test errors

* add half precision error test

* fix No newline at end of file error by clang

* simplify endian condition, format unit-bjdata

* remove broken test due to alloc limit

* full coverage, I hope

* move bjdata new markers from default to the same level as ubjson markers

* fix ci errors, add tests for new bjdata switch structure

* make is_bjdata const after using initializer list

* remove the unwanted assert

* move is_bjdata to an optional param to write_ubjson

* pass use_bjdata via output adapter

* revert order to avoid msvc 2015 unreferenced formal param error

* update BJData Spect V1 Draft-2 URL after spec release

* amalgamate code

* code polishing following @gregmarr's feedback

* make use_bjdata a non-default parameter

* fix ci error, remove unwanted param comment

* encode and decode bjdata ndarray in jdata annotations, enable roundtrip tests

* partially fix ci errors, add tests to improve coverage

* polish patch to remove ci errors

* fix a ndarray dim vector condition

* fix clang tidy error

* add sax test cases for ndarray

* add additional sax event tests

* adjust sax event numbering

* fix sax tests

* ndarray can only be used with array containers, discard if used in object

* complete test coverage

* disable [{SHTFNZ in optimized type due to security risks in #2793 and hampered readability

* fix ci error

* move OutputIsLittleEndian from tparam to param to replace use_bjdata

* fix ci clang gcc error

* fix ci static analysis error

* update json_test_data to 3.1.0, enable file-based bjdata unit tests

* fix stack overflow error on msvc 2019 and 2022

* use https link, update sax_parse_error after rebase

* make input_format const and use initializer

* return bool for write_bjdata_ndarray

* test write_bjdata_ndarray return value as boolean

* fix ci error
This commit is contained in:
Qianqian Fang 2022-04-29 15:17:30 -04:00 committed by GitHub
parent a6ee8bf9d9
commit ee51661481
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 4877 additions and 227 deletions

View File

@ -32,7 +32,7 @@
- [Implicit conversions](#implicit-conversions)
- [Conversions to/from arbitrary types](#arbitrary-types-conversions)
- [Specializing enum conversion](#specializing-enum-conversion)
- [Binary formats (BSON, CBOR, MessagePack, and UBJSON)](#binary-formats-bson-cbor-messagepack-and-ubjson)
- [Binary formats (BSON, CBOR, MessagePack, UBJSON, and BJData)](#binary-formats-bson-cbor-messagepack-ubjson-and-bjdata)
- [Supported compilers](#supported-compilers)
- [Integration](#integration)
- [CMake](#cmake)
@ -961,9 +961,9 @@ Other Important points:
- When using `get<ENUM_TYPE>()`, undefined JSON values will default to the first pair specified in your map. Select this default pair carefully.
- If an enum or JSON value is specified more than once in your map, the first matching occurrence from the top of the map will be returned when converting to or from JSON.
### Binary formats (BSON, CBOR, MessagePack, and UBJSON)
### Binary formats (BSON, CBOR, MessagePack, UBJSON, and BJData)
Though JSON is a ubiquitous data format, it is not a very compact format suitable for data exchange, for instance over a network. Hence, the library supports [BSON](https://bsonspec.org) (Binary JSON), [CBOR](https://cbor.io) (Concise Binary Object Representation), [MessagePack](https://msgpack.org), and [UBJSON](https://ubjson.org) (Universal Binary JSON Specification) to efficiently encode JSON values to byte vectors and to decode such vectors.
Though JSON is a ubiquitous data format, it is not a very compact format suitable for data exchange, for instance over a network. Hence, the library supports [BSON](https://bsonspec.org) (Binary JSON), [CBOR](https://cbor.io) (Concise Binary Object Representation), [MessagePack](https://msgpack.org), [UBJSON](https://ubjson.org) (Universal Binary JSON Specification) and [BJData](https://neurojson.org/bjdata) (Binary JData) to efficiently encode JSON values to byte vectors and to decode such vectors.
```cpp
// create a JSON value

View File

@ -683,7 +683,7 @@ add_custom_target(ci_infer
add_custom_target(ci_offline_testdata
COMMAND mkdir -p ${PROJECT_BINARY_DIR}/build_offline_testdata/test_data
COMMAND cd ${PROJECT_BINARY_DIR}/build_offline_testdata/test_data && ${GIT_TOOL} clone -c advice.detachedHead=false --branch v3.0.0 https://github.com/nlohmann/json_test_data.git --quiet --depth 1
COMMAND cd ${PROJECT_BINARY_DIR}/build_offline_testdata/test_data && ${GIT_TOOL} clone -c advice.detachedHead=false --branch v3.1.0 https://github.com/nlohmann/json_test_data.git --quiet --depth 1
COMMAND ${CMAKE_COMMAND}
-DCMAKE_BUILD_TYPE=Debug -GNinja
-DJSON_BuildTests=ON -DJSON_FastTests=ON -DJSON_TestDataDirectory=${PROJECT_BINARY_DIR}/build_offline_testdata/test_data/json_test_data

View File

@ -1,5 +1,5 @@
set(JSON_TEST_DATA_URL https://github.com/nlohmann/json_test_data)
set(JSON_TEST_DATA_VERSION 3.0.0)
set(JSON_TEST_DATA_VERSION 3.1.0)
# if variable is set, use test data from given directory rather than downloading them
if(JSON_TestDataDirectory)

View File

@ -12,6 +12,7 @@
#include <string> // char_traits, string
#include <utility> // make_pair, move
#include <vector> // vector
#include <map> // map
#include <nlohmann/detail/exceptions.hpp>
#include <nlohmann/detail/input/input_adapters.hpp>
@ -74,7 +75,7 @@ class binary_reader
@param[in] adapter input adapter to read from
*/
explicit binary_reader(InputAdapterType&& adapter) noexcept : ia(std::move(adapter))
explicit binary_reader(InputAdapterType&& adapter, const input_format_t format = input_format_t::json) noexcept : ia(std::move(adapter)), input_format(format)
{
(void)detail::is_sax_static_asserts<SAX, BasicJsonType> {};
}
@ -118,6 +119,7 @@ class binary_reader
break;
case input_format_t::ubjson:
case input_format_t::bjdata:
result = parse_ubjson_internal();
break;
@ -129,7 +131,7 @@ class binary_reader
// strict mode: next byte must be EOF
if (result && strict)
{
if (format == input_format_t::ubjson)
if (input_format == input_format_t::ubjson || input_format == input_format_t::bjdata)
{
get_ignore_noop();
}
@ -141,7 +143,7 @@ class binary_reader
if (JSON_HEDLEY_UNLIKELY(current != std::char_traits<char_type>::eof()))
{
return sax->parse_error(chars_read, get_token_string(), parse_error::create(110, chars_read,
exception_message(format, concat("expected end of input; last byte: 0x", get_token_string()), "value"), nullptr));
exception_message(input_format, concat("expected end of input; last byte: 0x", get_token_string()), "value"), nullptr));
}
}
@ -1844,7 +1846,7 @@ class binary_reader
get(); // TODO(niels): may we ignore N here?
}
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format_t::ubjson, "value")))
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "value")))
{
return false;
}
@ -1854,52 +1856,154 @@ class binary_reader
case 'U':
{
std::uint8_t len{};
return get_number(input_format_t::ubjson, len) && get_string(input_format_t::ubjson, len, result);
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'i':
{
std::int8_t len{};
return get_number(input_format_t::ubjson, len) && get_string(input_format_t::ubjson, len, result);
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'I':
{
std::int16_t len{};
return get_number(input_format_t::ubjson, len) && get_string(input_format_t::ubjson, len, result);
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'l':
{
std::int32_t len{};
return get_number(input_format_t::ubjson, len) && get_string(input_format_t::ubjson, len, result);
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'L':
{
std::int64_t len{};
return get_number(input_format_t::ubjson, len) && get_string(input_format_t::ubjson, len, result);
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'u':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint16_t len{};
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'm':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint32_t len{};
return get_number(input_format, len) && get_string(input_format, len, result);
}
case 'M':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint64_t len{};
return get_number(input_format, len) && get_string(input_format, len, result);
}
default:
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(113, chars_read,
exception_message(input_format_t::ubjson, concat("expected length type specification (U, i, I, l, L); last byte: 0x", last_token), "string"), nullptr));
break;
}
auto last_token = get_token_string();
std::string message;
if (input_format != input_format_t::bjdata)
{
message = "expected length type specification (U, i, I, l, L); last byte: 0x" + last_token;
}
else
{
message = "expected length type specification (U, i, u, I, m, l, M, L); last byte: 0x" + last_token;
}
return sax->parse_error(chars_read, last_token, parse_error::create(113, chars_read, exception_message(input_format, message, "string"), nullptr));
}
/*!
@param[out] dim an integer vector storing the ND array dimensions
@return whether reading ND array size vector is successful
*/
bool get_ubjson_ndarray_size(std::vector<size_t>& dim)
{
std::pair<std::size_t, char_int_type> size_and_type;
size_t dimlen = 0;
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_size_type(size_and_type)))
{
return false;
}
if (size_and_type.first != string_t::npos)
{
if (size_and_type.second != 0)
{
if (size_and_type.second != 'N')
{
for (std::size_t i = 0; i < size_and_type.first; ++i)
{
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_size_value(dimlen, size_and_type.second)))
{
return false;
}
dim.push_back(dimlen);
}
}
}
else
{
for (std::size_t i = 0; i < size_and_type.first; ++i)
{
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_size_value(dimlen)))
{
return false;
}
dim.push_back(dimlen);
}
}
}
else
{
while (current != ']')
{
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_size_value(dimlen, current)))
{
return false;
}
dim.push_back(dimlen);
get_ignore_noop();
}
}
return true;
}
/*!
@param[out] result determined size
@return whether size determination completed
*/
bool get_ubjson_size_value(std::size_t& result)
bool get_ubjson_size_value(std::size_t& result, char_int_type prefix = 0)
{
switch (get_ignore_noop())
if (prefix == 0)
{
prefix = get_ignore_noop();
}
switch (prefix)
{
case 'U':
{
std::uint8_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format_t::ubjson, number)))
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
@ -1910,7 +2014,7 @@ class binary_reader
case 'i':
{
std::int8_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format_t::ubjson, number)))
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
@ -1921,7 +2025,7 @@ class binary_reader
case 'I':
{
std::int16_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format_t::ubjson, number)))
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
@ -1932,7 +2036,7 @@ class binary_reader
case 'l':
{
std::int32_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format_t::ubjson, number)))
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
@ -1943,7 +2047,7 @@ class binary_reader
case 'L':
{
std::int64_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format_t::ubjson, number)))
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
@ -1951,13 +2055,105 @@ class binary_reader
return true;
}
default:
case 'u':
{
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(113, chars_read,
exception_message(input_format_t::ubjson, concat("expected length type specification (U, i, I, l, L) after '#'; last byte: 0x", last_token), "size"), nullptr));
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint16_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
result = static_cast<std::size_t>(number);
return true;
}
case 'm':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint32_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
result = static_cast<std::size_t>(number);
return true;
}
case 'M':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint64_t number{};
if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
{
return false;
}
result = detail::conditional_static_cast<std::size_t>(number);
return true;
}
case '[':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::vector<size_t> dim;
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_ndarray_size(dim)))
{
return false;
}
if (dim.size() == 1 || (dim.size() == 2 && dim.at(0) == 1)) // return normal array size if 1D row vector
{
result = dim.at(dim.size() - 1);
return true;
}
if (!dim.empty()) // if ndarray, convert to an object in JData annotated array format
{
string_t key = "_ArraySize_";
if (JSON_HEDLEY_UNLIKELY(!sax->start_object(3) || !sax->key(key) || !sax->start_array(dim.size())))
{
return false;
}
result = 1;
for (auto i : dim)
{
result *= i;
if (JSON_HEDLEY_UNLIKELY(!sax->number_integer(static_cast<number_integer_t>(i))))
{
return false;
}
}
result |= (1ull << (sizeof(result) * 8 - 1)); // low 63 bit of result stores the total element count, sign-bit indicates ndarray
return sax->end_array();
}
result = 0;
return true;
}
default:
break;
}
auto last_token = get_token_string();
std::string message;
if (input_format != input_format_t::bjdata)
{
message = "expected length type specification (U, i, I, l, L) after '#'; last byte: 0x" + last_token;
}
else
{
message = "expected length type specification (U, i, u, I, m, l, M, L) after '#'; last byte: 0x" + last_token;
}
return sax->parse_error(chars_read, last_token, parse_error::create(113, chars_read, exception_message(input_format, message, "size"), nullptr));
}
/*!
@ -1979,8 +2175,10 @@ class binary_reader
if (current == '$')
{
std::vector<char_int_type> bjdx = {'[', '{', 'S', 'H', 'T', 'F', 'N', 'Z'}; // excluded markers in bjdata optimized type
result.second = get(); // must not ignore 'N', because 'N' maybe the type
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format_t::ubjson, "type")))
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "type") || (input_format == input_format_t::bjdata && std::find(bjdx.begin(), bjdx.end(), result.second) != bjdx.end() )))
{
return false;
}
@ -1988,13 +2186,13 @@ class binary_reader
get_ignore_noop();
if (JSON_HEDLEY_UNLIKELY(current != '#'))
{
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format_t::ubjson, "value")))
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "value")))
{
return false;
}
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(112, chars_read,
exception_message(input_format_t::ubjson, concat("expected '#' after type information; last byte: 0x", last_token), "size"), nullptr));
exception_message(input_format, concat("expected '#' after type information; last byte: 0x", last_token), "size"), nullptr));
}
return get_ubjson_size_value(result.first);
@ -2017,7 +2215,7 @@ class binary_reader
switch (prefix)
{
case std::char_traits<char_type>::eof(): // EOF
return unexpect_eof(input_format_t::ubjson, "value");
return unexpect_eof(input_format, "value");
case 'T': // true
return sax->boolean(true);
@ -2030,43 +2228,125 @@ class binary_reader
case 'U':
{
std::uint8_t number{};
return get_number(input_format_t::ubjson, number) && sax->number_unsigned(number);
return get_number(input_format, number) && sax->number_unsigned(number);
}
case 'i':
{
std::int8_t number{};
return get_number(input_format_t::ubjson, number) && sax->number_integer(number);
return get_number(input_format, number) && sax->number_integer(number);
}
case 'I':
{
std::int16_t number{};
return get_number(input_format_t::ubjson, number) && sax->number_integer(number);
return get_number(input_format, number) && sax->number_integer(number);
}
case 'l':
{
std::int32_t number{};
return get_number(input_format_t::ubjson, number) && sax->number_integer(number);
return get_number(input_format, number) && sax->number_integer(number);
}
case 'L':
{
std::int64_t number{};
return get_number(input_format_t::ubjson, number) && sax->number_integer(number);
return get_number(input_format, number) && sax->number_integer(number);
}
case 'u':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint16_t number{};
return get_number(input_format, number) && sax->number_unsigned(number);
}
case 'm':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint32_t number{};
return get_number(input_format, number) && sax->number_unsigned(number);
}
case 'M':
{
if (input_format != input_format_t::bjdata)
{
break;
}
std::uint64_t number{};
return get_number(input_format, number) && sax->number_unsigned(number);
}
case 'h':
{
if (input_format != input_format_t::bjdata)
{
break;
}
const auto byte1_raw = get();
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "number")))
{
return false;
}
const auto byte2_raw = get();
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "number")))
{
return false;
}
const auto byte1 = static_cast<unsigned char>(byte1_raw);
const auto byte2 = static_cast<unsigned char>(byte2_raw);
// code from RFC 7049, Appendix D, Figure 3:
// As half-precision floating-point numbers were only added
// to IEEE 754 in 2008, today's programming platforms often
// still only have limited support for them. It is very
// easy to include at least decoding support for them even
// without such support. An example of a small decoder for
// half-precision floating-point numbers in the C language
// is shown in Fig. 3.
const auto half = static_cast<unsigned int>((byte2 << 8u) + byte1);
const double val = [&half]
{
const int exp = (half >> 10u) & 0x1Fu;
const unsigned int mant = half & 0x3FFu;
JSON_ASSERT(0 <= exp&& exp <= 32);
JSON_ASSERT(mant <= 1024);
switch (exp)
{
case 0:
return std::ldexp(mant, -24);
case 31:
return (mant == 0)
? std::numeric_limits<double>::infinity()
: std::numeric_limits<double>::quiet_NaN();
default:
return std::ldexp(mant + 1024, exp - 25);
}
}();
return sax->number_float((half & 0x8000u) != 0
? static_cast<number_float_t>(-val)
: static_cast<number_float_t>(val), "");
}
case 'd':
{
float number{};
return get_number(input_format_t::ubjson, number) && sax->number_float(static_cast<number_float_t>(number), "");
return get_number(input_format, number) && sax->number_float(static_cast<number_float_t>(number), "");
}
case 'D':
{
double number{};
return get_number(input_format_t::ubjson, number) && sax->number_float(static_cast<number_float_t>(number), "");
return get_number(input_format, number) && sax->number_float(static_cast<number_float_t>(number), "");
}
case 'H':
@ -2077,7 +2357,7 @@ class binary_reader
case 'C': // char
{
get();
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format_t::ubjson, "char")))
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "char")))
{
return false;
}
@ -2085,7 +2365,7 @@ class binary_reader
{
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(113, chars_read,
exception_message(input_format_t::ubjson, concat("byte after 'C' must be in range 0x00..0x7F; last byte: 0x", last_token), "char"), nullptr));
exception_message(input_format, concat("byte after 'C' must be in range 0x00..0x7F; last byte: 0x", last_token), "char"), nullptr));
}
string_t s(1, static_cast<typename string_t::value_type>(current));
return sax->string(s);
@ -2104,12 +2384,10 @@ class binary_reader
return get_ubjson_object();
default: // anything else
{
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(112, chars_read,
exception_message(input_format_t::ubjson, concat("invalid byte: 0x", last_token), "value"), nullptr));
}
break;
}
auto last_token = get_token_string();
return sax->parse_error(chars_read, last_token, parse_error::create(112, chars_read, exception_message(input_format, "invalid byte: 0x" + last_token, "value"), nullptr));
}
/*!
@ -2123,6 +2401,44 @@ class binary_reader
return false;
}
// detect and encode bjdata ndarray as an object in JData annotated array format (https://github.com/NeuroJSON/jdata):
// {"_ArrayType_" : "typeid", "_ArraySize_" : [n1, n2, ...], "_ArrayData_" : [v1, v2, ...]}
if (input_format == input_format_t::bjdata && size_and_type.first != string_t::npos && size_and_type.first >= (1ull << (sizeof(std::size_t) * 8 - 1)))
{
std::map<char_int_type, string_t> bjdtype = {{'U', "uint8"}, {'i', "int8"}, {'u', "uint16"}, {'I', "int16"},
{'m', "uint32"}, {'l', "int32"}, {'M', "uint64"}, {'L', "int64"}, {'d', "single"}, {'D', "double"}, {'C', "char"}
};
string_t key = "_ArrayType_";
if (JSON_HEDLEY_UNLIKELY(bjdtype.count(size_and_type.second) == 0 || !sax->key(key) || !sax->string(bjdtype[size_and_type.second]) ))
{
return false;
}
if (size_and_type.second == 'C')
{
size_and_type.second = 'U';
}
size_and_type.first &= ~(1ull << (sizeof(std::size_t) * 8 - 1));
key = "_ArrayData_";
if (JSON_HEDLEY_UNLIKELY(!sax->key(key) || !sax->start_array(size_and_type.first) ))
{
return false;
}
for (std::size_t i = 0; i < size_and_type.first; ++i)
{
if (JSON_HEDLEY_UNLIKELY(!get_ubjson_value(size_and_type.second)))
{
return false;
}
}
return (sax->end_array() && sax->end_object());
}
if (size_and_type.first != string_t::npos)
{
if (JSON_HEDLEY_UNLIKELY(!sax->start_array(size_and_type.first)))
@ -2185,6 +2501,11 @@ class binary_reader
return false;
}
if (input_format == input_format_t::bjdata && size_and_type.first != string_t::npos && size_and_type.first >= (1ull << (sizeof(std::size_t) * 8 - 1)))
{
return false;
}
string_t key;
if (size_and_type.first != string_t::npos)
{
@ -2267,7 +2588,7 @@ class binary_reader
for (std::size_t i = 0; i < size; ++i)
{
get();
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format_t::ubjson, "number")))
if (JSON_HEDLEY_UNLIKELY(!unexpect_eof(input_format, "number")))
{
return false;
}
@ -2286,7 +2607,7 @@ class binary_reader
if (JSON_HEDLEY_UNLIKELY(result_remainder != token_type::end_of_input))
{
return sax->parse_error(chars_read, number_string, parse_error::create(115, chars_read,
exception_message(input_format_t::ubjson, concat("invalid number text: ", number_lexer.get_token_string()), "high-precision number"), nullptr));
exception_message(input_format, concat("invalid number text: ", number_lexer.get_token_string()), "high-precision number"), nullptr));
}
switch (result_number)
@ -2313,7 +2634,7 @@ class binary_reader
case token_type::literal_or_value:
default:
return sax->parse_error(chars_read, number_string, parse_error::create(115, chars_read,
exception_message(input_format_t::ubjson, concat("invalid number text: ", number_lexer.get_token_string()), "high-precision number"), nullptr));
exception_message(input_format, concat("invalid number text: ", number_lexer.get_token_string()), "high-precision number"), nullptr));
}
}
@ -2362,6 +2683,8 @@ class binary_reader
@note This function needs to respect the system's endianness, because
bytes in CBOR, MessagePack, and UBJSON are stored in network order
(big endian) and therefore need reordering on little endian systems.
On the other hand, BSON and BJData use little endian and should reorder
on big endian systems.
*/
template<typename NumberType, bool InputIsLittleEndian = false>
bool get_number(const input_format_t format, NumberType& result)
@ -2377,7 +2700,7 @@ class binary_reader
}
// reverse byte order prior to conversion if necessary
if (is_little_endian != InputIsLittleEndian)
if (is_little_endian != (InputIsLittleEndian || format == input_format_t::bjdata))
{
vec[sizeof(NumberType) - i - 1] = static_cast<std::uint8_t>(current);
}
@ -2514,6 +2837,10 @@ class binary_reader
error_msg += "BSON";
break;
case input_format_t::bjdata:
error_msg += "BJData";
break;
case input_format_t::json: // LCOV_EXCL_LINE
default: // LCOV_EXCL_LINE
JSON_ASSERT(false); // NOLINT(cert-dcl03-c,hicpp-static-assert,misc-static-assert) LCOV_EXCL_LINE
@ -2535,6 +2862,9 @@ class binary_reader
/// whether we can assume little endianness
const bool is_little_endian = little_endianness();
/// input format
const input_format_t input_format = input_format_t::json;
/// the SAX parser
json_sax_t* sax = nullptr;
};

View File

@ -23,7 +23,7 @@ namespace nlohmann
namespace detail
{
/// the supported input formats
enum class input_format_t { json, cbor, msgpack, ubjson, bson };
enum class input_format_t { json, cbor, msgpack, ubjson, bson, bjdata };
////////////////////
// input adapters //

View File

@ -2,12 +2,14 @@
#include <algorithm> // reverse
#include <array> // array
#include <map> // map
#include <cmath> // isnan, isinf
#include <cstdint> // uint8_t, uint16_t, uint32_t, uint64_t
#include <cstring> // memcpy
#include <limits> // numeric_limits
#include <string> // string
#include <utility> // move
#include <vector> // vector
#include <nlohmann/detail/input/binary_reader.hpp>
#include <nlohmann/detail/macro_scope.hpp>
@ -724,9 +726,11 @@ class binary_writer
@param[in] use_count whether to use '#' prefixes (optimized format)
@param[in] use_type whether to use '$' prefixes (optimized format)
@param[in] add_prefix whether prefixes need to be used for this value
@param[in] use_bjdata whether write in BJData format, default is false
*/
void write_ubjson(const BasicJsonType& j, const bool use_count,
const bool use_type, const bool add_prefix = true)
const bool use_type, const bool add_prefix = true,
const bool use_bjdata = false)
{
switch (j.type())
{
@ -752,19 +756,19 @@ class binary_writer
case value_t::number_integer:
{
write_number_with_ubjson_prefix(j.m_value.number_integer, add_prefix);
write_number_with_ubjson_prefix(j.m_value.number_integer, add_prefix, use_bjdata);
break;
}
case value_t::number_unsigned:
{
write_number_with_ubjson_prefix(j.m_value.number_unsigned, add_prefix);
write_number_with_ubjson_prefix(j.m_value.number_unsigned, add_prefix, use_bjdata);
break;
}
case value_t::number_float:
{
write_number_with_ubjson_prefix(j.m_value.number_float, add_prefix);
write_number_with_ubjson_prefix(j.m_value.number_float, add_prefix, use_bjdata);
break;
}
@ -774,7 +778,7 @@ class binary_writer
{
oa->write_character(to_char_type('S'));
}
write_number_with_ubjson_prefix(j.m_value.string->size(), true);
write_number_with_ubjson_prefix(j.m_value.string->size(), true, use_bjdata);
oa->write_characters(
reinterpret_cast<const CharType*>(j.m_value.string->c_str()),
j.m_value.string->size());
@ -792,14 +796,16 @@ class binary_writer
if (use_type && !j.m_value.array->empty())
{
JSON_ASSERT(use_count);
const CharType first_prefix = ubjson_prefix(j.front());
const CharType first_prefix = ubjson_prefix(j.front(), use_bjdata);
const bool same_prefix = std::all_of(j.begin() + 1, j.end(),
[this, first_prefix](const BasicJsonType & v)
[this, first_prefix, use_bjdata](const BasicJsonType & v)
{
return ubjson_prefix(v) == first_prefix;
return ubjson_prefix(v, use_bjdata) == first_prefix;
});
if (same_prefix)
std::vector<CharType> bjdx = {'[', '{', 'S', 'H', 'T', 'F', 'N', 'Z'}; // excluded markers in bjdata optimized type
if (same_prefix && !(use_bjdata && std::find(bjdx.begin(), bjdx.end(), first_prefix) != bjdx.end()))
{
prefix_required = false;
oa->write_character(to_char_type('$'));
@ -810,12 +816,12 @@ class binary_writer
if (use_count)
{
oa->write_character(to_char_type('#'));
write_number_with_ubjson_prefix(j.m_value.array->size(), true);
write_number_with_ubjson_prefix(j.m_value.array->size(), true, use_bjdata);
}
for (const auto& el : *j.m_value.array)
{
write_ubjson(el, use_count, use_type, prefix_required);
write_ubjson(el, use_count, use_type, prefix_required, use_bjdata);
}
if (!use_count)
@ -843,7 +849,7 @@ class binary_writer
if (use_count)
{
oa->write_character(to_char_type('#'));
write_number_with_ubjson_prefix(j.m_value.binary->size(), true);
write_number_with_ubjson_prefix(j.m_value.binary->size(), true, use_bjdata);
}
if (use_type)
@ -871,6 +877,14 @@ class binary_writer
case value_t::object:
{
if (use_bjdata && j.m_value.object->size() == 3 && j.m_value.object->find("_ArrayType_") != j.m_value.object->end() && j.m_value.object->find("_ArraySize_") != j.m_value.object->end() && j.m_value.object->find("_ArrayData_") != j.m_value.object->end())
{
if (!write_bjdata_ndarray(*j.m_value.object, use_count, use_type)) // decode bjdata ndarray in the JData format (https://github.com/NeuroJSON/jdata)
{
break;
}
}
if (add_prefix)
{
oa->write_character(to_char_type('{'));
@ -880,14 +894,16 @@ class binary_writer
if (use_type && !j.m_value.object->empty())
{
JSON_ASSERT(use_count);
const CharType first_prefix = ubjson_prefix(j.front());
const CharType first_prefix = ubjson_prefix(j.front(), use_bjdata);
const bool same_prefix = std::all_of(j.begin(), j.end(),
[this, first_prefix](const BasicJsonType & v)
[this, first_prefix, use_bjdata](const BasicJsonType & v)
{
return ubjson_prefix(v) == first_prefix;
return ubjson_prefix(v, use_bjdata) == first_prefix;
});
if (same_prefix)
std::vector<CharType> bjdx = {'[', '{', 'S', 'H', 'T', 'F', 'N', 'Z'}; // excluded markers in bjdata optimized type
if (same_prefix && !(use_bjdata && std::find(bjdx.begin(), bjdx.end(), first_prefix) != bjdx.end()))
{
prefix_required = false;
oa->write_character(to_char_type('$'));
@ -898,16 +914,16 @@ class binary_writer
if (use_count)
{
oa->write_character(to_char_type('#'));
write_number_with_ubjson_prefix(j.m_value.object->size(), true);
write_number_with_ubjson_prefix(j.m_value.object->size(), true, use_bjdata);
}
for (const auto& el : *j.m_value.object)
{
write_number_with_ubjson_prefix(el.first.size(), true);
write_number_with_ubjson_prefix(el.first.size(), true, use_bjdata);
oa->write_characters(
reinterpret_cast<const CharType*>(el.first.c_str()),
el.first.size());
write_ubjson(el.second, use_count, use_type, prefix_required);
write_ubjson(el.second, use_count, use_type, prefix_required, use_bjdata);
}
if (!use_count)
@ -974,7 +990,7 @@ class binary_writer
const double value)
{
write_bson_entry_header(name, 0x01);
write_number<double, true>(value);
write_number<double>(value, true);
}
/*!
@ -993,7 +1009,7 @@ class binary_writer
{
write_bson_entry_header(name, 0x02);
write_number<std::int32_t, true>(static_cast<std::int32_t>(value.size() + 1ul));
write_number<std::int32_t>(static_cast<std::int32_t>(value.size() + 1ul), true);
oa->write_characters(
reinterpret_cast<const CharType*>(value.c_str()),
value.size() + 1);
@ -1026,12 +1042,12 @@ class binary_writer
if ((std::numeric_limits<std::int32_t>::min)() <= value && value <= (std::numeric_limits<std::int32_t>::max)())
{
write_bson_entry_header(name, 0x10); // int32
write_number<std::int32_t, true>(static_cast<std::int32_t>(value));
write_number<std::int32_t>(static_cast<std::int32_t>(value), true);
}
else
{
write_bson_entry_header(name, 0x12); // int64
write_number<std::int64_t, true>(static_cast<std::int64_t>(value));
write_number<std::int64_t>(static_cast<std::int64_t>(value), true);
}
}
@ -1054,12 +1070,12 @@ class binary_writer
if (j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
{
write_bson_entry_header(name, 0x10 /* int32 */);
write_number<std::int32_t, true>(static_cast<std::int32_t>(j.m_value.number_unsigned));
write_number<std::int32_t>(static_cast<std::int32_t>(j.m_value.number_unsigned), true);
}
else if (j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int64_t>::max)()))
{
write_bson_entry_header(name, 0x12 /* int64 */);
write_number<std::int64_t, true>(static_cast<std::int64_t>(j.m_value.number_unsigned));
write_number<std::int64_t>(static_cast<std::int64_t>(j.m_value.number_unsigned), true);
}
else
{
@ -1107,7 +1123,7 @@ class binary_writer
const typename BasicJsonType::array_t& value)
{
write_bson_entry_header(name, 0x04); // array
write_number<std::int32_t, true>(static_cast<std::int32_t>(calc_bson_array_size(value)));
write_number<std::int32_t>(static_cast<std::int32_t>(calc_bson_array_size(value)), true);
std::size_t array_index = 0ul;
@ -1127,7 +1143,7 @@ class binary_writer
{
write_bson_entry_header(name, 0x05);
write_number<std::int32_t, true>(static_cast<std::int32_t>(value.size()));
write_number<std::int32_t>(static_cast<std::int32_t>(value.size()), true);
write_number(value.has_subtype() ? static_cast<std::uint8_t>(value.subtype()) : static_cast<std::uint8_t>(0x00));
oa->write_characters(reinterpret_cast<const CharType*>(value.data()), value.size());
@ -1249,7 +1265,7 @@ class binary_writer
*/
void write_bson_object(const typename BasicJsonType::object_t& value)
{
write_number<std::int32_t, true>(static_cast<std::int32_t>(calc_bson_object_size(value)));
write_number<std::int32_t>(static_cast<std::int32_t>(calc_bson_object_size(value)), true);
for (const auto& el : value)
{
@ -1295,20 +1311,22 @@ class binary_writer
template<typename NumberType, typename std::enable_if<
std::is_floating_point<NumberType>::value, int>::type = 0>
void write_number_with_ubjson_prefix(const NumberType n,
const bool add_prefix)
const bool add_prefix,
const bool use_bjdata)
{
if (add_prefix)
{
oa->write_character(get_ubjson_float_prefix(n));
}
write_number(n);
write_number(n, use_bjdata);
}
// UBJSON: write number (unsigned integer)
template<typename NumberType, typename std::enable_if<
std::is_unsigned<NumberType>::value, int>::type = 0>
void write_number_with_ubjson_prefix(const NumberType n,
const bool add_prefix)
const bool add_prefix,
const bool use_bjdata)
{
if (n <= static_cast<std::uint64_t>((std::numeric_limits<std::int8_t>::max)()))
{
@ -1316,7 +1334,7 @@ class binary_writer
{
oa->write_character(to_char_type('i')); // int8
}
write_number(static_cast<std::uint8_t>(n));
write_number(static_cast<std::uint8_t>(n), use_bjdata);
}
else if (n <= (std::numeric_limits<std::uint8_t>::max)())
{
@ -1324,7 +1342,7 @@ class binary_writer
{
oa->write_character(to_char_type('U')); // uint8
}
write_number(static_cast<std::uint8_t>(n));
write_number(static_cast<std::uint8_t>(n), use_bjdata);
}
else if (n <= static_cast<std::uint64_t>((std::numeric_limits<std::int16_t>::max)()))
{
@ -1332,7 +1350,15 @@ class binary_writer
{
oa->write_character(to_char_type('I')); // int16
}
write_number(static_cast<std::int16_t>(n));
write_number(static_cast<std::int16_t>(n), use_bjdata);
}
else if (use_bjdata && n <= static_cast<uint64_t>((std::numeric_limits<uint16_t>::max)()))
{
if (add_prefix)
{
oa->write_character(to_char_type('u')); // uint16 - bjdata only
}
write_number(static_cast<std::uint16_t>(n), use_bjdata);
}
else if (n <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
{
@ -1340,7 +1366,15 @@ class binary_writer
{
oa->write_character(to_char_type('l')); // int32
}
write_number(static_cast<std::int32_t>(n));
write_number(static_cast<std::int32_t>(n), use_bjdata);
}
else if (use_bjdata && n <= static_cast<uint64_t>((std::numeric_limits<uint32_t>::max)()))
{
if (add_prefix)
{
oa->write_character(to_char_type('m')); // uint32 - bjdata only
}
write_number(static_cast<std::uint32_t>(n), use_bjdata);
}
else if (n <= static_cast<std::uint64_t>((std::numeric_limits<std::int64_t>::max)()))
{
@ -1348,7 +1382,15 @@ class binary_writer
{
oa->write_character(to_char_type('L')); // int64
}
write_number(static_cast<std::int64_t>(n));
write_number(static_cast<std::int64_t>(n), use_bjdata);
}
else if (use_bjdata && n <= (std::numeric_limits<uint64_t>::max)())
{
if (add_prefix)
{
oa->write_character(to_char_type('M')); // uint64 - bjdata only
}
write_number(static_cast<std::uint64_t>(n), use_bjdata);
}
else
{
@ -1358,7 +1400,7 @@ class binary_writer
}
const auto number = BasicJsonType(n).dump();
write_number_with_ubjson_prefix(number.size(), true);
write_number_with_ubjson_prefix(number.size(), true, use_bjdata);
for (std::size_t i = 0; i < number.size(); ++i)
{
oa->write_character(to_char_type(static_cast<std::uint8_t>(number[i])));
@ -1371,7 +1413,8 @@ class binary_writer
std::is_signed<NumberType>::value&&
!std::is_floating_point<NumberType>::value, int >::type = 0 >
void write_number_with_ubjson_prefix(const NumberType n,
const bool add_prefix)
const bool add_prefix,
const bool use_bjdata)
{
if ((std::numeric_limits<std::int8_t>::min)() <= n && n <= (std::numeric_limits<std::int8_t>::max)())
{
@ -1379,7 +1422,7 @@ class binary_writer
{
oa->write_character(to_char_type('i')); // int8
}
write_number(static_cast<std::int8_t>(n));
write_number(static_cast<std::int8_t>(n), use_bjdata);
}
else if (static_cast<std::int64_t>((std::numeric_limits<std::uint8_t>::min)()) <= n && n <= static_cast<std::int64_t>((std::numeric_limits<std::uint8_t>::max)()))
{
@ -1387,7 +1430,7 @@ class binary_writer
{
oa->write_character(to_char_type('U')); // uint8
}
write_number(static_cast<std::uint8_t>(n));
write_number(static_cast<std::uint8_t>(n), use_bjdata);
}
else if ((std::numeric_limits<std::int16_t>::min)() <= n && n <= (std::numeric_limits<std::int16_t>::max)())
{
@ -1395,7 +1438,15 @@ class binary_writer
{
oa->write_character(to_char_type('I')); // int16
}
write_number(static_cast<std::int16_t>(n));
write_number(static_cast<std::int16_t>(n), use_bjdata);
}
else if (use_bjdata && (static_cast<std::int64_t>((std::numeric_limits<std::uint16_t>::min)()) <= n && n <= static_cast<std::int64_t>((std::numeric_limits<std::uint16_t>::max)())))
{
if (add_prefix)
{
oa->write_character(to_char_type('u')); // uint16 - bjdata only
}
write_number(static_cast<uint16_t>(n), use_bjdata);
}
else if ((std::numeric_limits<std::int32_t>::min)() <= n && n <= (std::numeric_limits<std::int32_t>::max)())
{
@ -1403,7 +1454,15 @@ class binary_writer
{
oa->write_character(to_char_type('l')); // int32
}
write_number(static_cast<std::int32_t>(n));
write_number(static_cast<std::int32_t>(n), use_bjdata);
}
else if (use_bjdata && (static_cast<std::int64_t>((std::numeric_limits<std::uint32_t>::min)()) <= n && n <= static_cast<std::int64_t>((std::numeric_limits<std::uint32_t>::max)())))
{
if (add_prefix)
{
oa->write_character(to_char_type('m')); // uint32 - bjdata only
}
write_number(static_cast<uint32_t>(n), use_bjdata);
}
else if ((std::numeric_limits<std::int64_t>::min)() <= n && n <= (std::numeric_limits<std::int64_t>::max)())
{
@ -1411,7 +1470,7 @@ class binary_writer
{
oa->write_character(to_char_type('L')); // int64
}
write_number(static_cast<std::int64_t>(n));
write_number(static_cast<std::int64_t>(n), use_bjdata);
}
// LCOV_EXCL_START
else
@ -1422,7 +1481,7 @@ class binary_writer
}
const auto number = BasicJsonType(n).dump();
write_number_with_ubjson_prefix(number.size(), true);
write_number_with_ubjson_prefix(number.size(), true, use_bjdata);
for (std::size_t i = 0; i < number.size(); ++i)
{
oa->write_character(to_char_type(static_cast<std::uint8_t>(number[i])));
@ -1434,7 +1493,7 @@ class binary_writer
/*!
@brief determine the type prefix of container values
*/
CharType ubjson_prefix(const BasicJsonType& j) const noexcept
CharType ubjson_prefix(const BasicJsonType& j, const bool use_bjdata) const noexcept
{
switch (j.type())
{
@ -1458,10 +1517,18 @@ class binary_writer
{
return 'I';
}
if (use_bjdata && ((std::numeric_limits<std::uint16_t>::min)() <= j.m_value.number_integer && j.m_value.number_integer <= (std::numeric_limits<std::uint16_t>::max)()))
{
return 'u';
}
if ((std::numeric_limits<std::int32_t>::min)() <= j.m_value.number_integer && j.m_value.number_integer <= (std::numeric_limits<std::int32_t>::max)())
{
return 'l';
}
if (use_bjdata && ((std::numeric_limits<std::uint32_t>::min)() <= j.m_value.number_integer && j.m_value.number_integer <= (std::numeric_limits<std::uint32_t>::max)()))
{
return 'm';
}
if ((std::numeric_limits<std::int64_t>::min)() <= j.m_value.number_integer && j.m_value.number_integer <= (std::numeric_limits<std::int64_t>::max)())
{
return 'L';
@ -1484,14 +1551,26 @@ class binary_writer
{
return 'I';
}
if (use_bjdata && j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::uint16_t>::max)()))
{
return 'u';
}
if (j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
{
return 'l';
}
if (use_bjdata && j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::uint32_t>::max)()))
{
return 'm';
}
if (j.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int64_t>::max)()))
{
return 'L';
}
if (use_bjdata && j.m_value.number_unsigned <= (std::numeric_limits<std::uint64_t>::max)())
{
return 'M';
}
// anything else is treated as high-precision number
return 'H'; // LCOV_EXCL_LINE
}
@ -1525,6 +1604,118 @@ class binary_writer
return 'D'; // float 64
}
/*!
@return false if the object is successfully converted to a bjdata ndarray, true if the type or size is invalid
*/
bool write_bjdata_ndarray(const typename BasicJsonType::object_t& value, const bool use_count, const bool use_type)
{
std::map<string_t, CharType> bjdtype = {{"uint8", 'U'}, {"int8", 'i'}, {"uint16", 'u'}, {"int16", 'I'},
{"uint32", 'm'}, {"int32", 'l'}, {"uint64", 'M'}, {"int64", 'L'}, {"single", 'd'}, {"double", 'D'}, {"char", 'C'}
};
string_t key = "_ArrayType_";
auto it = bjdtype.find(static_cast<string_t>(value.at(key)));
if (it == bjdtype.end())
{
return true;
}
CharType dtype = it->second;
key = "_ArraySize_";
std::size_t len = (value.at(key).empty() ? 0 : 1);
for (const auto& el : value.at(key))
{
len *= static_cast<std::size_t>(el.m_value.number_unsigned);
}
key = "_ArrayData_";
if (value.at(key).size() != len)
{
return true;
}
oa->write_character('[');
oa->write_character('$');
oa->write_character(dtype);
oa->write_character('#');
key = "_ArraySize_";
write_ubjson(value.at(key), use_count, use_type, true, true);
key = "_ArrayData_";
if (dtype == 'U' || dtype == 'C')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::uint8_t>(el.m_value.number_unsigned), true);
}
}
else if (dtype == 'i')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::int8_t>(el.m_value.number_integer), true);
}
}
else if (dtype == 'u')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::uint16_t>(el.m_value.number_unsigned), true);
}
}
else if (dtype == 'I')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::int16_t>(el.m_value.number_integer), true);
}
}
else if (dtype == 'm')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::uint32_t>(el.m_value.number_unsigned), true);
}
}
else if (dtype == 'l')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::int32_t>(el.m_value.number_integer), true);
}
}
else if (dtype == 'M')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::uint64_t>(el.m_value.number_unsigned), true);
}
}
else if (dtype == 'L')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<std::int64_t>(el.m_value.number_integer), true);
}
}
else if (dtype == 'd')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<float>(el.m_value.number_float), true);
}
}
else if (dtype == 'D')
{
for (const auto& el : value.at(key))
{
write_number(static_cast<double>(el.m_value.number_float), true);
}
}
return false;
}
///////////////////////
// Utility functions //
///////////////////////
@ -1532,16 +1723,18 @@ class binary_writer
/*
@brief write a number to output input
@param[in] n number of type @a NumberType
@tparam NumberType the type of the number
@tparam OutputIsLittleEndian Set to true if output data is
@param[in] OutputIsLittleEndian Set to true if output data is
required to be little endian
@tparam NumberType the type of the number
@note This function needs to respect the system's endianness, because bytes
in CBOR, MessagePack, and UBJSON are stored in network order (big
endian) and therefore need reordering on little endian systems.
On the other hand, BSON and BJData use little endian and should reorder
on big endian systems.
*/
template<typename NumberType, bool OutputIsLittleEndian = false>
void write_number(const NumberType n)
template<typename NumberType>
void write_number(const NumberType n, const bool OutputIsLittleEndian = false)
{
// step 1: write number to array of length NumberType
std::array<CharType, sizeof(NumberType)> vec{};

View File

@ -3773,7 +3773,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
auto ia = detail::input_adapter(std::forward<InputType>(i));
return format == input_format_t::json
? parser(std::move(ia), nullptr, true, ignore_comments).sax_parse(sax, strict)
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia)).sax_parse(format, sax, strict);
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia), format).sax_parse(format, sax, strict);
}
/// @brief generate SAX events
@ -3788,7 +3788,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
auto ia = detail::input_adapter(std::move(first), std::move(last));
return format == input_format_t::json
? parser(std::move(ia), nullptr, true, ignore_comments).sax_parse(sax, strict)
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia)).sax_parse(format, sax, strict);
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia), format).sax_parse(format, sax, strict);
}
/// @brief generate SAX events
@ -3809,7 +3809,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
? parser(std::move(ia), nullptr, true, ignore_comments).sax_parse(sax, strict)
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia)).sax_parse(format, sax, strict);
: detail::binary_reader<basic_json, decltype(ia), SAX>(std::move(ia), format).sax_parse(format, sax, strict);
}
#ifndef JSON_NO_IO
/// @brief deserialize from stream
@ -3965,6 +3965,33 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
binary_writer<char>(o).write_ubjson(j, use_size, use_type);
}
/// @brief create a BJData serialization of a given JSON value
/// @sa https://json.nlohmann.me/api/basic_json/to_bjdata/
static std::vector<std::uint8_t> to_bjdata(const basic_json& j,
const bool use_size = false,
const bool use_type = false)
{
std::vector<std::uint8_t> result;
to_bjdata(j, result, use_size, use_type);
return result;
}
/// @brief create a BJData serialization of a given JSON value
/// @sa https://json.nlohmann.me/api/basic_json/to_bjdata/
static void to_bjdata(const basic_json& j, detail::output_adapter<std::uint8_t> o,
const bool use_size = false, const bool use_type = false)
{
binary_writer<std::uint8_t>(o).write_ubjson(j, use_size, use_type, true, true);
}
/// @brief create a BJData serialization of a given JSON value
/// @sa https://json.nlohmann.me/api/basic_json/to_bjdata/
static void to_bjdata(const basic_json& j, detail::output_adapter<char> o,
const bool use_size = false, const bool use_type = false)
{
binary_writer<char>(o).write_ubjson(j, use_size, use_type, true, true);
}
/// @brief create a BSON serialization of a given JSON value
/// @sa https://json.nlohmann.me/api/basic_json/to_bson/
static std::vector<std::uint8_t> to_bson(const basic_json& j)
@ -4000,7 +4027,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::forward<InputType>(i));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::cbor).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
return res ? result : basic_json(value_t::discarded);
}
@ -4016,7 +4043,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::move(first), std::move(last));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::cbor).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
return res ? result : basic_json(value_t::discarded);
}
@ -4043,7 +4070,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = i.get();
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::cbor).sax_parse(input_format_t::cbor, &sdp, strict, tag_handler);
return res ? result : basic_json(value_t::discarded);
}
@ -4058,7 +4085,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::forward<InputType>(i));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::msgpack, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::msgpack).sax_parse(input_format_t::msgpack, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4073,7 +4100,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::move(first), std::move(last));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::msgpack, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::msgpack).sax_parse(input_format_t::msgpack, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4097,7 +4124,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = i.get();
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::msgpack, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::msgpack).sax_parse(input_format_t::msgpack, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4112,7 +4139,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::forward<InputType>(i));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::ubjson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::ubjson).sax_parse(input_format_t::ubjson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4127,7 +4154,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::move(first), std::move(last));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::ubjson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::ubjson).sax_parse(input_format_t::ubjson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4151,10 +4178,64 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = i.get();
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::ubjson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::ubjson).sax_parse(input_format_t::ubjson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
/// @brief create a JSON value from an input in BJData format
/// @sa https://json.nlohmann.me/api/basic_json/from_bjdata/
template<typename InputType>
JSON_HEDLEY_WARN_UNUSED_RESULT
static basic_json from_bjdata(InputType&& i,
const bool strict = true,
const bool allow_exceptions = true)
{
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::forward<InputType>(i));
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bjdata).sax_parse(input_format_t::bjdata, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
/// @brief create a JSON value from an input in BJData format
/// @sa https://json.nlohmann.me/api/basic_json/from_bjdata/
template<typename IteratorType>
JSON_HEDLEY_WARN_UNUSED_RESULT
static basic_json from_bjdata(IteratorType first, IteratorType last,
const bool strict = true,
const bool allow_exceptions = true)
{
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::move(first), std::move(last));
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bjdata).sax_parse(input_format_t::bjdata, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
template<typename T>
JSON_HEDLEY_WARN_UNUSED_RESULT
static basic_json from_bjdata(const T* ptr, std::size_t len,
const bool strict = true,
const bool allow_exceptions = true)
{
return from_bjdata(ptr, ptr + len, strict, allow_exceptions);
}
JSON_HEDLEY_WARN_UNUSED_RESULT
static basic_json from_bjdata(detail::span_input_adapter&& i,
const bool strict = true,
const bool allow_exceptions = true)
{
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = i.get();
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bjdata).sax_parse(input_format_t::bjdata, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
/// @brief create a JSON value from an input in BSON format
/// @sa https://json.nlohmann.me/api/basic_json/from_bson/
template<typename InputType>
@ -4166,7 +4247,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::forward<InputType>(i));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::bson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bson).sax_parse(input_format_t::bson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4181,7 +4262,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
basic_json result;
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = detail::input_adapter(std::move(first), std::move(last));
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::bson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bson).sax_parse(input_format_t::bson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
@ -4205,7 +4286,7 @@ class basic_json // NOLINT(cppcoreguidelines-special-member-functions,hicpp-spec
detail::json_sax_dom_parser<basic_json> sdp(result, allow_exceptions);
auto ia = i.get();
// NOLINTNEXTLINE(hicpp-move-const-arg,performance-move-const-arg)
const bool res = binary_reader<decltype(ia)>(std::move(ia)).sax_parse(input_format_t::bson, &sdp, strict);
const bool res = binary_reader<decltype(ia)>(std::move(ia), input_format_t::bson).sax_parse(input_format_t::bson, &sdp, strict);
return res ? result : basic_json(value_t::discarded);
}
/// @}

File diff suppressed because it is too large Load Diff

View File

@ -72,7 +72,7 @@ endif()
if (CMAKE_CXX_COMPILER_ID STREQUAL "MSVC")
# avoid stack overflow, see https://github.com/nlohmann/json/issues/2955
json_test_set_test_options("test-cbor;test-msgpack;test-ubjson" LINK_OPTIONS /STACK:4000000)
json_test_set_test_options("test-cbor;test-msgpack;test-ubjson;test-bjdata" LINK_OPTIONS /STACK:4000000)
endif()
# disable exceptions for test-disabled_exceptions

View File

@ -10,7 +10,7 @@ CXXFLAGS += -std=c++11
CPPFLAGS += -I ../single_include
FUZZER_ENGINE = src/fuzzer-driver_afl.cpp
FUZZERS = parse_afl_fuzzer parse_bson_fuzzer parse_cbor_fuzzer parse_msgpack_fuzzer parse_ubjson_fuzzer
FUZZERS = parse_afl_fuzzer parse_bson_fuzzer parse_cbor_fuzzer parse_msgpack_fuzzer parse_ubjson_fuzzer parse_bjdata_fuzzer
fuzzers: $(FUZZERS)
parse_afl_fuzzer:
@ -27,3 +27,6 @@ parse_msgpack_fuzzer:
parse_ubjson_fuzzer:
$(CXX) $(CXXFLAGS) $(CPPFLAGS) $(FUZZER_ENGINE) src/fuzzer-parse_ubjson.cpp -o $@
parse_bjdata_fuzzer:
$(CXX) $(CXXFLAGS) $(CPPFLAGS) $(FUZZER_ENGINE) src/fuzzer-parse_bjdata.cpp -o $@

View File

@ -0,0 +1,84 @@
/*
__ _____ _____ _____
__| | __| | | | JSON for Modern C++ (fuzz test support)
| | |__ | | | | | | version 3.10.5
|_____|_____|_____|_|___| https://github.com/nlohmann/json
This file implements a parser test suitable for fuzz testing. Given a byte
array data, it performs the following steps:
- j1 = from_bjdata(data)
- vec = to_bjdata(j1)
- j2 = from_bjdata(vec)
- assert(j1 == j2)
- vec2 = to_bjdata(j1, use_size = true, use_type = false)
- j3 = from_bjdata(vec2)
- assert(j1 == j3)
- vec3 = to_bjdata(j1, use_size = true, use_type = true)
- j4 = from_bjdata(vec3)
- assert(j1 == j4)
The provided function `LLVMFuzzerTestOneInput` can be used in different fuzzer
drivers.
Licensed under the MIT License <http://opensource.org/licenses/MIT>.
*/
#include <iostream>
#include <sstream>
#include <nlohmann/json.hpp>
using json = nlohmann::json;
// see http://llvm.org/docs/LibFuzzer.html
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size)
{
try
{
// step 1: parse input
std::vector<uint8_t> vec1(data, data + size);
json j1 = json::from_bjdata(vec1);
try
{
// step 2.1: round trip without adding size annotations to container types
std::vector<uint8_t> vec2 = json::to_bjdata(j1, false, false);
// step 2.2: round trip with adding size annotations but without adding type annonations to container types
std::vector<uint8_t> vec3 = json::to_bjdata(j1, true, false);
// step 2.3: round trip with adding size as well as type annotations to container types
std::vector<uint8_t> vec4 = json::to_bjdata(j1, true, true);
// parse serialization
json j2 = json::from_bjdata(vec2);
json j3 = json::from_bjdata(vec3);
json j4 = json::from_bjdata(vec4);
// serializations must match
assert(json::to_bjdata(j2, false, false) == vec2);
assert(json::to_bjdata(j3, true, false) == vec3);
assert(json::to_bjdata(j4, true, true) == vec4);
}
catch (const json::parse_error&)
{
// parsing a BJData serialization must not fail
assert(false);
}
}
catch (const json::parse_error&)
{
// parse errors are ok, because input may be random bytes
}
catch (const json::type_error&)
{
// type errors can occur during parsing, too
}
catch (const json::out_of_range&)
{
// out of range errors may happen if provided sizes are excessive
}
// return 0 - non-zero return values are reserved for future use
return 0;
}

3355
test/src/unit-bjdata.cpp Normal file

File diff suppressed because it is too large Load Diff