json: Inconsistent Behaviour of NaN & Null Values

Description

Let us assume we have nan = std::numeric_limits<double>::quiet_NaN() and null = nlohmann::json() in our json object. Then the first is a number and NOT null. Whereas the second is null and NOT a number. However, in string form, both appear as null.

So when dumping everything to a json string and then parsing it back into a json object, then both values have now become null. And neither value is a number. So when we try to convert any of them back to a double we get type must be number, but is null.

Reproduction steps

See Minimal code example or https://godbolt.org/z/b9n7Eq9qv

Expected vs. actual results

Both std::numeric_limits<double>::quiet_NaN() and nlohmann::json() are displayed as null in a json file. So a nlohmann::json object should treat both the same at any given point in time. Two solutions:

  1. Either immediately convert std::numeric_limits<double>::quiet_NaN() to a true null object (which is not a number). So that json in memory is consistent with its json string representation.
  2. Or allow null to get cast to NaN when converting to double.

Since NaN serializes to null, I expect that when I convert a json null object to a double to get NaN. That is, solution 2.

Minimal code example

#include <nlohmann/json.hpp>
#include <iostream>

int main()
{
    using json = nlohmann::json;

    const double NaN     = std::numeric_limits<double>::quiet_NaN();
    json jsonData1       = {1.72, NaN, json()};
    const json jsonData2 = json::parse(jsonData1.dump());
    std::cout << "DATA1: " << jsonData1.dump() << "\n";
    std::cout << "DATA2: " << jsonData2.dump() << "\n\n";

    for (const auto& v : jsonData1)
        std::cout << ">> is_null=" << v.is_null() << ", is_number=" << v.is_number() << ", value='" << v << "'\n";
    std::cout << "\n";
    for (const auto& v : jsonData2)
        std::cout << ">> is_null=" << v.is_null() << ", is_number=" << v.is_number() << ", value='" << v << "'\n";

    const double v1 = jsonData1.at(1);
    // const double v2 = jsonData2.at(1);  // ERROR: type must be number, but is null

    return 0;
}

Error messages

ERROR: type must be number, but is null

Compiler and operating system

Linux (gcc 12.2), Windows (msvc v19.33)

Library version

3.11.1 or 3.11.2

Validation

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

I’m not sure if I care if its enabled by default, but could we at least add an optional flag to read/write NaN, Infinity, and -Infinity?

The crux of the problem is that the round-trip const json data2 = json::parse(data1.dump()); creates inconsistent behaviour between data1 and data2. Similarly to trzeciak, I send numerical json data over the wire. So if before & after serialisation, the objects behave differently, it’s a big problem. And it’s impossible for me to catch or handle every nan.

To recap, so far 3 solutions have been proposed:

  1. Auto-convert NaN to null inside a json object
  2. Allow conversion of null to NaN via get<double>()
  3. Throw an error when NaN is set to a json object

I personally still think solution 2 has merit. Because irrespective of whether a null was originally a true null or a NaN (or something else), converting null to NaN just feels more natural than throwing an error.

Nonetheless, another solution:

  1. On https://json.nlohmann.me/api/basic_json/dump it says “The function tries to mimic Python’s json.dumps() function”. And on https://docs.python.org/3.10/library/json.html#infinite-and-nan-number-values it says that

    […] By default, this module accepts and outputs Infinity, -Infinity, and NaN as if they were valid JSON number literal values.

So Python supports NaN and it’s even enabled by default. We can control this behaviour via the allow_nan parameter. This parameter could be added to nlohmann json’s dump & parse function:

string_t dump(int indent, char indent_char, bool ensure_ascii, error_handler_t, bool allow_nan);
basic_json parse(InputType&& i, parser_callback_t cb, bool allow_exceptions, bool ignore_comments, bool allow_nan);

Then we can write const json data2 = json::parse(data1.dump(-1, ' ', false, strict, true), nullptr, true, false, true); and we truly have data1 == data2.

Of course, listing all the parameters is a bit cumbersome. So maybe new overloads or new functions could help:

// New dump overload like string_t dump(nan_policy_t) const
// where nan_policy_t = enum class { allow, forbid };
const json data2 = json::parse(data1.dump(nan_policy_t::allow), nan_policy_t::allow);

// New "..._with_nan" functions
const json data2 = json::parse_with_nan(data1.dump_with_nan());

However, if nlohmann::json’s default for allow_nan is true, then this may be a breaking change (again, for some users). But it would follow Python’s example (as promised in the docs), and it may remove the need for the additional overloads/functions (as I suppose most people would be happy with nan’s being supported by default).

Hope this helps.

Understood.

However, if we have nlohmann::json obj = {1.72, NaN, null}, then the round-trip

obj == nlohmann::json::parse(obj.dump())

fails because nlohmann::json::parse(obj.dump()) gives {1.72, null, null} instead.

The json strings are equal, but the json objects are different, which seems like an inconsistency.