json: Inconsistent Behaviour of NaN & Null Values

Description

Let us assume we have nan = std::numeric_limits<double>::quiet_NaN() and null = nlohmann::json() in our json object. Then the first is a number and NOT null. Whereas the second is null and NOT a number. However, in string form, both appear as null.

So when dumping everything to a json string and then parsing it back into a json object, then both values have now become null. And neither value is a number. So when we try to convert any of them back to a double we get type must be number, but is null.

Reproduction steps

See Minimal code example or https://godbolt.org/z/b9n7Eq9qv

Expected vs. actual results

Both std::numeric_limits<double>::quiet_NaN() and nlohmann::json() are displayed as null in a json file. So a nlohmann::json object should treat both the same at any given point in time. Two solutions:

Either immediately convert std::numeric_limits<double>::quiet_NaN() to a true null object (which is not a number). So that json in memory is consistent with its json string representation.
Or allow null to get cast to NaN when converting to double.

Since NaN serializes to null, I expect that when I convert a json null object to a double to get NaN. That is, solution 2.

Minimal code example

#include <nlohmann/json.hpp>
#include <iostream>

int main()
{
    using json = nlohmann::json;

    const double NaN     = std::numeric_limits<double>::quiet_NaN();
    json jsonData1       = {1.72, NaN, json()};
    const json jsonData2 = json::parse(jsonData1.dump());
    std::cout << "DATA1: " << jsonData1.dump() << "\n";
    std::cout << "DATA2: " << jsonData2.dump() << "\n\n";

    for (const auto& v : jsonData1)
        std::cout << ">> is_null=" << v.is_null() << ", is_number=" << v.is_number() << ", value='" << v << "'\n";
    std::cout << "\n";
    for (const auto& v : jsonData2)
        std::cout << ">> is_null=" << v.is_null() << ", is_number=" << v.is_number() << ", value='" << v << "'\n";

    const double v1 = jsonData1.at(1);
    // const double v2 = jsonData2.at(1);  // ERROR: type must be number, but is null

    return 0;
}

Error messages

ERROR: type must be number, but is null

Compiler and operating system

Linux (gcc 12.2), Windows (msvc v19.33)

Library version

3.11.1 or 3.11.2

Validation

The bug also occurs if the latest version from the develop branch is used.
I can successfully compile and run the unit tests.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (7 by maintainers)

Most upvoted comments

I’m not sure if I care if its enabled by default, but could we at least add an optional flag to read/write NaN, Infinity, and -Infinity?

bredelings on Jul 20, 2023

The crux of the problem is that the round-trip const json data2 = json::parse(data1.dump()); creates inconsistent behaviour between data1 and data2. Similarly to trzeciak, I send numerical json data over the wire. So if before & after serialisation, the objects behave differently, it’s a big problem. And it’s impossible for me to catch or handle every nan.

To recap, so far 3 solutions have been proposed:

Auto-convert NaN to null inside a json object
Allow conversion of null to NaN via get<double>()
Throw an error when NaN is set to a json object

I personally still think solution 2 has merit. Because irrespective of whether a null was originally a true null or a NaN (or something else), converting null to NaN just feels more natural than throwing an error.

Nonetheless, another solution:

On https://json.nlohmann.me/api/basic_json/dump it says “The function tries to mimic Python’s json.dumps() function”. And on https://docs.python.org/3.10/library/json.html#infinite-and-nan-number-values it says that

[…] By default, this module accepts and outputs Infinity, -Infinity, and NaN as if they were valid JSON number literal values.

So Python supports NaN and it’s even enabled by default. We can control this behaviour via the allow_nan parameter. This parameter could be added to nlohmann json’s dump & parse function:

string_t dump(int indent, char indent_char, bool ensure_ascii, error_handler_t, bool allow_nan);
basic_json parse(InputType&& i, parser_callback_t cb, bool allow_exceptions, bool ignore_comments, bool allow_nan);

Then we can write const json data2 = json::parse(data1.dump(-1, ' ', false, strict, true), nullptr, true, false, true); and we truly have data1 == data2.

Of course, listing all the parameters is a bit cumbersome. So maybe new overloads or new functions could help:

// New dump overload like string_t dump(nan_policy_t) const
// where nan_policy_t = enum class { allow, forbid };
const json data2 = json::parse(data1.dump(nan_policy_t::allow), nan_policy_t::allow);

// New "..._with_nan" functions
const json data2 = json::parse_with_nan(data1.dump_with_nan());

However, if nlohmann::json’s default for allow_nan is true, then this may be a breaking change (again, for some users). But it would follow Python’s example (as promised in the docs), and it may remove the need for the additional overloads/functions (as I suppose most people would be happy with nan’s being supported by default).

Hope this helps.

phil-zxx on Dec 11, 2022

Understood.

However, if we have nlohmann::json obj = {1.72, NaN, null}, then the round-trip

obj == nlohmann::json::parse(obj.dump())

fails because nlohmann::json::parse(obj.dump()) gives {1.72, null, null} instead.

The json strings are equal, but the json objects are different, which seems like an inconsistency.

phil-zxx on Oct 19, 2022