Skip to content

gh-98306: Support JSON encoding of NaNs and infinities as null #115246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

mdickinson
Copy link
Member

@mdickinson mdickinson commented Feb 10, 2024

This PR:

  • adds support for allow_nan="as_null" in JSON encoding (json.dump, json.dumps, JSONEncoder.encode). This offers an encoding mode that automatically converts floating-point infinities and NaNs to JSON nulls. This matches the ECMAScript specification for JSON.stringify, so makes interoperability between JavaScript and Python somewhat easier.
  • deprecates use of any other string as a value for allow_nan, to allow space for future expansion of the protocol if necessary

Note that strictly speaking, this is a backwards incompatible change: allow_nan="null" was valid before (since the string "null" is truthy), and this PR changes its meaning. The likelihood of actual breakage from this change seems negligible. Nevertheless, per PEP 387 we would need to request a Steering Council exception before this could be merged.


📚 Documentation preview 📚: https://cpython-previews--115246.org.readthedocs.build/

@mdickinson
Copy link
Member Author

Note: Strictly speaking, this is a backwards incompatible change: allow_nan="null" was valid before (since the string "null" is truthy), and this PR changes its meaning. The likelihood of actual breakage from this change seems likely to be negligible. Nevertheless, per PEP 387 we would need to request a Steering Council exception before this could be merged.

@Dzeri96
Copy link

Dzeri96 commented Mar 3, 2024

I can't comment on the C implementation as I'm simply unfamiliar with that part of the codebase, but one semantic change I'd suggest is to use "as_null"/"convert_to_null" instead of "null". In my mind, allow_nan="as_null" makes it more obvious as to what the argument does, especially since we're re-using a kwarg with a fairly inflexible name.

Apart from that, I'd really like to avoid using stringified versions of reserved keywords in the code, even if they come from other languages. I'm sure anyone who was unfortunate enough to debug serialization errors can attest to that.

@Dzeri96
Copy link

Dzeri96 commented Mar 3, 2024

One more thing: Can we type-hint the allow_nan kwarg with a Literal[True, False, "as_null"] ?

@mdickinson
Copy link
Member Author

In my mind, allow_nan="as_null" makes it more obvious as to what the argument does, especially since we're re-using a kwarg with a fairly inflexible name.

SGTM. I've updated the PR and the description.

One more thing: Can we type-hint the allow_nan kwarg with a Literal[True, False, "as_null"] ?

Python's own std. lib. source doesn't use type hints, as a rule; this would need to be a change in typeshed.

@mdickinson
Copy link
Member Author

Nevertheless, per PEP 387 we would need to request a Steering Council exception before this could be merged.

python/steering-council#244

@mdickinson
Copy link
Member Author

mdickinson commented May 12, 2024

Adding DO-NOT-MERGE label, pending response to python/steering-council#244.

@gpshead
Copy link
Member

gpshead commented May 12, 2024

API suggestion: Do not use a str literal for the new flag value, use a new singleton json.AS_NULL constant instead.

Use of a str literal on something that also accepts boolean values allows for an anti-pattern of someone making a typo in the value. They may not notice their error as it'd just wind up True. A singleton constant prevents people from coming near a typo prone literal.

Something like this:

class _Singleton:
    def __bool__(self):
        raise RuntimeError(f"Not usable in boolean context.")

AS_NULL = _Singleton()

and just check for that using is within the code on any API where it should be allowed. (or equivalent in a C extension)

@gpshead
Copy link
Member

gpshead commented May 12, 2024

Strictly speaking, this is a backwards incompatible change: allow_nan="null" was valid before (since the string "null" is truthy)

Slightly, but not in a way that matters a whole lot. Though if you use AS_NULL instead of a literal as mentioned above, this concern goes away. People will no longer be passing a str and the minimum version supporting this API is clear by the existence of AS_NULL or not. Instead of the code behaving differently based on version when the literal is passed in.

@mdickinson
Copy link
Member Author

@gpshead I like the suggestion in principle, and it would definitely alleviate the backwards compatibility concerns to the point where an SC exception wouldn't be necessary.

The practical question: how to implement this in a way that the new json.AS_NULL constant is available to the C code? Adding a new type and its instance to _json.c is the obvious way, but it feels like substantial new code for a fairly minor thing. Defining the constant in the Python code and injecting it into the extension module seems possible, but feels fragile to me (I've done something similar in 3rd-party projects, and it generally didn't end well).

I guess there are two, independent, parts to this:

  1. Making the special value a named constant json.AS_NULL and documenting it as such seems like a no-brainer - completely agree with the rationale here. And that would work regardless of what actual concrete type json.AS_NULL happens to have.
  2. Picking a suitable under-the-hood type for json.AS_NULL.

The re module and the re.IGNORE constants and friends seem like a good example of prior art for this problem - I'll take a look at how that works.

@JelleZijlstra
Copy link
Member

@mdickinson the re module appears to just use integer flags. You may also want to look at the typing.NoDefault sentinel I added in #116129; I think a very similar implementation would work here.

@mdickinson
Copy link
Member Author

Closing. I'm not going to have the bandwidth to push this through in the forseeable future. (But if anyone wants to build on what's already here, please feel free to do so.)

@mdickinson mdickinson closed this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants