ENH: Update scalar representations as per NEP 51 #22449

seberg · 2022-10-18T10:09:58Z

This is a WIP for implementing NEP 51/gh-22261. There are a few things that go beyond just changing the scalar representation:

It makes get_formatter() a more central/public API, to also move fmt=... to it.
- This introduces the distinction that if the dtype is known, we can e.g. represent a longdouble as '3.1' (with quotes) rather than np.longdouble('3.1').
- fmt is mainly used for fmt="r" (previous point), I think fmt="s" should be supported, but always be identical to str(arr[0]).
- There could be an argument for the fmt path to be a ufunc (it should not inspect array values, unlike the default printing).
- fmt should be used more cleanly for the __format__ work. This should be fine, I think it only means pushing the "option" adjustments lower.
arr.tofile() used repr(arr.item(...)) to print each element. This is weird/problematic:
- For longdouble, I want that repr to be np.longdouble('3.1').
- For strings, this means it uses 'string' or b'string'
Record and MA printing has to be adjusted to use the new functionality. This affects record scalars and the printing of the masked array fill_value.

After some back and forth, I think the default should just be to go back to str() (this was changed silently in BUG 4381 Longdouble from string without precision loss #6199). str(arr.item()) is not ideal for float32 for example.
This could use some thought (or should change to use the above get_formatter(fmt="s"), but I hope this can be decoupled from NEP 51.

I plan to update the NEP 51 draft to include these points. Happy to discuss here of course (will make pushing NEP 51 quicker). But for now the draft-PR exists primarily to help with the NEP 51 discussion.

TODOS::

Refguide check is failing (and may lead to fixups being necessary?)
Discuss and accept NEP 51
Decide whether this should be hidden behind a flag or if it should interact with the legacy print-mode (i.e. to allow inclusion in the next release, but only switch with NEP 50 – which would likely be a NumPy 2.0 as well)
New tests for some paths are needed. Mainly arr.tofile() is not well tested (which is why the change here should not fail tests).

tylerjereddy · 2022-10-20T00:16:04Z

@seberg so far so good for latest main of SciPy alongside this feature branch in my hands (Ubuntu Linux; i9-7900X):

python dev.py test -j 10
37669 passed, 2222 skipped, 139 xfailed, 9 xpassed in 83.59s (0:01:23)

seberg · 2022-10-21T13:11:46Z

So, I realized that the empty format string "" already has more or less the meaning of str(), and in that sense I was thinking of using the repr function itself as a singleton for "scalar repr" (rather than "r").

I don't really have an opinion, could even say that str should be used for !s formatting effectively. Or could go back, because why would anyone use {obj:r} for something that is not a repr...

EDIT: Reconsidering this, maybe "" is different, since it would be simpler if that was the same as the default formatting, which is not string. So using str and repr seems actually pretty nice?

The thing to look into now (besides test fixes), is that array float formatting doesn't match scalar float formatting, so I actually need to pass through the formatting information better (should not be hard, but needs a bit organizing/thought.)

seberg · 2022-10-24T12:56:49Z

This is starting to settle, although I am considering changing np.float64(nan) to add the quotes always (np.float64('nan') similar to Python).

@rossbar do you know why the refguide check does not seem to notice changes from 0.0 to np.float64(0.0)? In a sense, I like that, making that a followup is probably far more manageable.

h-vetinari · 2022-11-01T02:04:22Z

I saw #22261 only now, so I won't complain if this is too late, but I'd really like to get away from long, longlong, etc. I'd much prefer to just use sized integers throughout. It's a much saner model, and now that C99 is available on all platforms, we could do it (modulo the fact that LAPACK and perhaps some other venerable libs still have those types in their API 😑)

seberg · 2022-11-01T07:19:44Z

I am proposing to print the bit-sized version, even if imprecise sometimes. Yes, it would be nice to change it, but it seems like a big ABI change on the C-side?

h-vetinari · 2022-11-02T03:56:06Z

I am proposing to print the bit-sized version, even if imprecise sometimes.

Looked more closely, looks fine.

Yes, it would be nice to change it, but it seems like a big ABI change on the C-side?

I'm afraid so, but I still think it'd be worth it eventually (numpy 2.0 material?).

h-vetinari

There seems to be a mismatch between the implementation and the NEP w.r.t. whether printing is affected or not.

Based on the implementation, I think this PR has pretty high chance of disrupting services where something gets calculated with numpy, and then passed on in some serialized form to another consumer, where numpy (or even python) might not necessarily be installed.

Is there any fallback we could provide so that people can adapt such interfaces (e.g. .value or .raw or ...)?

doc/source/reference/arrays.classes.rst

ev-br · 2022-11-05T04:06:10Z

the refguide check does not seem to notice changes from 0.0 to np.float64(0.0)

It won't, as long as np.allclose(eval('np.float64(0.0)'), 0.0)

charris · 2023-02-19T18:57:57Z

Needs rebase.

pllim · 2023-07-03T13:31:06Z

@mhvk , how do you think astropy can/should deal with this in our doctest downstream? I can only think of disabling doctest for numpy-dev and then when numpy 2.0 is released, we update the docs and disable doctest for numpy 1.x. However, this comes with the risk that we won't catch any incompatibility if a feature is only tested in the doc examples. Hope you can advise. Thanks! 🙏

larsoner · 2023-07-26T21:19:33Z

Looks like we're headed to green with legacy="1.25" (set only when the NumPy version is > 1.25), so the suggested workaround is all good (when deployed correctly) 👍

seberg · 2023-07-27T08:25:27Z

I will note that just setting the legacy mode for the tests will unfortunately hide issues if you use repr(scalar) for serialization/writing to a file (or a dependency does). So it is a good hot-fix for doctests or tests which check repr explicitly, but might cause real issues to be missed.

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail. The doctest for tensor.extra_ops.Unique was failing because the output shape for the inverse indices has changed when axis is None: numpy/numpy#20638

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

github-actions bot added the 25 - WIP label Oct 18, 2022

seberg force-pushed the scalar-repr-change branch 2 times, most recently from 342b701 to ff552fc Compare October 18, 2022 11:59

seberg marked this pull request as ready for review October 21, 2022 13:14

seberg changed the title ~~WIP: Update scalar representations as per NEP 51~~ ENH: Update scalar representations as per NEP 51 Oct 21, 2022

seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Oct 21, 2022

github-actions bot added the 01 - Enhancement label Oct 21, 2022

seberg added 25 - WIP and removed 25 - WIP 01 - Enhancement labels Oct 21, 2022

seberg marked this pull request as draft October 21, 2022 13:44

seberg force-pushed the scalar-repr-change branch from 893d77d to 387e26a Compare October 24, 2022 12:34

github-actions bot added the 01 - Enhancement label Oct 24, 2022

seberg marked this pull request as ready for review October 24, 2022 12:56

seberg force-pushed the scalar-repr-change branch 3 times, most recently from ce4166f to 9a64758 Compare October 24, 2022 15:31

seberg mentioned this pull request Oct 25, 2022

NEP: Make NEP 51 to propose changing the scalar representation #22261

Merged

h-vetinari reviewed Nov 2, 2022

View reviewed changes

doc/source/reference/arrays.classes.rst Show resolved Hide resolved

seberg force-pushed the scalar-repr-change branch from decdee0 to 08c07ce Compare June 21, 2023 14:26

seberg force-pushed the scalar-repr-change branch from 08c07ce to bf92d6e Compare June 30, 2023 16:01

keewis mentioned this pull request Jul 28, 2023

⚠️ Nightly upstream-dev CI failed ⚠️ pydata/xarray#7977

Closed

This was referenced Aug 5, 2023

[TST] Upcoming dependency test failures matplotlib/matplotlib#26460

Closed

MNT: Adjust for upcoming numpy repr changes matplotlib/matplotlib#26467

Merged

This was referenced Aug 9, 2023

MNT Remove DeprecationWarning for scipy.sparse.linalg.cg tol vs rtol argument scikit-learn/scikit-learn#26814

Merged

MNT Adjust code after NEP 51 numpy scalar formatting changes scikit-learn/scikit-learn#27042

Merged

ngoldbaum mentioned this pull request Jun 17, 2024

DOC: set_printoptions doesn't mention new legacy mode 1.25 #26731

Closed

michele-riva mentioned this pull request Oct 1, 2024

Numpy types written into bad pixel file viperleed/viperleed#251

Closed

FlorianDoerr mentioned this pull request Oct 1, 2024

badpixel numpy data type fix viperleed/viperleed#252

Merged

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 14, 2025

Updated doctests

844718b

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 14, 2025

Updated doctests

ee5ce2f

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 14, 2025

Updated doctests

5c617a9

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 14, 2025

Updated doctests

1a99fe8

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 14, 2025

Updated doctests

70885b2

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 17, 2025

Updated doctests

c3a991b

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 17, 2025

Updated doctests

ea44de5

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

brendan-m-murphy added a commit to brendan-m-murphy/pytensor that referenced this pull request Feb 17, 2025

Updated doctests

45c3a01

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

ricardoV94 pushed a commit to pymc-devs/pytensor that referenced this pull request Feb 17, 2025

Updated doctests

4d74d13

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

Aarsh-Wankar pushed a commit to Aarsh-Wankar/pytensor that referenced this pull request Feb 22, 2025

Updated doctests

cabc95f

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

felker mentioned this pull request Mar 14, 2025

Two failed regression tests PrincetonUniversity/athena#653

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Update scalar representations as per NEP 51 #22449

ENH: Update scalar representations as per NEP 51 #22449

seberg commented Oct 18, 2022 •

edited

Loading

tylerjereddy commented Oct 20, 2022

seberg commented Oct 21, 2022 •

edited

Loading

seberg commented Oct 24, 2022

h-vetinari commented Nov 1, 2022

seberg commented Nov 1, 2022

h-vetinari commented Nov 2, 2022

h-vetinari left a comment

ev-br commented Nov 5, 2022

charris commented Feb 19, 2023

pllim commented Jul 3, 2023

larsoner commented Jul 26, 2023

seberg commented Jul 27, 2023

ENH: Update scalar representations as per NEP 51 #22449

ENH: Update scalar representations as per NEP 51 #22449

Conversation

seberg commented Oct 18, 2022 • edited Loading

tylerjereddy commented Oct 20, 2022

seberg commented Oct 21, 2022 • edited Loading

seberg commented Oct 24, 2022

h-vetinari commented Nov 1, 2022

seberg commented Nov 1, 2022

h-vetinari commented Nov 2, 2022

h-vetinari left a comment

Choose a reason for hiding this comment

ev-br commented Nov 5, 2022

charris commented Feb 19, 2023

pllim commented Jul 3, 2023

larsoner commented Jul 26, 2023

seberg commented Jul 27, 2023

seberg commented Oct 18, 2022 •

edited

Loading

seberg commented Oct 21, 2022 •

edited

Loading