-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
API: Cleaning numpy/__init__.py
and main namespace - Part 3 [NEP 52]
#24376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Cleaning numpy/__init__.py
and main namespace - Part 3 [NEP 52]
#24376
Conversation
bd8ab0a
to
bb5b6da
Compare
fa785a5
to
1fd0839
Compare
It looks like the doctests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments below, I didn't catch any other issues but this probably needs at least one more careful review before it can be merged.
Some I wonder if there is enough use that we shouldn't bother, but overall nothing springs up a lot. Someone at pandas recently mntiond they rely on |
It looks undocumented besides a single brief mention in https://numpy.org/devdocs/reference/arrays.dtypes.html, and it's unclear what the design of it is. I think it's worth revisiting in a follow-up PR after some of the aliases are removed, and then (assuming we keep it) documenting properly what will be in it and how users are expected to use it? Right now this seems like total chaos: >>> np.sctypeDict.keys()
dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'ulong', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'b1', 'bool8', 'i8', 'int64', 'u8', 'uint64', 'f2', 'float16', 'f4', 'float32', 'f8', 'float64', 'f16', 'float128', 'c8', 'complex64', 'c16', 'complex128', 'c32', 'complex256', 'object0', 'bytes0', 'str0', 'void0', 'M8', 'datetime64', 'm8', 'timedelta64', 'int32', 'i4', 'uint32', 'u4', 'int16', 'i2', 'uint16', 'u2', 'int8', 'i1', 'uint8', 'u1', 'complex_', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'bytes_', 'string_', 'str_', 'unicode_', 'object_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a', 'int0', 'uint0']) |
I think #24411 is a good place to refactor building process of |
@seberg This PR is only about cleaning top-level namespace: |
ffffe98
to
d065f8f
Compare
I'd be +1 for removing |
asfarray is fine, just would be nice to explain a bit why we think it is bad. tracemalloc is also good to move, might be nice to find a better place eventually, but only documenting where it is important. |
d065f8f
to
c440330
Compare
For >>> x = np.array([1.5, 2.5], dtype=np.float32)
>>> np.asfarray(x) # converts an array that is already floating-point: bad
array([1.5, 2.5])
>>> # writing it like this will preserve `float32` & co, but has other bad side-effects,
>>> # like not working with array-likes and ignoring explicitly non-float dtypes
>>> np.asfarray(x, dtype=x.dtype)
array([1.5, 2.5], dtype=float32)
>>> x = np.array([1, 2], dtype=np.int32)
>>> np.asfarray(x, dtype=x.dtype)
array([1., 2.])
>>> np.asfarray(x, dtype=bool) # one would expect this to raise an exception
array([1., 2.]) The default behavior is better written as |
@rgommers Would like me to purge |
I'd go ahead and delete it. Until now, the |
My guess: if users turn up, I would rather replace it with a more sane version that does the same thing then just hide it away. |
Are you proposing a silent behavioral change to |
The doctest failure is caused by pandas trying to import np.NaN, which is removed in this PR. That will be fixed once pandas-dev/pandas#54579 is merged and a new pandas nightly wheel gets generated. |
What is the reasoning for removing For consistency, I'd advocate either removing all four, or keeping all four. |
I think one reason is that
|
Is |
Please don't use anything from |
Thanks for the quick response! |
I don't have an answer right away. If there's no replacement then we can move |
Thanks – It would be great to figure out the correct way forward ASAP – currently JAX's nightly CI job testing against nightly numpy & scipy is broken by this with no obvious workaround. This CI run has proven valuable in the past for early flagging of potential incompatibiilties, and I'd like to get it green again. |
Hmm, I think we want to avoid breaking downstream, but I also am not sure whether we anticipated downstream users inserting things into that dictionary. @seberg, do you have a suggestion? In the short term the easiest solution for jax would either be to restore the removed item or ask them to use the private one in |
We chatted about this in this week's numpy community meeting and decided to restore We'd like to find a way to do what In the experimental dtype API, user-defined dtypes don't have any character codes at all and the vision is that users manipulate them by importing the dtype class from a python namespace rather than passing a string to If you'd like, we have another community meeting two weeks from today that you're welcome to drop in on to discuss this further. We'd also be happy to schedule a low-latency chat or find an appropriate location in Jax community spaces to have this discussion if hiding it in a merged NumPy github PR is obnoxious. |
@ngoldbaum Indeed, it's so But yes, you should probably decide if you want to support named lookups of extension types. (Also, I'll add that |
NumPy should also probably deprecate and eventually prevent registering dtype names like this, see #24699 |
I suspected/mentioned this was likely to happen and while I don't like this, I suspect we may need to just allow it. I would be fine with forcing |
Yes, at least bfloat16 did meet sympathy I think, but we also need a bit more to decide than just "would be nice" and maybe a write up, etc. Which is all not quite trivial:
Maybe all of these turn out to be details we can ignore, but the question is how to push it forward, and I suspect it needs a bit of a fatter proposal than "upstream it". Even if the implementation may end up being only that. |
Follow-up PR after #24357.
Current scope of the PR:
float_
complex_
longfloat
singlecomplex
cfloat
longcomplex
clongfloat
string_
unicode_
Inf
,Infinity
,NaN
,infty
issctype
,maximum_sctype
,obj2sctype
,sctype2char
,sctypeDict
,sctypes
,issubsctype
set_string_function
,asfarray
(removed completely from NumPy)issubclass_
tracemalloc_domain
(still available innp.lib
)mat
(alias fornp.asmatrix
)recfromcsv
andrecfromtxt
(still available undernp.lib.npyio
module)safe_eval
,deprecate
,deprecate_with_doc
(all three already deprecated)