-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
MAINT: Refactor of numpy/core/_type_aliases.py
#24651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We talked a bit about this and I tried to remove some C-side stuff and change the exposure a bit via: https://github.com/seberg/numpy/tree/typeinfo-remove (it could probably be simplified more, but at least 300 lines of weird C-code gone 🎉). That also removes the redefinition of the "integer priority" list in Python. I am a bit unsure about float128. On the one hand, I hate it and want everyone to use longdouble. OTOH, I think it may be used (unnecessarily) quite a lot. |
@seberg Thank you! I built |
It should return the dtype, you are fetching the |
In that table, I'd argue |
In fact, the table is also wrong on Linux where |
@eric-wieser thank you for your feedback! |
numpy/core/_type_aliases.py
numpy/core/_type_aliases.py
~~
numpy/core/_type_aliases.py
~~numpy/core/_type_aliases.py
Sorry for confusion! |
Hi @seberg @rgommers,
I'm starting here a new PR for reviewing as the discussion in the original one concluded.
Here I'm sharing a proposition of
numpy/core/_type_aliases.py
file refactor.While working on removing other scalar aliases I had some difficulties understanding how
allTypes/sctypeDict
dictionaries are being created. For example, I noticed that thefloat
alias is originally present insctypeDict
, then it's removed and then added again.Also, as NEP 52 suggest,
float96
andfloat128
could be removed (extended precision) and it would also cover modifying_type_aliases.py
. This refactor makes it easier to introduce new changes to the available aliases list.I decided to try to refactor this file to decompose the creation of these dictionaries into well-defined stages.
I think the purpose of this file is straightforward: Read types from the compiled
multiarray
module, and expose them as Python dictionaries, while mapping C naming conventions to Python conventions (e.g. interpretfloat
asdouble
instead ofsingle
).I identified which "group" of entries is present in each dictionary (could it be modified for NumPy 2.0?):
words
(e.g. "float", "cdouble", "int_")words+bits
(e.g. "int16", "complex64")symbols
(e.g. "p", "L")symbols+bytes
(e.g. "c8", "b1")abstracts
(e.g. "inexact", "integer")numbers
(e.g. 1, 2, 3)aliases
(e.g. "complex_", "longfloat")extra/other aliases
(e.g. "bool")Additionally, here's a decomposition of
words
andwords+bits
into three columns:In the refactored file I build each of these groups to then join them into final dicts. I do it in a loop in
def _build_dicts()
, where sizedint
&uint
aliases must take under consideration a "priority" of types - in case at least two of them are same bitsize.After that I rename and add aliases to the created groups. In the last stage I clean entries that aren't present in the original implementation (these are custom cases).
The last, and separate, stage creates
sctypes
which I copied, as the implementation was clear to me.Important to mention, this modifies a crucial piece of code that could be tricky to debug (e.g. canonical int names are aliased incorrectly on some architecture (I tested it locally only on macbook with Intel)).
I renamed the old implementation to
_expired_type_aliases.py
and added a testtest_new_vs_expired_type_aliases
that checks if dictionaries created by the new implementation are exactly the same as done by the previous code. This should flag in the CI if the new implementation fails to mimic the previous one on a specific architecture. (After some time, the original impl could be completely removed)Please share your feedback!