Skip to content

MAINT: Refactor of numpy/core/_type_aliases.py #24679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 6, 2023

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Sep 10, 2023

Hi @seberg @rgommers,

This is a successor for #24651 PR. It focuses on refactoring both C and Python code that defines types and aliases. _type_aliases.py imports typeinfo and moves types to allTypes and sctypeDict without the old logic, such as recalculating bit sizes etc.

Here, allTypes contains concrete type names that are available in the main namespace and abstract types (also placed in the main namespace). sctypeDict contains concrete type names and additional aliases that are used by np.dtype (e.g. np.dtype("long") or np.dtype("int")). sctypes is exactly the same as before.

The only thing left is the refactor of numpy/f2py test suite, as it uses original typeinfo quite extensively. The rest of the test suites pass successfully on my machine.

@mtsokol mtsokol force-pushed the typeinfo-remove-part-2 branch 2 times, most recently from 6b413f6 to dd1f9f1 Compare September 11, 2023 11:32
@mtsokol mtsokol force-pushed the typeinfo-remove-part-2 branch from 67bafb9 to 4e77ba6 Compare September 11, 2023 14:16
@mtsokol mtsokol self-assigned this Sep 12, 2023
@mtsokol
Copy link
Member Author

mtsokol commented Sep 12, 2023

Hi @seberg,

I've got two questions regarding remaining checks that failed:

  1. Should I drop np.float128 and np.complex256 usages and replace them with (c)longdouble? The new implementation of _type_aliases.py doesn't build these entries for allTypes as it only imports names from typeinfo.
  2. I'm wondering how to solve one last C dependency that causes tests to break. The numpy/f2py/tests/test_array_from_pyobj.py needs to know the C name of a given dtype here as it uses C name to import a dtype from f2py extension. Previously it worked because typeinfo was a dict with C names as keys and dtypes as values.
    As in the current implementation typeinfos keys are already proper Python names I have no way to determine "C" name to use to call "wrap" f2py extension with (line 219 is the one where assert fails).
    Is there another way to get C type name -> Python dtype mapping to use for this test? (It looks like the mapping that we intentionally removed from typeinfo in this PR).

@seberg
Copy link
Member

seberg commented Oct 4, 2023

Hmmm, I thought this was already being pushed forward/through, while I was out.

Should I drop np.float128 and np.complex256 usages and replace them with (c)longdouble? The new implementation of _type_aliases.py doesn't build these entries for allTypes as it only imports names from typeinfo.

Just use .itemsize to define float{...} if the itemsize is 96 or 128, and the same for clongdouble? I think we should at least add a warning on those names, but that is a distinct issue.

I'm wondering how to solve one last C dependency that causes tests to break

I guess we could add the NPY_INT, etc. names to the dictionary, additionally, or a different dictionary. In general, I don't mind the approach taken here to just have another dict, especially in tests where it isn't as bad to duplicate things.

@seberg
Copy link
Member

seberg commented Oct 4, 2023

I would really like this to move forward, because I started looking into changing the definition for intp, which means changing a character code here or there probably. So there are likely conflicts.

@mtsokol mtsokol force-pushed the typeinfo-remove-part-2 branch from 4e77ba6 to e318f0a Compare October 4, 2023 15:38
@mtsokol
Copy link
Member Author

mtsokol commented Oct 4, 2023

@seberg Thank you for feedback! For now I will include these extended precision sized aliases based on .itemsize.
I think that additional dict built on C side that defines C names mapped to dtypes (to be used in tests) solves this issue.

seberg and others added 4 commits October 5, 2023 11:44
This could probably be simplified more by using `np.dtypes` or
merging some of the information there, but this is a big first step
of removing logic duplication and just removing a lot of exposed
info that nobody needs.
@mtsokol mtsokol force-pushed the typeinfo-remove-part-2 branch 2 times, most recently from 681abe0 to 6998983 Compare October 5, 2023 09:49
@mtsokol
Copy link
Member Author

mtsokol commented Oct 5, 2023

Hi @seberg,

I pushed a new commit - here are my comments:

  1. I added C's NPY_* names that are used in f2py tests to get the cname -> dtype mapping that was missing.
  2. I named it c_names_dict, is it fine, or would you prefer a different name?
  3. I added extended precision sized aliases as we discussed (np.float128, np.complex256 etc.).
  4. The PR is ready from my side. One question: I see that 6 CI stages are missing - all azure-pipelines - do you know why? (it would be better to see them passing as well here)

@mattip
Copy link
Member

mattip commented Oct 5, 2023

I don't know why it is not showing up in the summary, but there is a test failure on azure.

@mtsokol
Copy link
Member Author

mtsokol commented Oct 5, 2023

I don't know why it is not showing up in the summary, but there is a test failure on azure.

Huh, the link you provided is from different PR (core to _core rename), not this one.
Maybe I can just retrigger.

@mtsokol mtsokol force-pushed the typeinfo-remove-part-2 branch from 6998983 to 3968009 Compare October 5, 2023 11:34
@mtsokol
Copy link
Member Author

mtsokol commented Oct 5, 2023

Ok, now azure stages are running - CI is green.

@ngoldbaum
Copy link
Member

I added extended precision sized aliases as we discussed

Are you OK making the bitsized longdouble aliases generate deprecation warnings in a followup?

@seberg
Copy link
Member

seberg commented Oct 5, 2023

Are you OK making the bitsized longdouble aliases generate deprecation warnings in a followup?

Are you asking me or @mtsokol, TBH, whether or not we do it, it is nice as a follow-up any case. It might be good if someone other than me can have a quick look as I wrote the C changes. I didn't have a close look yet (or at least recently), but I suspect the changes are fine.

@ngoldbaum
Copy link
Member

I was asking Mateusz since that was the only ask in the reviews that wasn't addressed. Thanks for the context though.

I looked over the C changes earlier and didn't see any issues. I'll give them another once-over before merging.

@mtsokol
Copy link
Member Author

mtsokol commented Oct 5, 2023

Sure! I will deprecate these aliases in the next PR.

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the C changes once more and didn't see any issues. Thanks for pushing this through @mtsokol @seberg!

@ngoldbaum ngoldbaum merged commit 195d85b into numpy:main Oct 6, 2023
@mtsokol mtsokol deleted the typeinfo-remove-part-2 branch October 6, 2023 14:30
@mroeschke
Copy link
Contributor

mroeschke commented Oct 9, 2023

It appears this removed some previously public working aliases in np.dtype. Is this expected?

In [1]: import numpy as np

In [2]: np.dtype("unicode")
Out[2]: dtype('<U')

In [3]: np.__version__
Out[3]: '1.24.4'
>>> import numpy as np
>>> np.dtype("unicode")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: data type 'unicode' not understood
>>> np.__version__
'2.0.0.dev0+git20231007.040ed2d'

@ngoldbaum
Copy link
Member

I think that was missed, can you open an issue?

@mtsokol
Copy link
Member Author

mtsokol commented Oct 9, 2023

unicode was defined as an extra-alias and didn't come from the C layer. I think this and one more thing are missing after refactor - let me readd it!

@mroeschke
Copy link
Contributor

mroeschke commented Oct 9, 2023

Here's an issue: #24892

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants