API: Introduce `np.isdtype` function [Array API] #25054

mtsokol · 2023-11-02T12:14:16Z

This PR adds np.isdtype mentioned in the NEP 52 and the tracking issue #23999.

I followed the description in https://data-apis.org/array-api/latest/API_specification/generated/array_api.isdtype.html.

Questions:

In the Array API standard, from available options for kind argument a flexible is missing (union of floating and complex types), but integral is present. Is it on purpose?
The standard requires dtype argument to be a dtype but in the implementation I accept anything that can be consumed by np.dtype, such as strings "int64" etc. Should I enforce dtype instances only?

mhvk

Bit of a fly-by review out of curiosity, but one small comment inline.

In favour of adding 'flexible' would seem to be that it is fairly common to have functions that work well with either float or complex, and it may just have been an oversight of the array API (which initially had no support for complex numbers).

numpy/_core/numerictypes.py

rgommers · 2023-11-02T17:44:43Z

Thanks @mtsokol!

of adding 'flexible' would seem to be that it is fairly common to have functions that work well with either float or complex, and it may just have been an oversight of the array API (which initially had no support for complex numbers).

It's not an oversight; isdtype was developed only after complex dtypes existed and therefore had complex support from the start. There's a fairly detailed discussion around this point in data-apis/array-api#425. I'll quote in particular this point from @sebert: "NumPy calls real + complex floating point types inexact (not that the name is used often). The only problem with using floating point for both is that people may just not think about complex at all, and it may be a false friend in practice."

Now I can see a case for the opposite point as well, but I think we should start with only implementing what the standard says, and only reconsidering if it becomes clear that this is cumbersome. At that point a naming discussion should be had I think, because flexible and inexact are both quite poor and very non-standard names.

One of the upsides is that it's often unclear how well complex dtypes are supported by a function. If we read:

np.isdtype(dtype, ('real floating', 'complex floating'))

that is quite clear, and easier to understand than the old numpy names for this.

but in the implementation I accept anything that can be consumed by np.dtype, such as strings "int64" etc. Should I enforce dtype instances only?

I think you should enforce the typing of the standard - so it'd either be a dtype object or a string - and if it's a string then it must be one of the strings that is a named collection of dtypes in the standard.

NumPy has always had a ton of different ways to say dtype=xxx, but I only see downsides in accepting things like dtype='f4' or dtype='int32'.

mhvk · 2023-11-02T18:02:32Z

Yes, makes sense to have a replacement for "flexible" only if it turns out there is an actual need.

mtsokol · 2023-11-02T20:50:04Z

I think you should enforce the typing of the standard - so it'd either be a dtype object or a string - and if it's a string then it must be one of the strings that is a named collection of dtypes in the standard.

NumPy has always had a ton of different ways to say dtype=xxx, but I only see downsides in accepting things like dtype='f4' or dtype='int32'.

@rgommers Sure! I updated the implementation (and tests) to match the standard exactly.

doc/release/upcoming_changes/25054.new_feature.rst

ngoldbaum · 2023-12-05T16:53:14Z

numpy/_core/numerictypes.py

+
+    """
+    # validate and preprocess arguments
+    if not isinstance(dtype, (type, ma.dtype)):


I'm not sure if there's an easy way to spell this that's more accurate than just checking for type, but this would let you do e.g. np.isdtype(np.int64, dict) and return False. I guess that is true but is perhaps against the spirit of this check. I don't know if there's an easy way to get a list of all the dtype scalar types numpy knows about at any given moment (which could include user dtypes or dtypes defined in downstream packages).

I don't know it either. Let's ask about it during today's community call.

Maybe checking issubclass(dtype, np.generic) would make snese?

In [31]: np.int64.mro() Out[31]: [numpy.int64, numpy.signedinteger, numpy.integer, numpy.number, numpy.generic, object]

We decided at the triage meeting the other day that we wanted to make this error reject all dtypes or scalar types that aren't in the array API standard

reject all dtypes or scalar types that aren't in the array API standard

That would make it harder to use while supporting, e.g., float16, and seems like an odd suggestion - what is the justification?

Looking at the jax bfloat16 type as a real-world example, I see:

In [10]: np.issubdtype(ml_dtypes.bfloat16, np.floating) Out[10]: False In [11]: ml_dtypes.bfloat16.mro() Out[11]: [ml_dtypes.bfloat16, numpy.generic, object]

Which isn't particularly useful but is correct according to the implementation of issubdtype.

So I think we might need to just not handle third-party dtypes for now, until we have a better story for registering dtypes in a type hierarchy. Does that sound reasonable?

Handling all the builtin numpy dtypes here makes sense, no worries about doing that.

So I think we might need to just not handle third-party dtypes for now, until we have a better story for registering dtypes in a type hierarchy. Does that sound reasonable?

Thanks for checking. That sounds fine to me.

I updated the implementation so now it accepts all NumPy's dtypes as inputs.

For what it's worth, the reason that np.issubdtype(ml_dtypes.bfloat16, np.floating) returns False is that in several places NumPy hard-codes the assumption that there is only a single 16-bit floating-point type, so we could not make bfloat16 a subclass of np.floating without causing collisions across the codebase. If you have suggestions for how to do better, we'd love to hear it!

I’d love to relax that assumption inside numpy, I’m sure patches that do that would be reviewed if you can get anyone to work on it. I also wasn’t privy to previous discussions about this but at this point I think numpy should probably support bfloat16 natively. Of course that’s just my opinion. Also with the new dtype API it should be much more straightforward to upstream a new dtype into numpy, there’s no need anymore to mess with all the complicated custom templating and codegen.

rgommers

Thanks @mtsokol, this is looking pretty good to me - a number of minor comments.

numpy/__init__.py

numpy/_core/numerictypes.py

rgommers · 2023-12-12T13:57:34Z

numpy/_core/numerictypes.py

+    False
+    >>> np.isdtype(np.int64, (np.uint64, "signed integer"))
+    True
+


It'd be nice to explicitly add the example in the PR discussion regarding checking for real-floating only vs. an API that supports both real and complex floating.

Do you mean in the Examples section? Or PR's description?
I added two more examples to the docstring for checking real-floating only and real and complex floating.

rgommers · 2023-12-12T15:13:26Z

numpy/_core/numerictypes.py

+    True
+
+    """
+    # validate and preprocess arguments


It would be useful to extend this comment to state what is actually happening here. The intended inputs are np.float32 & co, which aren't instances of multiarray.dtype. So I'm not sure that this validation is right.

Also, rejection of non-compliant objects may be good to do more explicitly, now the errors may be a bit obscure:

In [9]: np.isdtype(np.float32, np.ones(1)) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[9], line 1 ----> 1 np.isdtype(np.float32, np.ones(1)) File ~/code/numpy/build-install/usr/lib/python3.11/site-packages/numpy/_core/numerictypes.py:436, in isdtype(dtype, kind) 434 if isinstance(kind, ma.dtype): 435 kind = kind.type --> 436 if kind not in allTypes.values(): 437 raise TypeError( 438 "kind argument must be comprised of NumPy dtypes or " 439 f"strings only, but it is a {kind}." 440 ) 442 processed_kinds.add(kind) TypeError: descriptor '__array_wrap__' for 'numpy.generic' objects doesn't apply to a 'numpy.ndarray' object

I added a short docstring for the helper function instead.
Basically, if we have a dtype instance we need to extract type. Then we check if the dtype is in allTypes (I added a check for numpy.ndarray to avoid the error you posted).

numpy/_core/numerictypes.py

ngoldbaum · 2023-12-20T20:42:20Z

Sorry for not merging this earlier in the week, I dropped it without looking at it again accidentally.

Thanks for bringing this one home @mtsokol!

mtsokol changed the title ~~API: Introduce np.isdtype function~~ API: Introduce np.isdtype function Nov 2, 2023

github-actions bot added the 30 - API label Nov 2, 2023

mtsokol self-assigned this Nov 2, 2023

mtsokol added the Numpy 2.0 API Changes label Nov 2, 2023

mtsokol force-pushed the isdtype-function branch 2 times, most recently from 9f3a100 to 0264091 Compare November 2, 2023 13:37

jakevdp mentioned this pull request Nov 2, 2023

Tracking issue: support Array API jax-ml/jax#18353

Closed

38 tasks

mhvk reviewed Nov 2, 2023

View reviewed changes

numpy/_core/numerictypes.py Outdated Show resolved Hide resolved

mtsokol force-pushed the isdtype-function branch from 28e435e to 3e1c06e Compare November 29, 2023 10:16

mtsokol added this to the 2.0.0 release milestone Nov 29, 2023

mtsokol force-pushed the isdtype-function branch from 3e1c06e to a227f5d Compare November 29, 2023 11:37

mtsokol changed the title ~~API: Introduce np.isdtype function~~ API: Introduce np.isdtype function [Array API] Nov 30, 2023

ngoldbaum reviewed Dec 5, 2023

View reviewed changes

doc/release/upcoming_changes/25054.new_feature.rst Outdated Show resolved Hide resolved

ngoldbaum reviewed Dec 5, 2023

View reviewed changes

mtsokol force-pushed the isdtype-function branch 2 times, most recently from 7e7b550 to cd8a56c Compare December 8, 2023 14:23

mtsokol mentioned this pull request Dec 8, 2023

API: Introduce np.astype [Array API] #25079

Merged

mtsokol force-pushed the isdtype-function branch 4 times, most recently from a22def4 to c74dbb4 Compare December 11, 2023 18:09

rgommers reviewed Dec 12, 2023

View reviewed changes

mtsokol force-pushed the isdtype-function branch 2 times, most recently from f830455 to 3a3684b Compare December 12, 2023 17:05

mtsokol requested a review from rgommers December 13, 2023 09:54

mtsokol force-pushed the isdtype-function branch from 3a3684b to 151d70a Compare December 14, 2023 16:32

API: Introduce np.isdtype function

4598f0c

mtsokol force-pushed the isdtype-function branch from 151d70a to 4598f0c Compare December 15, 2023 13:14

ngoldbaum merged commit a1fbdd9 into numpy:main Dec 20, 2023

charris changed the title ~~API: Introduce np.isdtype function [Array API]~~ API: Introduce np.isdtype function [Array API] Dec 20, 2023

mtsokol deleted the isdtype-function branch December 20, 2023 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: Introduce `np.isdtype` function [Array API] #25054

API: Introduce `np.isdtype` function [Array API] #25054

mtsokol commented Nov 2, 2023 •

edited

Loading

mhvk left a comment

rgommers commented Nov 2, 2023

mhvk commented Nov 2, 2023

mtsokol commented Nov 2, 2023

ngoldbaum Dec 5, 2023

mtsokol Dec 6, 2023

mtsokol Dec 6, 2023

ngoldbaum Dec 8, 2023

rgommers Dec 11, 2023

ngoldbaum Dec 11, 2023

rgommers Dec 11, 2023

mtsokol Dec 11, 2023

jakevdp Dec 12, 2023 •

edited

Loading

ngoldbaum Dec 20, 2023

rgommers left a comment

rgommers Dec 12, 2023

mtsokol Dec 12, 2023

rgommers Dec 12, 2023

mtsokol Dec 12, 2023 •

edited

Loading

ngoldbaum commented Dec 20, 2023

API: Introduce np.isdtype function [Array API] #25054

API: Introduce np.isdtype function [Array API] #25054

Conversation

mtsokol commented Nov 2, 2023 • edited Loading

mhvk left a comment

Choose a reason for hiding this comment

rgommers commented Nov 2, 2023

mhvk commented Nov 2, 2023

mtsokol commented Nov 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakevdp Dec 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgommers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtsokol Dec 12, 2023 • edited Loading

Choose a reason for hiding this comment

ngoldbaum commented Dec 20, 2023

API: Introduce `np.isdtype` function [Array API] #25054

API: Introduce `np.isdtype` function [Array API] #25054

mtsokol commented Nov 2, 2023 •

edited

Loading

jakevdp Dec 12, 2023 •

edited

Loading

mtsokol Dec 12, 2023 •

edited

Loading