Add NumPy scalar hierarchy #14

alanhdu · 2018-03-20T02:19:14Z

No description provided.

shoyer · 2018-03-20T04:51:19Z

numpy/__init__.pyi

+_Scalar = TypeVar("_Scalar", bound="generic")
+
+class generic:
+    @property


Could we define a private base class for all the shared methods/properties between generic and ndarray? Otherwise we'll have quite a bit to keep in sync.

shoyer · 2018-03-20T04:52:01Z

numpy/__init__.pyi

+class int16(signedinteger): ...
+class int32(signedinteger): ...
+class int64(signedinteger): ...
+int_ = int64


Regrettably, this is actually platform dependent, e.g., on Windows, it's int32: numpy/numpy#9464

shoyer · 2018-03-20T04:59:24Z

numpy/__init__.pyi

+class number(generic): ...
+class bool_(generic): ...
+class object_(generic): ...
+class datetime64(generic): ...


Don't forget timedelta64. Unfortunately it inherits from np.integer, though this is likely to change in the future: numpy/numpy#10685

shoyer · 2018-03-20T05:01:59Z

numpy/__init__.pyi

@@ -358,6 +358,187 @@ class ndarray(Iterable, Sized, SupportsInt, SupportsFloat, SupportsComplex,
    def __getattr__(self, name) -> Any: ...


+_Scalar = TypeVar("_Scalar", bound="generic")


I think we can actually define circular references in .pyi files, like I did for _DtypeLike above.

The approach in this patch is the approach recommended by PEP484

OK, in that case we should probably fix _DtypeLike :)

To be clear, this is the recommended approach for typing self

I was suggesting that bound="generic" could be written as bound=generic.

I realize. I suppose the PEP self-typing example is not within a stub file, so perhaps my point is irrelevant.

In typeshed we prefer not doing any quoting of this sort.

shoyer · 2018-03-20T05:02:49Z

numpy/__init__.pyi

+
+class generic:
+    @property
+    def real(self: _Scalar) -> _Scalar: ...


real and imag are not guaranteed to return the same scalar type -- they convert complex numbers to reals.

eric-wieser · 2018-03-20T05:08:11Z

numpy/__init__.pyi

+class int8(signedinteger): ...
+class int16(signedinteger): ...
+class int32(signedinteger): ...
+class int64(signedinteger): ...


Sadly this doesn't actually reflect the underlying model - in reality, the types are byte, short, intc, int_, and longlong, and the sized aliases are attached in a platform-dependent manner.

Does it really matter whether we define aliases or the real classes here? I would actually be OK with doing only these aliases. For annotations (and clearly written code in general), it's best to avoid platform dependent types like int.

Well since there are actually 5 underlying types, sometimes numpy returns a type that is not referred to by any of these four aliases (which is made worse by numpy/numpy#10151 making them look the same).

I think what you have there is an argument about removing the unsized types from numpy - which I think is valid. But I think the type annotations should reflect what actually happens, not how we want things to happen.

eric-wieser · 2018-03-20T05:09:01Z

numpy/__init__.pyi

+class float16(floating): ...
+class float32(floating): ...
+class float64(floating): ...
+class float128(floating): ...


Similarly the actual types here are half, single, double, longdouble, and similarly for complex.

Which is relevant because float128 is sometimes called float96, but both are always aliases for longdouble

alanhdu · 2018-03-20T19:07:41Z

I've factored out an _ArrayLike class (although I need to play around with the types some more to see how to get it to typecheck the way I want).

I've also moved the platform-specific types together. AFAICT though, on the Python side the platform-specific types are aliases for the sized types (e.g. for me, np.short.__name__ == 'int16'), so the problem reduces to how to decide which sized type the unsized types correspond to (mypy supports sys.platform and maybe also sys.maxsize?). Or is that not how it actually works?

eric-wieser · 2018-03-20T19:29:35Z

on the Python side the platform-specific types are aliases for the sized types

The __name__ attributes are misleading and currently non-unique (numpy/numpy#10151)

>>> np.int_.__name__  # on windows, this gives int32, so use `intc` below
'int64'
>>> np.longlong.__name__
'int64'
>>> np.int_ is np.longlong  # C long and C longlong are different types
False

There really is a unique type for each of the underlying C types, but the python-visible names are the sized name for that type. Does mypy use the stubs to print the type name in a message, or the actual __name__ attribute?

alanhdu · 2018-03-21T20:52:51Z

I believe that MyPy uses the actual type name (from a small test case):

Argument 1 to "f" has incompatible type "int64"; expected "longlong"

Is the "right" thing to do here is to just to have each of the platform specific types be their own distinct type? How important is it to get all the sized aliases matched up with their underlying components and how hard is that to do?

JelleZijlstra · 2018-03-21T20:59:12Z

Mypy definitely doesn't look at the runtime __name__ attribute; all it knows about numpy types necessarily comes from the stubs.

eric-wieser · 2018-03-21T21:18:35Z

As part of setup.py, we could import numpy and generate the alias stubs based on the actual alias values

alanhdu · 2018-03-25T18:42:03Z

I really like generating the alias stubs when the user installs them! I've opened up a follow-up issue (#15) to discuss that idea further.

For now, I've fixed the type errors and decided to leave the platform-specific types for another PR (which by default leaves them as Any types). Do you mind taking another look @shoyer @eric-wieser?

(Also sorry for the super long iteration cycles -- I unfortunately don't have a ton of time to really zero-in on this so it usually takes a couple days for me to circle back to stubbing out numpy).

eric-wieser · 2018-03-25T19:43:33Z

numpy/__init__.pyi

-    real: ndarray
-
+_ArraySelf = TypeVar("_ArraySelf", bound=_ArrayLike)
+class _ArrayLike(SupportsInt, SupportsFloat, SupportsComplex, SupportsBytes,


Not a good name - in the docs, array_like means can be converted to array.

Perhaps _ArrayOrScalarCommon?

eric-wieser · 2018-03-25T19:44:17Z

numpy/__init__.pyi

+class generic(_ArrayLike):
+    @property
+    def base(self) -> None: ...
+class _is_real(generic):


Missing newline above _is_real, I think

_real_generic would be a better name

eric-wieser · 2018-03-25T19:45:57Z

numpy/__init__.pyi

+
+class flexible(_is_real): ...
+class void(flexible): ...
+class character(_is_real): ...


This is correct, but it's also bizarre:

>>> np.bytes_(b'test').real 'test' >>> np.bytes_(b'test').imag ''

What were we thinking here?

If I had to guess, I think imag by default returns the "0" version of the data type -- while this makes sense for numbers, it has bizarre consequences for things that aren't numbers:

In [4]: np.datetime64(dt.datetime.now()).real Out[4]: numpy.datetime64('2018-03-26T15:22:14.672804') In [5]: np.datetime64(dt.datetime.now()).imag Out[5]: numpy.datetime64('1970-01-01T00:00:00.000000')

or

In [13]: np.object_(["hello world"]).real Out[13]: array(['hello world'], dtype=object) In [14]: np.object_(["hello world"]).imag Out[14]: array([0], dtype=object)

Issue opened at numpy/numpy#10818

array_like means "can be converted to array"

shoyer · 2018-03-29T01:20:29Z

👍 this looks great to me!

Two things that would be nice to have:

Update the return value of dtype.type to Type[generic].
If you can think of any, add a few additional smoke tests to our test file (e.g., verify that you can construct a numpy scalar and check its properties without an error from mypy)

eric-wieser · 2018-03-29T01:23:49Z

Update the return value of dtype.type to Type[generic].

~~What about object?~~

Edit: nevermind

alanhdu · 2018-03-29T22:48:50Z

@shoyer @eric-wieser I went a little overboard with the test suite and ended up writing a small test framework (I wanted to assert that certain things failed and that led me down a bit of a rabbit hole).

I put the documentation in the testing readme, the important bit is that:

There are three main directories of tests right now:

There are three main directories of tests right now:

pass/ which contain Python files that must pass mypy checking with
no type errors

fail/ which contain Python files that must fail mypy checking
with the annotated errors

reveal/ which contain Python files that must output the correct
types with reveal_type

fail and reveal are annotated with comments that specify what error
mypy threw and what type should be revealed respectively. The format
looks like:
bad_function   # E: <error message>
reveal_type(x)   # E: <type name>
Right now, the error messages and types are must be contained within
corresponding mypy message.

Running the tests

We use py.test to orchestrate our tests. You can just run:
py.test
to run the entire test suite. To run mypy on a specific file (which
can be useful for debugging), you can also run:
$ cd tests
$ MYPYPATH=.. mypy <file_path>

Is that ok? If not, I'm happy to move the test suite stuff to another PR for a longer discussion (I'm not totally sold on the # E: annotations and I'm not super happy with having to parse mypy's string output).

shoyer · 2018-03-29T22:52:41Z

The test suite bit looks pretty sweet to me! Thanks @alanhdu

alanhdu · 2018-04-03T14:13:11Z

@shoyer @eric-wieser Do you mind taking another look at this? Hopefully it's close 😆.

shoyer

A couple of minor suggestions

shoyer · 2018-04-03T15:34:20Z

tests/test_stubs.py

+            assert lineno in errors, f'Extra error "{marker}"'
+            assert marker in errors[lineno]
+        else:
+            assert "# E:" in target_line, f'Error "{errors[lineno]}" not found'


Instead of the redundant assert, can you just use pytest.fail()?

shoyer · 2018-04-03T15:37:39Z

tests/test_stubs.py

+            assert lineno in errors, f'Extra error "{marker}"'
+            assert marker in errors[lineno]
+        else:
+            assert "# E:" in target_line, f'Error "{errors[lineno]}" not found'


To guarantee quotes are escaped properly, let's cast with repr() inside the f-string, e.g., f'Error {repr(errors[lineno])}" not found'.

shoyer · 2018-04-05T06:23:36Z

OK, looks good to me.

@eric-wieser any more thoughts before I merge?

shoyer · 2018-04-10T18:37:11Z

OK, merged. Thanks @alanhdu

Add NumPy scalar hierarchy

55b39aa

shoyer reviewed Mar 20, 2018

View reviewed changes

eric-wieser reviewed Mar 20, 2018

View reviewed changes

alanhdu added 3 commits March 20, 2018 14:33

Include timedelta64

9baec88

Factor out _ArrayLike base class

6637c4f

Cluster away platform specific types

2272beb

alanhdu force-pushed the scalar branch from f3817cb to 2272beb Compare March 20, 2018 18:56

fixup! Factor out _ArrayLike base class

9e1ca28

Add _is_real type to fix real and imag typing

bcaf44b

alanhdu force-pushed the scalar branch from be07d3a to bcaf44b Compare March 25, 2018 18:17

fixup! fixup! Factor out _ArrayLike base class

2e000fa

eric-wieser reviewed Mar 25, 2018

View reviewed changes

alanhdu added 2 commits March 26, 2018 15:19

s/_is_real/_real_generic/g

4636885

s/_ArrayLike/_ArrayOrScalarCommon

f14ff62

array_like means "can be converted to array"

This was referenced Mar 29, 2018

Behaviour of np.generic.real and np.generic.imag is ... really imaginative #16

Closed

Behaviour of np.generic.real and np.generic.imag is ... really imaginative numpy/numpy#10818

Open

Update dtype.type

208e492

alanhdu force-pushed the scalar branch from 4815033 to f328ac4 Compare March 29, 2018 22:42

Add testing framework

2916e5f

alanhdu added 2 commits March 29, 2018 18:43

Fix type annotations

87e2760

Update Travis

f36e253

alanhdu force-pushed the scalar branch from f328ac4 to f36e253 Compare March 29, 2018 22:43

shoyer reviewed Apr 3, 2018

View reviewed changes

Use pytest.fail

7d3d523

shoyer merged commit f0cb9ff into numpy:master Apr 10, 2018

alanhdu deleted the scalar branch April 10, 2018 18:37

BvB93 mentioned this pull request Aug 27, 2020

ENH: Make np.complexfloating generic w.r.t. np.floating numpy/numpy#17172

Merged

		@@ -358,6 +358,187 @@ class ndarray(Iterable, Sized, SupportsInt, SupportsFloat, SupportsComplex,
		def __getattr__(self, name) -> Any: ...


		_Scalar = TypeVar("_Scalar", bound="generic")

Add NumPy scalar hierarchy #14

Add NumPy scalar hierarchy #14

Conversation

alanhdu commented Mar 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alanhdu commented Mar 20, 2018 • edited Loading

eric-wieser commented Mar 20, 2018 • edited Loading

alanhdu commented Mar 21, 2018

JelleZijlstra commented Mar 21, 2018

eric-wieser commented Mar 21, 2018

alanhdu commented Mar 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoyer commented Mar 29, 2018

eric-wieser commented Mar 29, 2018 • edited Loading

alanhdu commented Mar 29, 2018 • edited Loading

Running the tests

shoyer commented Mar 29, 2018

alanhdu commented Apr 3, 2018

shoyer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoyer commented Apr 5, 2018

shoyer commented Apr 10, 2018

eric-wieser Mar 20, 2018 •

edited

Loading

eric-wieser Mar 20, 2018 •

edited

Loading

alanhdu commented Mar 20, 2018 •

edited

Loading

eric-wieser commented Mar 20, 2018 •

edited

Loading

alanhdu commented Mar 25, 2018 •

edited

Loading

eric-wieser commented Mar 29, 2018 •

edited

Loading

alanhdu commented Mar 29, 2018 •

edited

Loading