-
-
Notifications
You must be signed in to change notification settings - Fork 31
Conversation
numpy/__init__.pyi
Outdated
_Scalar = TypeVar("_Scalar", bound="generic") | ||
|
||
class generic: | ||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we define a private base class for all the shared methods/properties between generic
and ndarray
? Otherwise we'll have quite a bit to keep in sync.
numpy/__init__.pyi
Outdated
class int16(signedinteger): ... | ||
class int32(signedinteger): ... | ||
class int64(signedinteger): ... | ||
int_ = int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regrettably, this is actually platform dependent, e.g., on Windows, it's int32: numpy/numpy#9464
numpy/__init__.pyi
Outdated
class number(generic): ... | ||
class bool_(generic): ... | ||
class object_(generic): ... | ||
class datetime64(generic): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget timedelta64
. Unfortunately it inherits from np.integer
, though this is likely to change in the future: numpy/numpy#10685
numpy/__init__.pyi
Outdated
@@ -358,6 +358,187 @@ class ndarray(Iterable, Sized, SupportsInt, SupportsFloat, SupportsComplex, | |||
def __getattr__(self, name) -> Any: ... | |||
|
|||
|
|||
_Scalar = TypeVar("_Scalar", bound="generic") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can actually define circular references in .pyi
files, like I did for _DtypeLike
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach in this patch is the approach recommended by PEP484
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, in that case we should probably fix _DtypeLike
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, this is the recommended approach for typing self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting that bound="generic"
could be written as bound=generic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize. I suppose the PEP self-typing example is not within a stub file, so perhaps my point is irrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In typeshed we prefer not doing any quoting of this sort.
numpy/__init__.pyi
Outdated
|
||
class generic: | ||
@property | ||
def real(self: _Scalar) -> _Scalar: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
real
and imag
are not guaranteed to return the same scalar type -- they convert complex numbers to reals.
class int8(signedinteger): ... | ||
class int16(signedinteger): ... | ||
class int32(signedinteger): ... | ||
class int64(signedinteger): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly this doesn't actually reflect the underlying model - in reality, the types are byte
, short
, intc
, int_
, and longlong
, and the sized aliases are attached in a platform-dependent manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really matter whether we define aliases or the real classes here? I would actually be OK with doing only these aliases. For annotations (and clearly written code in general), it's best to avoid platform dependent types like int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well since there are actually 5 underlying types, sometimes numpy returns a type that is not referred to by any of these four aliases (which is made worse by numpy/numpy#10151 making them look the same).
I think what you have there is an argument about removing the unsized types from numpy - which I think is valid. But I think the type annotations should reflect what actually happens, not how we want things to happen.
numpy/__init__.pyi
Outdated
class float16(floating): ... | ||
class float32(floating): ... | ||
class float64(floating): ... | ||
class float128(floating): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly the actual types here are half
, single
, double
, longdouble
, and similarly for complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is relevant because float128
is sometimes called float96
, but both are always aliases for longdouble
I've factored out an I've also moved the platform-specific types together. AFAICT though, on the Python side the platform-specific types are aliases for the sized types (e.g. for me, |
The >>> np.int_.__name__ # on windows, this gives int32, so use `intc` below
'int64'
>>> np.longlong.__name__
'int64'
>>> np.int_ is np.longlong # C long and C longlong are different types
False There really is a unique type for each of the underlying |
I believe that MyPy uses the actual type name (from a small test case):
Is the "right" thing to do here is to just to have each of the platform specific types be their own distinct type? How important is it to get all the sized aliases matched up with their underlying components and how hard is that to do? |
Mypy definitely doesn't look at the runtime |
As part of |
I really like generating the alias stubs when the user installs them! I've opened up a follow-up issue (#15) to discuss that idea further. For now, I've fixed the type errors and decided to leave the platform-specific types for another PR (which by default leaves them as (Also sorry for the super long iteration cycles -- I unfortunately don't have a ton of time to really zero-in on this so it usually takes a couple days for me to circle back to stubbing out numpy). |
numpy/__init__.pyi
Outdated
real: ndarray | ||
|
||
_ArraySelf = TypeVar("_ArraySelf", bound=_ArrayLike) | ||
class _ArrayLike(SupportsInt, SupportsFloat, SupportsComplex, SupportsBytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a good name - in the docs, array_like
means can be converted to array
.
Perhaps _ArrayOrScalarCommon
?
numpy/__init__.pyi
Outdated
class generic(_ArrayLike): | ||
@property | ||
def base(self) -> None: ... | ||
class _is_real(generic): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing newline above _is_real
, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_real_generic
would be a better name
numpy/__init__.pyi
Outdated
|
||
class flexible(_is_real): ... | ||
class void(flexible): ... | ||
class character(_is_real): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct, but it's also bizarre:
>>> np.bytes_(b'test').real
'test'
>>> np.bytes_(b'test').imag
''
What were we thinking here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I had to guess, I think imag
by default returns the "0" version of the data type -- while this makes sense for numbers, it has bizarre consequences for things that aren't numbers:
In [4]: np.datetime64(dt.datetime.now()).real
Out[4]: numpy.datetime64('2018-03-26T15:22:14.672804')
In [5]: np.datetime64(dt.datetime.now()).imag
Out[5]: numpy.datetime64('1970-01-01T00:00:00.000000')
or
In [13]: np.object_(["hello world"]).real
Out[13]: array(['hello world'], dtype=object)
In [14]: np.object_(["hello world"]).imag
Out[14]: array([0], dtype=object)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue opened at numpy/numpy#10818
array_like means "can be converted to array"
👍 this looks great to me! Two things that would be nice to have:
|
Edit: nevermind |
@shoyer @eric-wieser I went a little overboard with the test suite and ended up writing a small test framework (I wanted to assert that certain things failed and that led me down a bit of a rabbit hole). I put the documentation in the testing readme, the important bit is that: There are three main directories of tests right now:
Is that ok? If not, I'm happy to move the test suite stuff to another PR for a longer discussion (I'm not totally sold on the |
The test suite bit looks pretty sweet to me! Thanks @alanhdu |
@shoyer @eric-wieser Do you mind taking another look at this? Hopefully it's close 😆. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor suggestions
tests/test_stubs.py
Outdated
assert lineno in errors, f'Extra error "{marker}"' | ||
assert marker in errors[lineno] | ||
else: | ||
assert "# E:" in target_line, f'Error "{errors[lineno]}" not found' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the redundant assert, can you just use pytest.fail()
?
tests/test_stubs.py
Outdated
assert lineno in errors, f'Extra error "{marker}"' | ||
assert marker in errors[lineno] | ||
else: | ||
assert "# E:" in target_line, f'Error "{errors[lineno]}" not found' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To guarantee quotes are escaped properly, let's cast with repr()
inside the f-string, e.g., f'Error {repr(errors[lineno])}" not found'
.
OK, looks good to me. @eric-wieser any more thoughts before I merge? |
OK, merged. Thanks @alanhdu |
No description provided.