Skip to content

TYP: ndarray.item is defaulting to output string #28017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PranaliPatil12 opened this issue Dec 17, 2024 · 14 comments
Closed

TYP: ndarray.item is defaulting to output string #28017

PranaliPatil12 opened this issue Dec 17, 2024 · 14 comments

Comments

@PranaliPatil12
Copy link

Describe the issue:

In version 2.2.0, the return value of ndarray.item is defaulting to str. The previous version the function returned int.
I am not sure if it should be doing that.

I was following the documentation here but the new version 2.2.0 returns a string.

numpy/numpy/__init__.pyi

Lines 2111 to 2122 in a2012ad

@overload # special casing for `StringDType`, which has no scalar type
def item(self: ndarray[Any, dtypes.StringDType], /) -> str: ...
@overload
def item(self: ndarray[Any, dtypes.StringDType], arg0: SupportsIndex | tuple[SupportsIndex, ...] = ..., /) -> str: ...
@overload
def item(self: ndarray[Any, dtypes.StringDType], /, *args: SupportsIndex) -> str: ...
@overload # use the same output type as that of the underlying `generic`
def item(self: _HasShapeAndDTypeWithItem[Any, _T], /) -> _T: ...
@overload
def item(self: _HasShapeAndDTypeWithItem[Any, _T], arg0: SupportsIndex | tuple[SupportsIndex, ...] = ..., /) -> _T: ...
@overload
def item(self: _HasShapeAndDTypeWithItem[Any, _T], /, *args: SupportsIndex) -> _T: ...

@eendebakpt eendebakpt added the 33 - Question Question about NumPy usage or development label Dec 17, 2024
@eendebakpt
Copy link
Contributor

The return type of ndarray.item depends on the contents of the array, but this is not str by default. The code you quoted is the typing specification where there is a special case for StringDType. The .item() can return all kinds of values:

import numpy as np
a = np.array('s', dtype=str)
a.item() # 's', e.g. str type
a = np.array(10)
a.item() # 10, e.g. int type

@PranaliPatil12
Copy link
Author

PranaliPatil12 commented Dec 17, 2024

Yes, it runs smoothly but Mypy is recognizing it as string rather than an integer

Error:
error: Argument 2 to "XYZ" has incompatible type "str"; expected "int" [arg-type]

@eendebakpt
Copy link
Contributor

Can you give a minimal example of some code where mypy behaves different in numpy 2.2 compared to earlier versions?

@anujmittal94
Copy link

anujmittal94 commented Dec 17, 2024

Can you give a minimal example of some code where mypy behaves different in numpy 2.2 compared to earlier versions?

import numpy as np

def main() -> None:
    sample_array: np.ndarray = np.array([1])
    item: int = sample_array.item()
    print(f"{item}")

if __name__ == "__main__":
    main()

This recreates effectively a similar case:
error: Incompatible types in assignment (expression has type "str", variable has type "int") [assignment]

@eendebakpt eendebakpt added 00 - Bug and removed 33 - Question Question about NumPy usage or development labels Dec 17, 2024
@deathblade287
Copy link

While I have failed to recreate this error with your code (it prints 1), my theory is that the array might be of type np.int32 or np.int64 which is not the same thing as a python int and that's why mypy throws the error.

@eendebakpt
Copy link
Contributor

The issue might have been introduced in #27750 @jorenham

@jorenham jorenham changed the title BUG: ndarray.item is defaulting to output string TYP: ndarray.item is defaulting to output string Dec 19, 2024
@jorenham

This comment was marked as outdated.

@jorenham

This comment was marked as outdated.

@jorenham
Copy link
Member

jorenham commented Dec 19, 2024

I can't reproduce this with the latest mypy (1.13.0):

contents of [...]/issue_28017.pyi

from typing import reveal_type

import numpy as np

a = np.array("s", dtype=str)
reveal_type(a)
reveal_type(a.item())

b = np.array(10)
reveal_type(b)
reveal_type(b.item())

output of uv run mypy [...]/issue_28017.pyi:

[...]/issue_28017.pyi:6: note: Revealed type is "numpy.ndarray[builtins.tuple[builtins.int, ...], numpy.dtype[Any]]"
[...]/issue_28017.pyi:7: note: Revealed type is "Any"
[...]/issue_28017.pyi:10: note: Revealed type is "numpy.ndarray[builtins.tuple[builtins.int, ...], numpy.dtype[Any]]"
[...]/issue_28017.pyi:11: note: Revealed type is "Any"

relevant pyproject.toml config:

dependencies = [
    "numpy>=2.2.0",
    "mypy[faster-cache]>=1.13.0",
]

[tool.mypy]
modules = ["src/*"]
plugins = ["numpy.typing.mypy_plugin"]
python_version = "3.12"
strict = true

So both a and b are inferred as npt.NDArray[Any] (which isn't optimal, but it's not a bug).
And we see that in both cases, .item is then inferred as Any, which the behavior you'd expect, and not a string.


What mypy version are you using exactly @PranaliPatil12?

@jorenham
Copy link
Member

I confirmed that in the case where the scalar-type isn't Any, this works as intended:

from typing import reveal_type

import numpy as np
import numpy.typing as npt

c: npt.NDArray[np.int32]
reveal_type(c)
reveal_type(c.item())

mypy output

[...]/issue_28017.pyi:7: note: Revealed type is "numpy.ndarray[builtins.tuple[builtins.int, ...], numpy.dtype[numpy.signedinteger[numpy._typing._nbit_base._32Bit]]]"
[...]/issue_28017.pyi:8: note: Revealed type is "builtins.int"

pyright output:

[...]/issue_28017.pyi:7:13 - information: Type of "c" is "ndarray[tuple[int, ...], dtype[signedinteger[_32Bit]]]"
[...]/issue_28017.pyi:8:13 - information: Type of "c.item()" is "int"

@jorenham
Copy link
Member

jorenham commented Dec 19, 2024

If you look at the first @overloads with return type -> str, you'll notice that these all have self: ndarray[Any, dtypes.StringDType]. That means that they will only used if the dtype is an instance of dtypes.StringDType.

In the other cases, the static type-checker will skip them, and use the self: _HasShapeAndDTypeWithItem[Any, _T] overloads. Skipping over the details, this will cause _T to be inferred as int in the case of dtype[int32] like in my last example.

@jorenham jorenham added the 57 - Close? Issues which may be closable unless discussion continued label Dec 19, 2024
@anujmittal94
Copy link

anujmittal94 commented Dec 19, 2024

@deathblade287 The issue is typing and mypy related, it is not a bug that will make the script fail at execution
@jorenham As you have done, I checked using reveal type in the example I gave, and indeed it correctly handles the types, but mypy reveals unexpected behaviour

def main() -> None:
    sample_array: np.ndarray = np.array([1])
    reveal_type(sample_array)
    value = sample_array.item()
    reveal_type(value)
    item: int = value
    reveal_type(item)
    print(f"{item}")

if __name__ == "__main__":
    main()

giving output

Runtime type is 'ndarray'
Runtime type is 'int'
Runtime type is 'int'
1

However, when I run mypy on the script (1.13.0):

>mypy .      
test.py:6: note: Revealed type is "numpy.ndarray[Any, Any]"
test.py:8: note: Revealed type is "builtins.str"
test.py:9: error: Incompatible types in assignment (expression has type "str", variable has type "int")  [assignment]
test.py:10: note: Revealed type is "builtins.int

So it may be something to do with the way mypy gets the types which I do not know about, perhaps related to the bug you mentioned earlier.

Edit: As per pranali's comment, this case does not occur if using npt.NDArray[Any] instead. Thanks @jorenham @PranaliPatil12

@PranaliPatil12
Copy link
Author

While looking at Anuj's case, I think using the numpy.typing might be useful. npt.NDArray[Any] would do the trick.
I am no longer getting the error of .item being a string.

Thank you @jorenham

@PranaliPatil12 PranaliPatil12 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2024
@jorenham
Copy link
Member

However, when I run mypy on the script (1.13.0):

>mypy .      
test.py:6: note: Revealed type is "numpy.ndarray[Any, Any]"
test.py:8: note: Revealed type is "builtins.str"
test.py:9: error: Incompatible types in assignment (expression has type "str", variable has type "int")  [assignment]
test.py:10: note: Revealed type is "builtins.int

So it may be something to do with the way mypy gets the types which I do not know about, perhaps related to the bug you mentioned earlier.

The only difference I see between your example here, and my earlier example (the one with a and b), is that in your case, the dtype itself is inferred as Any, and in my case it was inferred as numpy.dtype[Any]. That, in combination with the mypy bug python/mypy#14070, is probably indeed the reason why mypy incorrectly infers ndarray[Any, Any].item() as str.

But unfortunately I don't think that there's a workaround, or at least, not one that doesn't have severe negative side-effects (i.e. the "Never-trick" that was used in some places before numpy 2.2.0, which broke structural typing).

So I'd indeed classify this one as (yet another) "mypy bug without reasonable workaround".
If possible for you, I'd recommend switching to pyright, or even better, basedpyright (a backwards-compatible pyright fork with saner default, more checks, better tooling, more open-source, and less microsoft).

@jorenham jorenham removed the 57 - Close? Issues which may be closable unless discussion continued label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants