Skip to content

BUG: StringDType: na_object ignored in full #28157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mathause opened this issue Jan 15, 2025 · 4 comments · Fixed by #28228
Closed

BUG: StringDType: na_object ignored in full #28157

mathause opened this issue Jan 15, 2025 · 4 comments · Fixed by #28228
Assignees
Labels
00 - Bug component: numpy.strings String dtypes and functions

Comments

@mathause
Copy link

mathause commented Jan 15, 2025

Describe the issue:

Creating a StringDType ndarray with na_object using full (and full_like) coerces the nan sentinel to a string.

I can work around this using arr[:] = np.nan, but think the behavior is unexpected.

Reproduce the code example:

import numpy as np

arr1 = np.full((1,), fill_value=np.nan, dtype=np.dtypes.StringDType(na_object=np.nan))

arr2 = np.full_like(arr1, fill_value=np.nan)

assert arr1.item() is np.nan
assert arr2.item() is np.nan

Error message:

Traceback (most recent call last):
  File "/Users/goldbaum/Documents/numpy/../numpy-experiments/test.py", line 7, in <module>
    assert arr1.item() is np.nan
           ^^^^^^^^^^^^^^^^^^^^^
AssertionError

Python and NumPy Versions:

2.2.1
3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:24:40) [GCC 13.3.0]

Runtime Environment:

[{'numpy_version': '2.2.1',
'python': '3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:24:40) '
'[GCC 13.3.0]',
'uname': uname_result(system='Linux', node='poisson', release='6.8.0-51-generic', version='#52~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Dec 9 15:00:52 UTC 2', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL',
'AVX512_SPR']}},
{'architecture': 'Haswell',
'filepath': '/home/mathause/.conda/envs/regionmask_dev/lib/libopenblasp-r0.3.28.so',
'internal_api': 'openblas',
'num_threads': 8,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.28'}]

Context for the issue:

No response

@ngoldbaum ngoldbaum added the component: numpy.strings String dtypes and functions label Jan 15, 2025
@ngoldbaum
Copy link
Member

Thanks for the report, I can reproduce this on main. I agree that this is a bug.

@ngoldbaum
Copy link
Member

See #28091 (comment). I think I'll wait for that PR to land before fixing this. It will be a lot easier to debug without having to trace through macros in a C debugger too...

@ngoldbaum
Copy link
Member

I opened #28228

@mathause
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: numpy.strings String dtypes and functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants