Skip to content

BUG: Passing a stringdtype array to np.where can lose string data #26420

Closed
@ngoldbaum

Description

@ngoldbaum

Describe the issue:

Similar to #26147 and #26317 it looks like np.where should be doing a copy instead of a view for StringDType.

Reproduce the code example:

import numpy as np
a = np.array(["a"*25, "b", np.nan], dtype=np.dtypes.StringDType(na_object=np.nan))
print(repr(np.where(a, a, a)))

Result:

# prints:
array(['', 'b', nan], dtype=StringDType(na_object=nan))

# should print:
array(['aaaaaaaaaaaaaaaaaaaaaaaaa', 'b', nan],
      dtype=StringDType(na_object=nan))

Python and NumPy Versions:

Current main branch, python 3.12.2.

Runtime Environment:

No response

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions