Skip to content

MNT: Remove set_string_function #26611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 17, 2024

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Jun 4, 2024

Hi @ngoldbaum,

This PR removes set_string_function and makes PyArray_SetStringFunction throw a "was removed" exception when called.

__str__ in numpy.ndarray class can be still altered with set_printoptions. __repr__ can't be modified by a user with the Python or C API.

@mtsokol mtsokol self-assigned this Jun 4, 2024
@mtsokol mtsokol linked an issue Jun 4, 2024 that may be closed by this pull request
@ngoldbaum
Copy link
Member

ngoldbaum commented Jun 4, 2024

Needs a release note. You also missed spots the C function still shows up:

numpy/__init__.pxd
526:    void PyArray_SetStringFunction (object, int)

doc/source/reference/c-api/types-and-structures.rst
223:    methods can also be altered using :c:func:`PyArray_SetStringFunction`.

numpy/__init__.cython-30.pxd
611:    void PyArray_SetStringFunction (object, int)

Maybe also add a note or comment to numpy/_core/code_generators/numpy_api.py above the entry for the function that it is stubbed out and should be removed in the future.

repr can't be modified by a user with the Python or C API.

That's not true is it? You can use printoptions to override how the repr appears too:

In [1]: with np.printoptions(formatter={"all": lambda x: "hello"}):
   ...:     print(repr(np.array([1, 2, 3])))
   ...:
array([hello, hello, hello])

@mtsokol
Copy link
Member Author

mtsokol commented Jun 4, 2024

That's not true is it? You can use printoptions to override how the repr appears too:

In [1]: with np.printoptions(formatter={"all": lambda x: "hello"}):
   ...:     print(repr(np.array([1, 2, 3])))
   ...:
array([hello, hello, hello])

I think there's no need to dig into it so let's say that we can override repr/str sufficiently with printoptions.

With set_string_function, that we're removing, we can override whole repr, not only elements repr (but I guess it's a redundant feature):

In [104]: a = np.array([[1,2,3]])

In [105]: a
Out[105]: array([[1, 2, 3]])

In [106]: np.set_string_function(lambda x: "hello")

In [107]: a
Out[107]: hello

@ngoldbaum
Copy link
Member

Yeah, I think printoptions replaces all the actually useful use cases. If anyone really wants to override how we lay out printing arrays they can do that themselves.

@mtsokol
Copy link
Member Author

mtsokol commented Jun 5, 2024

Done!

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @seberg @mattip are you ok with having a stubbed-out C API function like this?

In a GitHub search I see no uses outside of copies of numpy and re-exports and listings of the numpy C API.

@ngoldbaum
Copy link
Member

I do see one downstream project trying to work around us removing the ability to completely override the print formatting for the entire array:

napari/napari#6776

They ended up deciding to set edgeitems so the array size is limited in the printed output, but it's not quite the same as what they had before.

That said, maybe printoptions could grow an option to override printing the entire array?

@mattip
Copy link
Member

mattip commented Jun 9, 2024

Since we began deprecating this in 2.0, I think we need to wait for 2.2 to remove it, no?

@ngoldbaum
Copy link
Member

The C-level function was never deprecated but the python-level function was removed from the public API in NumPy 2.0, skipping the deprecation.

It seems odd to me to have the Python function gone from the public API but to insist on a long deprecation period for the C-level function that is used much less often.

@mattip mattip added the triage review Issue/PR to be discussed at the next triage meeting label Jun 9, 2024
@seberg
Copy link
Member

seberg commented Jun 9, 2024

It seems odd to me to have the Python function gone from the public API but to insist on a long deprecation period for the C-level function that is used much less often.

FWIW, the deprecation warning should have been implemented to include the C-side, but the normal deprecation policy is 2 (or maybe even 3 in this case) releases (1 year).
I don't care too much about bending that here, it isn't a typical end-user function. But what is the reason?

@ngoldbaum
Copy link
Member

It looks like it was deprecated first, then quickly removed before the deprecation was in a release (the comment says it was deprecated in numpy 2.0). I think it was included in the general API cleanup, @mtsokol might remember the details about why it was included for removal so quickly after being deprecated. I don't remember thinking too carefully about this one during review.

@seberg
Copy link
Member

seberg commented Jun 10, 2024

It looks like it was deprecated first, then quickly removed before the deprecation was in a release

I am confused: This PR would be the one to remove it, no? Giving us a very brief deprecation period on the Python function.

I am happy to ignore the C version, and had we done it in 2.0, I would have been happy to just do it probably.
But now it does seem a bit like there is no reason to not just wait for 2.2 at least, so I am -0.0 for finalizing the deprecation so quickly?

@mtsokol
Copy link
Member Author

mtsokol commented Jun 10, 2024

It looks like it was deprecated first, then quickly removed before the deprecation was in a release (the comment says it was deprecated in numpy 2.0). I think it was included in the general API cleanup, @mtsokol might remember the details about why it was included for removal so quickly after being deprecated. I don't remember thinking too carefully about this one during review.

I think we just made a decision during one of the meetings to put this function in the "remove" group for 2.0 and removed it: #24306 (See Details).

So set_string_function was removed from the main namespace without a deprecation period (the error message says what should be used instead), and we didn't touch C's PyArray_SetStringFunction at all. This PR removes set_string_function implementation and stubs out PyArray_SetStringFunction.

That said, maybe printoptions could grow an option to override printing the entire array?

Sure, then there would be a new option override_repr or something in np.set_printoptions that accepts a callable and completely replaces how arrays are printed, replicating set_string_function functionality.

@seberg
Copy link
Member

seberg commented Jun 10, 2024

Aha, so the point is that it was effectively already removed in 2.0 because it only exists as a private function.

@mtsokol
Copy link
Member Author

mtsokol commented Jun 13, 2024

That said, maybe printoptions could grow an option to override printing the entire array?

@ngoldbaum I added override_repr option to np.printoptions that allows defining custom repr(...), which replicates:

np.set_string_function(lambda x: "FOO", repr=True)

@seberg seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 13, 2024
@seberg
Copy link
Member

seberg commented Jun 13, 2024

One thing we forgot yesterday: This needs a release note for the C-API change at least.

EDIT: Sorry, I see it's there, would be nice to put the C-API part into c_api but either way is fine.

@seberg seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 13, 2024
@ngoldbaum
Copy link
Member

Sorry for taking a little while to merge this one. Let's bring this in and thanks for finishing up the work on this deprecation/removal @mtsokol!

@ngoldbaum ngoldbaum merged commit a4cddb6 into numpy:main Jun 17, 2024
67 of 68 checks passed
@seberg
Copy link
Member

seberg commented Jun 17, 2024

I can see the use of adding it, but hadn't realized it was added in this PR. If we add override_repr we should have a relase note for that, and that is also a place to give anyone a chance to bike-shed names.

@mtsokol mtsokol deleted the remove-set_string_function branch June 17, 2024 19:07
@mtsokol
Copy link
Member Author

mtsokol commented Jun 17, 2024

Sure! Let me tweak the release note separately.

charris added a commit that referenced this pull request Jun 20, 2024
@alexanderswerdlow
Copy link

Thanks for adding this functionality back with override_repr!

Just wanted to note that at least one other project that relies on this feature: lovely-numpy, and I personally also have a hackier version, so it's great that it's being left in .

Very useful to be able to quickly see array information in one glance :)

@ngoldbaum ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) triage review Issue/PR to be discussed at the next triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MNT: np.set_string_function implementation should be removed
5 participants