Skip to content

subplots() returns AxesArray #25937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

timhoffm
Copy link
Member

subplots() used to return a numpy array of Axes, which has some drawbacks. The numpy array is mainly used as a 2D container structure that allows 2D indexing. Apart from that, it's not particularly well suited:

  • Many of the numpy functions do not work on Axes.
  • Some functions work, but have awkward semantics; e.g. len() gives the number of rows.
  • We can't add our own functionality.

AxesArray introduces a facade to the underlying array to allow us to customize the API. For the beginning, the API is 100% compatible with the previous numpy array behavior, but we deprecate everything except for a few reasonable methods.


@property
def flat(self):
return self._array.flat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add ravel() here too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is no good reason to use flatten() over flat, but inexperienced me did so in lots of places. If we don’t want to support flatten(), it might be good to have a more detailed deprecation message saying what to use instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with adding flatten() here too. It really doesn't cost anything to support it too and should save quite a bit of churn in old codebases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Will add flatten() and ravel() as "discouraged" but keep for maximum compatibility. I also will switch to pending deprecation. Cutting the behavior is both difficult (it's hard to overview the whole array API) and not urgent. So, let's take it really slow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there's also users of expressions such as axarr.T or axarr.ravel("F") (or similar...) that want to iterate in column-major order.

@anntzer
Copy link
Contributor

anntzer commented May 21, 2023

I think we need to decide whether we want to be normative ("there is one way to do it") or lax ("we support all ndarray methods that could make sense"). I feel like you prefer being normative here, but I'd rather be lax because AxesArray is a somewhat ad-hoc API, and I don't think we should ask anyone to actually learn it (e.g. learn that the only way to unravel it is .flat and not .ravel() or .flatten()); users should just be able treat it as a numpy array and use whatever numpy array methods they can think of as long as it makes sense.

As a more specific point, I also don't think that len() returning the number of rows is particularly bad; I've probably(?) relied on that in the past and again that's just normal numpy semantics (use .size for the total number of elements), which I'd rather not break.

@timhoffm
Copy link
Member Author

timhoffm commented May 21, 2023

I‘m in between lax and normative. We should have clear recommendations (flat is preferred over flatten or ravel). We should continue to support „common“ current operations (e.g. flatten and ravel) because we don’t want to annoy users unnecessarily changing their working code. OTOH we should not feel obliged to log-term maintain every possible array operation (technically axis.roll() should work but it is somewhat arcane and IMHO not really useful). But since I can‘t oversee what is needed and reasonable for the whole array API, I plan to by default only keep API I and others discussing here find useful. Everything else gets a pending deprecation. This gives users time to object to the deprecation (and we should make this explicit in the message: „If you think this API should continue to be supported, please open an issue about it in the matplotlib bug tracker.“)

Overall the new API will be a relatively permissive opt-in: We more or less take the existing stuff that people (we and users) still want.

@oscargus
Copy link
Member

Is the idea to support "broadcasting" of the Axes method? Like AxisArray().set_xlim etc?

@jklymak
Copy link
Member

jklymak commented May 21, 2023

I'm not 100% clear on what the planned advantages of adding a new class are, so my vote is heavily on the lax side of things. If this has a proposed use, can that be devised at the same time?

@timhoffm
Copy link
Member Author

timhoffm commented May 22, 2023

Plans / possibilities:

  • broadcasting methods (e.g. axs[:,0].set_ylabel(text))
  • remove the need for squeeze=False for Nx1 or 1xM subplots: We can support 1D and 2D indexing on them simultaneously.
  • Narrow the API to useful methods. Tab-completion currently gives a lot of non-applicable methods.
  • T.b.d.: long-term behavior changes. Iteration yields rows is a one-sided preference. We could have iterrows() / itercolumns(). And maybe make iteration go over all elements. I'm aware that'd be quite a migration, but worth discussing as a long-term strategy.

@anntzer
Copy link
Contributor

anntzer commented May 22, 2023

Personally I'm +20 for some way of removing the need for squeeze, but see #19463 for a concrete alternative proposal that achieves the same thing with much less machinery needed (TLDR: I proposed to support subplots((m, n)) where passing a tuple as a single argument explicitly means "I absolutely want a 2D array back -- I know you already commented there).

broadcasting methods

I guess I'm +0 there; the loop is not too bad. Would be worth seeing some concrete examples where this helps. Note that usually you will have a loop somewhere over the axes anyways to do the plotting over each of them, so you can just put everything in the loop; for the axs[:, 0].set_ylabel(...) case there's the label_outer() helper (which is admittedly a bit of a kludge).

Iteration yields rows is a one-sided preference. We could have iterrows() / itercolumns(). And maybe make iteration go over all elements. I'm aware that'd be quite a migration, but worth discussing as a long-term strategy.

Let's not do that; ndarray semantics seem just fine here. (Numpy has been fine for years with for x in t.ravel(): ... / for x in t.ravel("F"): ...)

@ksunden ksunden modified the milestones: v3.8.0, v3.9.0 Aug 8, 2023
subplots() used to return a numpy array of Axes, which has some
drawbacks. The numpy array is mainly used as a 2D container
structure that allows 2D indexing. Apart from that, it's not
particularly well suited:

- Many of the numpy functions do not work on Axes.
- Some functions work, but have awkward semantics; e.g. len()
  gives the number of rows.
- We can't add our own functionality.

AxesArray introduces a facade to the underlying array to allow
us to customize the API. For the beginning, the API is 100%
compatible with the previous numpy array behavior, but we
deprecate everything except for a few reasonable methods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants