Skip to content

Better auto-selection of axis limits #4891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
njsmith opened this issue Aug 9, 2015 · 11 comments
Closed

Better auto-selection of axis limits #4891

njsmith opened this issue Aug 9, 2015 · 11 comments
Assignees
Milestone

Comments

@njsmith
Copy link

njsmith commented Aug 9, 2015

As discussed at the matplotlib 2.0 BoF, it would be good to switch the method for default selection of axis limits to something a little less magical. Right now, it does something like: pick the nearest "round number" that is outside of all the data. But this leads to all kinds of weirdnesses. For example:

  • If your data goes from 0 to 1, then your limits will be from 0 to 1. Which means that data which actually falls right at 0 or 1 may be obscured by the axis limits -- e.g., here's the spectral distribution autocorrelation function of white noise (a plot that I've actually used in papers): plt.plot(np.linspace(0, 1, 1000), [1] + [0] * 999). Or if you have a scatter plot, the scatter markers may be half off-screen. A very minimal requirement for a default plotting method is that it should show all the data.
  • If your data doesn't go between some nice round numbers (which is very common), then you can get very ugly, asymmetric graphs. E.g. because of how the FFT works, it's not uncommon when working with signals to want to e.g. plot a spectrum with 256 elements: plt.plot(np.random.randn(256)). Notice that this gives x-axis limits of (0, 300) and looks ridiculous.

The easiest way to handle this is to simply ensure that the axis limits are slightly larger than necessary to cover all the data, and then leave it at that. It avoids the really nasty cases, avoids surprises, and if people want something fancier than they almost always have to tweak the limits by hand anyway. As further evidence that this works, this is what R does and it basically just works -- no-one complains.

Specifically, base R calculates default axis limits by: taking the min and max of the data, and then setting the limits 4% further beyond each of those. ggplot2 uses a similar algorithm, but uses 5% by default. (See here, semantics of expand= is (multiplicative expansion, additive expansion).)

This is an issue instead of a PR because I'm not sure how to make this a PR :-). At the BoF someone said that the suggested functionality was already in matplotlib as something called "margins"? But I'm not sure what that is or how to hook it up to rcParams.

@njsmith
Copy link
Author

njsmith commented Aug 9, 2015

(See also)

@tacaswell tacaswell added this to the Color overhaul milestone Aug 9, 2015
@tacaswell
Copy link
Member

margins already has an rcparam.

@njsmith
Copy link
Author

njsmith commented Aug 9, 2015

Okay, having done a bit of reading, it sounds like: there are rcParams axes.xmargin and axes.ymargin, but, they do not actually do what is described in the original post, b/c they are applied in addition to the automagic axis limit finding code, not instead of the automagic axis limit finding code.

Specifically, plt.margins and friends do what we want if you pass tight=True, but the rcParams act as if they call plt.margins with tight=False (which is #2298).

I see two obvious options: either switch the meaning of the existing rcParams so that they imply tight=True as requested in #2298 by the original poster, or else add a second rcParam that orthogonally controls the "autoscaling".

@efiring
Copy link
Member

efiring commented Aug 9, 2015

Although I dislike the ever-increasing number of rcParams, it looks to
me like the simplest and cleanest solution here would be to add one to
control the tight default for autoscale_view. Otherwise,
axes.xmargin etc. would need to use None or some other flag to
indicate the "classic" default, in place of their present default of
0. I think that might be more confusing than adding, e.g.,
"axes.autoscale_tight" with a new default of True.

On 8/8/15, Nathaniel J. Smith notifications@github.com wrote:

Okay, having done a bit of reading, it sounds like: there are rcParams
axes.xmargin and axes.ymargin, but, they do not actually do what is
described in the original post, b/c they are applied in addition to the
automagic axis limit finding code, not instead of the automagic axis limit
finding code.

Specifically, plt.margins and friends do what we want if you pass
tight=True, but the rcParams act as if they call plt.margins with
tight=False (which is #2298).

I see two obvious options: either switch the meaning of the existing
rcParams so that they imply tight=True as requested in #2298 by the
original poster, or else add a second rcParam that orthogonally controls the
"autoscaling".


Reply to this email directly or view it on GitHub:
#4891 (comment)

@ellisonbg
Copy link

I am in favor of less magic in picking the x and y margins (I think that is
what they are called). I think the approach that ggplot2 uses (5%) is a
good one.

On Sat, Aug 8, 2015 at 9:33 PM, Nathaniel J. Smith <notifications@github.com

wrote:

Okay, having done a bit of reading, it sounds like: there are rcParams
axes.xmargin and axes.ymargin, but, they do not actually do what is
described in the original post, b/c they are applied in addition to the
automagic axis limit finding code, not instead of the automagic axis
limit finding code.

Specifically, plt.margins and friends do what we want if you pass
tight=True, but the rcParams act as if they call plt.margins with
tight=False (which is #2298
#2298).

I see two obvious options: either switch the meaning of the existing
rcParams so that they imply tight=True as requested in #2298
#2298 by the original
poster, or else add a second rcParam that orthogonally controls the
"autoscaling".


Reply to this email directly or view it on GitHub
#4891 (comment)
.

Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@njsmith
Copy link
Author

njsmith commented Aug 10, 2015

Okay, so looking at axes/_base.py it seems that we currently have the following terminology:

  • autoscale can be on or off for each axis independently, and determines whether we adjust the axes at all when new data is plotted.
  • margins can be specified independently for both x and y, and if present then unconditionally expand the data range by the specified amount when doing autoscaling
  • the scaling can be tight or not, which is confusing -- there's the attribute axes._tight which can be either True or False, and then there's the kwarg tight= accepted by the autoscale and autoscale_view methods, which is a tristate True/False/None, defaulting to None.

Then when we autoscale, the rules are:

  • Margins (if present) are always applied
  • Then we apply the "round number" heuristics (by calling locator.view_limits) if any of the following apply:
    • self._tight is True
    • tight=True
    • (most common) self._tight is False, tight=None, and the plot contains something besides an image (!!)

I guess what we want for a default is something like: if the plot contains something besides an image, then apply the margins only; otherwise, do nothing.

This means that maintaining strict backwards compatibility is going to be tricky -- potentially people are currently using margins with images, or using margins together with custom locators with custom view_limits implementations. We may just need to have an autoscaleLikeMatplotlib1 rcParam?

@joferkington
Copy link
Contributor

Just as a clarification here, we're only talking about changing what the axes.xmargin and axes.ymargin rcParams do, correct?

i.e. axes.xmargin in .maplotlibrc/rcParams is proposed to behave identical to ax.margins(x=value) instead of ax.margins(x=value, tight=False) as it currently does?

ax.margins(some_percentage) will continue to default to tight=True, and autoscaling will continue to default to choosing "even" numbers?

If so, great! Of course, the caveats about actually making it work as an rcParam while allowing full backwards compatibility are tough. I do think an rcParam that's equivalent to calling ax.margins(value) would be a good thing, though.


However, I'd argue for against making ax.margins-style auto-scaling the default.

In my experience, "even numbers" are desirable as axes limits about as often as a percentage padding is. In some cases you want one, in some cases the other. Given that it's roughly 50/50 split (though, again, that's just my opinion), it's best to stick with what we have. ax.margins(percentage) is nice and simple, and the alternative of calling ax.autoscale(some_new_parameter) for the other 50% of the time is worse, i.m.o.

In my opinion, the main problem is that too few people are aware of ax.margins(some_percentage) because we don't advertise it enough.

I don't think it's worth grossly breaking backwards compatibility for something that's better solved by ensuring people are aware of ax.margins.


Overall, line plots are the most common case where autoscaling with even numbers makes the most sense. (Feel free to argue the contrary there!) They're also probably the most common plot type. I don't think that it's worth breaking that expectation for something that's easy to fix in the cases where you don't want "even-number" autoscaling.

For example, the most frequently asked question I've seen about autoscaling involves bar plots. The default autoscaling is the best choice for bar in general (i.m.o.), as bar can do lots of things. However, autoscaling often results in visually poor limits in the most common case of a vertical bar plot.

This is a classic case where even-number-choosing as limits is not optimal. However, I would argue it's far easier to fix this by letting people know about ax.margins for cases where it makes more sense to use it.

For example, with a vertical bar plot, you'll often want a combination of ax.margins(0.05) and ax.set_ylim(bottom=0). Compare the top and bottom plots:

import matplotlib.pyplot as plt

y = [0.5, 9.0, 8.5, 7, 1, 4, 6.5, 4, 3, 2]
x = range(len(y))

fig, axes = plt.subplots(nrows=2)
for ax in axes:
    ax.bar(x, y, align='center', color='lightblue')

axes[1].margins(0.05)
axes[1].set_ylim(bottom=0)

plt.show()

figure_1

However, in my experience, surprisingly few people are aware that it's possible to create the bottom plot without using manually set axes limits.

Overall, I'd argue the bigger problem is that the examples almost never show little things such as this. (I know, I know, PR's welcome!). We can better solve the problem with better advertising of existing functionality instead of breaking backwards compatibility for what's basically a 50/50 advantage/disadvantage.

Just my thoughts on the matter, anyway. I definitely think it's a good discussion to have, regardless!

@anntzer
Copy link
Contributor

anntzer commented Aug 12, 2015

This is the first time I hear about plt.margins, which is certainly nice, but handles log-scaled plots incorrectly (cf. #2263, which got closed but not fixed).
My 2c. is to give up on round numbers, but fixing this issue is more important.

@mdboom mdboom modified the milestones: Color overhaul, next major release (2.0) Oct 8, 2015
@njsmith
Copy link
Author

njsmith commented Nov 8, 2015

@joferkington:

In my experience, "even numbers" are desirable as axes limits about as often as a percentage padding is. In some cases you want one, in some cases the other. Given that it's roughly 50/50 split (though, again, that's just my opinion), it's best to stick with what we have.

The argument against "round number" autoscaling is that sometimes it works fine, and sometimes it fails utterly and produces awful plots (see the examples cited at the top of this thread, which all involve line plots, or your bar plot example). Simple margins OTOH may or may not be the absolute prettiest, but they always work. When choosing defaults IMO the focus shouldn't be on the best cases, but on the worst cases -- if some rule produces 5% better plots 50% of the time and terrible plots the rest of the time, then it's a bad default.

@joferkington
Copy link
Contributor

@njsmith - That's actually a really good point. A good default should have few downsides. By that logic, margins-style autoscaling is a better default. Consider me won-over.

@mdboom
Copy link
Member

mdboom commented Nov 9, 2015

@joferkington: That's was @tacaswell and I also came to at our "style summit" today, so with good certainty that's what the default will be in 2.0

mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 11, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 16, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 16, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 17, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 23, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 25, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 27, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 27, 2015
mdboom added a commit to mdboom/matplotlib that referenced this issue Nov 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants