-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Autoscaling has fundamental problems #7413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The first could be fixed by adding (yet more) state to the Axes in the from of a 'autoscale_the_nexttime_you_draw` flag? I suspect that the 'autoscale in plot method' makes sense from the POV of interactive use where almost every command is followed by a draw anyway. The second sounds like we need to clear a cache when ever we change the scale (and maybe the view limits?). |
No, I think the second is a much more fundamental problem--more along the lines of "you can't get there from here". A constraint engine might be needed to get it right. I need to spend more time thinking about it to be sure, though. |
Shouldn't the autoscaling "simply" be done at draw time (if autoscaling is on at that time, of course)? (In interactive mode, each command is followed by a draw, so it's just the same.) At that time, you know what your scale is, so you can
I think this should solve all the issues? (Not that it'd be that easy to implement such a large refactor.) |
Yes, that's part of my point: draw time seems like the right time, so the strategy would be to accumulate all necessary information along the way, and then use all of it at once. I think this still requires an algorithm change, though; I'm not sure that the present |
No, my strategy is to not accumulate any information along the way. You simply wait until draw time when the correct transData is known, and do all the calculations at that point (this is assuming of course that the limits have not been explicitly set). You will then need an explicit inverse when converting the limits-including-margins back to data space, but that's not a problem as scales are explicitly invertible. |
I think you are misunderstanding what I am saying, and the nature of the problem. It is a nonlinear minimization problem. Mathematically, it cannot be solved with a simple single-pass algorithm like the present one. I don't have time to write this out in more detail right now, but we can return to it. |
OK, I'll wait for your writeup then. |
Here is an attempt to show the structure of the problem for a single axis. It is a hybrid of math and code notation.
This problem needs to be solved simultaneously for all I think the solution to this problem for now is to stop trying to be so ambitious. Let's disable the present non-functional attempt to solve the optimization, and instead rely on the simple mechanism of specified margins to reduce the incidence of cases where autoscaling fails to include all parts of all symbols. |
I am 👍 on doing margins completely in dataspace and accepting clipped symbols. How much of this logic can we push down into the scales / is pushed into the scales? If we could delegate all of the margin computation there it would greatly reduce how badly we have to special case for log / symlog / etc and would let third-party scales (attn @phobson ) deal with margins (up to just ignoring them) in a way that makes the most sense for that scale. |
I don't think the situation is that bad. Indeed, the transform Thus, let us write
Note that the To be honest, I had missed the fact that different elements could have different If we want to keep things simple, as suggested by @tacaswell, I would rather set Let's come back to the general case of eq. (3). The expression on the right, seen as a function of A quick googling suggests that these break points can be found in |
|
I agree with pushing out a "good enough" release. What about setting S = min(S)? This would, again, cover the case where all the markers have the same size (which is likely by far the most common case), with "relatively" little complexity. |
That still requires introducing a new algorithm and strategy for handling the autoscaling, pushing it to the draw stage. (Probably this should be done anyway, but it's not an immediate priority.) And much of the point of scatter is to handle cases where size is variable. Same with quiver. Variable size is a common case, and it is the reason the present flawed approach was taken. |
Surely not true for polar plots or projection examples or Cartopy? |
I think our discussion is only applicable to scales and not projections, as per the definition of http://matplotlib.org/devel/add_new_projection.html:
I'm not sure things can be done in a non ad hoc manner for arbitrary projections; for example, you'll probably generally want your polar plot to cover the entire range 0..2pi in theta even if all the data is in a smaller range (or, if you want to restrict yourself to a smaller range, it is probably a range that you explicitly put in. |
Possible partial fix for 2.0:
This is just a quick thought at this point. |
Looking again at this, I conclude we should do nothing about it for v2.0. Even the partial fix sketched above would be somewhat intrusive, requiring adding attributes to collections to track whether and how to handle them in relim. As far as I can see, the present algorithm has been in place for a very long time, so it is not a matter of breakage by 2.x. I will move the milestone; I think that even 2.1 is optimistic for coming up with a good, clean solution. |
I closed it, but #13639 and |
#13640 tries to fix one of the issues; I'm a little befuddled about what the auto-lim behaviour of scatter is supposed to be; isn't it supposed to make the limits big enough to see the whole marker? Or is that part of the bug? import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 1)
axs[0].scatter([0, 1], [0, 1], s=10)
axs[1].scatter([0, 1], [0, 1], s=3000)
plt.show() |
@jklymak I can't say I understand the full extent of the issue, but I think
But if I understand correctly a few Monte Carlo simulations done showed that enlarging the markers affects the choice of (wrong) default limits (cf. one with the other). Finally, I don't know how much has changed about all this in the past 2 years... |
Another way of expressing the problem: if I make a marker that is 10" wide and my figure is only 5" wide, this algorithm is doomed to failure because no adjusting the xlim of the axes will ever get that marker to fit in the axes. If you throw a constraint solver at this, it will just say you have incompatible constraints and die. There are presently only four collections that use autolims, so far as I can see: scatter, quiver, barb, and brokenbar There are two separate cases here: 1) the collection offsets and shapes are drawn in data co-ordinates (quiver, barb, brokenbar) 2) the collection offsets are in data units, but the shapes are sized in physical units. I think 1) is easy - you just find the min and max in data space and you have your lims. 2) I think we should just assume the makers are normal sized, and allow big ones to clip (i.e. just use the offsets to get the auto limits. I think |
I think the complete marker is supposed to be part of the data limits. However,
However, even in this case the marker is only shown completely due to the |
Right, and if you keep calling it, xlim keeps changing, converging on a solution if there is a solution... import matplotlib.pyplot as plt
fig, ax = plt.subplots()
sc = ax.scatter([0, 1], [0, 1], s=8000)
xmin = []
for i in range(20):
fig.canvas.draw()
ax.update_datalim(sc.get_datalim(ax.transData))
ax.autoscale()
print(ax.get_xlim())
xmin += [ax.get_xlim()[0]]
If there is no solution (s=400000) the limits never converge...
|
#6915 brings to light two problems with autoscaling:
It looks very inefficient: every plotting method in
_axes
adds an artist to the axes and then callsautoscale_view
, occasionally with arguments.autoscale_view
then does a complete autoscaling operation, going through all of the artists that have been added up to that point. Logically, it seems like the autoscaling should be done only before a draw operation, not every time an artist is added.Beyond the apparent inefficiency, it doesn't work right for collections.
add_collection
callsself.update_datalim(collection.get_datalim(self.transData))
to get dataLim. This uses the presenttransData
to calculate the size in data units of objects that have sizes and/or positions that may be specified in screen or axes units. Then the subsequent call toautoscale_view
uses those positions to modify the view limits. But this changestransData
so that the intended result cannot be achieved--when drawn, the sizes and locations in data units will not be what they were calculated to be when the view limits were set. The mismatch will grow as additional artists are added, each one potentially changing the data limits and the view limits. Usually we get away with this with no one noticing, but not always. plt.yscale('log') after plt.scatter() behaves unpredictably in this example. #6915 shows that subsequently changing the scale of an axis, e.g. linear to log, can wreck the plot.The text was updated successfully, but these errors were encountered: