FIX: get_datalim for collection #13642

jklymak · 2019-03-10T05:02:58Z

PR Summary

EDIT 12 Mar:

Closes the non-unique auto-limit problem in #7413 for collections. Collections have "offsets" and "paths" and each can be in a different co-ordinate frames; This PR tackles the autolim problem by consulting the self.get_transform() and self.get_offset_transform() for the collection and checking if these are ax.dataTrans or not.

For collection.get_datalim: There are two cases that I can see in common usage:

A. offsets=None for LineCollection. In this case, all the vertices in paths should be included in the autolim if transform=ax.dataTrans. (this is the case for streamplot)

B offset is not None:

if the offset coords (offsetTrans) is not the axes dataTrans, then don't auto-lim on this collection at all. I don't know off hand of any built-in examples like this, but if the user specifies a bunch of markers in AxesTrans co-ords, then they shouldn't expect the axes to change its x/ylims.
If the path coords (self.transform) is dataTrans then we use the existing algorithm. This is useful for paths like in quiver, where the offset defines the tail of an arrow, and the path defines it's length. In that case, the whole arrow is meant to be in data space, and the lims can change size accordingly
if the path co-ords are not dataTrans, then we don't consult the path at all, because these paths are in some other unit that doesn't help set the data limits. i.e. scatter markers are in points, and these can be arbitrarily large (even larger than the figure if you want), so setting the axes data limits based on this size is ill-defined. That lead to many of the issues in Autoscaling has fundamental problems #7413. Of course the offset limits should still be included in the limits (i,e. we want the centre of markers to be in the new x/ylims.)

This necessitated 14 image tests needing to be changed, mostly due to small axes limit differences in scatter tests.

The major exception is test_transforms.py::test_pre_transform_plotting which highlights a limitation of the consult-the-transforms approach used here. If the user passes a simple transform like transform = mtransforms.Affine2D().scale(10) + ax.dataTrans the autolimiting will not take place because transform != ax.dataTrans. Thats a cost of the approach, but I personally feel users should just transform the data before asking matplotlib to plot it, or be willing to manually adjust the axes limits.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({'Depth': [0.2, 0.4, 0.4, 0.4, 0.4, 0.4, 0.6, 0.4, 3.2, 2.0],
                   'DateTimeUTC': [pd.Timestamp('2018-03-28 06:25:08'),
                                   pd.Timestamp('2018-03-28 06:25:49'),
                                   pd.Timestamp('2018-03-28 06:27:06'),
                                   pd.Timestamp('2018-03-28 06:32:11'),
                                   pd.Timestamp('2018-03-28 06:32:59'),
                                   pd.Timestamp('2018-03-28 06:34:02'),
                                   pd.Timestamp('2018-03-28 06:35:38'),
                                   pd.Timestamp('2018-03-28 06:37:04'),
                                   pd.Timestamp('2018-03-28 06:39:08'),
                                   pd.Timestamp('2018-03-28 06:40:52')]})

x = np.asarray(df['DateTimeUTC'])
y = np.asarray(df['Depth'])

fig, axs = plt.subplots(2, 1)

ax = axs[0]
#pp = ax.plot(x, y, '.')
ax.scatter(x, y, s=25, marker='d')
ax.set_title("using scatter")

ax = axs[1]
ax.plot(x, y)
ax.set_title("Using plot")

fig.tight_layout()
plt.show()

Before

After

Big Markers

Note that big markers will be clipped. This is "worse", but its consistent...

Before:

After:

Cures inconsistency

To see the inconsistency in the old way not how the xlim changes with each re-lim:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

r = 2 * 72
sc = ax.scatter([0, 1], [0, 1], s=r * r * 3.14)

plt.show()
xmin = []
for i in range(4):
    fig.canvas.draw()
    ax.update_datalim(sc.get_datalim(ax.transData))
    ax.autoscale()
    print(ax.get_xlim())
    xmin += [ax.get_xlim()[0]]

(-0.5934780784674604, 1.59347807846746)
(-0.9094406858496092, 1.9094406858496091)
(-1.15777773561647, 2.15777773561647)
(-1.3529631431489448, 2.3529631431489455)

Of course the resulting markers are still clipped anyways:

Before:

After

(-0.05, 1.05)
(-0.05, 1.05)
(-0.05, 1.05)
(-0.05, 1.05)

Failures:

`test_axes`

Failures are minor axes limits changing slightly....

`test_collections.py`

test_barb_limits fails because it is constructed as:

    x = np.linspace(-5, 10, 20)
    y = np.linspace(-2, 4, 10)
    y, x = np.meshgrid(y, x)
    trans = mtransforms.Affine2D().translate(25, 32) + ax.transData
    plt.barbs(x, y, np.sin(x), np.cos(y), transform=trans)

and that fails the rubric above because the offset transform is not the data transform. Not sure about how people feel about this - personally I'm not a fan of defining data at one location and then using the transform stack to move the object around, instead of just manipulating the original data, and if folks want to do this, I'm OK with auto_lims failing for them. If you are going to manually move things around with a low-level stack, then you can also be expected to set your limits accordingly: flexibility comes at the cost of automation.

test_EllipseCollection, test_regularpolycollection_rotate fail somewhat badly because they are big paths that used to get some space from the existing algorithm, and no longer do. scatter_post_alpha is a very minor change....

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

jklymak · 2019-03-10T05:04:29Z

This is WIP from a still-need-to-fix-tests point of view, but I wanted to have a concrete proposal on the table for fixing part of #7413.

anntzer · 2019-03-10T19:11:10Z

I think the general approach is good, haven't reviewed carefully yet.
Does this close #6915/#11198 too?
In #7413 (comment) @tacaswell argued in favor of accepting clipped markers too. Despite my comments on that thread I now also think ignoring marker extents in scatter makes complete sense (the "larger-than-figure marker" case is quite convincing).

jklymak · 2019-03-10T19:45:26Z

No it doesn’t fix #6915. But I don’t think it’s incompatible with your PR leaving the autolim to draw time which makes sense to me.

jklymak · 2019-03-11T04:44:25Z

OK, this breaks streamplots, mostly in manageable ways, but a couple of the failures I can't figure out yet.

anntzer · 2019-03-11T19:03:40Z

lib/matplotlib/collections.py

@@ -181,7 +181,12 @@ def get_offset_transform(self):
    def get_datalim(self, transData):
        transform = self.get_transform()
        transOffset = self.get_offset_transform()
+        if not transOffset == transData:


transOffset != transData

jklymak · 2019-03-12T04:22:41Z

So, this PR damages autolim-ing on constructs like in test_transforms:

times10 = mtransforms.Affine2D().scale(10)
ax.scatter(np.linspace(0, 10), np.linspace(10, 0), transform=times10 + ax.transData)

I'm OK with that, but wanted to let folks know... The new test file looks fine, but the limits have changed substantially: https://github.com/matplotlib/matplotlib/blob/d3f8447cc69b21be885c8683c942043e2903ac7f/lib/matplotlib/tests/baseline_images/test_transforms/pre_transform_data.png

jklymak · 2019-03-12T04:26:14Z

BTW, is there an easy way to tell if a transform has ax.dataTrans in it? i.e. if someone has done a simple offsetTrans+ax.dataTrans?

anntzer · 2019-03-12T16:50:19Z

@jklymak Have a look at contains_branch or contains_branch_separately (not exactly sure about their use, but I think they may be what you want).

jklymak · 2019-03-12T16:50:38Z

Please see updated description above.

I can make an API change doc etc, but wanted feedback on the approach first. Ping @ImportanceOfBeingErnest as you have been known to make extensive use of the transform machinery. This will break some autolimits for some transforms.

anntzer · 2019-03-12T16:53:11Z

Instead of changing the baseline images, can you keep the old ones, add an assert that the x/ylims are what they should be per the new approach, then force the x/ylims to match the old values?

anntzer · 2019-03-12T17:01:35Z

I'll wait until you look into contains_branch and possibly don't change the baseline images?
But again, generally speaking I like the approach.

jklymak · 2019-03-12T17:03:28Z

Instead of changing the baseline images, can you keep the old ones, add an assert that the x/ylims are what they should be per the new approach, then force the x/ylims to match the old values?

Moved this discussion to gitter 😉

jklymak · 2019-03-12T17:06:21Z

I'll wait until you look into contains_branch...

Seems to work! The transform=times10 + ax.transData now works, though the image still changes because the scatter in that test changes its limits....

anntzer · 2019-03-12T17:18:49Z

copypasting my reply on gitter:

Instead of regenerating the images, it may also be worth just checking whether we could do away with some of the tests, in case they're redundant with others
For example the test that generates scatter.png could become a check_figures_equal comparing scatter() with manually calling plot() to add one marker at a time
scatter_post_alpha too
I guess colorbar_single_scatter could just not do an image check but check that the colorbar bounds are correctly set
Also, fewer baseline-image tests also help mplcairo :) (as they are renderer-dependent)

jklymak · 2019-03-12T19:36:18Z

Fair enough - I'll look at rejiggering the tests after we get some consensus that this is a good way to go. Particularly as A/B-ing the tests shows the end-effects of this change for users....

jklymak · 2019-03-12T19:41:26Z

...alos ping @efiring

anntzer · 2019-03-12T20:47:23Z

lib/matplotlib/tests/test_streamplot.py

    y, x = np.meshgrid(y, x)
-    trans = mtransforms.Affine2D().translate(25, 32) + ax.transData
-    plt.barbs(x, y, np.sin(x), np.cos(y), transform=trans)
+#    trans = mtransforms.Affine2D().translate(25, 32) + ax.transData


just delete this?

Oops, meant to re-Instate this actually. Thanks....

lib/matplotlib/collections.py

anntzer · 2019-03-12T20:49:39Z

lib/matplotlib/collections.py

        transform = self.get_transform()
        transOffset = self.get_offset_transform()
+        if (not self._offsetsNone and
+            not transOffset.contains_branch(transData)):


I guess(?) you could get a few extra points by checking contains_branch_separately and then deciding to use the x or the y data limits if one of them does contain transData; this could be a separate PR too as the use case is a bit obscure.

let's just forget this case for now

As discussed on gitter, I think thats best for simplicity sake. For instance, I don't know what the datalim handling does with infs (for instance) and adding this case is a good chunk of code. As also discussed, very happy to revisit if it turns out to be needed.

anntzer · 2019-03-12T20:50:19Z

btw, I don't mind doing the conversion to check_figures_equal if you want.

ImportanceOfBeingErnest · 2019-03-12T21:06:45Z

I will need to look at this more thoroughly. Based on a quick glance I wonder if we can't keep using mpath.get_path_collection_extents even in the cases you exclude here. It does not seem to give generally bad output, it just needs a good frozen transform to start with (maybe that is the hard part, but maybe not?) and it will then not be 100% accurate - but probably still better then cropping half the marker.

lib/matplotlib/collections.py

jklymak · 2019-03-12T23:44:05Z

I will need to look at this more thoroughly. Based on a quick glance I wonder if we can't keep using mpath.get_path_collection_extents even in the cases you exclude here. It does not seem to give generally bad output, it just needs a good frozen transform to start with (maybe that is the hard part, but maybe not?) and it will then not be 100% accurate - but probably still better then cropping half the marker.

We could heuristically do that when the marker size is not too big (probably a quarter of the axes size), but you still need to solve the inverse problem, which isn't necessarily trivial particularly if you need to take into account the scale of the axes, or iterate to a solution as discussed before. Or accept the present state, which I think we can all agree is broken..

jklymak · 2019-03-13T01:07:18Z

One further point, this doesn't typically cut off markers because of the default data margins. Just really big markers... Which are also the ones that cause auto-lim instability...

anntzer · 2019-07-18T17:37:42Z

lib/matplotlib/collections.py

+                # (i.e. like scatter). We can't uniquely set limits based on
+                # those shapes, so we just set the limits based on their
+                # location.
+                # Finish the transform:


in the same spirit as #14845 this could be

offset_to_data = transOffset - transData offsets = offset_to_data.transform(offsets)

(you also save a transform by doing so).

I wasn't paying attention when #14845 went in. This requires remembering that transA - transB is the same as what I wrote here. I don't think it helps the readability/comprehensibility of all the transform magic to have arithmetic operators that do stuff versus just using explicit methods.

The main operator is trA+trB, which just means "perform trA, then perform trB"; once you know it it's quite natural that trA-trB means "perform trA, then perform the inverse of trB". You could write mtransforms.composite_transform_factory(trA, trB.inverse()) if you prefer explicit method names, but the point is that it is more efficient to first multiply two small (3x3) matrices and then use the result to transform a big (N,2) array, rather than to do two transformations in a series.

Good point. Split the difference: (trA + trB.inverted()) seems straightforward enough. Like, I understand the __sub__ convention, I just find it a bit obscure. The __add__ convention is fine.

anntzer

modulo ci

jklymak · 2019-07-18T18:39:40Z

Thanks for the help and suggestions @anntzer, @QuLogic, and @efiring !

QuLogic

Unfortunately, Axes.transData does not link (I think because it's an undocumented instance variable), but there's one more that can be fixed:

doc/api/next_api_changes/2019-03-04-AL.rst

Co-Authored-By: Elliott Sales de Andrade <quantum.analyst@gmail.com>

jklymak mentioned this pull request Mar 10, 2019

[WIP/Proof of concept] Fix Collection autolim issue #13640

Closed

6 tasks

jklymak added the status: work in progress label Mar 11, 2019

anntzer reviewed Mar 11, 2019

View reviewed changes

jklymak removed the status: work in progress label Mar 12, 2019

anntzer reviewed Mar 12, 2019

View reviewed changes

lib/matplotlib/collections.py Show resolved Hide resolved

anntzer reviewed Mar 12, 2019

View reviewed changes

jklymak force-pushed the fix-relim-collection branch from 1162001 to 28f63d0 Compare March 12, 2019 22:24

anntzer reviewed Mar 12, 2019

View reviewed changes

lib/matplotlib/collections.py Outdated Show resolved Hide resolved

tacaswell added this to the v3.2.0 milestone Mar 17, 2019

anntzer added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label May 2, 2019

jklymak force-pushed the fix-relim-collection branch 2 times, most recently from 41e75ec to a005562 Compare July 18, 2019 16:22

anntzer reviewed Jul 18, 2019

View reviewed changes

FIX: change autolim exclude shapes not in data co-ordinates

f19ae31

jklymak force-pushed the fix-relim-collection branch from a005562 to f19ae31 Compare July 18, 2019 18:04

anntzer approved these changes Jul 18, 2019

View reviewed changes

QuLogic reviewed Jul 19, 2019

View reviewed changes

doc/api/next_api_changes/2019-03-04-AL.rst Outdated Show resolved Hide resolved

Update doc/api/next_api_changes/2019-03-04-AL.rst

fd7f68e

Co-Authored-By: Elliott Sales de Andrade <quantum.analyst@gmail.com>

QuLogic merged commit b03370a into matplotlib:master Jul 19, 2019

QuLogic removed the status: needs review label Jul 19, 2019

pllim mentioned this pull request Jul 19, 2019

TST: visualization test failure with mpl dev astropy/astropy#9028

Closed

astrofrog mentioned this pull request Jul 20, 2019

Fix patches distortion test astropy/astropy-data#70

Merged

This was referenced Aug 6, 2019

Autoscaling has fundamental problems #7413

Closed

Axis limits are set badly with small values in scatter(). #9979

Closed

plot.scatter: negative or null values lead to strange behavior on log scale #11898

Closed

williamjameshandley mentioned this pull request Aug 10, 2019

scipy.stats.gaussian_kde contour plots handley-lab/anesthetic#50

Merged

5 tasks

anntzer mentioned this pull request Oct 25, 2019

Collections could check x- and y- transforms separately to decide whether to autoscale each direction #15518

Closed

anntzer mentioned this pull request Nov 4, 2019

Incorrect autoscaling of polar plot limits after scatter #7130

Closed

QuLogic mentioned this pull request Dec 2, 2019

3.2.0 may break some Cartopy tests #15806

Closed

This was referenced Dec 17, 2019

scatterplot axis autoscale fails for small data values #6015

Closed

Repeated calls to scatter with identical data produce different axis limits #15955

Closed

ImportanceOfBeingErnest mentioned this pull request Feb 16, 2020

Axes.autoscale has scale-inconsistent margins #16539

Closed

QuLogic mentioned this pull request Mar 8, 2020

Scatter plots with log scaling #15978

Closed

anntzer mentioned this pull request Apr 3, 2020

Issues with autoscaling and transforms with 3.2+ #17016

Closed

dstansby mentioned this pull request Apr 21, 2020

Subplots using bad axis limits in 3.2 #17203

Closed

anntzer mentioned this pull request Oct 22, 2020

Add finer-grain control to Axes.ignore_existing_data_limits #18771

Closed

7 tasks

jklymak deleted the fix-relim-collection branch May 8, 2021 15:50

anntzer mentioned this pull request Mar 11, 2022

scatter + loglog = bad autolimits #5897

Closed

lukashergt mentioned this pull request Jan 17, 2023

ax.scatter or ax.plot in scatter_plot_2d? handley-lab/anesthetic#252

Open

Uh oh!

FIX: get_datalim for collection #13642

FIX: get_datalim for collection #13642

Uh oh!

Conversation

jklymak commented Mar 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Before

After

Big Markers

Before:

After:

Cures inconsistency

Before:

After

Failures:

test_axes

test_collections.py

PR Checklist

Uh oh!

jklymak commented Mar 10, 2019

Uh oh!

anntzer commented Mar 10, 2019

Uh oh!

jklymak commented Mar 10, 2019

Uh oh!

jklymak commented Mar 11, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

anntzer commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

anntzer commented Mar 12, 2019

Uh oh!

anntzer commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

anntzer commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

jklymak commented Mar 12, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer commented Mar 12, 2019

Uh oh!

ImportanceOfBeingErnest commented Mar 12, 2019

Uh oh!

Uh oh!

jklymak commented Mar 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklymak commented Mar 13, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jklymak commented Mar 10, 2019 •

edited

Loading

`test_axes`

`test_collections.py`

jklymak commented Mar 12, 2019 •

edited

Loading

jklymak commented Jul 18, 2019 •

edited

Loading