Simplify normalization of multiple images #11380

timhoffm · 2018-06-05T21:43:13Z

PR Summary

In response to #11377. This adds two simpler ways for creating a common normalization for images.

In particular the linked example is quite verbose. That (at least the first) half gets significantly easier with this PR.

Pre-imshow normalization

if the data are known beforehand, one can create a Normalize instance scaled to the global min/max of the data:

import matplotlib.pyplot as plt
from matplotlib.colors import Normalize

fields = [np.random.randn(10, 10) + 2*i for i in range(3)]

_, axes = plt.subplots(ncols=3)
norm = Normalize(data=fields)                     # New API: create Normalize
for ax, field in zip(axes, fields):
    im = ax.imshow(field, norm=norm, cmap="RdBu_r")    # use Normalize
plt.colorbar(im, ax=axes)

Post-imshow normalization

~~Alternatively, one can collect the images and create a Normalize instance from their values and apply the norm to the images:~~

~~_, axes = plt.subplots(ncols=3) images = [] for ax, field in zip(axes, fields): images.append(ax.imshow(field, cmap="RdBu_r")) Normalize.apply_autoscale(images) # New API plt.colorbar(im, ax=axes)~~

(withdrawn after discussion)

This is a proof of concept. Documentation, tests and examples are not yet complete. Feedback on the general API is much apreciated.

PR Checklist

Has Pytest style unit tests
Code is PEP 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)

ImportanceOfBeingErnest · 2018-06-05T23:43:29Z

Why is this restricted to images?
Why is this restricted to Normalize? (What about PowerNorm, LogNorm etc.?)

timhoffm · 2018-06-06T00:11:31Z

It probably shouldn't be restricted to images or Normalize. This is just a proof of concept quickly hacked together. So far, I didn't bother to check if it's straight forward to extend to other cases.

tacaswell · 2018-06-06T04:30:16Z

I think data is better than autoscale for the kwarg nam. I agree that this signature should be propagated to all of the Norm sub-classes.

I am less convinced by the class method.

jklymak · 2018-06-06T13:11:53Z

How does auto normalization work now ( I’m away from my computer) If this goes in then it might be nice to make sure we don’t have multiple auto scaling algorithms.

Overall I’m not really sure this is needed but maybe has a small enough footprint to be useful.

timhoffm · 2018-06-06T18:10:28Z

Parameter name autoscale changed to data.
Added data parameters for Normalize subclasses (where it makes sense)

timhoffm · 2018-06-06T18:22:07Z

@jklymak auto normalization works via self.norm.auto_scale_None(self._A) in AxesImage. It's called when the image is created. By using the _None variant of the function we ensure that the scaling is determined by the first image that is connected to the Normalize instance. So the __init__() methods use exactly the same auto scaling algorithm.

As for the Normalize.apply_autoscale() class method, I use the global min/max of all supplied images as vmin/vmax. For the case of one supplied image, this is equivalent to giving the image data to the constructor. The generalization is straight forward. We don't have to bother with the None part vmin/vmax cannot be supplied externally (the image data are the only criterion here).

timhoffm · 2018-06-06T18:40:01Z

Remark on if this is needed

Normalize(data=...):

Probably I want to normalize to data more often than to vmin/vmax. One could argue, that I can calculate the respective values and just pass them to the existing parameters. From a minimalistic API point of view that is correct. However, matplotlib has a lot of redundant ways of doing things. Most of them for convenience (e.g. set line properties via kwargs, fmt-string, rc_context, on the created Line2D object).

I argue that this is such a case where convenience beats minimalism. In particular, normalizing to data feels quite straight forward.

Normalize.apply_autoscale():

I'm a bit divided if this method is necessary. On the one hand one can argue that Normalize(data=...) is very convenient and thus sufficient. The only disadvantage is that all the data need to be there before the first imshow so that I can construct the norm beforehand and pass it in. If that's not the case, it gets really cumbersome (5 lines of code just to get all images to the same scale).

So the question is: Are there significant real world cases in which I cannot use Normalize(data=...)? Or is it just the example which generates the data on the fly? If there are real-world cases (and I can imagine, people are loading and plotting data within a single loop), I would welcome a function that can autoscale and apply a norm to a number of images.

jklymak · 2018-06-12T23:03:54Z

lib/matplotlib/colors.py

        """
        self.vmin = _sanitize_extrema(vmin)
        self.vmax = _sanitize_extrema(vmax)
        self.clip = clip
+        if data is not None:
+            self.autoscale_None(data)


Did you mean self.apply_autoscale here or am I misunderstanding? I'm not seeing how data can be a list of np arrays if you call autoscale_None. I don't see that you have a test of this below, i.e. you only call apply_autoscale.

No. I mean autoscale_None. autoscale_None internally calls np.asanyarray() on the data, so a list of arrays is fine. I don't think we really need to test this, but if you want I can add a test.

norm = Normalize(data=data) is really just a shortcut for

norm = Normalize() norm.autoscale(data)

i.e. it scales to the data. The autoscale_None variant is used so that you can consistently do Normalize(vmin=0, data=data) in which case only vmax will be taken from data.

apply_autoscale() serves a different purpose. While the above works on data, apply_autoscale() works on AxesImages. It creates a norm based on the data in the images and applies the norm to the images. Maybe we should better call it autoscale_images()?

jklymak · 2018-06-19T17:25:42Z

Probably I want to normalize to data more often than to vmin/vmax. One could argue, that I can calculate the respective values and just pass them to the existing parameters. From a minimalistic API point of view that is correct. However, matplotlib has a lot of redundant ways of doing things. Most of them for convenience (e.g. set line properties via kwargs, fmt-string, rc_context, on the created Line2D object).

I rarely want to normalize automatically by data (except for initial exploration), and if I do, I don't find doing it by hand too onerous. I don't see what's better about the list comprehension gather versus just a loop that checks for vmin/vmax. Worse, the gather seems memory intensive to me if I have a big data set.

Yes, we have lots of redundant ways to do the same thing. I'm not sure thats always a good thing 😉

Not 100% against this if there is really an audience, but it seems crufty to me.

timhoffm · 2018-06-19T21:34:06Z

It probably depends on your application if you want to normalize automatically. No point in arguing individual cases here (neither mine nor yours).

The "list comprehension" is actually a generator expression, and if anything less memory consuming and faster than a loop. That's actually what users should use if they have to do this themselves. 😉

Any additional feedback (positive or negative) on Normalize.apply_autoscale() from other devs would be helpful. If negative, I could remove that function to bring the merge of the data attribute part forward.

ImportanceOfBeingErnest · 2018-06-19T21:58:39Z

I think my very first comment still applies. Normalize is a general purpose class to normalize data. I'm not sure if it should handle AxesImages specifically. Or if it does, it should handle every mappable that has a get_array method (Why only images?).

Second, the norm is not updated when the images change. In that sense apply_autoscale is a one time only method. It may be more confusing than helping. Partially this may also be due to a classmethod named apply... (applying something should happen to the instance, but the instance does not store the images to apply anything.).

The new data argument seems sufficiently useful to devote an example to it (or change extend the existing one on that subject).

timhoffm · 2018-06-19T22:09:17Z

Ok, will remove apply_autoscale() later.

Unfortunately, the existing example does not benefit from the data argument, because the image data is generated on the fly while plotting. That was actually the motivation to have the apply_autoscale() class method.

The workaround would be to split the loop and first generate all the data, then create the norm and then plot all the data. But that's not an easy and natural workflow for the example either.

ImportanceOfBeingErnest · 2018-06-19T22:15:25Z

Maybe it's worth looking into why the following does not work:

import matplotlib.pyplot as plt
import numpy as np

fig, axes = plt.subplots(2,2)

data = []
norm = plt.Normalize(None,None)
for i, ax in enumerate(axes.flat):
    arr = np.random.rand(10,10)+i
    ax.imshow(arr, norm=norm)
    data.append(arr)
    
norm.autoscale_None(data)

plt.show()

If it was working as expected, there wouldn't be any need for the apply_autoscale() method I guess.

timhoffm · 2018-06-20T05:37:13Z

The docstring says

If vmin or vmax is not given, they are initialized from the minimum and maximum value respectively of the first input processed.

Probably, this is done when I pass the norm to Imshow. (Didn‘t check in the code)

timhoffm · 2018-06-20T20:33:52Z

Removed Normalize.apply_autoscale()
Rewrote the example to
a) Start with a simple version using Normalize(data=...)
b) Explain the more involved version when applying the norm after creating the images

tacaswell

This needs a whats_new entry.

I am a bit concerned about the API growth

timhoffm · 2018-07-09T20:45:20Z

What's new added.

anntzer

I'd rather document norm.autoscale(data) better.
The main issue with the PR, IMO, is that we again end up with "partially overlapping kwargs", namely the behavior when both vmin/vmax and data are passed is less than optimal (IMO).

efiring · 2020-04-07T02:54:59Z

I'm not seeing enough user-level simplification here to justify the PR. The example can be simplified a bit even without the PR, but I also find there is a bug in part of the example that is not related to the PR: the colorbar is not responding properly to changes in the clim unless applied to images[0]. I will make a new issue for that.

efiring · 2020-04-07T03:33:44Z

See #17052 for the bug report promised above.

Very old and out of date.

timhoffm · 2020-07-14T19:58:50Z

Closing because there seems little approval of the idea. I'm not going to fight over it.

timhoffm added the status: work in progress label Jun 5, 2018

tacaswell added this to the v3.0 milestone Jun 6, 2018

tacaswell mentioned this pull request Jun 6, 2018

Feature Req: plt.subplots(sharev=True) #11377

Closed

timhoffm force-pushed the normalize branch from 4472a07 to 76e6b8c Compare June 6, 2018 18:08

timhoffm force-pushed the normalize branch from 76e6b8c to 04bbadc Compare June 10, 2018 09:23

timhoffm removed the status: work in progress label Jun 10, 2018

jklymak reviewed Jun 12, 2018

View reviewed changes

timhoffm force-pushed the normalize branch from 04bbadc to 67fecd5 Compare June 20, 2018 20:31

timhoffm force-pushed the normalize branch from 67fecd5 to 4c56dcb Compare June 20, 2018 21:02

timhoffm added the status: needs review label Jun 23, 2018

tacaswell previously requested changes Jul 7, 2018

View reviewed changes

tacaswell modified the milestones: v3.0, v3.1 Jul 7, 2018

Simplify normalization of multiple images

c966778

timhoffm force-pushed the normalize branch from 4c56dcb to 79e9b1e Compare July 9, 2018 20:44

jklymak added status: needs revision and removed status: needs review labels Aug 17, 2018

update multi_image example

fcba433

timhoffm force-pushed the normalize branch from 79e9b1e to fcba433 Compare October 14, 2018 09:34

anntzer requested changes Oct 14, 2018

View reviewed changes

jklymak modified the milestones: v3.1.0, v3.2.0 Feb 5, 2019

timhoffm modified the milestones: v3.2.0, v3.3.0 Aug 7, 2019

QuLogic added the status: needs comment/discussion needs consensus on next step label Apr 1, 2020

QuLogic modified the milestones: v3.3.0, v3.4.0 May 4, 2020

timhoffm closed this Jul 14, 2020

QuLogic removed this from the v3.4.0 milestone Mar 16, 2021

timhoffm deleted the normalize branch June 10, 2022 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify normalization of multiple images #11380

Simplify normalization of multiple images #11380

timhoffm commented Jun 5, 2018 •

edited

Loading

ImportanceOfBeingErnest commented Jun 5, 2018

timhoffm commented Jun 6, 2018

tacaswell commented Jun 6, 2018

jklymak commented Jun 6, 2018

timhoffm commented Jun 6, 2018

timhoffm commented Jun 6, 2018

timhoffm commented Jun 6, 2018 •

edited

Loading

jklymak Jun 12, 2018

timhoffm Jun 16, 2018

jklymak commented Jun 19, 2018

timhoffm commented Jun 19, 2018

ImportanceOfBeingErnest commented Jun 19, 2018

timhoffm commented Jun 19, 2018

ImportanceOfBeingErnest commented Jun 19, 2018

timhoffm commented Jun 20, 2018

timhoffm commented Jun 20, 2018 •

edited

Loading

tacaswell left a comment

timhoffm commented Jul 9, 2018

anntzer left a comment

efiring commented Apr 7, 2020

efiring commented Apr 7, 2020

timhoffm commented Jul 14, 2020

Simplify normalization of multiple images #11380

Simplify normalization of multiple images #11380

Conversation

timhoffm commented Jun 5, 2018 • edited Loading

PR Summary

Pre-imshow normalization

Post-imshow normalization

PR Checklist

ImportanceOfBeingErnest commented Jun 5, 2018

timhoffm commented Jun 6, 2018

tacaswell commented Jun 6, 2018

jklymak commented Jun 6, 2018

timhoffm commented Jun 6, 2018

timhoffm commented Jun 6, 2018

timhoffm commented Jun 6, 2018 • edited Loading

Remark on if this is needed

jklymak Jun 12, 2018

Choose a reason for hiding this comment

timhoffm Jun 16, 2018

Choose a reason for hiding this comment

jklymak commented Jun 19, 2018

timhoffm commented Jun 19, 2018

ImportanceOfBeingErnest commented Jun 19, 2018

timhoffm commented Jun 19, 2018

ImportanceOfBeingErnest commented Jun 19, 2018

timhoffm commented Jun 20, 2018

timhoffm commented Jun 20, 2018 • edited Loading

tacaswell left a comment

Choose a reason for hiding this comment

timhoffm commented Jul 9, 2018

anntzer left a comment

Choose a reason for hiding this comment

efiring commented Apr 7, 2020

efiring commented Apr 7, 2020

timhoffm commented Jul 14, 2020

timhoffm commented Jun 5, 2018 •

edited

Loading

timhoffm commented Jun 6, 2018 •

edited

Loading

timhoffm commented Jun 20, 2018 •

edited

Loading