Skip to content

Clip RGB data to valid range for imshow #10220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 4, 2018

Conversation

Zac-HD
Copy link
Contributor

@Zac-HD Zac-HD commented Jan 10, 2018

Axes.imshow can now apply the norm or vmin and vmax arguments to RGB images, allowing easy display of high-bit-depth data, or adjustments of brightness and contrast.

Out-of-range values in un-normalised RGB and RGBA images are now clipped to the nearest bound, rather than being wrapped. This bug often hid outliers, and could make interpretation of the plotted image entirely unreliable.

Closes #9391. Closes #5382. Related to pydata/xarray#1796.

  • Has Pytest style unit tests
  • Code is PEP 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature

@tacaswell, I'd appreciate any suggestions you might have about tests.

@Zac-HD Zac-HD changed the title Imshow rgb fixes imshow fixes for RGB data: support normalisation, clip to valid range Jan 10, 2018
@tacaswell tacaswell added this to the v2.2 milestone Jan 10, 2018
@tacaswell
Copy link
Member

This is definitely not going in to 2.1.2 as this is a new feature not a bug-fix.

I am not convinced that this is a good idea, if the user is passing in an RGB(A) image, then the assumption is that it is a ready-to-go image and we should do as little processing as possible on it. Starting to do so opens up a big can of worms (for example, this uses the same norm for all of the planes, it also seems reasonable to norm each plane independently). Once you go down this road the range of extensions seems like it will get large which is why this logic should live in the application level.

The Normalize methods do not return strictly [0, 1] values either, they put in sentinels for over/under/bad which needs to be accounted for.

I think the fixes should be clarifying the docstring (which seems to have already been done) and detecting when we get high-bit-depth ints and down-scaling them to 8 bit (we output Nx8 bit colors so the down sampling is going to happen eventually), and clipping the floats.

In either case, the logic needs to be AxesImage not in imshow so it applies to data later set via im.set_data.

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from my general concerns about adding this feature, the logic needs to live in AxesImage not in imshow so it applies to data set via set_data and it needs tests.

@jklymak
Copy link
Member

jklymak commented Jan 10, 2018

Yeah, this is the same argument as in the other Issues. RGB(A) is RGB(A). If the upstream app doesn't know how to form an RGB array within proper limits, its hard to get excited about helping them. I think a good argument can be made that we should clip instead of modulo values that are out of range, but I don't agree with allowing vmin/vmax.

@anntzer
Copy link
Contributor

anntzer commented Jan 10, 2018

I don't like the idea either at all, but if it does go in in some form this should probably use a machinery similar to #8738 because it'd be silly to have one mechanism for normalizing two-channel images and another for three-channel images.

@dstansby
Copy link
Member

I also agree on forcing RGB(A) to be between 0 and 1 (as far as I can think all it really requires is im = im / np.max(im) if you're original data isn't in that range). I think we could do better on the docstring though, and erroring/warning on out of range input.

@jklymak
Copy link
Member

jklymak commented Jan 10, 2018

I think we can error or clip/warn/info on out-of-range for im.

I'm -1 on automatic normalization if values are greater than 1. How do we know the users data doesn't really go to >max(im) in other images that they are trying to compare.

I'm also not in favour of applying arbitrary user-supplied normalizations to each RGB(A) channel in imshow including using vmin or vmax or the non-linear Norms.

To me, and I assume whoever wrote imshow, specifying RGB(A) means the user wants that color plotted on the screen (within the limits of color gamuts etc etc).

Its clear there is a body of users who thinks of RGB as channel1/channel2/channel3 with arbitrary data in them, and that they have developed enough intuition about what those images look like to be able to quantitatively see something in those three channels. I don't think imshow should change for that body of users. On the otherhand, a new function channelrgbimage or whatever name makes the most sense to that community, could be easy enough to add. Then vmin/vmax could be specified as floats or length-3 arrays, and three normalizations (one for each channel) could be supplied to map into colorspace.

As @anntzer points out, #8738 was designed to map 2-channel data to arbitrary colors. Thats a fair bit more general than this use case, which strictly maps to RGB.

@efiring
Copy link
Member

efiring commented Jan 10, 2018

Adding to the chorus, and going a bit farther: at most, we should error out or clip with a warning. Users who want to scale the channels are free to do so outside of mpl. I don't see any reason why that functionality should be part of mpl.

@jklymak
Copy link
Member

jklymak commented Jan 10, 2018

I guess what many users want is clip, but we should _log.warn so unattentive users don't get a saturated image and not realize they've lost dynamic range of their image. If they don't like the warning, they can clip/normalize their data properly (or suppress the warning via logging.)

@@ -13,6 +13,7 @@

from math import ceil
import os
import warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly suggest this is a use for logging instead of warnings.

import logging
....

_log = logging.getLogger(__name__)

....
_log.warn('Clipping...')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, but I'm happy to let @tacaswell or another maintainer settle it.

FWIW I did a search for each pattern and found 21 log calls (mostly for import errors or configuration state changes) and 66 warnings (mostly for invalid calls or data), so warnings seem to be the convention here (as in Numpy, Pandas, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging was just added a couple of months ago.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to follow up, when we added logging, we decided to follow the guidelines at https://docs.python.org/3/howto/logging.html#logging-basic-tutorial for when to use which tool:

warnings.warn() in library code if the issue is avoidable and the client application should be
modified to eliminate the warning

logging.warning() if there is nothing the client application can do about the situation, but the
event should still be noted

We decided not go back and check that the current instances of warnings.warn follow that pattern, but agreed to fix them as they come along.

I'd drop my "strongly" above, and mildly argue that this warning is the latter case, and the particular advantage of logging is that the warnings can be turned off if the user is OK and cognizant w/ the clipping. I don't think warnings.warn can be turned off, or at least not on a per-module basis.

Copy link
Contributor Author

@Zac-HD Zac-HD Jan 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense! The contributing guide is actually pretty good 😄

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 11, 2018

I'm seeing a strong consensus that matplotlib should not normalize or scale RGB or RGBA images, so I've dropped that part of the pull. However this still serves to close the linked issues (as "wontfix") and allows downstream libraries to support it without worrying about API compatibility.

I've moved the clipping logic into the add_data method where it belongs, but not added tests yet.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 11, 2018

Converted from warnings.warn to _log.warn, added test.

An image comparison test also seems like a good idea, but I can't get everything to build 😕. If this is needed, would someone else be able to supply the reference images? Edit: I'm now testing this by checking the max and min of .get_array() on the image.

When `Axes.imshow` is passed an RGB or RGBA value with out-of-range
values, it now issues a warning and clips them to the valid range.
The old behaviour, wrapping back in to the range, often hid outliers
and made interpreting RGB images unreliable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we clarify what ranges are, or perhaps link to an explanation of the ranges?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ranges are explained by the docs for Axes.imshow, and the reference to it here is turned into a link by the backticks IIRC.

Copy link
Member

@jklymak jklymak Jan 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to drag this out. I think this really needs to be in the API changes as well in case smeone was depending on the wrap behaviour:

Log of changes to Matplotlib that affect the outward-facing API. If updating Matplotlib breaks your scripts, this list may help you figure out what caused the breakage and how to fix it by updating your code.

# - otherwise casting wraps extreme values, hiding outliers and
# making reliable interpretation impossible.
high = 255 if np.issubdtype(self._A.dtype, np.integer) else 1
if self._A.min() < 0 or high < self._A.max():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should pad this by an eps in case the 0-1 values came from some numerical computation, and we probably wouldn't want to spam the user with unnecessary warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to clip to exactly [0 .. 1], and for consistency I would prefer to always warn if clipping. If users feel spammed, they can either fix their data or filter the logging output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the suggestion might have been to silently clip if (a-vmax)/(vmin - vmax) - 1 < 1e-10 or something that would just represent bit noise that would never affect the interpretation of the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the question - I just think the answer is "No, we shouldn't".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we shouldn't.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 12, 2018

The remaining test errors are because the pytest caplog fixture is unavailable on some Travis environments, which in turn is because they are running old versions of pytest (3.1.0, caplog was added in 3.3.0, latest is 3.3.2).

I can mark this with a skipif, or if you don't mind requiring a newer version of pytest I'll just leave it as-is.

@anntzer
Copy link
Contributor

anntzer commented Jan 12, 2018

I would adjust .travis.yml (and whatever docs) to bump the minimal required pytest version.
Given the progressively wider use of logging in mpl it makes sense to require a version of pytest that has caplog...

# making reliable interpretation impossible.
high = 255 if np.issubdtype(self._A.dtype, np.integer) else 1
if self._A.min() < 0 or high < self._A.max():
_log.warn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its _log.warning, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, turns out that _log.warn "is functionally identical but deprecated" (but not programmatically deprecated).

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 13, 2018

OK, I've responded to all the review comments, tests are present and passing, and Travis has been told to use a new version of pytest.

Travis isn't using a new version of pytest, but I think that can be fixed by blowing away some caches and re-running the job.

Copy link
Member

@efiring efiring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved subject to tests passing.

@jklymak
Copy link
Member

jklymak commented Jan 14, 2018

Restarted the two failing tests. Looked like something flaky, versus an actual failure of this PR.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 16, 2018

@efiring, I've done all I can to get the test passing - there are two remaining problems:

  • lib/matplotlib/tests/test_rcparams.py::test_validator_invalid is failing with a null byte issue somewhere, which is nothing to do with this pull. That's the Python 3.6 job on Travis.
  • The Travis jobs for python 3.4 and 3.7 are still using an older (incompatible) version of pytest. If you clear the cache are restart those jobs I would expect all the tests to pass.

@efiring
Copy link
Member

efiring commented Jan 16, 2018

I confess: I know how to restart the tests, but I don't know how to clear a cache on Travis. @tacaswell knows all, and needs to approve the changes (or not) in any case, so I will leave it to him.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 16, 2018

FYI it's under the "More options" button on the upper right, then "caches" and pick the relevant pull request - or just blow away everything with "delete all repository caches".

Usually this is all automatic, but things just haven't been expiring as they should this year 🤷‍♂️

@QuLogic
Copy link
Member

QuLogic commented Jan 16, 2018

TBH, I'm not sure about bumping the pytest requirement. I'm not seeing 3.3 packaged in a lot of distros just yet.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 16, 2018

TBH, I'm not sure about bumping the pytest requirement. I'm not seeing 3.3 packaged in a lot of distros just yet.

Given the trouble with Travis, I'm happy to ditch the check for logging - now we just check that the array was in fact clipped to the valid range, but not that a warning was logged.

@Zac-HD Zac-HD force-pushed the imshow-rgb-fixes branch 2 times, most recently from 2236a2c to 068fa28 Compare January 16, 2018 08:17
@tacaswell
Copy link
Member

I am also concerned about the pytest version bump.

If the cache / version issues persist try rebasing on current master.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 17, 2018

@tacaswell - I'm no longer bumping the pytest version.

@efiring - tests all passing now 🎉

@efiring
Copy link
Member

efiring commented Jan 18, 2018

I'm happy, but @tacaswell still has an angry red X next to his review...

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 21, 2018

Ping @tacaswell - if you're happy with this it can be merged; and if not I'd appreciate tips on how to improve it 😄

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Still needs API change added.
  • Also, I don' think you need to manually edit credits.rst

@Zac-HD Zac-HD changed the title imshow fixes for RGB data: support normalisation, clip to valid range Clip RGB data to valid range for imshow Jan 26, 2018
When `Axes.imshow` is passed an RGB or RGBA value with out-of-range
values, it now logs a warning and clips them to the valid range.
The old behaviour, wrapping back in to the range, often hid outliers and
made interpreting RGB images unreliable.
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 26, 2018

Ah, sorry - I'd missed that comment. Now with API changes documented; and since it's been a while I also bumped the dates and rebased on master.

The history of credits.rst suggests that the list was generated from Git metadata (by @efiring), but that it's not automatically updated. Unless someone knows how else it will be added, I'll therefore leave my name on the list 😄

(and ping @tacaswell again; I think we're just waiting for your approval now)

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 29, 2018

Ping @jklymak and @tacaswell; I've made all the changes you wanted and would love to get this merged in time for v2.2 - the due date is tomorrow!

@jklymak jklymak added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Feb 2, 2018
@tacaswell tacaswell merged commit 605fd3c into matplotlib:master Feb 4, 2018
@tacaswell
Copy link
Member

Thanks! Sorry I have been slow on review recently.

@QuLogic QuLogic modified the milestones: needs sorting, v2.2.0 Feb 12, 2018
@Zac-HD Zac-HD deleted the imshow-rgb-fixes branch May 26, 2018 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants