Skip to content

Automatic downsampling of images. #5602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Dec 2, 2015

This patch adds automatic downsampling of images with dimensions larger than
the physical number of pixels that will be displayed, in order to allow
displaying of very large images. The idea is to patch AxesImage to replace
calls to self._A[x0:x1, y0:y1] to calls to self._A[x0:x1:xstep, y0:y1:ystep]
with a well-chosen step.

Because this directly patches AxesImage's code, this is simpler than
e.g. @ChrisBeaumont's ModestImage, which must modify self._A to "pretend" the
image is something else than what it really is, and doesn't handle the "extent"
kwarg.

What still needs to be done:

  • agree on a kwarg/rcParam to control the behavior, and possibly the
    downsampling level (perhaps one wants e.g. up to 5 data points per pixel
    instead of a single one),
  • handle image caches: currently I just disabled rgbacache, which should be
    fine as to_rgba (which is what hanged matplotlib on huge images so far) will
    only be called on reasonably-sized arrays, but perhaps there's a better option?
  • tests.

This patch adds automatic downsampling of images with dimensions larger than
the physical number of pixels that will be displayed, in order to allow
displaying of very large images.  The idea is to patch AxesImage to replace
calls to self._A[x0:x1, y0:y1] to calls to self._A[x0:x1:xstep, y0:y1:ystep]
with a well-chosen step.

Because this directly patches AxesImage's code, this is simpler than
e.g. @ChrisBeaumont's ModestImage, which must modify self._A to "pretend" the
image is something else than what it really is, and doesn't handle the "extent"
kwarg.

What still needs to be done:

- agree on a kwarg/rcParam to control the behavior, and possibly the
downsampling level (perhaps one wants e.g. up to 5 data points per pixel
instead of a single one),

- handle image caches: currently I just disabled rgbacache, which should be
fine as to_rgba (which is what hanged matplotlib on huge images so far) will
only be called on reasonably-sized arrays, but perhaps there's a better option?

- tests.
@mdboom
Copy link
Member

mdboom commented Dec 2, 2015

First, a big thank you for taking this on. This is an important part of matplotlib that needs improvement.

One concern: This is effectively decimating (i.e. ignoring data), not downsampling (i.e. averaging across neighboring values), correct? I understand that might be acceptable in some domains, but probably not in others (particularly in astronomy I know this would almost never be the right thing). That's not a reason to exclude it, but it should be documented clearly that that is what this feature does.

The bigger problem matplotlib has currently is that images are processed by:

  • converting floating point values to RGB (colormapping)
  • interpolating in RGB space

It's done this way historically because we're using Agg, which has a bunch of interpolation algorithms in place that were easy to grab and use, and these only supported RGB. With the latest SVN version of Agg, this is reportedly no longer the case, but I spent some time a few months ago looking at this and it seems to not work as advertised.

But assuming someone can do the work so instead we do:

  • interpolate in floating point
  • colormap floating point -> RGB

then what you've done here just becomes one of many interpolation algorithms (and probably the fastest one).

Are you interested in helping with this?

With some code cleanup thrown in at the same time, but it should be clear what
is the interesting part.

Compare

    x, y = np.ogrid[:10000, :10000]
    plt.imshow(np.sin(x + y), interpolation="none")

with the downsampling and averaging methods to see the difference (one
shows severe aliasing, the other doesn't).  In either case one can zoom in
interactively to check that at high zooms, individual pixels are correctly
plotted.
@anntzer
Copy link
Contributor Author

anntzer commented Dec 2, 2015

See latest commit, which allows the choice of downsampling method. The commit message includes an example where decimation vs. e.g. averaging produces significantly different results. (I wrote the decimating code first mostly for simplicity (with speed as a secondary benefit) but index_tricks comes to the rescue as usual.)
I think this issue is pretty different from interpolation (and I certainly don't intend to rewrite all of agg's interpolation methods :-)), here instead we're grouping multiple values into a single one.

@tacaswell tacaswell added this to the proposed next point release (2.1) milestone Dec 3, 2015
@mdboom
Copy link
Member

mdboom commented Dec 3, 2015

I think long term we really want to do proper interpolation on the input (greyscale) image, and this is just a half measure. I've already solved the C++ side of it in matplotlib/aggravate, now we just need to make the Python side use it. I think that will ultimately be more flexible than this and also be the same speed when interpolation == nearest. Would you like to help with that effort?

@anntzer
Copy link
Contributor Author

anntzer commented Dec 3, 2015

To be honest I don't care much, or at least have very little intuition, about interpolation (I only believe in interpolation="none" :-)) but more importantly I still fail to see the connection between the two cases -- other than in the case where you have too many data points per pixel in one direction (you need to downsample there) and not enough in the other (you need to interpolate there) (but then you can always do the two parts sequentially).
A quick look (... limited by the quality of the docs but heh) at Agg's resampling algorithms suggests that they do not support downsampling, only upsampling, right?

@mdboom
Copy link
Member

mdboom commented Dec 3, 2015

A quick look (... limited by the quality of the docs but heh) at Agg's resampling algorithms suggests that they do not support downsampling, only upsampling, right?

Nope, they address both. It's really just generic "resampling" where each pixel in the output is projected onto the input, and the average of that rectangle taken (modulo the filtering kernel).

@anntzer
Copy link
Contributor Author

anntzer commented Dec 3, 2015

But do we always want to use averaging? I'd say this is pretty application dependent and certainly at least averaging, geometric averaging, median and possibly decimation all have legitimate uses there.

@jenshnielsen
Copy link
Member

How does this connect with the new image infrastructure in #5718

@anntzer
Copy link
Contributor Author

anntzer commented Jan 4, 2016

@mdboom suggests that #5718 handles this issue as a special case, but I haven't really followed what is happening on that issue.

@tacaswell
Copy link
Member

This is a special case of #5718 (see my pathological example).

@mdboom
Copy link
Member

mdboom commented Jan 5, 2016

This is basically a performance improvement for plotting very large images through decimation (which only really does the right thing if interpolation=='nearest').

To test, I'm using the following:

from matplotlib import pyplot as plt
import numpy as np

Z = np.random.rand(10000, 10000)
plt.imshow(Z, interpolation='nearest')
plt.savefig("test.png")

(If you want to reproduce the numbers, make sure you run it twice and ignore the first run to remove the effects of regenerating the font cache).

I get:

master: 7.24s
#5718: 3.61s
#5602: 2.22s

So, #5718 is much better than master already, by virtue of downsampling first, and then colormapping something much smaller. I'm a bit surprised by the remaining advantage of this over #5718, since it should boil down to effectively the same thing computationally... I'll need to do some further investigation. I'd still like to find a more general solution than this if possible.

@mdboom
Copy link
Member

mdboom commented Jan 5, 2016

As best I can tell from profiling, the difference in time is mostly in norm. This PR runs norm on the decimated data, which unfortunately isn't correct, since it could miss local maxima and minima in the data that way if they fall in the cracks. I think we have to norm (or at least find min and max) over the entire original data, which obviously takes considerably more memory bandwidth which is I believe what accounts for the difference in timing.

@anntzer
Copy link
Contributor Author

anntzer commented Jan 5, 2016

Two quick notes:

  • In its current state, the patch allows "fairly easily" other downsampling methods than decimation:
                # This is where downsampling happens.
                A = A[:, :, 0, 0] # or A = A.mean(axis=(-1, -2)), or etc.

(image.py line 239-240). Not that it matters because of the following point:

  • You are correct that normalization needs to happen on the whole image before resampling (typically: if you use a Power norm, the proper way to average your values is probably a geometric mean, so you may as well normalize to log scale and do an arithmetic mean after). So only the decimation method can benefit from normalization-after-resampling. (Also, I guess the difference in timing isn't just from computing the min and max, but also from doing the normalization itself on the whole vs the downsampled array.)

@efiring
Copy link
Member

efiring commented Jan 5, 2016

@mdboom, @anntzer, I'm not following this in detail, but here are a couple thoughts that might be relevant to the discussion:

  1. The computation of min and max is going through ma.min and ma.max, which probably has two sources of inefficiency: first, ma operations tend to be slow for a variety of reasons; and second, because it is unnecessarily making two passes instead of calculating min and max in a single pass. It's unfortunate that numpy has ptp(), but not a function that returns min and max from a single pass. We could use our own C++ code for this (handling the possible mask at the same time).
  2. I haven't checked, but in almost anything related to color handling in floating point, if we are using double precision we are wasting memory. I think there is a place where we push things toward single precision, but I don't know whether we are enforcing single precision.
    Sorry not to be able to look into this more closely for now.
    Eric

@mdboom
Copy link
Member

mdboom commented Jan 5, 2016

  1. The computation of min and max is going through ma.min and ma.max, which probably has two sources of inefficiency: first, ma operations tend to be slow for a variety of reasons; and second, because it is unnecessarily making two passes instead of calculating min and max in a single pass. It's unfortunate that numpy has ptp(), but not a function that returns min and max from a single pass. We could use our own C++ code for this (handling the possible mask at the same time).

Yeah -- this seems like something that Numpy really should grow. I wonder if they'd be amenable to a pull request on that. We'd still have to have our own answer in the meantime, though.

  1. I haven't checked, but in almost anything related to color handling in floating point, if we are using double precision we are wasting memory. I think there is a place where we push things toward single precision, but I don't know whether we are enforcing single precision.

In the common case of single plane data that is color mapped, we color map to 8-bit ints (the colormap code can also output to floats, but in practice that code path is rarely used). #5718 improves things further when the input data is already color: in that case, if the input data is 8-bit int, we no longer require passing through float, but can preserve it as 8-bit all the way down.

@WeatherGod
Copy link
Member

There have been talk of a np.minmax() function for years, but it has never
moved beyond that. I think they would be amendable to such a PR.

Something else that is even worse about np.ma.min() and such... they make
copies of the array (sans masked parts)! I was certain there was a github
issue filed for that, but I can't find it at the moment.

On Tue, Jan 5, 2016 at 2:11 PM, Michael Droettboom <notifications@github.com

wrote:

  1. The computation of min and max is going through ma.min and ma.max,
    which probably has two sources of inefficiency: first, ma operations tend
    to be slow for a variety of reasons; and second, because it is
    unnecessarily making two passes instead of calculating min and max in a
    single pass. It's unfortunate that numpy has ptp(), but not a function that
    returns min and max from a single pass. We could use our own C++ code for
    this (handling the possible mask at the same time).

Yeah -- this seems like something that Numpy really should grow. I wonder
if they'd be amenable to a pull request on that. We'd still have to have
our own answer in the meantime, though.

  1. I haven't checked, but in almost anything related to color handling in
    floating point, if we are using double precision we are wasting memory. I
    think there is a place where we push things toward single precision, but I
    don't know whether we are enforcing single precision.

In the common case of single plane data that is color mapped, we color map
to 8-bit ints (the colormap code can also output to floats, but in practice
that code path is rarely used). #5718
#5718 improves things
further when the input data is already color: in that case, if the input
data is 8-bit int, we no longer require passing through float, but can
preserve it as 8-bit all the way down.


Reply to this email directly or view it on GitHub
#5602 (comment)
.

@efiring
Copy link
Member

efiring commented Jan 5, 2016

The following shows where some major slowdowns are coming from the masked array support. It certainly would be nice to get all that sped up within numpy...

In [1]: import numpy as np
In [2]: xx = np.random.randn(10000, 10000)
In [3]: timeit(xx.min())
10 loops, best of 3: 47 ms per loop
In [4]: timeit(xx.ptp())
10 loops, best of 3: 94.5 ms per loop
In [7]: yy = np.ma.masked_greater(xx, 2)
In [8]: timeit(yy.min())
1 loops, best of 3: 508 ms per loop
In [9]: timeit np.nanmin(yy.filled(np.nan))
1 loops, best of 3: 577 ms per loop
In [10]: timeit yy.filled(np.nan)
1 loops, best of 3: 457 ms per loop
In [12]: timeit np.ma.max(xx)
10 loops, best of 3: 47.1 ms per loop

It looks like the implementation of ptp() is not very efficient.
Masked arrays are slow only when there are masked values, so it doesn't look like min and max are responsible for the slow norming in your test case--correct?

@mdboom
Copy link
Member

mdboom commented Jan 5, 2016

Masked arrays are slow only when there are masked values, so it doesn't look like min and max are responsible for the slow norming in your test case--correct?

Yeah -- in my test case, there's nothing masked out...

@mdboom
Copy link
Member

mdboom commented Jan 6, 2016

(Also, I guess the difference in timing isn't just from computing the min and max, but also from doing the normalization itself on the whole vs the downsampled array.)

Yes -- there may be a way to take advantage of that in the context of #5718. The tricky bit is that normalization happens in Python/Numpy (and through a nice plug-in API where different normalization functions can be written and interchanged), and resampling happens in C++. Plugging normalization into the resampling step might be tricky, though. #5718 already cuts the run time in half -- we know from this PR that there's more that could probably be done wrt to speed, but I'm not certain we should let that hold up #5718.

@anntzer
Copy link
Contributor Author

anntzer commented May 12, 2016

How does the new image machinery play with this? I have noticed it seems to perform quite well even with very large images. Should this PR be closed?

@mdboom
Copy link
Member

mdboom commented May 12, 2016

Thanks for pinging this. Indeed, the image refactor was in part a response to this -- to try to do this faster without resorting to decimation. I think we should probably close this unless further testing reveals new problems with the new code.

@tacaswell tacaswell removed this from the 2.1 (next point release) milestone Oct 3, 2017
@anntzer anntzer restored the auto-image-downsampling branch March 20, 2019 14:20
@anntzer
Copy link
Contributor Author

anntzer commented Mar 20, 2019

Reopening per the discussion at https://gitter.im/matplotlib/matplotlib?at=5c92437e49fdaa0d81e71ec9.
(Obviously the PR needs a big update, this is just to track the issue.)

@anntzer anntzer reopened this Mar 20, 2019
@anntzer anntzer added this to the v3.2.0 milestone Mar 20, 2019
@jklymak
Copy link
Member

jklymak commented Mar 21, 2019

Image taken from http://www.imagemagick.org/Usage/filter/nicolas/rings_sm_orig.gif The article is also quite useful: http://www.imagemagick.org/Usage/filter/nicolas/

import matplotlib.pyplot as plt
import numpy as np

a = plt.imread('rings_sm_orig.gif')

print(np.shape(a))
# 200x200
fig, ax = plt.subplots(figsize=(4, 4))
ax.imshow(a, cmap='gray')
ax.set_position([0, 0, 1, 1])
fig.savefig('dpi50.png', dpi=50)
fig.savefig('dpi30.png', dpi=30)


fig, ax = plt.subplots(figsize=(4, 4))
ax.imshow(a, cmap='gray', interpolation='lanczos', filterrad=(2*50/30))
ax.set_position([0, 0, 1, 1])
fig.savefig('filtdpi50.png', dpi=50)
fig.savefig('filtdpi30.png', dpi=30)

dpi50.png:

dpi50

dpi30.png

dpi30
0.png

With filter:

filtdpi50.png

filtdpi50

filtdpi30.png

filtdpi30

Conclusion:

Agg seems to do the right thing if we give it the appropriate filter. The question is if we apply this filter automatically if we know we are downsampling....

@anntzer
Copy link
Contributor Author

anntzer commented May 2, 2019

The issue is better described in #13724 now, so I'll close this.

@anntzer anntzer closed this May 2, 2019
@anntzer anntzer deleted the auto-image-downsampling branch May 2, 2019 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants