Skip to content

matplotlib.pyplot.imsave colormaps some grayscale images before saving them #3657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jni opened this issue Oct 17, 2014 · 20 comments · Fixed by #24320
Closed

matplotlib.pyplot.imsave colormaps some grayscale images before saving them #3657

jni opened this issue Oct 17, 2014 · 20 comments · Fixed by #24320
Assignees
Milestone

Comments

@jni
Copy link
Contributor

jni commented Oct 17, 2014

Some people may be using matplotlib as an image IO library. Those people may also be scientists that care about their data being accurately saved on disk. They will be disappointed, however, to find that imsave does nothing of the sort:

In [8]: mpl.pyplot.imsave('io4.png', np.array([[0, 1], [0, 4196]], dtype=np.uint16))

In [10]: im4 = mpl.pyplot.imread('io4.png')

In [11]: im4.shape
Out[11]: (2, 2, 4)

In [12]: np.rollaxis(im4, -1) # display the three channels + alpha of the array
Out[12]: 
array([[[ 0.        ,  0.        ],
        [ 0.        ,  0.49803922]],

       [[ 0.        ,  0.        ],
        [ 0.        ,  0.        ]],

       [[ 0.49803922,  0.49803922],
        [ 0.49803922,  0.        ]],

       [[ 1.        ,  1.        ],
        [ 1.        ,  1.        ]]], dtype=float32)

What is happening here is that imsave has encountered a data type it's not happy with (uint16, a commonly used data format in microscopy and probably other scientific imaging), and decided to silently colormap the values and save them as a uint8 RGBA image. (See #3616 for a longer discussion.) Adding insult to injury, the color mapping is to jet.

In summary:

  • Observed behaviour: imsave colormaps certain images before saving as uint8 RBGA.
  • Desired behaviour: either save the data exactly as given, or raise a ValueError, or at the very least raise a warning.
@tacaswell tacaswell added this to the v1.5.x milestone Oct 17, 2014
@tacaswell
Copy link
Member

Thanks, tagged as 1.5 as this is an api break to fix, but one I agree we should probably make.

On the other hand, should we also be discouraging people from using mpl as an image I/O library?

@efiring
Copy link
Member

efiring commented Oct 18, 2014

@jni Is there a misunderstanding here? imsave is doing exactly what the docstring says it will do. It is just as happy with uint16 as it would be with any other number type; if it sees a 2-D array, it colormaps it. If you want uint16, or uint8, to be understood by imsave as a 2-D array of gray values, then you need to add a kwarg to tell it this is what you want, and modify the code and docstring accordingly. None of your suggested "desired behaviours" is consistent with imsave. I don't see any reason to break the API here.

@jni
Copy link
Contributor Author

jni commented Oct 18, 2014

@efiring

First, where does it say this on the docstring? I'm on 1.4 and I get: Save an array as in image file. (Let's ignore the grammatical error.) Keyword arguments are enumerated later and these include vmin/vmax and cmap, but the behaviour, which is unexpected, is not clearly explained, nor do I see anywhere how to make it do the sane thing which is to save my array.

Second, even if the docstring was clear, it's still wonky behaviour, and should not be the default. If scatter silently log-converted your data before plotting, but said so clearly in the docstring, that would still be worth revisiting! This is not quite as bad but it's close! If you took a room full of programmers and asked them whether they would expect imsave(fn, ar); ar = imread(fn) to be a no-op, I predict you'd get a pretty overwhelming majority to say yes.

@tacaswell you are correct that mpl may not be the library of choice for handling scientific data. On the other hand, it is extremely ingrained into the scientific Python ecosystem. For example, ipython --pylab populates the root namespace with mpl's imsave. Therefore, it's worthwhile ensuring it treats people's data with respect.

@efiring
Copy link
Member

efiring commented Oct 18, 2014

Save an array as in image file.

The output formats available depend on the backend being used.

Arguments:
  *fname*:
    A string containing a path to a filename, or a Python file-like object.
    If *format* is *None* and *fname* is a string, the output
    format is deduced from the extension of the filename.
  *arr*:
    An MxN (luminance), MxNx3 (RGB) or MxNx4 (RGBA) array.

That last line, together with the vmin/vmax and cmap kwargs, is intended to mean that an MxN array is colormapped; and that is what it always has meant. Perhaps it could be worded better. I take it your interpretation is that MxN uint8 or uint16 should be treated as a gray scale image. That's a reasonable alternative. There is precedent for handling integer data differently from floating point; in fact, I introduced it in the Colormap.call() method. What I think is your alternative could be implemented with a kwarg to avoid breaking the API, as I suggested above.
Please don't cast this as a matter of lack of respect for users' data, or unsuitability for scientific data. That's not the issue at all. Scientific use is central to mpl, and we have consistently tried to provide the means to handle and plot as accurately and carefully as possible.

@jni
Copy link
Contributor Author

jni commented Oct 19, 2014

@efiring in turn, I did not mean any disrespect to the mpl team. God knows (as does GitHub =P), I've made more than my share of bad API choices. However, I still think imsave/imread should combine to be a no-op, and I feel strongly about this. (I think that came across. =P) I understand API changes are a big deal, but some are worth it. I'd like to see more of the mpl team weigh in on this one.

@efiring
Copy link
Member

efiring commented Oct 19, 2014

@jni I'm getting a better picture of what you mean now; yes, I see the logic in having imsave/imread be inverses (provided a lossless format is used for both) in a data information sense. I still maintain that this can be done without breaking the API, by having the handling of a 2-D array determined by a kwarg. Whether the kwarg is necessary depends on whether there is user code that depends on uint8 and uint16 2-D arrays being color-mapped, as they are at present. The safest thing is to assume there is; I don't know if that is necessary in this case.
Implementation looks non-trivial. I don't think our present code can write gray-scale pngs, though it can read them.

@WeatherGod
Copy link
Member

Just thinking out loud, should the behavior be similar for imshow/imsave
with respect to color handling?

@efiring
Copy link
Member

efiring commented Oct 19, 2014

@WeatherGod, I don't understand; would you elaborate, please?

@WeatherGod
Copy link
Member

Essentially, should imsave() produce a similar image as one would get if
they did "imshow(); savefig()"? Obviously, the latter would appear as an
axes plot with ticks and a frame, but should that be the only difference?
Maybe a more apt analogy would be "figimage(); savefig()"?

In other words, should the input handling/processing for
imsave()/imshow()/figimage() all be normalized as much as possible?

On Sun, Oct 19, 2014 at 2:02 AM, Eric Firing notifications@github.com
wrote:

@WeatherGod https://github.com/weathergod, I don't understand; would
you elaborate, please?


Reply to this email directly or view it on GitHub
#3657 (comment)
.

@efiring
Copy link
Member

efiring commented Oct 19, 2014

On 2014/10/19, 4:25 AM, Benjamin Root wrote:

Maybe a more apt analogy would be "figimage(); savefig()"?

That's basically how it is implemented.

@Fadel87
Copy link

Fadel87 commented May 21, 2015

thanx my friend ... since two days Im trying to solve this myth :D .. thanx a lot :)
http://stackoverflow.com/questions/30370351/converting-raw-images-to-tiff-by-using-rawpy-module-in-python/30377835#30377835

@sanjay-1
Copy link

sanjay-1 commented Jul 5, 2017

I've had recent real-world experiences with matplotlib.pyplot.imsave() and the issues discussed above.

First, I'm relatively new to Python programming (just a few weeks), though I've been programming in other languages (C++, Java, C, others including many Assembly languages years ago) -- so I'm experienced with programming and APIs. I was trying to save multiple numpy arrays (each containing either a 2D color or 2D grayscale image). Each array was read from a file using (what I thought was its complement function), imread().

I encountered the exact problem described here, whereby my saved image was being saved not as grayscale but with some color-mapping. Since I'm new to Python and one never assumes a professional API/library has major bugs (or inconsistencies), I thus assumed my code and my newness to Python was causing this color-mapping. I investigated for over an hour, and eventually came to the conclusion my code was not causing this. More exploring, searching/reading... eventually I asked: "how can I be sure imsave() is doing what I want, maybe there's a missing parameter".

So I researched further for info on imsave(), and after another hour stumbled onto this issue thread. I understand the point made by both @jni and @efiring. However, while being inexperienced with pyplot (and Python) though not a complete newbie, I feel imsave() and imread() should be reverse complements of each other (my problem arose because I assumed they were). This is a reasonable assumption for programmers to make. Most programmers, especially experienced programmers who are used to working with professional / curated libraries, will assume so.

The ideal solution, one that causes minimal "breakage" to the existing API, would be to modify the imsave() function to first detect whether the image/matrix is 2D grayscale (i.e. has no 3rd tuple), and in that case only to internally force/default the appropriate "cmap" parameter and/optionally "dtype" parameter. This would make minimal if any change to its API definition, and so should affect no/few existing code.

This problem cost me significant time (over 2 hours already). Worst of all, it makes me realize that Python library(ies?) I will import in future code may similarly be not necessarily consistent/well-designed. This I feel is the worst aspect -- for me and possibly for most developers -- to be left with the belief/realization that a specific library/package is "dangerous" or "tricky" to use. It's counter to why we create libraries/packages. I not mean to disparage or cast doubts on the people or development community who maintain this/other libraries -- but these are side-effects of such problems existing.

I do not presume to know if there are other issues/constraints around this issue (of whether to change imsave() or not, so I offer this only as one firm data point of one user's recent experiences with this exact issue.

Personally I would prefer to see imread() and imsave() to together act as a no-op, for a given image-type. At a minimum, the doc pages for both functions should be updated to elucidate the problem.


If I may make a related suggestion (for the related doc pages)... the "cmap" parameter is not explained well on the [http://matplotlib.org/api/image_api.html#matplotlib.image.imread](matplotlib.org's imread() / imsave()) doc page. It should better explain the possible (or at least common) values for "cmap". It says this param defaults to the "rc image.cmap value", however even this value is not explained. I looked for this info in many places, starting with colors.Colormap, but found it explained nowhere. Only a runtime error msg from the Python interpreter gave me info I needed to proceed.

@efiring
Copy link
Member

efiring commented Jul 5, 2017

We would be happy to see two PRs, one simple one to improve the documentation as you suggest immediately above, and a second to add the ability to write grayscale pngs as directed by a new kwarg.

@jni
Copy link
Contributor Author

jni commented Jul 5, 2017

@sanjay-1 my recommendation would be to use scikit-image or imageio for IO. Now that --pylab is essentially deprecated, and scikit-image uses PIL by default, there's not much reason to use mpl as an image IO library. Having said that, if you want to put together a PR as suggested by @efiring, I'm sure future generations would be grateful!

@WeatherGod
Copy link
Member

WeatherGod commented Jul 6, 2017 via email

@tacaswell tacaswell modified the milestones: 2.1 (next point release), 2.2 (next next feature release) Oct 3, 2017
@jklymak
Copy link
Member

jklymak commented Jul 16, 2020

Just pinging the dev team @matplotlib - what is the point of imsave at this point? I think a documentation update would help folks realize its not a straight image dump as you'd get from PIL and close this issue.

So far as I can tell, its the same as imshow, but done directly on the data buffer, and not placed in axes etc.

@jklymak jklymak modified the milestones: needs sorting, v3.4.0 Jul 16, 2020
@jklymak jklymak self-assigned this Jul 16, 2020
@QuLogic QuLogic modified the milestones: v3.4.0, v3.5.0 Jan 27, 2021
@QuLogic QuLogic modified the milestones: v3.5.0, v3.6.0 Sep 25, 2021
@QuLogic QuLogic removed this from the v3.6.0 milestone Sep 14, 2022
@QuLogic QuLogic added this to the v3.6.1 milestone Sep 14, 2022
@QuLogic QuLogic modified the milestones: v3.6.1, v3.6.2 Oct 6, 2022
@QuLogic QuLogic modified the milestones: v3.6.2, v3.6.3 Oct 27, 2022
@tacaswell tacaswell modified the milestones: v3.6.3, future releases Oct 31, 2022
tacaswell added a commit to tacaswell/matplotlib that referenced this issue Oct 31, 2022
tacaswell added a commit to tacaswell/matplotlib that referenced this issue Oct 31, 2022
@tacaswell
Copy link
Member

Took 8 years, but we got there @jni !

@QuLogic QuLogic modified the milestones: v3.6.3, v3.6.2 Nov 1, 2022
@jni
Copy link
Contributor Author

jni commented Nov 17, 2022

😂 😲 👏 👏 👏

also 👋! 😃

melissawm pushed a commit to melissawm/matplotlib that referenced this issue Dec 19, 2022
@ebalogun01
Copy link

Encountered issue with imsave adding a fourth channel to my images and had to read through this thread. This behavior is rather frustrating.

@rcomer
Copy link
Member

rcomer commented Oct 2, 2023

@ebalogun01 if you are experiencing problems with the latest version of Matplotlib, please open a new issue with details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants