-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Begin warning on modifying global state of colormaps #16991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/matplotlib/cm.py
Outdated
_cmap_d = _gen_cmap_d() | ||
locals().update(_cmap_d) | ||
# This is no longer considered public API | ||
cmap_d = _cmap_d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure how to warn on read/write here. Do I need to create a new Dictionary subclass that overwrites the getters/setters with the warn_deprecated call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess something like #10735 (simpler if Py3.7+) would be needed to fully warn on access (and then you don't need to warn on getitem/setitem I guess)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a nice helper! I think that may be very helpful for deprecating plt.cm usage. I decided to go the quick route and just subclass dict and add in warning on get/set only.
@timhoffm, just wanted to add you here to get your feedback since I can't explicitly request you as a reviewer. Happy to hear your thoughts on this. |
Actually, I started searching around a bit to see where/how others are using cmap_d and I think deprecating We explicitly use it in this example: There is this SO post with the first answer stating to use There is also mention of accessing plt.cm.datad! So, the |
Maybe removal should be scheduled for 4.0 or whatever sounds like "a long time from now". |
@timhoffm, Did you have any thoughts on this? I'd like to try to get some of the warnings in for the 3.3 release to get the deprecation cycle started. No API code is being changed yet.
|
👍 for pushing the deprecations into 3.3. |
cc95a3a
to
4c26048
Compare
Thanks, Tim! Great suggestions. I just pushed up a commit with all of those suggestions in it. I also decided to take your advice for more descriptive names and apply it to the private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to use _cmap_registry
instead of _cmap_d
. Sometimes you overlook the obvious.
@@ -104,7 +146,8 @@ def register_cmap(name=None, cmap=None, data=None, lut=None): | |||
raise ValueError("Arguments must include a name or a " | |||
"Colormap") from err | |||
if isinstance(cmap, colors.Colormap): | |||
cmap_d[name] = cmap | |||
cmap._global = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document in the docstring that if passing a Colormap instance, that instance will become immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added in a Notes section to both the get_cmap
and register_cmap
functions now. Let me know if you want any different wording there. It may be odd to talk about future behavior there considering you can't do anything to get that future behavior yet? But I can't think of any great wording off the top of my head.
d5a8b91
to
17bb516
Compare
lib/matplotlib/colors.py
Outdated
"a copy of the requested colormap before modifying it. ", | ||
alternative="To modify a colormap without overwriting the " | ||
"global state, you can make a copy of the colormap " | ||
f"first. cmap = copy.copy(mpl.cm.get_cmap({cmap.name})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty verbose alternative for something that I expect to catch a fair number of users. I am in favor of adding the copy=True
kwarg to get_cmap
in the same move as we add this warning so we can provide a less verbose (if less back-compatible) option to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to just change the wording here for now, and put .copy()
instead of copy.copy()
. I think the wording was just a little long, so I tidied it up. The alternative is actually less verbose than copy=True
. I also don't really know if we should add a keyword for one version that will just be removed a version later when it is unnecessary. Let me know what you think...
I labeled as release critical as I think this is very close and do not want it to fall through the cracks as we do the final push to 3.3. |
👍 Good comments on the rest of it. I'll try and push up changes to address them tomorrow. |
Can you try rebasing on current master? We are very cranky about making sure all of the internal references actually work (with a list of allowed errors). Sphinx made some internal changes that affected how references to the nested classes work so we re-generated the allowed list to the new version. Unlike travis which runs against the shadow merge commit GH makes to determine if your branch is mergable, circle runs against your actually commit so if you have not rebased on master from when we fixed that circle will be broken. |
17bb516
to
877cfe5
Compare
@@ -38,7 +38,7 @@ | |||
a = np.vstack((a, a)) | |||
|
|||
# List of all colormaps; skip reversed colormaps. | |||
maps = sorted(m for m in plt.cm.cmap_d if not m.endswith("_r")) | |||
maps = sorted(m for m in plt.colormaps() if not m.endswith("_r")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a remark (not necessarily to be acted upon in this PR):
By making the cmap registry private, we don't have an official way to get the registered colormaps from the cm
module, where I would expect such functionality. This is something we should take care of in followup PRs.
I'm not even sure plt.colormaps()
is needed. First, It's essentially a lengthy description of colormaps, which should be somewhere in the docs but not in a pyplot docstring. Second, I don't see providing the colormaps as a core functionality of pyplot, but since there's a lot of historic colormap stuff in pyplot, that's maybe ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought: #16991 (comment)
I'm not sure what the best step is to expose the access, but would agree that I don't want to have to go to pyplot if not necessary. Maybe rather than a cmap_d
, we could do something like your MutableMapping suggestion and then override the get/set methods to ensure copy/immutable return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this is a bit unstructured. I just stated typing and ideas unfolded along the way.
There would be two ways:
-
The
cm
module currently uses simple functionsget_cmap()
andregister_cmap()
. In that spirit, you could add a functionget_cmap_names()
(naming t.b.d.) but we have to make a distinction betweenget_cmap
returning an instance and that function (probably?) returning strings. -
The alternative would be to expose a
ColormapRegistry
object, which provides all necessary functionality. That would likely be subclassingMutableMapping
to look dict-like but still perform all relevant checks (maybe it would alternatively only implementMapping
and provide an explicit register method if we feel the mutation should be more special (i.e. we don't wantdel registry['jet']
orregistry['jet'] = Colormap(...)
).
This would also replace the module level functions; at least their implementation. Whether they will additionally stay for backward compatibility is a separate topic.
In the end 2 is the more complex approach. However, I have a slight preference for it because it properly encapsulates the colormap handling. On the downside it changes the API from calling simple cm functions to an object. But then again, we could move the actual instance to a variable matplotlib.colormaps
, which may make more sense and users would usually not need to import matplotlib.cm anymore; and pyplot.colormaps
could later even point to the same instance (with some API compatibility because the keys of that mapping would then colormap names, so for name in plt.colormaps()
would still work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with pretty much everything you're saying. I also don't think 2 would be that hard to implement. An issue with naming will be found with: #11413 which is already proposing to rename cm to colormaps. This discussion is really more a part of the MEP: https://matplotlib.org/3.2.1/devel/MEP/MEP21.html
Maybe we should do some larger work on an overarching vision there and then decide on how best to get there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we provide colormap registry access from matplotlib
directly, cm.get_cmap
and cm.register_cmap
become obsolete. There won't be a need for the end user to import cm
, which in turn reduces the need for renaming (#11413). But we could still find other names for that.
It would be reasonable to detail the overall vision.
lib/matplotlib/tests/test_colors.py
Outdated
@@ -70,6 +70,73 @@ def test_register_cmap(): | |||
cm.register_cmap() | |||
|
|||
|
|||
@pytest.mark.skipif(matplotlib.__version__ < "3.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant adding tests for future intended behavior. We don't know for sure if this will be implemented for the version and with the behavior we anticipate now.
Additionally, the test will be ignored during the complete development phase (master version targeting 3.3 is currently evaluated to 3.2.1.post2103+gb94812c60
). The check would only hit very late during the release.
You may keep this, but it needs explanation. If we are hit by the failing test in a year or so, we need a clear way of finding out why this is here, why the tested functionality is desired and what we have to do to implement it. Therefore, please add a comment describing the intention and referring to the modification roadmap discussed on github.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is unfortunate that this wouldn't actually hit until near the release. Is there a way to test against something else? My thought is that leaving these there will make sure this doesn't get lost along the way and it isn't hard to implement these since I've already got the code in another linked PR. I put a link to the gh issue number in as a comment now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other tests could be earlier but would still hit at some other time and break CI, when we maybe don't want to implement this right now.
Either leave it like this or remove it here and put the test just into your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to put the test in the PR where the change is actually made.
There are better ways to document a future test than a skipped test (I don't think this would actually get run until we tagged the rc.
As a side note, In general comparing versions via text is not the best way as it sorts pre-releases as being later than the final releases, distuitls.version.LooseVersion
is usually the better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that suggestion of comparison with distutils. I just removed all of these future tests. I'll hope it doesn't drop off my radar.
877cfe5
to
8268c4e
Compare
self._warn_deprecated() | ||
return self._cmap_registry.__getitem__(key) | ||
|
||
def __iter__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not warn for list(mpl.cm.cmap_d)
but does warn for list(mlp.cm.cmap_d.items())
. Was this an oversight that __iter__
does not warn or a decision that because it is only giving a list of keys that could then go to get_cmap
it should not wark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oversight, the iter still warns because of the getitem, but I didn't think of people just listing the colormaps and using it manually themselves elsewhere. I just added it here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for __len__
below? Sorry for the thousand paper cuts at the end...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, easy enough to fix. Should have just done it myself ;)
8268c4e
to
7d23616
Compare
7d23616
to
692b83c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the right direction to go in terms of removing a foot-cannon, but I am still a bit worried about the amount of users this is going to affect. Hopefully I am over estimating it and in the users we do hit are mostly people we are saving from confusing bugs later.
👍 to merge modulo CI green.
Just wanted to say, bravo for the super clear deprecation message including specific code suggestions for how to update downstream code. I can see that took extra effort to put together and it's much appreciated. 👏 |
I wish I'd followed this better. I did cmap = cm.RdBu_r
cmap.set_over('yellow') and was told I had to do:
Thats pretty user hostile. When will this get fixed? |
I guess it depends on what you mean by fixed. We could update the message pretty easily too if you have any suggestions there. |
I guess I mean an alternative that doesn't require an import and copy call. If that is #18503 we should get that in ASAP. I don't understand why a deprecation was deemed necessary for that PR to go in though. I'd actually vote to revert this for V3.3.3, but maybe I'm not understanding some subtlety. |
This was to enable people who may have been (ab)using the ability to modify the global colormap to change before we remove the capability entirely. In your example, if you keep running the program, or go to some other Jupyter notebook cell and want to get the RdBu colormap again, it would still have the yellow over color, which is probably not desired. cmap = cm.RdBu_r
cmap.set_over('yellow')
# Later on you get the built-in colormap again
cmap2 = cm.get_cmap('RdBu_r')
cmap2(np.inf) # yellow I agree it is somewhat annoying to import copy for now. We had discussed bringing in a |
Sure, I agree with the over all strategy, just not the tactics. As of 3.3 there is no other way to get the colormap except to do |
Maybe this is just a poorly worded message then. The only reason to use copy here is if you need to re-get the original RdBu colormap at some later time. You can certainly keep using the My vote would be to leave this as is for now and make sure 3.4 is returning a copy. I really don't think this is that egregious, more of a minor annoyance along the deprecation path. |
fwiw I guess #14645 may help too? |
The message would be useful if people had to change their code. But so far as I can tell we are going to allow The only users who would have to change behaviour are those who were for some reason depending on being able to access the colormap from the global state. cmap = cm.RdBu
cmap.set_over_color([0, 1., 1., 1.])
.....
cmap=cm.RdBu
assert cmap.get_over_color() == [0, 1., 1., .1.] which I imagine is exceedingly rare. The rest get a misleading Deprecation message that makes it sound like they won't be able to call |
Did we decide to go down the route of module properties (which is py37 so we have it on main)? I do not remember any discussion of it and am not finding it in the text of this issue. If we do not, then we can't make |
Good question. I think that was brought up as a possibility and an alternative would be to make the cmap object's immutable (Tim's proposal). I think we could get away with something like defining our own getattr at the module level for the "return a copy" case: def __getattr__(name):
return ColormapRegistry[name] All of this discussion leads me to think we should really have a clear end-goal in mind and lay that out. Then, we can start shimming the deprecations in. There is already MEP21, which looks to be stagnant. Mind if I revive that MEP with some new text and then we can start linking issues/fixes to a more central location? |
... or use the Projects feature... |
Making attribute access return new (mutable) copies seems rather contrary to standard python semantics. In particular this means that something like from matplotlib import cm
cm.jet.set_bad(...)
plt.imshow(..., cmap=cm.jet) would silently "fail", whereas from matplotlib.cm import jet
jet.set_bad(...)
plt.imshow(..., cmap=jet) would succeed... but I doubt that having both snippets behave differently is a good idea. |
I agree with @anntzer , while cute there are too many weird cases. Just because you can hijack and change something in Python does not mean you should. We need to decide if we only want to deprecate mutating the global state when getting the colormap through I think when we wrote the warning we were not thinking about the attribute access case at all. Do we document / recommend that usage anywhere? |
I find it useful in these cases to think about what the ideal situation would be and only then figure out how we would get there from where we are (if possible). So, how would this look if we were to start over? |
My 50c would be that we should just make all colormaps immutable (after merging the with_extremes PR (or similar functionality) and possibly with a very long deprecation period, given that we are basically breaking everyone who way setting over/under/bad before with essentially no way to write code that compatible both with old and new Matplotlib versions, as with_extremes did not exist before). Especially if we are going to push in the direction of encouraging the use of the global cmap registry (via # in foo.py
foo_cmap = LinearSegmentedColormap(...)
register_cmap("foo", foo_cmap)
...
# in module2.py
imshow(..., cmap="foo")
...
# in module3.py
from foo import foo_cmap
foo_cmap.set_over(...) # oops, this will affect module2.py if run before it. |
It seems to me that the original problem was defining globals like Given the state we are in, it seems sensible to me that if you modify a global like It would be nice if TLDR: it would be nice if we could return copies whenever possible, because I don't think it is at all clear to users that |
Actually, now that I look at it again, I guess a solution for the "sneaky global" problem I mentioned just above would be for |
@anntzer it's rarely the problem that one registers a colormap instance and then modifies the instance, thus modifying the registered colormap. While that's a theoretical issue I'm not seriously worried about that. Few people register colormaps themselves, and even fewer will hold on to instances some times and use the registry other times. The far more common problem is obtaining a reference to the registered colormap and modifying that, without knowing that it affects the registered state. |
I'd guess making a copy at registration time is simple enough to be worth considering, but won't insist on that either. |
The module attribute access has been with us from at least 2004 (e34a333). I think clawing those back is off the table. So if we were to make the attribute access return a copy: Pro: same behavior as I think the case of
is enough to take that option out of consideration (even if we could make it work) If we back off making the attribute access immutable: Pro: leave the foot-cannon that people are using in place
which is a variation on the issue with making It also will make If we make the attribute access immutable (which is the status quo) Pro: avoids all of the cons above Backing off of making the global versions of the cmap objects immutable is also an option, but I think that the action-at-a-distance and general global state issues are worth cleaning up for all of the reasons we had when we started this. On net, I think we should go with making the attribute access immutable and not try to return a copy. It may not be ideal, but I think it is the least-bad of the options we have. I agree about the copy in cmap_register (I checked, we do not currently do that but we do flag them as global). |
It!'s reasonable to do. I'm just clarifying that it does not solve our main issue. |
At some point @efiring wanted to change the name of this module anyhow. Is yet another possibility just writing a new way to do this from scratch and leaving |
PR Summary
This is the start of deprecating global state access for built-in colormaps. It doesn't implement any solutions yet, just begins to warn users about what will happen.
Returning a copy of colormaps is implemented in #16943
Per the dev call, the tentative plan forward is:
3.3
3.4:
3.5:
PR Checklist