-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Add asinh axis scaling (*smooth* symmetric logscale) #21178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening your first PR into Matplotlib!
If you have not heard from us in a while, please feel free to ping @matplotlib/developers
or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.
You can also join us on gitter for real-time discussion.
For details on testing, writing docs, and our review process, please see the developer guide
We strive to be a welcoming and open project. Please follow our Code of Conduct.
A couple of quick comments.
|
Thanks, Jody, for taking a look at this. In answer to your questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need a bit of organization.
I'm not sure the new scale will see much use if it is not a clear drop-in replacement for symlog
. In particular, I would have to do some math to calculate what a0 is and what an appropriate value for my dataset should be, whereas I readily understand what threshold
means.
Similarly I think folks will want nice log-scales on there.
Finally, this will need some tests if you want to pursue it.
I just wanted to drop in and say that I think this is a worthwhile addition - for displaying signed data over a large range it has advantages and disadvantages to the symlog scaling. re. interpreting a0, for |
Great! The docs should explicitly say that. |
Thanks for picking this up @rwpenney! I agree that the name |
Thanks, all, for very helpful comments so far. For @greglucas 's point, I'm not sure where there may be a missing factor of two. Agreed, the documentation needs to make explicit what arcsinh transformation I'm using, but I believe I've arranged that in the region |x|<<a0, the transformation is x -> 1*x+O(x^3), which seems like the simplest natural choice. Also, I'm still reluctant to rename I'll aim to get the most of the code updates done later this week (after a few more urgent distractions elsewhere). |
I don't mind not calling it |
If you're using |
Its |
I think I've now addressed all the suggestions raised so far:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is getting there. Still not sure about the locator/formatter not giving decades. Ticks like 20, 100, 600 are not as common as 10^1, 10^2, 10^3, but maybe other have opinions?
if (ymin * ymax) < 0 and min(zero_dev) > 1e-6: | ||
# Ensure that the zero tick-mark is included, | ||
# if the axis stradles zero | ||
ys = np.hstack([ys[(zero_dev > 0.5 / self.numticks)], 0.0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you need a test for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is already covered by the test_linear_values()
and test_wide_values()
methods, which involve axis-ranges that are symmetric around the origin. During development, some of those tests did fail without this check on min(zero_dev)
. Visual inspection has also confirmed that various demonstration plots are less likely to have tick-collisions near zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep this Locator, I still see some tick collisions around zero (asinh demo and colormap normalization), so I wonder if this should be "widened" or set to some threshold based on linear_width to try and remove some more ticks even.
@rwpenney I've added the "needs discussion" label, which means we may discuss at our (open to anyone!) developer call on Thursday. Questions to me are 1) do we want another scale like this? 2) is there general support for the non-decade locator/formatter? |
Thanks @jklymak - it would be very welcome to hear others' views on whether a new axis-scale would be useful to the matplotlib community. A few further thoughts on the non-decade formatter:
|
Just a few comments here:
For the example, I find it hard to see the differences other than the abrupt change. What about making one axis Click to expand code
import numpy as np
import matplotlib.pyplot as plt
# Prepare sample values for variations on y=x graph:
x = np.linspace(-5, 5, 100)
# Compare "symlog" and "asinh" behaviour on sample y=x graph:
fig1 = plt.figure(constrained_layout=True)
ax0, ax1 = fig1.subplots(1, 2)
ax0.plot(x, x)
ax0.axline((0, 0), (5, 5), color="black", linestyle=(0, (5, 5)),
transform=ax0.transAxes)
ax0.set_xscale('asinh')
ax0.set_yscale('symlog')
ax0.grid()
ax0.set_xlabel('asinh')
ax0.set_ylabel('symlog')
lims = np.min(x), np.max(x)
ax0.set_xlim(lims)
ax0.set_ylim(lims)
# Now use logspace to go out multiple decades logarithmically
x = np.logspace(-3, 3, 100)
ax1.plot(x, x)
ax1.axline((0, 0), (5, 5), color="black", linestyle=(0, (5, 5)),
transform=ax1.transAxes)
ax1.set_xscale('asinh')
ax1.set_yscale('symlog')
ax1.grid()
ax1.set_xlabel('asinh')
ax1.set_ylabel('symlog')
lims = np.min(x), np.max(x)
ax1.set_xlim(lims)
ax1.set_ylim(lims)
plt.show() |
I think re the ticker, I was only discussing the exponential ranges, and arguing that they should tick on powers of 10 (or powers of an optional |
Thanks for the various updates. I certainly agree that the choice of axis-scaling, and almost every other aesthetic choice about a plot, is very strongly dependent on one's data. There are definitely some cases where
I'd assume that there's no reason why a user couldn't use the |
Hi @rwpenney, this was brought up on the weekly dev call today for discussion. People were generally in agreement that the transform and scale are nice mathematical alternatives to symlog and would be OK additions. There was some confusion about the locator though, due to the ticking not being symmetric about 0 and not being on decades at the large ranges. So, there was a general preference to use the current There is also the option to add this as a separate package if you'd like, which is currently done with For the documentation, I really like your examples and descriptions in the Finally, addressing your last comment about colormapping:
I would guess you're right, but likely less noticeable than on a lineplot. It might be worth adding this new norm to the SymLogNorm colormapping example too, demoing the alternative there as well. |
Thanks, @greglucas for discussing this at the weekly dev call, and for your suggestions on a way forward. I was mistaken in assuming that we could simply use the Thanks for the comments about the documentation - I'll be adding some links to the I'll need to add a few more tests for the new tick-locator, and hope to have something ready for more formal review within the next few days. |
I believe I've now addressed all the points raised in @greglucas 's most recent message:
An example of the colorscale plot in the revised As ever, I'd welcome others' thoughts on how we proceed from here. |
This PR is affected by a re-writing of our history to remove a large number of accidentally committed files see discourse for details. To recover this PR it will need be rebased onto the new default branch (main). There are several ways to accomplish this, but we recommend (assuming that you call the matplotlib/matplotlib remote git remote update
git checkout main
git merge --ff-only upstream/main
git checkout YOUR_BRANCH
git rebase --onto=main upstream/old_master
# git rebase -i main # if you prefer
git push --force-with-lease # assuming you are tracking your branch If you do not feel comfortable doing this or need any help please reach out to any of the Matplotlib developers. We can either help you with the process or do it for you. Thank you for your contributions to Matplotlib and sorry for the inconvenience. |
Thanks @greglucas for going through the code, and for your many helpful comments. There are a few points that perhaps warrant a bit more discussion:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor text edits at this point.
One additional suggestion relating to the use of SymmetricalLogLocator and how to leverage that in the Scale. I think that is a nice re-use of code that is already there, but I won't hold this up on it either. I think this is a really welcome addition and really well done PR by @rwpenney!
examples/images_contours_and_fields/colormap_normalizations_symlognorm.py
Outdated
Show resolved
Hide resolved
examples/images_contours_and_fields/colormap_normalizations_symlognorm.py
Outdated
Show resolved
Hide resolved
axis.set(major_locator=AsinhLocator(self.linear_width, | ||
base=self._base), | ||
minor_locator=AsinhLocator(self.linear_width, | ||
base=self._base, | ||
subs=self._subs), | ||
minor_formatter=NullFormatter()) | ||
if self._base > 1: | ||
axis.set_major_formatter(LogFormatterSciNotation(self._base)) | ||
else: | ||
axis.set_major_formatter('{x:.3g}'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my suggestion to leverage the SymmetricalLogLocator, I think something like the below suggestion looks pretty good in general (the *2
seems to get rid of some of the overlapping near-zero tick texts). This would remove the need for the AsinhLocator and tests altogether by leveraging the SymLog work that finds the decades, but still keeps the subs info you put in to the scale. I do understand the hesitancy of using the older "threshold" and trying to get away from that, but I still think that the linear_width and threshold are very closely related ideas, so it isn't all that crazy to relate the two either.
axis.set(major_locator=AsinhLocator(self.linear_width, | |
base=self._base), | |
minor_locator=AsinhLocator(self.linear_width, | |
base=self._base, | |
subs=self._subs), | |
minor_formatter=NullFormatter()) | |
if self._base > 1: | |
axis.set_major_formatter(LogFormatterSciNotation(self._base)) | |
else: | |
axis.set_major_formatter('{x:.3g}'), | |
axis.set(major_locator=SymmetricalLogLocator( | |
base=self._base, | |
linthresh=2 * self.linear_width), | |
major_formatter=LogFormatterSciNotation(self._base), | |
minor_locator=SymmetricalLogLocator( | |
base=self._base, | |
linthresh=2 * self.linear_width, | |
subs=self._subs), | |
minor_formatter=NullFormatter()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @greglucas , for going through this again and for another set of helpful refinements.
The proposed removal of the AsinhLocator
is obviously a more major decision, so I'd be interested in others' views before committing the last of your proposed changes. I'm obviously not impartial in my loyalty to AsinhLocator
, or my reservations about the discontinuities that are an intrinsic part of the SymmetricalLogLocator
that led to the creation of AsinhScale
. There are imperfections and compromises in both of these approaches. Overall, I prefer the mathematical cleanliness of asinh
for both the Scale
and Locator
.
lib/matplotlib/ticker.py
Outdated
The fractional threshold beneath which data which covers | ||
a range that is approximately symmetric about zero | ||
will have ticks that are exactly symmetric. | ||
base : int, default: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the base 0 a bit odd. I'd put default: 10 and then special case the 10 below?
if (ymin * ymax) < 0 and min(zero_dev) > 1e-6: | ||
# Ensure that the zero tick-mark is included, | ||
# if the axis stradles zero | ||
ys = np.hstack([ys[(zero_dev > 0.5 / self.numticks)], 0.0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep this Locator, I still see some tick collisions around zero (asinh demo and colormap normalization), so I wonder if this should be "widened" or set to some threshold based on linear_width to try and remove some more ticks even.
Co-authored-by: Greg Lucas <greg.m.lucas@gmail.com>
Minor tweaks to resolve flake8 issues
@rwpenney, I apologize for letting this drop off my radar! I just came across issue #17402 that describes some ticking related issues with symlog and short ranges. I just tried out this PR and it looks like it suffers some of the same issues. I'm wondering what your thoughts would be about including some of the ideas from this comment into your locator? |
Thanks, @greglucas for the link to issue #17402 - interesting to see that we haven't yet converged on an ideal solution to this problem. Also notable is that many of the example plots in that issue-report seem to very nicely emphasise the discontinuous gradients that were a big incentive to revisit the assumptions behind symlog. I'll give the ideas in #17402 some more thought, and see what we might be able to merge into the arcsinh mechanisms. |
The question of whether there is a much better, or even "correct", design for the tick-locator in these symlog-like scenarios seems rather intractable. We have a number of options at the moment, including
Although the way the current API separates the coordinate transformation from the tick-location from the tick marking is elegant, this seems to be at the expense of allowing us to have major tick locations marked on the axis without also having the accompanying textual label. What symlog and asinh transformation seem to be highlighting is that there are situations where that API is itself imposing constraints that affect the aesthetics of the axis labelling. Certainly, it's unlikely to be worth making drastic changes to the API to accommodate these non-homogeneous coordinate transformations, but I fear it does mean that we're unlikely to avoid some compromises with what we can achieve with our axis labelling. So, overall, I think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for letting this languish again @rwpenney. I appreciate you doing the investigation of all the various ticking/formatting options here, and thank you for your patience with my questions surrounding all of the various options.
I'm happy to put this in with the locators that are present here (even easier to justify with the experimental note), they are quite good for general use. More generally, I think we've found out that an automatic locator for all use-cases is really hard for this type of scale :) So, if you want something better, place the ticks yourself ;)
examples/images_contours_and_fields/colormap_normalizations_symlognorm.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but a few more "provisionals" would be in order...
Co-authored-by: Jody Klymak <jklymak@gmail.com>
Co-authored-by: Jody Klymak <jklymak@gmail.com>
examples/images_contours_and_fields/colormap_normalizations_symlognorm.py
Outdated
Show resolved
Hide resolved
Thanks @rwpenney! Congratulations on your first PR to Matplotlib 🎉 We hope to hear from you again. |
Summary
When plotting quantities that span many orders of magnitude, it is standard practice to apply log-scaling, and Matplotlib already provides the
symlog
scaling which allows log-scaling to be applied to quantities that are both positive and negative. However,symlog
introduces discontinuous gradients in its transformation because it involves grafting together separate linear and logarithmic transformations. This pull-request proposes a new transformation, based onnumpy.arcsinh
, which is a smooth transformation that is asymptotically logarithmic for large sample values and has a quasi-linear region of controllable width near the origin.This pull-request introduces a new
asinh
option toAxis.set_yscale
which applies an inverse hyperbolic sine transformation and also generates tick locations that are roughly uniformly spaced over the given axis, while being rounded sensibly on a base-10 scale. The plot below shows how this behaves when plotting y=m*x with different settings of the scale parametera0
(which sets the width of the linear region neary=0
).A previous pull request (#16639) had proposed changing the behaviour of
symlog
itself. This P/R sits alongsidesymlog
and should have no impact on code currently dependent on the behaviour ofsymlog
.As a simple example of use of the new
asinh
scaling:PR Checklist
pytest
passes).flake8
on changed files to check).flake8-docstrings
and runflake8 --docstring-convention=all
).doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).