Skip to content

Add asinh axis scaling (*smooth* symmetric logscale) #21178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Feb 16, 2022

Conversation

rwpenney
Copy link
Contributor

Summary

When plotting quantities that span many orders of magnitude, it is standard practice to apply log-scaling, and Matplotlib already provides the symlog scaling which allows log-scaling to be applied to quantities that are both positive and negative. However, symlog introduces discontinuous gradients in its transformation because it involves grafting together separate linear and logarithmic transformations. This pull-request proposes a new transformation, based on numpy.arcsinh, which is a smooth transformation that is asymptotically logarithmic for large sample values and has a quasi-linear region of controllable width near the origin.

symlog-asinh

This pull-request introduces a new asinh option to Axis.set_yscale which applies an inverse hyperbolic sine transformation and also generates tick locations that are roughly uniformly spaced over the given axis, while being rounded sensibly on a base-10 scale. The plot below shows how this behaves when plotting y=m*x with different settings of the scale parameter a0 (which sets the width of the linear region near y=0).

asinh-scales

A previous pull request (#16639) had proposed changing the behaviour of symlog itself. This P/R sits alongside symlog and should have no impact on code currently dependent on the behaviour of symlog.

As a simple example of use of the new asinh scaling:

import numpy
import matplotlib.pyplot as plt

# Prepare sample values for variations on y=x graph:
x = numpy.linspace(-3, 6, 100)

# Compare "symlog" and "asinh" behaviour on sample y=x graph:
fig1 = plt.figure()
ax0, ax1 = fig1.subplots(1, 2, sharex=True)

ax0.plot(x, x)
ax0.set_yscale('symlog')
ax0.grid()
ax0.set_title('symlog')

ax1.plot(x, x)
ax1.set_yscale('asinh')
ax1.grid()
ax1.set_title(r'$sinh^{-1}$')

PR Checklist

  • [N/A] Has pytest style unit tests (and pytest passes).
  • [*] Is Flake 8 compliant (run flake8 on changed files to check).
  • [*] New features are documented, with examples if plot related.
  • [*] Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • [*] Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • [*] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [N/A] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a while, please feel free to ping @matplotlib/developers or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@jklymak
Copy link
Member

jklymak commented Sep 26, 2021

A couple of quick comments.

  • Can a0 be scaled to mean something in terms of the asymptotes? Folks are used to controlling the width of the linear region with threshold kwarg
  • I think the locator and formatter should basically be the same as for symlog.
  • I think by choosing a higher odd power of sinh you can make the transition more abrupt?

@rwpenney
Copy link
Contributor Author

rwpenney commented Sep 26, 2021

Thanks, Jody, for taking a look at this. In answer to your questions:

  • The asymptotes behave like (a0*log(x) + O(1)), so a0 can be thought of as controlling the "base" of the logarithm, as well as the width of the quasi-linear region. However, this base obviously gets scaled away by the final affine translation before rendering on the screen. Also the word "threshold" very much implies an abrupt change between linear and logarithmic regimes, and that's something I deliberately don't want to imply.
  • I'm not convinced that the locator and formatter should be the same as for symlog, because the latter seems very strongly tied to there being distinct regions of the transformation. Also, the default settings for symlog's locator seem only to put major ticks on the powers of ten, which is not very helpful for many plots. The example plots above hopefully show that the new formatter & locator put many more guides that should help the viewer get an intuition for how the data points are distributed. It's not obvious to me whether there's a simple way of getting both regularly spaced major ticks and neatly regular numerical values (as can be done for the plain log-scale), so my current solution is the best I've found so far that fits in a small number of lines of code.
  • Interesting idea about higher powers of asinh, although that's obviously not implemented in this patch. The risk in offering this facility would be that one has even more subtlety for the viewer to interpret - we might have a region near zero where the graph was approximately cubic and the asymptotes would then be O((log(x)^3), for example.

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need a bit of organization.

I'm not sure the new scale will see much use if it is not a clear drop-in replacement for symlog. In particular, I would have to do some math to calculate what a0 is and what an appropriate value for my dataset should be, whereas I readily understand what threshold means.

Similarly I think folks will want nice log-scales on there.

Finally, this will need some tests if you want to pursue it.

@dstansby
Copy link
Member

This will need a bit of organization.

I'm not sure the new scale will see much use if it is not a clear drop-in replacement for symlog. In particular, I would have to do some math to calculate what a0 is and what an appropriate value for my dataset should be, whereas I readily understand what threshold means.

I just wanted to drop in and say that I think this is a worthwhile addition - for displaying signed data over a large range it has advantages and disadvantages to the symlog scaling. re. interpreting a0, for x <~ a0 the scale is approximately linear, for x >~ a0 the scaling is approximately logarithmic, so I think that there's a similar interpretation as a0 in both the existing norm and the new one in this PR.

@jklymak
Copy link
Member

jklymak commented Sep 28, 2021

interpreting a0, for x <~ a0 the scale is approximately linear, for x >~ a0

Great! The docs should explicitly say that.

@greglucas
Copy link
Contributor

Thanks for picking this up @rwpenney!

I agree that the name a0 could be more descriptive, I think linthresh would be a better choice to keep the similarity to symlog. Also, I believe there is a factor of 2 missing for the linear region approximation? The other linked PR had the base change, but I am not sure that is strictly necessary here, the other PR had it because that was trying to replace symlog. However, it wasn't a difficult change either, just adding another scaling parameter in, so that can be left up to you.

@rwpenney
Copy link
Contributor Author

Thanks, all, for very helpful comments so far.

For @greglucas 's point, I'm not sure where there may be a missing factor of two. Agreed, the documentation needs to make explicit what arcsinh transformation I'm using, but I believe I've arranged that in the region |x|<<a0, the transformation is x -> 1*x+O(x^3), which seems like the simplest natural choice.

Also, I'm still reluctant to rename a0 to anything linked to the word "threshold" - the whole purpose of this new axis scaling is that there isn't a threshold that separates the nearly linear from asympotitically logarithmic regions.
Is there a clearer shorthand for this, perhaps transition_scale or tx_scale?

I'll aim to get the most of the code updates done later this week (after a few more urgent distractions elsewhere).

@jklymak
Copy link
Member

jklymak commented Sep 28, 2021

I don't mind not calling it threshold and agree with your argument, but a0 is too mysterious. transition_scale is OK, maybe linear_scale? Though I hesitate on either of those because this is defining the scale of part of a "scale", which is a bit confusing. Is transition_width possible? Maybe thats where the factor of two came from before...

@dstansby
Copy link
Member

Agreed, the documentation needs to make explicit what arcsinh transformation I'm using, but I believe I've arranged that in the region |x|<<a0, the transformation is x -> 1*x+O(x^3), which seems like the simplest natural choice.

If you're using asinh(ax), that's ax + O(x^3) for |x| < 1/a right? How about calling it linear_gradient or linear_slope in that case?

@jklymak
Copy link
Member

jklymak commented Sep 29, 2021

Its a0 sinh (x / a0), so I think its x + a0 O((x/a0)^3) for |x| < a0.

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 1, 2021

I think I've now addressed all the suggestions raised so far:

  • the AsinhLocator class has been relocated to ticker.py
  • the a0 parameter has been renamed to linear_width
  • the documentation now makes explicit what formula is used for the arcsinh transformation
  • various unit-tests have been added for the transformations and tick-locator
  • handling of tick-marks near zero has been improved to avoid very closely spaced ticks

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is getting there. Still not sure about the locator/formatter not giving decades. Ticks like 20, 100, 600 are not as common as 10^1, 10^2, 10^3, but maybe other have opinions?

if (ymin * ymax) < 0 and min(zero_dev) > 1e-6:
# Ensure that the zero tick-mark is included,
# if the axis stradles zero
ys = np.hstack([ys[(zero_dev > 0.5 / self.numticks)], 0.0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you need a test for this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is already covered by the test_linear_values() and test_wide_values() methods, which involve axis-ranges that are symmetric around the origin. During development, some of those tests did fail without this check on min(zero_dev). Visual inspection has also confirmed that various demonstration plots are less likely to have tick-collisions near zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this Locator, I still see some tick collisions around zero (asinh demo and colormap normalization), so I wonder if this should be "widened" or set to some threshold based on linear_width to try and remove some more ticks even.

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 1, 2021

As an illustration of the latest version of the demonstration plots, attached is the comparison of symlog with asinh on y=x.
asinh_demo
The improved logic for handling ticks near zero has tidied up the nearly overlapping '0.1' and '0' markers shown in the original P/R message. I've also added some circles that show the points where the symlog transformation has discontinuous gradients. Given that these circles are themselves mapped through the same coordinate transformation, these egg-shaped annotations further emphasise the differences between the two axis scales.

@jklymak jklymak added this to the v3.6.0 milestone Oct 5, 2021
@jklymak jklymak added the status: needs comment/discussion needs consensus on next step label Oct 5, 2021
@jklymak
Copy link
Member

jklymak commented Oct 5, 2021

@rwpenney I've added the "needs discussion" label, which means we may discuss at our (open to anyone!) developer call on Thursday. Questions to me are 1) do we want another scale like this? 2) is there general support for the non-decade locator/formatter?

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 5, 2021

Thanks @jklymak - it would be very welcome to hear others' views on whether a new axis-scale would be useful to the matplotlib community.

A few further thoughts on the non-decade formatter:

  • In any non-linear axis transformation, the crucial requirement is to give the viewer enough clues about the coordinate system throughout all regions of the coordinate transformation. Having tick-marks that coincide neatly with powers of ten, or other "regular" benchmarks is of secondary importance, and not always achievable.
  • Transformations such as (ordinary) log-scaling, have a regularity that makes it easy to achieve uniform tick marks across each decade - neither symlog nor asinh have this mathematical structure.
  • The various plots I've shared in this pull-request, hopefully, make it clear that the default settings in symlog, even for very simple datasets, do not produce anything like enough tick-marks to help the viewer understand the various different regions of the coordinate transformation.

@greglucas
Copy link
Contributor

Just a few comments here:

  • I think the reason I was thinking there was a factor of 2 off, was because in the other PR I was trying to replace the symlog scale, which has a default of 2 on linthresh. So, for this PR I think it makes sense to keep a default of 1 like you are and ignore my previous comment.
  • For the tickers, I think it depends on each user's problem as to which one they would prefer. I personally only used symlog to try and get log-like scaling on positive/negative quantities on a colormapped image, so the different regions didn't really matter to me as much as it was nice to have +/- decades. However, I understand if you want to view the data in the linear region, then having a different ticker there makes sense.

For the example, I find it hard to see the differences other than the abrupt change. What about making one axis symlog and the other axis asinh? The left image has data closer to the linear region and asinh has nicer ticks, but the right image has more dynamic range and I find the symlog ticker does better for that case.

Figure_1

Click to expand code

import numpy as np
import matplotlib.pyplot as plt

# Prepare sample values for variations on y=x graph:
x = np.linspace(-5, 5, 100)

# Compare "symlog" and "asinh" behaviour on sample y=x graph:
fig1 = plt.figure(constrained_layout=True)
ax0, ax1 = fig1.subplots(1, 2)

ax0.plot(x, x)
ax0.axline((0, 0), (5, 5), color="black", linestyle=(0, (5, 5)),
           transform=ax0.transAxes)
ax0.set_xscale('asinh')
ax0.set_yscale('symlog')
ax0.grid()

ax0.set_xlabel('asinh')
ax0.set_ylabel('symlog')
lims = np.min(x), np.max(x)
ax0.set_xlim(lims)
ax0.set_ylim(lims)

# Now use logspace to go out multiple decades logarithmically
x = np.logspace(-3, 3, 100)

ax1.plot(x, x)
ax1.axline((0, 0), (5, 5), color="black", linestyle=(0, (5, 5)),
           transform=ax1.transAxes)
ax1.set_xscale('asinh')
ax1.set_yscale('symlog')
ax1.grid()

ax1.set_xlabel('asinh')
ax1.set_ylabel('symlog')
lims = np.min(x), np.max(x)
ax1.set_xlim(lims)
ax1.set_ylim(lims)

plt.show()

@jklymak
Copy link
Member

jklymak commented Oct 5, 2021

I think re the ticker, I was only discussing the exponential ranges, and arguing that they should tick on powers of 10 (or powers of an optional base). I won't argue that symlogs current behaviour in the inner range is at all good!

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 6, 2021

Thanks for the various updates. I certainly agree that the choice of axis-scaling, and almost every other aesthetic choice about a plot, is very strongly dependent on one's data. There are definitely some cases where symlog offers very useful functionality, but I think there are other circumstances where I'd definitely prefer to have alternative tools available.

asinh-cauchy
@greglucas 's plots further highlight the trade-offs between these two axis scales. There was a demonstration plot in examples/scales/asinh_demo.py that shows a scatterplot of some fat-tailed circularly-symmetric random deviates. That plot (as attached) seems to show quite noticeable horizontal artifacts due to the symlog discontinuity. I'm wondering whether similar artifacts might lurk within the colorscales that you're basing on symlog?

I'd assume that there's no reason why a user couldn't use the SymmetricalLogLocator with the AsinhScale, if that works best for their particular scenario?

@greglucas
Copy link
Contributor

Hi @rwpenney, this was brought up on the weekly dev call today for discussion. People were generally in agreement that the transform and scale are nice mathematical alternatives to symlog and would be OK additions. There was some confusion about the locator though, due to the ticking not being symmetric about 0 and not being on decades at the large ranges. So, there was a general preference to use the current SymmetricalLogLocator with this transform and scale.

There is also the option to add this as a separate package if you'd like, which is currently done with mpl-probscale https://github.com/matplotlib/mpl-probscale.

For the documentation, I really like your examples and descriptions in the asinh area. I'm wondering if we could also add some text and alternatives in the SymLog examples pointing to this new option and explaining why someone might want to use it instead. Going in the reverse direction as well because SymmetricLog is easier to search/find than asinh if you don't know what you're looking for.

Finally, addressing your last comment about colormapping:

I'm wondering whether similar artifacts might lurk within the colorscales that you're basing on symlog

I would guess you're right, but likely less noticeable than on a lineplot. It might be worth adding this new norm to the SymLogNorm colormapping example too, demoing the alternative there as well.
https://64066-1385122-gh.circle-artifacts.com/0/doc/build/html/gallery/images_contours_and_fields/colormap_normalizations_symlognorm.html

@rwpenney
Copy link
Contributor Author

Thanks, @greglucas for discussing this at the weekly dev call, and for your suggestions on a way forward.

I was mistaken in assuming that we could simply use the SymmetricalLogLocator directly, having forgotten that it is quite strongly coupled to the assumed axis-scale transformation (as indeed is AsinhLocator). So, I'm in the process of revising AsinhLocator to support rounding to an arbitrary number base, so that the default behaviour of AsinhScale will be to put major ticks on powers of ten, and minor ticks at the adjacent multiples of 2, 5.

Thanks for the comments about the documentation - I'll be adding some links to the SymmetricalLogScale pages to give some signposts to the tradeoffs between these two facilities.

I'll need to add a few more tests for the new tick-locator, and hope to have something ready for more formal review within the next few days.

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 16, 2021

Illustration of new tick-locator, which now supports arbitrary number bases as well as the "base-0" variant that tries to fit neatly rounded ticks at roughly uniform spacing.
asinh-bases
The plots show base-2 on the left, "base-0" in the middle, and base-3 on the right.

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 17, 2021

I believe I've now addressed all the points raised in @greglucas 's most recent message:

  • The default axis marker now shows powers of ten, and supports arbitrary number bases;
  • The AsinhLocator will automatically select minor tick spacings for common number bases (e.g. 10->(2,5), 16->(2, 4, 8), etc.), as well as allowing user customization;
  • Test coverage has been further improved;
  • There are documentation cross-references between the examples/scales/asinh_demo.py and examples/scales/symlog_demo.py to make it easier for users to assess the tradeoffs between these two transformations;
  • There is a new AsinhNorm colorscale, analogous to SymLogNorm;
  • The SymLogNorm contour-plot demonstration has been reworked so that its synthetic dataset is more consistent with its description and now shows a comparison with the AsinhNorm colorscale.

An example of the colorscale plot in the revised examples/images_contours_and_fields/colormap_normalizations_symlognorm.py is show below:
symlog-colorscale

As ever, I'd welcome others' thoughts on how we proceed from here.

@tacaswell
Copy link
Member

This PR is affected by a re-writing of our history to remove a large number of accidentally committed files see discourse for details.

To recover this PR it will need be rebased onto the new default branch (main). There are several ways to accomplish this, but we recommend (assuming that you call the matplotlib/matplotlib remote "upstream"

git remote update
git checkout main
git merge --ff-only upstream/main
git checkout YOUR_BRANCH
git rebase --onto=main upstream/old_master
# git rebase -i main # if you prefer
git push --force-with-lease   # assuming you are tracking your branch

If you do not feel comfortable doing this or need any help please reach out to any of the Matplotlib developers. We can either help you with the process or do it for you.

Thank you for your contributions to Matplotlib and sorry for the inconvenience.

@rwpenney
Copy link
Contributor Author

rwpenney commented Oct 26, 2021

Thanks @greglucas for going through the code, and for your many helpful comments.

There are a few points that perhaps warrant a bit more discussion:

  • I'll take a closer look at the tick-generation in interactive mode - I think I missed a get_view_interval() in my Locator.__call__(), which I'll patch-up.
  • I've deliberately chosen the axis ranges in some of the demos to have asymmetric ranges simply to confirm that the tick-locators behave sensibly in these cases. Naturally, for asymmetric data extent, the ticks may not themselves be symmetrical around the origin, especially when the coordinate transformation is non-linear.
  • For the 3**4 tick-formatting, again that was a deliberate choice in one of the demos - the default is to use powers of ten. Perhaps that demo could use powers of 2 and powers of 10 just to give a hint of the common use-cases?
  • I absolutely share the reluctance to create a new tick-locator for each new axis scale, but I'm afraid I don't see any clean solution here. The SymmetricalLogLocator makes assumptions about the coordinate transform that simply aren't appropriate for the AsinhScale, and I'm struggling to see how even the SymmetricalLogLocator achieves much reuse of existing tick-locators. The way I've implemented the AsinhLocator could, possibly, provide an outline through which an arbitrary monotonic coordinate transformation could be used to produce roughly uniformly spaced ticks, but that doesn't seem easy to generalize within the current API provided by Locator and would certainly require much wider changes to other Locator subclasses. Creating three separate data regions that are handed differently by the Locator again seems to be taking us back to the artifacts associated with discontinuities between the different regions in symlog, and may well require a lot of fine-tuning to cope with different settings of linear_width.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor text edits at this point.

One additional suggestion relating to the use of SymmetricalLogLocator and how to leverage that in the Scale. I think that is a nice re-use of code that is already there, but I won't hold this up on it either. I think this is a really welcome addition and really well done PR by @rwpenney!

Comment on lines +560 to +569
axis.set(major_locator=AsinhLocator(self.linear_width,
base=self._base),
minor_locator=AsinhLocator(self.linear_width,
base=self._base,
subs=self._subs),
minor_formatter=NullFormatter())
if self._base > 1:
axis.set_major_formatter(LogFormatterSciNotation(self._base))
else:
axis.set_major_formatter('{x:.3g}'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my suggestion to leverage the SymmetricalLogLocator, I think something like the below suggestion looks pretty good in general (the *2 seems to get rid of some of the overlapping near-zero tick texts). This would remove the need for the AsinhLocator and tests altogether by leveraging the SymLog work that finds the decades, but still keeps the subs info you put in to the scale. I do understand the hesitancy of using the older "threshold" and trying to get away from that, but I still think that the linear_width and threshold are very closely related ideas, so it isn't all that crazy to relate the two either.

Suggested change
axis.set(major_locator=AsinhLocator(self.linear_width,
base=self._base),
minor_locator=AsinhLocator(self.linear_width,
base=self._base,
subs=self._subs),
minor_formatter=NullFormatter())
if self._base > 1:
axis.set_major_formatter(LogFormatterSciNotation(self._base))
else:
axis.set_major_formatter('{x:.3g}'),
axis.set(major_locator=SymmetricalLogLocator(
base=self._base,
linthresh=2 * self.linear_width),
major_formatter=LogFormatterSciNotation(self._base),
minor_locator=SymmetricalLogLocator(
base=self._base,
linthresh=2 * self.linear_width,
subs=self._subs),
minor_formatter=NullFormatter())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @greglucas , for going through this again and for another set of helpful refinements.

The proposed removal of the AsinhLocator is obviously a more major decision, so I'd be interested in others' views before committing the last of your proposed changes. I'm obviously not impartial in my loyalty to AsinhLocator, or my reservations about the discontinuities that are an intrinsic part of the SymmetricalLogLocator that led to the creation of AsinhScale. There are imperfections and compromises in both of these approaches. Overall, I prefer the mathematical cleanliness of asinh for both the Scale and Locator.

The fractional threshold beneath which data which covers
a range that is approximately symmetric about zero
will have ticks that are exactly symmetric.
base : int, default: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the base 0 a bit odd. I'd put default: 10 and then special case the 10 below?

if (ymin * ymax) < 0 and min(zero_dev) > 1e-6:
# Ensure that the zero tick-mark is included,
# if the axis stradles zero
ys = np.hstack([ys[(zero_dev > 0.5 / self.numticks)], 0.0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this Locator, I still see some tick collisions around zero (asinh demo and colormap normalization), so I wonder if this should be "widened" or set to some threshold based on linear_width to try and remove some more ticks even.

Co-authored-by: Greg Lucas <greg.m.lucas@gmail.com>
@QuLogic QuLogic added the status: needs workflow approval For PRs from new contributors, from which GitHub blocks workflows by default. label Nov 20, 2021
@greglucas
Copy link
Contributor

@rwpenney, I apologize for letting this drop off my radar! I just came across issue #17402 that describes some ticking related issues with symlog and short ranges. I just tried out this PR and it looks like it suffers some of the same issues. I'm wondering what your thoughts would be about including some of the ideas from this comment into your locator?

@rwpenney
Copy link
Contributor Author

Thanks, @greglucas for the link to issue #17402 - interesting to see that we haven't yet converged on an ideal solution to this problem. Also notable is that many of the example plots in that issue-report seem to very nicely emphasise the discontinuous gradients that were a big incentive to revisit the assumptions behind symlog.

I'll give the ideas in #17402 some more thought, and see what we might be able to merge into the arcsinh mechanisms.

@rwpenney
Copy link
Contributor Author

rwpenney commented Dec 28, 2021

The question of whether there is a much better, or even "correct", design for the tick-locator in these symlog-like scenarios seems rather intractable. We have a number of options at the moment, including SymmetricalLogLocator, the proposed AsinhLocator and the mechanisms of #17402, and I think all of these are trying to balance some mutually incompatible goals when we have a non-linear axis scale which inevitably has varying lengthscales across its extent:

  • The location of powers of ten (or any other base) is obviously entirely fixed by the coordinate transformation - the only freedom we have is whether or not to mark that location in some way.
  • The tick-marks are most necessary where the lengthscale of the coordinate transformation is changing rapidly (e.g. near the "linear" region of the arcsinh or the linear/logarithmic discontinuity of symlog)
  • If a tick-mark is needed to help the viewer interpret the changing lengthscales, then there is no guarantee that the text label attached to that tick mark won't overlap nearby tick labels.

Although the way the current API separates the coordinate transformation from the tick-location from the tick marking is elegant, this seems to be at the expense of allowing us to have major tick locations marked on the axis without also having the accompanying textual label. What symlog and asinh transformation seem to be highlighting is that there are situations where that API is itself imposing constraints that affect the aesthetics of the axis labelling. Certainly, it's unlikely to be worth making drastic changes to the API to accommodate these non-homogeneous coordinate transformations, but I fear it does mean that we're unlikely to avoid some compromises with what we can achieve with our axis labelling.

So, overall, I think the AsinhLocator is about the best I can see how to achieve for the AsinhScale - its tick-marks are well-aligned to powers of the base; reasonably well-spaced across the visible range of the plot; sensibly placed if the axis range happens to be asymmetric around zero; and avoids abrupt discontinuities that aren't explicitly labelled within the SymmetricalLogLocator. I'm not sure I can see a way forward unless there's some fundamentally different, and mathematically self-consistent solution that's possible within the existing Transform/Locator APIs.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for letting this languish again @rwpenney. I appreciate you doing the investigation of all the various ticking/formatting options here, and thank you for your patience with my questions surrounding all of the various options.

I'm happy to put this in with the locators that are present here (even easier to justify with the experimental note), they are quite good for general use. More generally, I think we've found out that an automatic locator for all use-cases is really hard for this type of scale :) So, if you want something better, place the ticks yourself ;)

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but a few more "provisionals" would be in order...

@QuLogic QuLogic merged commit 98184f4 into matplotlib:main Feb 16, 2022
@QuLogic
Copy link
Member

QuLogic commented Feb 16, 2022

Thanks @rwpenney! Congratulations on your first PR to Matplotlib 🎉 We hope to hear from you again.

@QuLogic QuLogic removed status: needs comment/discussion needs consensus on next step status: needs workflow approval For PRs from new contributors, from which GitHub blocks workflows by default. labels Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants