Improve tick subsampling in LogLocator. #29698

anntzer · 2025-03-02T22:15:55Z

Rewrite the logic for subsampling log-scale ticks, so that 1) there's as many ticks as possible without going over numticks (don't draw a single tick when two would fit), and 2) the ticks are spaced as much as possible (when two ticks are drawn, put them on the lowest and highest decade, not somewhere in the middle). See comments in implementation for details. For the backcompat-oriented "classic-mode", remove the "stride >= numdec" check (which is a much later addition).

Test changes:

TestLogLocator changes because tick_values() no longer returns major ticks outside of the (vmin, vmax) range.
test_small_range_loglocator was likely previously wrong (the assert did not cover the values of ticks, and all iterations of the loop ended up testing the same condition). Rewrite it fully (in particular to cover the new implementation).

Closes #29694.

Edit: there's still some more test failures that need to be investigated (I suspect most of them are indeed changes "for the better", but I'll go through them one at a time).

PR summary

PR checklist

"closes #0000" is in the body of the PR description to link the related issue
new and changed code is tested
Plotting related features are demonstrated in an example
New Features and API Changes are noted with a directive and release note
Documentation complies with general and docstring guidelines

lib/matplotlib/tests/test_ticker.py

timhoffm · 2025-03-03T10:24:16Z

lib/matplotlib/ticker.py

    def tick_values(self, vmin, vmax):
        if self.numticks == 'auto':
            if self.axis is not None:
-                numticks = np.clip(self.axis.get_tick_space(), 2, 9)
+                nt = np.clip(self.axis.get_tick_space(), 2, 9)


can we keep the long names numticks and numdecs? The abbreviations are IMHO harder to read and rember.

The problem is that the names numticks and numdec are actually not that nice either, and I suspect they were actually a bit a source of confusion. For example, I think it is easier to reason in terms of "the number of decade ticks without striding" (i.e. emax - emin + 1: there's two ticks for an axis from 1 to 10), but numdecs looks more like "the number of decades" to me (i.e. emax - emin: there's one decade between 1 at 10). Likewise there's different "number of ticks" lying around (desired number of ticks, total number of decade ticks, maximum actually realizable number of ticks).
Overall I found it more confusing to write the logic using names that "suggest" a wobbly semantic and easier to do so using more symbolic names.

I think the actually relevant quantity are intervals (i.e. the numbers -1), because that's what you try to match.

If you have 9 decades / 8 intervals (nd=9) and a stride of s=3, then no matter the offset you always end up with 3 subsampled ticks. If you have 10 decades then you can have either 3 or 4 subsampled ticks. In fact the number of subsampled ticks is either floor(nd/s) or ceil(nd/s) (which are equal if s divides nd), which makes me think that nd and s are indeed the "nicer" variables to work with.

Trying to underestand the logic below, I kept having to flick back and forth to where nd and nt were defined to remind myself what they were. So I'm very pro giving them more descriptive names. Perhaps nt could become nticks_requested, and nd become nd become nticks_max?

rcomer · 2025-03-03T13:02:26Z

This also fixes #25896

anntzer · 2025-03-04T13:31:29Z

Looking at this again, I suspect that a possible improvement to the algorithm would be to put ticks on multiples of the stride, if it is possible to do so without decreasing the number of ticks: for example, if the choice is between ticks at 10^-10, 10^-7, 10^-4, 10^-1 or 10^-9, 10^-6, 10^-3, 10^0, then the latter probably looks better (on average...)

timhoffm · 2025-03-04T17:14:28Z

Agreed, multiples of strides is a good secondary criterion, because the exponents are part of a non-offsetted sequence (4, 8, 12) feels more natural than 5, 9, 13.

lib/matplotlib/ticker.py

anntzer · 2025-03-06T10:27:12Z

@rcomer: With the last changes, #25896 is again no longer fixed; the underlying issues are actually slightly different. The patch (once this PR goes in) is, first, something like

diff --git i/lib/matplotlib/ticker.py w/lib/matplotlib/ticker.py
index 5a229fd8e7..1420e4536a 100644
--- i/lib/matplotlib/ticker.py
+++ w/lib/matplotlib/ticker.py
@@ -2519,10 +2519,12 @@ class LogLocator(Locator):
 
         if (len(subs) > 1
                 and stride == 1
-                and ((vmin <= ticklocs) & (ticklocs <= vmax)).sum() <= 1):
+                and (len(decades) - 2  # major
+                     + ((vmin <= ticklocs) & (ticklocs <= vmax)).sum())  # minor
+                     <= 1):
             # If we're a minor locator *that expects at least two ticks per
             # decade* and the major locator stride is 1 and there's no more
-            # than one minor tick, switch to AutoLocator.
+            # than one major or minor tick, switch to AutoLocator.
             return AutoLocator().tick_values(vmin, vmax)
         else:
             return self.raise_if_exceeds(ticklocs)

(this PR is needed because it makes it consistent to always have exactly one extra major tick on both sides (hence the -2 in the tick count); note that I guess there could still be inconsistencies if the major and minor locators have been configured to have different numticks but I don't think I can do anything about that...)

Then there is also something wrong with coeff = round(b ** (fx - exponent)) in LogFormatter.__call__, as that misinterprets "12" (coming from AutoLocator) as "10" (due to the rounding of the mantissa); this also needs to be fixed (although the first fix above will simply eliminate the tick at 12 so the second issue will become hidden).

Overall I'd rather keep that issue separate for clarity.

timhoffm · 2025-03-07T21:08:46Z

@anntzer this looks good to me. Since it's still in draft mode, are you still planning changes?

anntzer · 2025-03-07T23:07:27Z

There's just a few remaining test failures (due to the actual changes in the algorithm) for which I still need to decide on the best path forward.

anntzer · 2025-03-10T12:50:33Z

@timhoffm This is ready for review now. See commit message (8a80e1b) for discussion about changes to test suite. The remaining test failures are spurious.

timhoffm

On a side-note: Not strictly related to the changes here, but can we run into issues with the assumption that we always have one out-of-limit tick on each side? Say we have a setup such that LogLocator(numticks=2).tick_values(1e300, 1e305). I would then need an extra tick at 1e315, which is outside the float64 range.

anntzer · 2025-03-10T18:57:14Z

I don't like the "out-ticking" behavior and would like to get rid of it, but (as detailed in the commit message) contour() currently relies on it. It can indeed lead to overflow, but this was already present before this PR, e.g. with LogLocator(numticks=2).tick_values(1e300, 1e308).

dstansby

This is a nice improvement. It took me quite a while to understand all the logic though, which doesn't bode well for maintainability. I think better variable names would help a lot - here's what I ended up with when I edited the code locally so I could understand it.

I also think it would be nice to extract the new logic into a function, 1) for readability and 2) because I can imagine this being used in the future in other non-log contexts where one wants nice ticks at only integer locations. But that could always be done after this PR.

diff --git a/lib/matplotlib/ticker.py b/lib/matplotlib/ticker.py
index 5a229fd8e7..f732d07e09 100644
--- a/lib/matplotlib/ticker.py
+++ b/lib/matplotlib/ticker.py
@@ -2414,11 +2414,11 @@ class LogLocator(Locator):
     def tick_values(self, vmin, vmax):
         if self.numticks == 'auto':
             if self.axis is not None:
-                nt = np.clip(self.axis.get_tick_space(), 2, 9)
+                nticks_request = np.clip(self.axis.get_tick_space(), 2, 9)
             else:
-                nt = 9
+                nticks_request = 9
         else:
-            nt = self.numticks
+            nticks_request = self.numticks
 
         b = self._base
         if vmin <= 0.0:
@@ -2436,10 +2436,10 @@ class LogLocator(Locator):
         efmin, efmax = self._log_b([vmin, vmax])
         emin = math.ceil(efmin)
         emax = math.floor(efmax)
-        nd = emax - emin + 1  # Total number of decade ticks available.
+        nticks_avail = emax - emin + 1  # Total number of decade ticks available.
 
         if isinstance(self._subs, str):
-            if nd >= 10 or b < 3:
+            if nticks_avail >= 10 or b < 3:
                 if self._subs == 'auto':
                     return np.array([])  # no minor or major ticks
                 else:
@@ -2453,49 +2453,55 @@ class LogLocator(Locator):
         # Get decades between major ticks.  Include an extra tick outside the
         # lower and the upper limit: QuadContourSet._autolev relies on this.
         if mpl.rcParams["_internal.classic_mode"]:  # keep historic formulas
-            stride = max(math.ceil((nd - 1) / (nt - 1)), 1)
+            stride = max(math.ceil((nticks_avail - 1) / (nticks_request - 1)), 1)
             decades = np.arange(emin - stride, emax + stride + 1, stride)
         else:
             # Find the largest number of ticks, no more than the requested
-            # number, that can actually be drawn (e.g., with 9 decades ticks,
-            # no stride yields 7 ticks).  For a given value of the stride *s*,
-            # there are either floor(nd/s) or ceil(nd/s) ticks depending on the
-            # offset.  Pick the smallest stride such that floor(nd/s) < nt,
-            # i.e. nd/s < nt+1, then re-set nt to ceil(...) if acceptable, else
-            # to floor(...) (which must then equal the original nt, i.e. nt is
-            # kept unchanged).
-            stride = nd // (nt + 1) + 1
-            nt1 = math.ceil(nd / stride)
-            if nt1 <= nt:
-                nt = nt1
+            # number, that can actually be drawn. For a given value of the stride *s*,
+            # there are either floor(nticks_avail/s) or ceil(nticks_avail/s) ticks
+            # within the limits, depending on the offset.
+            #
+            # Pick the smallest stride such that the total ticks [floor(nticks_avail/s)]
+            # is less than the requested ticks [nticks_request],
+            # i.e. nticks_avail/s < nticks_request+1, then re-set nticks_avail
+            # to ceil(...) if acceptable, else to floor(...) (which must then equal
+            # to the original nticks_request, i.e. nticks_request is kept unchanged).
+            stride = 1 + nticks_avail // (nticks_request + 1)
+            nt1 = math.ceil(nticks_avail / stride)
+            if nt1 <= nticks_request:
+                nticks_request = nt1
             else:
-                assert nt1 == nt + 1
-            if nt == 0:
+                assert nticks_request == nt1 - 1
+
+            if nticks_request == 0:
+                # Return two ticks outside limits
                 decades = [emin - 1, emax + 1]
                 stride = decades[1] - decades[0]
-            elif nt == 1:  # Draw single tick close to center.
+            elif nticks_request == 1:
+                # Draw a single tick close to center.
                 mid = round((efmin + efmax) / 2)
                 stride = max(mid - (emin - 1), (emax + 1) - mid)
                 decades = [mid - stride, mid, mid + stride]
             else:
-                # Pick the largest s that yields this actual nt (e.g., with 15
+                # Calculate the stride
+                #
+                # Pick the largest s that yields this reqested nticks (e.g., with 15
                 # decades, strides of 5, 6, or 7 *can* yield 3 ticks; picking a
-                # larger stride minimizes unticked space at the ends).  First
-                # try for
-                #     ceil(nd/s) == nt, i.e. nd/nt <= s < nd/(nt-1),
-                # else fallback to
-                #     floor(nd/s) == nt, i.e. nd/(nt+1) < s <= nd/nt.
+                # larger stride minimizes unticked space at the ends).
+                #
                 # One of these cases must have an integer solution (given the
                 # choice of nt above).
-                stride = (nd - 1) // (nt - 1)
-                if stride < nd / nt:  # fallback to second case
-                    stride = nd // nt
-                # For a given s, once including an offset off (0 <= off < s),
-                # the actual number of ticks is ceil((nd - off) / s), which
-                # must be equal to nt.  This leads to olo <= off < ohi, with
-                # the values defined below.
-                olo = max(nd - stride * nt, 0)
-                ohi = min(nd - stride * (nt - 1), stride)
+                stride = (nticks_avail - 1) // (nticks_request - 1)
+                if stride < nticks_avail / nticks_request:  # fallback to second case
+                    stride = nticks_avail // nticks_request
+                # Calculate a good offset
+                #
+                # For a given stride, once including an offset off (0 <= off < stride),
+                # the actual number of ticks is ceil((nticks_avail - off) / s), which
+                # must be equal to nticks_request. This leads to olo <= off < ohi,
+                # with the values defined below.
+                olo = max(nticks_avail - stride * nticks_request, 0)
+                ohi = min(nticks_avail - stride * (nticks_request - 1), stride)
                 # Try to see if we can pick an offset so that ticks are at
                 # integer multiples of the stride while satisfying the bounds
                 # above; if not, fallback to the smallest acceptable offset.
@@ -2508,7 +2514,7 @@ class LogLocator(Locator):
         # anything other than 1.
         is_minor = len(subs) > 1 or (len(subs) == 1 and subs[0] != 1.0)
         if is_minor:
-            if stride == 1 or nd <= 1:
+            if stride == 1 or nticks_avail <= 1:
                 # Minor ticks start in the decade preceding the first major tick.
                 ticklocs = np.concatenate([
                     subs * b**decade for decade in range(emin - 1, emax + 1)])

dstansby · 2025-03-12T19:53:53Z

lib/matplotlib/tests/test_ticker.py

+
+
+@mpl.style.context('default')
+def test_loglocator_properties():


It's not super obvious to me reading the code what this is testing - would be good to get a comment explaining at the top here.

dstansby · 2025-03-12T19:54:40Z

lib/matplotlib/tests/test_ticker.py

+        if not (decades % stride == 0).all():
+            for offset in range(0, stride):
+                alt_decades = range(lo + offset, hi + 1, stride)
+                assert len(alt_decades) < len(decades) or len(alt_decades) > numticks


I think I worked out what everything above was testing, but I'm stumped by this - what behaviour is this testing?

dstansby · 2025-03-14T21:07:05Z

lib/matplotlib/ticker.py

    def tick_values(self, vmin, vmax):
        if self.numticks == 'auto':
            if self.axis is not None:
-                numticks = np.clip(self.axis.get_tick_space(), 2, 9)
+                nt = np.clip(self.axis.get_tick_space(), 2, 9)


Trying to underestand the logic below, I kept having to flick back and forth to where nd and nt were defined to remind myself what they were. So I'm very pro giving them more descriptive names. Perhaps nt could become nticks_requested, and nd become nd become nticks_max?

anntzer · 2025-03-15T11:27:44Z

Renamed to n_request and n_avail (which seems like a decent compromise vs. the longer "nticks_foo"), and extended the comments in the tests.

dstansby

Thanks for updating the variable names 👍

timhoffm · 2025-03-18T14:25:03Z

This could get a short what's new note. While not being configurable, we may well advertize improvements. We also prevent possible confusion through "matplotlib behaves differently, but I don't see any relevant changes in the release notes".

Rewrite the logic for subsampling log-scale ticks, so that 1) there's as many ticks as possible without going over numticks (don't draw a single tick when two would fit); 2) the ticks are spaced as much as possible (when two ticks are drawn, put them on the lowest and highest decade, not somewhere midway); 3) if possible, draw ticks on integer multiples of the stride (0, 3, 9 is better than 1, 4, 7). See comments in implementation for details. For the backcompat-oriented "classic-mode", remove the "stride >= numdec" check (which is a much later addition). Test changes: - test_small_range_loglocator was likely previously wrong (the assert did not cover the values of ticks, and all iterations of the loop ended up testing the same condition). Rewrite it fully (in particular to cover the new implementation). - changes because tick_values() now returns exactly one major ticks outside of the (vmin, vmax) range on both sides: TestLogLocator, test_axes::test_auto_numticks_log, test_colorbar::test_colorbar_renorm. - test_colorbar::test_colorbar_autotickslog's tested tick values didn't match what actually got drawn, because rendering the colorbar (here simulated via draw_without_rendering) would make the (first) colorbar slightly smaller (via _ColorbarAxesLocator) and change the available tick space. Furthermore, the second colorbar was wrongly drawn with 3 ticks at [-12, 0, +12] when there's only space for 2; this arose because np.log(1e±12)/np.log(10) = ±11.999... which led to a wrong stride calculation; the new code explicitly uses log10, which is more accurate, when possible, and correctly yields ticks at [-12, 12]. - test_contour::test_contourf_log_extension: Log-normed contour extension remains a bit iffy, because it is implemented by asking a LogLocator for ticks within data range, and then including an extra tick below and above (see also the comment regarding _autolev in LogLocator.tick_values()). This is not really optimal, e.g. that extra tick can be quite far below or above the actual data limits (depending on the stride). Perhaps a better solution would be to just extend the range to the nearest decade, or even just accept that colorbar ticks (like axes ticks) don't necessarily reach the very end of the data range but can stop slightly below. In the meantime, I chose to simply force the contour levels so that the test results matches the old output.

anntzer · 2025-03-18T23:23:37Z

Sure, added whatsnew.

anntzer added the topic: ticks axis labels label Mar 2, 2025

anntzer force-pushed the logticks branch from b35786f to 73bb542 Compare March 2, 2025 22:16

timhoffm reviewed Mar 3, 2025

View reviewed changes

anntzer marked this pull request as draft March 3, 2025 12:21

rcomer linked an issue Mar 3, 2025 that may be closed by this pull request

[Bug]: LogFormatter minor ticks with minor_thresholds of (0,0) does not behave as documented #25896

Open

anntzer force-pushed the logticks branch from 73bb542 to 34fa95b Compare March 3, 2025 13:35

timhoffm reviewed Mar 4, 2025

View reviewed changes

lib/matplotlib/ticker.py Outdated Show resolved Hide resolved

anntzer force-pushed the logticks branch from 34fa95b to 58bba7b Compare March 6, 2025 10:20

rcomer removed a link to an issue Mar 6, 2025

[Bug]: LogFormatter minor ticks with minor_thresholds of (0,0) does not behave as documented #25896

Open

anntzer force-pushed the logticks branch from 58bba7b to 667abd1 Compare March 7, 2025 16:47

anntzer force-pushed the logticks branch from 667abd1 to 8a80e1b Compare March 10, 2025 10:39

anntzer marked this pull request as ready for review March 10, 2025 10:39

timhoffm approved these changes Mar 10, 2025

View reviewed changes

anntzer mentioned this pull request Mar 11, 2025

[Bug]: LogFormatter minor ticks with minor_thresholds of (0,0) does not behave as documented #25896

Open

dstansby reviewed Mar 14, 2025

View reviewed changes

anntzer force-pushed the logticks branch from 8a80e1b to c86320e Compare March 15, 2025 11:28

dstansby approved these changes Mar 17, 2025

View reviewed changes

anntzer force-pushed the logticks branch from c86320e to bc94215 Compare March 18, 2025 23:23

timhoffm merged commit c887ecb into matplotlib:main Mar 19, 2025
38 of 41 checks passed

QuLogic added this to the v3.11.0 milestone Mar 19, 2025

anntzer deleted the logticks branch March 19, 2025 06:52



		@mpl.style.context('default')
		def test_loglocator_properties():

Uh oh!

Improve tick subsampling in LogLocator. #29698

Improve tick subsampling in LogLocator. #29698

Uh oh!

Conversation

anntzer commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR summary

PR checklist

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rcomer commented Mar 3, 2025

Uh oh!

anntzer commented Mar 4, 2025

Uh oh!

timhoffm commented Mar 4, 2025

Uh oh!

Uh oh!

anntzer commented Mar 6, 2025

Uh oh!

timhoffm commented Mar 7, 2025

Uh oh!

anntzer commented Mar 7, 2025

Uh oh!

anntzer commented Mar 10, 2025

Uh oh!

timhoffm left a comment

Choose a reason for hiding this comment

Uh oh!

anntzer commented Mar 10, 2025

Uh oh!

dstansby left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer commented Mar 15, 2025

Uh oh!

dstansby left a comment

Choose a reason for hiding this comment

Uh oh!

timhoffm commented Mar 18, 2025

Uh oh!

anntzer commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

anntzer commented Mar 2, 2025 •

edited

Loading