ENH: new date formatter #10841

jklymak · 2018-03-19T17:27:57Z

PR Summary

Update 22 Sep 2018: New localization mechanism.

This can now be localized. Its a bit verbose because there are basically 3*7 = 21 formatting strings. 7 for the normal ticks, 7 for the ticks that are major time divisions (i.e. if ticks are 22 Mar, 1 Apr, 8 Apr, the tick labels will be "22", "Apr", "8". ). I think the default is good, and ISO compliant. The verbose lists of strings allow localization with a bit of work. I include an example of how to create a trivial subclass to encapsulate that customization. Maybe someone more clever than me can come up with a better way to do this.

..its described in detail at https://13776-1385122-gh.circle-artifacts.com/0/home/circleci/project/doc/build/html/gallery/ticks_and_spines/date_concise_formatter.html#sphx-glr-gallery-ticks-and-spines-date-concise-formatter-py

[x] still needs tests

ping @anntzer and @ImportanceOfBeingErnest who were both interested in how this would localize.

Update 4 July 2018

Renamed ConciseDateFormatter and included examples of how to use it.

I think mechanisms or dicussion of whether it shoudl be the defautl can be deferred. Certainly its nice to be able to set the date format manually as pointed out by @ImportanceOfBeingErnest below, and the current method allows this, whereas this Formatter has no localization like that.

...

This is a proposed new date formatter. I find the Automatic one less than useful, though I appreciate the functional flexibility of the default one.

I'd argue the default should be something like this, but apprecaite we are conservative on such matters.

This isn't ready for serious review attention yet, but thoughts on the API or the result are welcome...

I'll note a couple fo flaws (in my opinion) of the default locator. As per #9801, interval_multiples=True should be the default. Secondly, it would be nice if the locator only located the first and 15th of the month rather than 1, 15, 29.

EDIT: added extra info in the "offset" region of the axes.

import datetime
import numpy as np
import matplotlib
matplotlib.use('Qt5Agg')
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


fig, axs = plt.subplots(2, 3, figsize=(8,8), constrained_layout=True)
axs = axs.flatten()
if False:
    for ax in axs:
        locator = mdates.AutoDateLocator(interval_multiples=True)
        formatter = mdates.ConciseDateFormatter(locator)
        ax.yaxis.set_major_locator(locator)
        ax.yaxis.set_major_formatter(formatter)
    fname = 'New.png'
else:
    for ax in axs:
        locator = mdates.AutoDateLocator(interval_multiples=True)
        formatter = mdates.AutoDateFormatter(locator)
        ax.yaxis.set_major_locator(locator)
        ax.yaxis.set_major_formatter(formatter)
    fname = 'Old.png'

ax = axs[0]

t0 = datetime.datetime(2009, 1, 20)
tf = datetime.datetime(2009, 1, 21)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(days=5),
            tf + datetime.timedelta(days=5))

ax.set_title('Days', loc='Right')

ax = axs[1]
t0 = datetime.datetime(2009, 1, 20)
tf = datetime.datetime(2009, 3, 21)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(days=15),
            tf + datetime.timedelta(days=15))

ax.set_title('2 Months', loc='Right')

ax = axs[2]
t0 = datetime.datetime(2008, 8, 20)
tf = datetime.datetime(2010, 3, 21)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(days=105),
            tf + datetime.timedelta(days=105))

ax.set_title(' Months/Y', loc='Right')


ax = axs[3]
t0 = datetime.datetime(2005, 8, 20)
tf = datetime.datetime(2015, 3, 21)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(days=1005),
            tf + datetime.timedelta(days=1005))

ax.set_title(' Years', loc='Right')

ax = axs[4]
t0 = datetime.datetime(2009, 8, 20)
tf = datetime.datetime(2009, 8, 21)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(hours=3),
            tf + datetime.timedelta(hours=3))

ax.set_title(' Hours', loc='Right')

ax = axs[5]
t0 = datetime.datetime(2009, 8, 20, 9)
tf = datetime.datetime(2009, 8, 20, 10)
ax.axhspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_ylim(t0 - datetime.timedelta(minutes=3),
            tf + datetime.timedelta(minutes=3))

ax.set_title(' Minutes', loc='Right')


fig.savefig(fname)
plt.show()

Before:

After:

PR Checklist

Has Pytest style unit tests
Code is PEP 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

lib/matplotlib/dates.py

ImportanceOfBeingErnest · 2018-03-20T15:24:50Z

It may be that I misunderstand something here, but shouldn't the date format be determined though the rcParams date.autoformatter.xyz parameters?

jklymak · 2018-03-20T15:56:32Z

@ImportanceOfBeingErnest

The current formatters are:

# date.autoformatter.year     : %Y
# date.autoformatter.month    : %Y-%m
# date.autoformatter.day      : %Y-%m-%d
# date.autoformatter.hour     : %m-%d %H
# date.autoformatter.minute   : %d %H:%M
# date.autoformatter.second   : %H:%M:%S
# date.autoformatter.microsecond   : %M:%S.%f

and which one gets used is set by logic inside dates.py that chooses which interval is most appropriate. But only one formatter gets used for every tick once that formatter is chosen.

This PR takes into account surrounding ticks and changes the format accordingly. For instance, its extra info to repeat 2009-01- for all the ticks in the "Days" example above, or the date in the "Hours" example. Taking into account the surrounding ticks to minimize visual clutter is what this PR does.

It would make sense for this PR to have a way to specify the format of the individual date elements - there are already placeholders for that. i.e.

self._yearfmt = '%Y'
self._monthfmt = '%b'
self._dayfmt = '%d'
# etc...

but I've not added RC params or kwargs yet.

It would of course be necessary to get the old behaviour back, either by leaving it the default, or by leaving instructions:

locator = mdates.LegacyAutoDateLocator()
formatter = mdates.LegacyAutoDateFormatter(locator)
ax.yaxis.set_major_locator(locator)
ax.yaxis.set_major_formatter(formatter)

ImportanceOfBeingErnest · 2018-03-20T16:16:09Z

I see. Just note that it is very convenient to set the preferred format once through the rcParams and have every plot the way you want it. So if that is going to change, intruducing a parameter which reverts back would definitely be useful, date.autoformatter.use : old or similar.

jklymak · 2018-03-20T16:22:37Z

@ImportanceOfBeingErnest That makes sense to me.

It sounds like we are also moving to a "startup-file" like startup sequence, so folks will be able to put slightly more sophisticated recipes in their new matplotlibrc.py, like changing the default date handler. However I'm not sure if the architecture of that file is going to just affect rcParams, or arbitrary code. I think the former, so some way for rcParams to specify the DateFormatter is what will be needed. ie. rcParams['dates.locator']=mdates.LegacayAutoDateLocator

QuLogic · 2018-03-21T01:37:45Z

This PR takes into account surrounding ticks and changes the format accordingly. For instance, its extra info to repeat 2009-01- for all the ticks in the "Days" example above, or the date in the "Hours" example. Taking into account the surrounding ticks to minimize visual clutter is what this PR does.

This seems to be something used downstream in ObsPy (e.g., on waveform plots). I have not looked at or compared either implementation though.

jklymak · 2018-03-21T13:28:11Z

@QuLogic I think ObPy just makes the first tick have more info, but doesn't try to be intelligent about
which tick gets the extra info. This PR makes, for instance, midnight be labeled by the day, or second zero labeled by HH:MM, or 1 Jan by YYYY, etc. Note I've mildly improved the plot above...

lib/matplotlib/dates.py

story645 · 2018-03-26T19:57:07Z

Working through the example, I think my question may be more of what resolution does the formatter pick up?

fig, ax = plt.subplots()
t0 = datetime.datetime(2009, 1, 20)
tf = datetime.datetime(2009, 1, 21)
ax.axvspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_xlim(t0 - datetime.timedelta(days=5), tf + datetime.timedelta(days=5))
ax.vlines([datetime.datetime(2009,1,17,5) +
           datetime.timedelta(hours=x) for x in range(0,200,25)], 0, 1)
ax.set_ylim(0,1)
plt.show()

jklymak · 2018-03-26T20:11:15Z

@story645 It returns:

Note this doesn't change the locator (which chooses the ticks), it sets the formatter.

anntzer · 2018-03-26T20:19:50Z

Actually, coming back to the comment about localization mentioned during the call:

In the "hours" example, can "Aug-21" be replaced by "Aug 21"? (looks better even in English) (and in general, perhaps just get rid of the dashes?)
in French, that gives "août-21" and "2009-août-20" ("minutes" example), both of which are quite weird in French (I'd say "21 août", "21 août 2009").

A quick search did not yield any way to obtain localized separators or formats (especially "date without year") though :/ Not saying this should be a blocker, but just raising the issue.

jklymak · 2018-03-26T20:30:04Z

I'm fine w/o the dashes. Easy change.

I'm personally against any formats that don't go year month day. As someone who writes hand written logs, nothing drives me nuttier than seeing "12-11-10" in a log book I know was written in November 2010, even though I would totally say 12 November 2010 in speech; written and spoken speech and scientific notation on a plot can/should be different.

story645 · 2018-03-26T20:42:13Z

@jklymak thanks! This only works with the major locator? (And I know that's probably a question about the autoformatter in general...)

jklymak · 2018-03-26T20:46:25Z

Ummm... I think you can pass whatever formatter you want:

ax.yaxis.set_minor_formatter(formatter)

I think to make nicely formatted labels for minor date ticks would be a challenge given that they are already too big for normal axes sizes!

I think having a decent minor locator would be a fun challenge, but one thing at a time ;-)

jklymak · 2018-03-27T04:35:00Z

This was largely met w/ approval on the call to turn this into AutoDateFormatter, and move the current AutoDateFormatter to a Mpl22DateFormatter so folks could still access it. But some key folks weren't on the call. @tacaswell and @efiring, or anyone else want to weigh in before I do all the work of changing it?

@ImportanceOfBeingErnest I'll still try to figure out the best API for getting the old behaviour back, and ping you for your opinion.

ImportanceOfBeingErnest · 2018-03-27T09:56:42Z

Since my opinion on this is being asked for: I would not change the name. AutoDateFormatter is the automatic formatter which labels the ticks according to the rules given in the date.autoformatter.xyz rcParams. I would leave it at this, because this is how people use it now and will still want to use it. There are all kinds of useful as well as weird standards for datetime formats and indeed as @anntzer commented, different language backgrounds need different formatting capabilities. They might have set their rcParams up to do the plots they like that way and I think one should not require them to change anything more than possibly add a single additional rcParam to get the old behaviour back.

In view of that I could imagine to allow the rcParam date.formatter : auto to get the AutoDateFormatter.

Concerning the new formatter: I'm not sure if I grasp this correctly. Does it not allow to specify any custom formats for individual components like days, minutes, years etc, or is it still not decided? As the overall aim of the new formatter seems to be to have nicely looking plots I could imagine to name it PrettyDateFormatter. Setting this would then be done via the rcParam date.formatter : pretty, which would also be the default if that is what people agree upon.

jklymak · 2018-03-27T16:54:33Z

Thanks - I'll certainly make it so the current behaviour is achievable with rcParams.

As written the new formatter does not allow you to pass different formats for the elements. It'd be nice if it did. My tendancy would be to try to make it fit into one rcParam versus six or seven, but I'd have to think about whether parsing a format string is the best way to do that or passing a dict. Kind of needs @anntzer new rcParam methodology in place to see how that'll work 😉

QuLogic · 2018-11-25T23:31:03Z