ENH: add variable epoch #15008

jklymak · 2019-08-08T04:57:57Z

PR Summary

EDIT: 9 April 2020:

The default epoch is now "1970-01-01T00:00:00". This has microsecond resolution for dates pithing 70 years of either side of 1970. A more recent epoch would likely be better, but this is the UNIX standard, and is hence defendable. (per @efiring: #15008 (review))

There is a set_epoch, and an rcParam date.epoch that take datetime64-compatible strings. Must be called before the epoch is used, otherwise a sentinel is tripped to stop confusion (per @ImportanceOfBeingErnest and @tacaswell #15008 (comment)).

Who will get tripped up by this? Anyone who stored their data as floats in the old epoch. But that is relatively easy to fix:

neword = oldord + mdates.date2num(np.datetime64('0000-12-31'))

EDIT: 14 Oct 2019:

So, big question is if we want to change the default epoch, and how we can do that w/o breaking many users....

EDIT: Updated 9 Aug, 2019

Allow a variable epoch via a mdates.set_epoch('2001-01-01') so matplotlib dates can be relative to that instead of 0001-01-01. This allows folks who insist on driving down to micro-seconds a bit more leeway to do so precisely.

Note I'm also depreciating epoch2num and num2epoch. I don't know why they are there, and they seem to refer to 1970 epochs, which is of course not the only epoch you can have, and isn't terribly relevant to python dates.

[To come: I think its totally possible to change the locators to get down to whatever resolution the user wants. How the formatters handle that is another issue, but using datetime64 and a close enough epoch we can get pretty small. But this PR still improves things for the current situation]

i.e.

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np 

x = np.arange('2000-01-01T00:00:00.0', '2000-01-01T00:00:00.000100',
              dtype='datetime64[us]')
y = np.arange(0, len(x))
fig, ax = plt.subplots(constrained_layout=True)
ax.plot(x, y)
ax.set_title('Epoch: ' + mdates.get_epoch())
plt.setp(ax.xaxis.get_majorticklabels(), rotation=40)
plt.show()

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np 

mdates.set_epoch('1999-01-01')

x = np.arange('2000-01-01T00:00:00.0', '2000-01-01T00:00:00.000100',
              dtype='datetime64[us]')
y = np.arange(0, len(x))
fig, ax = plt.subplots(constrained_layout=True)
ax.plot(x, y)
ax.set_title('Epoch: ' + mdates.get_epoch())
plt.setp(ax.xaxis.get_majorticklabels(), rotation=40)
plt.show()

Addresses: #7138

Note, this also gets rid of the restriction of matplotlib times must be greater than 1, because it no longer uses the datetime.toordinal function, but rather just specifies the epoch in numpy.datetime64. Of course this might mean folks will plot data far off where they think they want it to go, but...

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

examples/ticks_and_spines/date_precision_and_epochs.py

lib/matplotlib/dates.py

ImportanceOfBeingErnest · 2019-08-09T01:25:50Z

I find it pretty dangerous that depending on when you set epoch you may get totally different outcomes and potentially inconsistent results. Especially if you think about some package that you import setting the epoch - a nightmare to debug.

jklymak · 2019-08-09T01:41:46Z

You shouldn't get "totally different outcomes", unless you mix datetime with floats in your code (and you change the epoch). Of course you get different outcomes if one of them is broken, but thats what this is meant to fix

You can always check the epoch with mdates.get_epoch() if something funky starts going on.

If we want sub-millisecond plotting (which you kind of want for zooming) then this is the way to do it.

The ultimate problem is that the original choice was to make the epoch year 0001 which is too far away for modern dates to get much floating point resolution.

ImportanceOfBeingErnest · 2019-08-09T02:41:46Z

Clearly, it's not the way to do it; it's one way to do it. Pandas uses a totally different approach -which is also non-ideal and subject to a lot of confusion. But e.g. the case from above works out of the box in pandas.

import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt

x = np.arange('2000-01-01T00:00:00.0', '2000-01-01T00:00:00.000100',
              dtype='datetime64[us]')
y = np.arange(0, len(x))

df = pd.DataFrame({"XX" : x, "YY" : y})
fig, ax = plt.subplots(constrained_layout=True)
df.plot(x="XX", y="YY")
plt.show()

Concerning totally different outcomes, what I mean is that you may e.g. (re)set epoch too early, like

x = np.arange('2000-01-01T00:00:00.0', '2000-01-01T00:00:00.000100',
              dtype='datetime64[us]')
y = np.arange(0, len(x))

mdates.set_epoch('1999-01-01')
fig, ax = plt.subplots(constrained_layout=True)
ax.plot(x, y)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%H:%M:%S.%f\n%Y-%m-%d"))

mdates.reset_epoch()

plt.show()

resulting in the axes showing dates from when Jesus was born,

In this case it's rather obvious, but as said, some module you import in cell 32 of your notebook may (re)set epoch and it would still affect the plot you started in cell 2 and show in cell 35.

Similarly you may read in some data without having set an epoch, and only set it later before plotting,

x,y = np.loadtxt(..., 
                 converters={0: mdates.datestr2num})

#lots of other stuff

mdates.set_epoch('1999-01-01')
plt.plot(x,y)

tacaswell · 2019-08-09T04:07:39Z

I had the same thought as @ImportanceOfBeingErnest walking home from the train, probably need to bind the epoch to the axes some how?

jklymak · 2019-08-09T04:36:49Z

x,y = np.loadtxt(..., 
                 converters={0: mdates.datestr2num})

#lots of other stuff

mdates.set_epoch('1999-01-01')
plt.plot(x,y)

is an interesting one. I didn't know people did that. Binding the epoch to the axes won't help there (and is in fact why the epoch needs to be a global). Is there an advantage to the user using datestr2num instead of one of the standard libraries and keeping their dates as datetime?

What does Pandas do?

I guess one approach would be to sentinel the epoch so that once its used anywhere, set_epoch throws an error (or a warning and doesn't change the epoch). ie. you can really only call it before you use anything from dates.py.

jklymak · 2019-08-09T04:59:07Z

I guess also to be clear, this is really not meant to be used willy-nilly. But we do get issues where people complain about microseconds not plotting correctly, and its kind of a shame because we have chosen a poor epoch for 99% of the people who want microsecond accuracy (not many folks were tracking microseconds back in year 0001). I really expect this to be a seldom used feature.

anntzer · 2019-08-09T08:00:27Z

In #7138 (comment) I suggested making this callable only once, and only before any call to date2num (if a library calls it before you, that's too bad -- we could even explicitly discourage other libraries to call it, suggesting that the use should be reserved to end users). Perhaps having a per-axes epoch would be even better, but that's likely much harder to do properly...

timhoffm · 2019-08-09T11:22:52Z

Perhaps having a per-axes epoch would be even better, but that's likely much harder to do properly...

If you go that way, probably a per-axis epoch ...

jklymak · 2019-08-09T13:27:56Z

A sentinel is easy enoiugh except it makes writing the tests and docs really hard because I need to change the epoch which then would expose to users the private method to reset the epoch.

A per-axis epoch is basically a new locator/formatter. Then the date2num etc could go be methods on the locator as would the epoch. We could easily go that route but then the new locators are not backwards compatible with the old ones and we have the rigamarole of making the new ones standard. Ie I vastly prefer ConciseDateFormatter to the default AutoDateFormatter, but it’s hard to see how we will ever make it the default.

anntzer · 2019-08-09T13:36:28Z

The tests could use a private method to reset it, I agree it's less nice for the docs :/

examples/ticks_and_spines/date_precision_and_epochs.py

jklymak · 2019-08-09T16:42:34Z

This new version has a sentinel. So:

import matplotlib.date as mdate
import numpy as np 
import matplotlib.pyplot as plt 

x = np.arange('2000-01-01T00:00:00.0', '2000-01-01T00:00:20.000100',
              dtype='datetime64[s]')
y = np.arange(0, len(x))
x2 = np.arange('2001-01-01T00:00:00.0', '2001-01-01T00:00:20.000100',
              dtype='datetime64[s]')
y2 = np.arange(0, len(x))
fig, ax = plt.subplots()
ax.plot(x, y)
mdates.set_epoch('1990-01-01')
ax.plot(x2, y2)

fails w/

  File "testepoch.py", line 13, in <module>
    mdates.set_epoch('1990-01-01')
  File "/Users/jklymak/matplotlib/lib/matplotlib/dates.py", line 253, in set_epoch
    raise RuntimeError('set_epoch must be called before dates plotted.')
RuntimeError: set_epoch must be called before dates plotted.

The docs need to call _reset_epoch, and just have a huge caveat about doing so.

jklymak · 2019-08-11T19:37:57Z

ping @efiring, as having some interest in dates

efiring · 2019-08-11T21:19:19Z

There is a lot here, and I appreciate the work. I am sympathetic to the idea of providing a way to retreat from the unfortunate choices of epoch used by mpl and by Matlab, for that matter. This is a bit like a convention that I started very long ago for my work with shipboard ADCP data: using floating-point time in days (which I call "decimal days") relative to the start of a specified year (which I call the "yearbase"). It also fits in with many satellite-based data products which typically give time in days since the start of 1950. Other data products use other years. Unix uses seconds since the start of 1970. In my understanding, the "epoch" in such a case is the start of the specified year, so noon on January 1 of the epoch year is 0.5 days. I think this is consistent with the usage in https://en.wikipedia.org/wiki/Epoch, for example.

In mpl datenums, noon on January 1 of the year 1 is 1.5, so I would have to say that the real epoch for mpl is December 31 on 1 BCE (since there is no year zero). But this conflicts with the way you are using "epoch", claiming that the mpl default is January 1 of the year 1. This is a recipe for confusion.

Putting aside the question of what the epoch really is, some of the motivations for this change seem contrived, though. Is there a real use case for zooming from dates to microseconds, especially given that all of the date-time formats in question are locally incorrect at the 1-second level because of leap-seconds?

My experience with my decimal day and yearbase also makes me cautious here. I found that I had to be very careful to keep track of the yearbase, adding complexity to the code. There were valid reasons for the design when it was developed, but if I were starting now I would probably pick a single epoch and stick with it.

jklymak · 2019-08-12T00:50:03Z

Well the other application is people who have floats in seconds since 1970 (for instance). In order to get in mTpltolib yeardays they have to use datetime anyways to figure out how many days to add to that number. Or, we could just do it for them by letting them set the epoch. Or the example you just gave of yeardays 2017. Right now they need to know the offset

jklymak · 2019-10-01T20:10:27Z

Discussed this briefly on the 1 Oct call. This seems a reasonable way forward.

While I agree that going to sub-seconds looses absolute accuracy, its the precision you usually want at such small scales.

jklymak · 2019-10-14T20:42:24Z

I don't think the test failures are from this PR, but could be misunderstanding...

jklymak · 2020-04-20T20:44:45Z

Putting back in draft mode to better understand #15008 (comment)

examples/ticks_and_spines/date_precision_and_epochs.py

matplotlibrc.template

timhoffm · 2020-04-26T10:33:55Z

lib/matplotlib/mpl-data/stylelib/classic.mplstyle

@@ -229,6 +229,7 @@ date.autoformatter.hour   : %H:%M:%S
 date.autoformatter.minute : %H:%M:%S.%f
 date.autoformatter.second : %H:%M:%S.%f
 date.autoformatter.microsecond : %H:%M:%S.%f
+date.epoch                : 0000-12-31T00:00:00  # old epoch


I would not add date.epoch to styles because I see it as a functional property not a style property.

Btw. what would happen if I change the style later after having already used date operations?

Styles are just rcParams. The tests are run using the classic style. This is how you tell all the tests to run with the old epoch. The alternative is changing every date test which I'm fine with doing, but there will be a lot of changes.

matplotlib/lib/matplotlib/testing/conftest.py

Line 33 in f091a4d

def mpl_test_settings(request):

That is where we do the setup / clean up if we want to force this in the tests.

I think I agree that this is not something that should be in the style sheets (and probably should be added to the banned list).

OK, blacklisted and changed the tests to use the new epoch.

lib/matplotlib/dates.py

QuLogic

I don't really know too much about datetime in Matplotlib, so these comments are not really about it.

lib/matplotlib/rcsetup.py

lib/matplotlib/tests/test_axes.py

lib/matplotlib/tests/test_dates.py

matplotlibrc.template

lib/matplotlib/dates.py

QuLogic · 2020-04-28T05:41:48Z

Why does the first plot in the example get broken Axis?

jklymak · 2020-04-28T14:33:03Z

@QuLogic The axis in that example is one of the things this PR is addressing. It gets broken by roundoff error. Please see the example at the top of this PR as well

QuLogic · 2020-04-28T20:39:31Z

doc/api/api_changes_3.3/deprecations.rst

@@ -487,3 +487,13 @@ of all renderer classes is deprecated.

 `.transforms.AffineDeltaTransform` can be used as a replacement.  This API is
 experimental and may change in the future.
+
+``ColorbarBase`` parameters will become keyword-only


This addition seems accidental.

hmmm. Happened during rebase. Not sure why...

timhoffm

Looks good now. 🎉

Just needs a final rebase.

Co-Authored-By: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>

jklymak · 2020-04-29T14:24:53Z

Rebased. Thanks for everyone's help with this!

tacaswell · 2020-04-29T16:12:44Z

🎉 Thank you everyone!

…ib/matplotlib#15008).

* unpin proj, move cartopy pin forward * Explicit resolution of coastlines in tests - see SciTools/cartopy#1105. * Removed gdal optional requirement. * New target image for test_plot_custom_aggregation due to auto-sizing coastlines in SciTools/cartopy#1105. Twin commit: SciTools/test-iris-imagehash@9b4e50e. * New acceptable images for various tests due to minor changes in Matplotlib 3.3. Twin commit: SciTools/test-iris-imagehash@3babde5. * New target images for test plots affected by the gridline spacing change in SciTools/cartopy@2f5e568. Twin commit: SciTools/test-iris-imagehash@f071f2c. * New acceptable images to allow for minute colormap range changes in Matplotlib 3.3. Twin commit: SciTools/test-iris-imagehash@9f4b04e. * Improvement to quickplot time axis labelling and accompanying graphics test target changes. Twin commit: SciTools/test-iris-imagehash@3036a6f * Made quickplot time axis label sensitive to MPL version (see matplotlib/matplotlib#15008). * More target images following from 41b3b2a. Twin commit: SciTools/test-iris-imagehash@804ff68. * More target images following from 27ea2f2. Twin commit: SciTools/test-iris-imagehash@f559a36. * Mirroring _draw_2d_from_points() use of mpl date2num in _draw_2d_from_bounds(). Twin commit: SciTools/test-iris-imagehash@3c582dc. * Re-instated all valid image targets for TestPlotCoordinatesGiven.test_tx. * New target image for TestSimple.test_bounds following change to plot axis labelling. Twin commit: SciTools/test-iris-imagehash@770dc92. * modify raster tests to handle new gdal behaviour * modify raster tests to handle new gdal naming behaviour * modify tests to reflect new PROJ behaviour * modify parts of test_project to use Transverse Mercator * revert test_project to use Robinson, add warning to docstring * keep results consistent Co-authored-by: Martin Yeo <martin.yeo@metoffice.gov.uk>

jklymak added the status: work in progress label Aug 8, 2019

jklymak added this to the v3.3.0 milestone Aug 8, 2019

jklymak added the topic: date handling label Aug 8, 2019

jklymak force-pushed the enh-add-variable-epoch branch from 12f8f99 to 8419417 Compare August 8, 2019 20:03

jklymak removed the status: work in progress label Aug 8, 2019

jklymak changed the title ~~WIP: ENH: add variable epoch~~ ENH: add variable epoch Aug 8, 2019

jklymak marked this pull request as ready for review August 8, 2019 20:12

anntzer reviewed Aug 8, 2019

View reviewed changes

examples/ticks_and_spines/date_precision_and_epochs.py Outdated Show resolved Hide resolved

anntzer reviewed Aug 8, 2019

View reviewed changes

lib/matplotlib/dates.py Outdated Show resolved Hide resolved

anntzer reviewed Aug 9, 2019

View reviewed changes

examples/ticks_and_spines/date_precision_and_epochs.py Outdated Show resolved Hide resolved

jklymak mentioned this pull request Aug 9, 2019

Discuss: Date handling issues and ideas #15018

Closed

jklymak mentioned this pull request Oct 14, 2019

FIX: allow zero and one as dates via wrapping #15416

Closed

6 tasks

jklymak force-pushed the enh-add-variable-epoch branch 2 times, most recently from dd0f7ea to a1ec8a1 Compare October 14, 2019 19:10

pganssle reviewed Apr 20, 2020

View reviewed changes

examples/ticks_and_spines/date_precision_and_epochs.py Outdated Show resolved Hide resolved

jklymak marked this pull request as ready for review April 20, 2020 22:08

jklymak force-pushed the enh-add-variable-epoch branch from f91b603 to fd4fc70 Compare April 21, 2020 21:06

timhoffm reviewed Apr 26, 2020

View reviewed changes

jklymak force-pushed the enh-add-variable-epoch branch 2 times, most recently from 1095ba5 to 2ae42f5 Compare April 28, 2020 00:49

QuLogic reviewed Apr 28, 2020

View reviewed changes

jklymak force-pushed the enh-add-variable-epoch branch 3 times, most recently from 77df9ac to e9fa873 Compare April 28, 2020 19:46

QuLogic reviewed Apr 28, 2020

View reviewed changes

timhoffm approved these changes Apr 29, 2020

View reviewed changes

ENH: add ability to change matplotlib epoch

a3fbc49

Co-Authored-By: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>

jklymak force-pushed the enh-add-variable-epoch branch from 1234d70 to a3fbc49 Compare April 29, 2020 14:23

timhoffm merged commit e8c7579 into matplotlib:master Apr 29, 2020

jklymak deleted the enh-add-variable-epoch branch April 29, 2020 15:58

jklymak removed status: needs comment/discussion needs consensus on next step status: needs review labels Apr 29, 2020

QuLogic mentioned this pull request Apr 29, 2020

misplaced spines in dates plot #7138

Closed

QuLogic mentioned this pull request May 7, 2020

Datetime plot fails with 'Agg' backend in interactive mode #15409

Closed

TomAugspurger mentioned this pull request Jul 20, 2020

BUG: register_matplotlib_converters leads to wrong datetime interpretation with matplotlib 3.3 pandas-dev/pandas#35350

Closed

3 tasks

jklymak mentioned this pull request Jul 20, 2020

FIX: undeprecate and update num2epoch/epoch2num #17983

Merged

6 tasks

trexfeathers added a commit to stephenworsley/iris that referenced this pull request Aug 26, 2020

Made quickplot time axis label sensitive to MPL version (see matplotl…

75dfe94

…ib/matplotlib#15008).

stephenworsley mentioned this pull request Sep 2, 2020

Unpin Matplotlib and proj SciTools/iris#3762

Merged

Uh oh!

ENH: add variable epoch #15008

ENH: add variable epoch #15008

Uh oh!

Conversation

jklymak commented Aug 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

PR Checklist

Uh oh!

Uh oh!

Uh oh!

ImportanceOfBeingErnest commented Aug 9, 2019

Uh oh!

jklymak commented Aug 9, 2019

Uh oh!

ImportanceOfBeingErnest commented Aug 9, 2019

Uh oh!

tacaswell commented Aug 9, 2019

Uh oh!

jklymak commented Aug 9, 2019

Uh oh!

jklymak commented Aug 9, 2019

Uh oh!

anntzer commented Aug 9, 2019

Uh oh!

timhoffm commented Aug 9, 2019

Uh oh!

jklymak commented Aug 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anntzer commented Aug 9, 2019

Uh oh!

Uh oh!

jklymak commented Aug 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklymak commented Aug 11, 2019

Uh oh!

efiring commented Aug 11, 2019

Uh oh!

jklymak commented Aug 12, 2019

Uh oh!

jklymak commented Oct 1, 2019

Uh oh!

jklymak commented Oct 14, 2019

Uh oh!

jklymak commented Apr 20, 2020

Uh oh!

Uh oh!

Uh oh!

timhoffm Apr 26, 2020

Choose a reason for hiding this comment

Uh oh!

jklymak Apr 27, 2020

Choose a reason for hiding this comment

Uh oh!

tacaswell Apr 27, 2020

Choose a reason for hiding this comment

Uh oh!

jklymak Apr 27, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

QuLogic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jklymak commented Aug 8, 2019 •

edited

Loading

jklymak commented Aug 9, 2019 •

edited

Loading

jklymak commented Aug 9, 2019 •

edited

Loading