Description
Currently matplotlib
defaults to using pytz
for time zone support, I think that this should be dropped in favor of using dateutil.tz
, which supports everything pytz
does and more.
Case for dateutil
For starters, dateutil
is already a dependency, so this removes rather than replaces a dependency. Additionally, the time zones dateutil
provides properly implement the tzinfo
protocol, while pytz
uses a non-standard API that is famously confusing.
Additionally, as of dateutil==2.6.0
, dateutil
has support for the fold
attribute introduced in PEP 495. It is currently recommended in the Python 3.6 documentation (scroll up a bit, that's the closest anchor) for this reason. It is unlikely that pytz
will implement fold
support, as that's not really how pytz
works.
I'll also say that most people seem to think that pytz
already works like dateutil
does, which causes all kinds of problems. An example:
import pytz
from dateutil import tz
from datetime import datetime, timedelta
NYC = tz.gettz('America/New_York')
NYC_pytz = pytz.timezone('America/New_York')
dt_du = datetime(2016, 4, 2, tzinfo=NYC)
dt_pt_bad = datetime(2016, 4,2, tzinfo=NYC_pytz) # Wrong!
dt_pt = NYC_pytz.localize(datetime(2016, 4, 2))
print(dt_du)
# 2016-04-02 00:00:00-04:00
print(dt_pt_bad)
# 2016-04-02 00:00:00-04:56
print(dt_pt)
# 2016-04-02 00:00:00-04:00
print(dt_du - timedelta(days=60))
# 2016-02-02 00:00:00-05:00
print(dt_pt - timedelta(days=60))
# 2016-02-02 00:00:00-04:00
print(NYC_pytz.normalize(dt_pt - timedelta(days=60)))
# 2016-02-01 23:00:00-05:00
Generally matplotlib
handles all this correctly internally, but this only breaks backwards compatibility insofar as users are actually taking the time zone they are getting and using it for something, and those people are getting an object with a famously confusing API (every time I talk about this I get people saying they've been doing it wrong, and even people who know how this works in general don't often understand the details well).
Case against dateutil
In favor of pytz
, I'll say that it's a very "light" dependency in the sense that it's extremely widely used and is a hard dependency of other libraries like pandas
that matplotlib
users are likely to be installing anyway.
Additionally, currently pytz
is faster than dateutil
, and I think does a (somewhat) better job at memoizing function calls. This is something that I'm actively working on in dateutil
, and I've already closed the gap in many areas (particularly dateutil.tz.tzutc()
, which is very useful).
Summary:
In favor of dateutil
:
- Standard
tzinfo
interface - Support for
fold
- Already a dependency
In favor of pytz
:
- Faster
- Wouldn't break backwards compatibility to keep it.
I'll also note that I'm only suggesting that matplotlib
stop using pytz
as its default timezone provider, not that all support for pytz
be dropped. Users should still be able to supply datetime
objects with any valid tzinfo
and have matplotlib
work properly.