Skip to content

Drop use of pytz dependency in next major release #10443

Closed
@pganssle

Description

@pganssle

Currently matplotlib defaults to using pytz for time zone support, I think that this should be dropped in favor of using dateutil.tz, which supports everything pytz does and more.

Case for dateutil

For starters, dateutil is already a dependency, so this removes rather than replaces a dependency. Additionally, the time zones dateutil provides properly implement the tzinfo protocol, while pytz uses a non-standard API that is famously confusing.

Additionally, as of dateutil==2.6.0, dateutil has support for the fold attribute introduced in PEP 495. It is currently recommended in the Python 3.6 documentation (scroll up a bit, that's the closest anchor) for this reason. It is unlikely that pytz will implement fold support, as that's not really how pytz works.

I'll also say that most people seem to think that pytz already works like dateutil does, which causes all kinds of problems. An example:

import pytz
from dateutil import tz
from datetime import datetime, timedelta

NYC = tz.gettz('America/New_York')
NYC_pytz = pytz.timezone('America/New_York')

dt_du = datetime(2016, 4, 2, tzinfo=NYC)
dt_pt_bad = datetime(2016, 4,2, tzinfo=NYC_pytz)    # Wrong!
dt_pt = NYC_pytz.localize(datetime(2016, 4, 2))

print(dt_du)
# 2016-04-02 00:00:00-04:00

print(dt_pt_bad)
# 2016-04-02 00:00:00-04:56

print(dt_pt)
# 2016-04-02 00:00:00-04:00

print(dt_du - timedelta(days=60))
# 2016-02-02 00:00:00-05:00

print(dt_pt - timedelta(days=60))
# 2016-02-02 00:00:00-04:00

print(NYC_pytz.normalize(dt_pt - timedelta(days=60)))
# 2016-02-01 23:00:00-05:00

Generally matplotlib handles all this correctly internally, but this only breaks backwards compatibility insofar as users are actually taking the time zone they are getting and using it for something, and those people are getting an object with a famously confusing API (every time I talk about this I get people saying they've been doing it wrong, and even people who know how this works in general don't often understand the details well).

Case against dateutil

In favor of pytz, I'll say that it's a very "light" dependency in the sense that it's extremely widely used and is a hard dependency of other libraries like pandas that matplotlib users are likely to be installing anyway.

Additionally, currently pytz is faster than dateutil, and I think does a (somewhat) better job at memoizing function calls. This is something that I'm actively working on in dateutil, and I've already closed the gap in many areas (particularly dateutil.tz.tzutc(), which is very useful).

Summary:

In favor of dateutil:

  1. Standard tzinfo interface
  2. Support for fold
  3. Already a dependency

In favor of pytz:

  1. Faster
  2. Wouldn't break backwards compatibility to keep it.

I'll also note that I'm only suggesting that matplotlib stop using pytz as its default timezone provider, not that all support for pytz be dropped. Users should still be able to supply datetime objects with any valid tzinfo and have matplotlib work properly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions