Skip to content

FIX: improve date performance regression #18756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 29, 2020

Conversation

jklymak
Copy link
Member

@jklymak jklymak commented Oct 17, 2020

PR Summary

Closes #18743

date2num was slowed down for long lists of datetime. This pr simply tries to convert them to datetime64 before we convert to a num which seems fast.

I think that if we had a list of date times with timezones it would still be quite slow, but this fixes the naive-datetime case. Open to better suggestions..

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.units
import numpy as np
import datetime
import time

# region # synthetic ...
def datetime_range(startDate, endDate, deltaDate):
    current = startDate
    while current < endDate:
        yield current
        current += deltaDate
    return

listOfDates = [date for date in datetime_range(datetime.datetime(2020, 1, 1, 0, 1),
                              datetime.datetime(2020, 12, 31, 23, 59),
                              datetime.timedelta(minutes=1))]

start = time.perf_counter()
mdates.date2num(listOfDates)
stop = time.perf_counter()
print(f'{stop-start:1.2f} s')

Master

11.79 s

This PR:

1.99 s

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and pydocstyle<4 and run flake8 --docstring-convention=all).
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

@jklymak
Copy link
Member Author

jklymak commented Oct 17, 2020

If I add a timezone,

listOfDates = [date for date in datetime_range(datetime.datetime(2020, 1, 1, 0, 1, tzinfo=timezone(-timedelta(hours=2))),
                              datetime.datetime(2020, 12, 31, 23, 59, tzinfo=timezone(-timedelta(hours=2))),
                              datetime.timedelta(minutes=1))]

then its a bit slower: 3.29 s or so, which seems an acceptable slow down.

@jklymak
Copy link
Member Author

jklymak commented Oct 17, 2020

... I don't understand the codecov error - this PR didn't touch the tests...

@jklymak
Copy link
Member Author

jklymak commented Oct 17, 2020

@QuLogic is it possible to add datetime handling to the performance tests?

tzi = getattr(d[0], 'tzinfo', None)
if tzi is not None:
# make datetime naive:
d = [dt.astimezone(UTC).replace(tzinfo=None) for dt in d]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this one statement after @tacaswell's approval (sorry). There were two list comprehensions, and I realized it could just as easily be one!

@jklymak
Copy link
Member Author

jklymak commented Oct 23, 2020

Note, as promised, that there is now a benchmark PR in mpl-benchmark

@QuLogic QuLogic merged commit 9244eca into matplotlib:master Oct 29, 2020
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Oct 29, 2020
@jklymak jklymak deleted the fix-improve-dateconversion branch October 29, 2020 21:41
QuLogic added a commit that referenced this pull request Oct 31, 2020
…756-on-v3.3.x

Backport PR #18756 on branch v3.3.x (FIX: improve date performance regression)
@Alexia-I
Copy link

Hi @QuLogic, I'm intrigued by how the meeseeksmachine was utilized for the backporting process in this PR. Can you please shed some light on how it's triggered? Is it through specific labels or comments, or does it run automatically upon merging?

@rcomer
Copy link
Member

rcomer commented Feb 25, 2024

@Alexia-I here is our documentation on the automatic backports
https://matplotlib.org/devdocs/devel/pr_guide.html#automated-backports

In this case it was triggered by the milestone but it can also be triggered by tagging the bot in a comment.

@Alexia-I
Copy link

@rcomer Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

from mpl 3.2.2 to 3.3.0 enormous increase in creation time
5 participants