-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
WIP: ENH: autodecode pandas timestamps #10638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: ENH: autodecode pandas timestamps #10638
Conversation
lib/matplotlib/dates.py
Outdated
@@ -430,6 +430,9 @@ def date2num(d): | |||
For details see the module docstring. | |||
""" | |||
|
|||
if hasattr(d, "to_pydatetime"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment that this is to give pandas a special treatment here? (and below)
lib/matplotlib/dates.py
Outdated
@@ -430,6 +430,9 @@ def date2num(d): | |||
For details see the module docstring. | |||
""" | |||
|
|||
if hasattr(d, "to_pydatetime"): | |||
d = d.to_pydatetime() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are unnecessarily losing efficiency here; to avoid that, use the same 2-part conditional here as you use in get_converter. That way, if the input is basically datetime64 it can be handled by your nice, efficient, fully vectorized converter instead of making a round-trip through the clunky datetime.datetime machinery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can pandas go to datetime64? Ity my understanding that the underlying data type is datetime64[ns], but I don't know what attribute to test for to get that to come out...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, duh, I see...
caa6de5
to
333b2ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
I don't have a PR or branch yet (was planning to do that this evening). I was thinking to go the way to check for the dtype
, but anyway, I think there are multiple options that are not necessarily better or worse.
lib/matplotlib/dates.py
Outdated
if hasattr(d, "values"): | ||
d = d.values | ||
if hasattr(d, "to_pydatetime"): | ||
d = d.to_pydatetime() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this one is needed, as date2num
already handles Timestamp objects (them being datetime.datetime subclasses):
In [82]: date2num(pd.Timestamp('2012-01-01'))
Out[82]: 734503.0
In [85]: date2num(datetime.datetime(2012, 1, 1))
Out[85]: 734503.0
752e403
to
16ac8b7
Compare
lib/matplotlib/units.py
Outdated
@@ -154,6 +154,10 @@ def get_converter(self, x): | |||
# DISABLED cached = self._cached.get(idx) | |||
# DISABLED if cached is not None: return cached | |||
|
|||
if hasattr(x, "values"): | |||
# this unpacks pandas datetime objects... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually much more general than that; it is returning the underlying ndarray from a pandas Series or DataFrame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
16ac8b7
to
cecf1f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give it a try.
I do wonder if this should be in 2.2, but I guess pandas is eager to drop matplotlib as an unconditional dependency? |
It changes behavior for base mpl only in the case where an object with a ".values" attribute is supplied. The only danger I see there is that someone might have an ndarray subclass with a ".values" attribute that does something other than return the ndarray. So, it's fairly safe for mpl. It could be made even safer by checking to ensure .values really has returned an ndarray, but I doubt this will be necessary, at least for now. I can imagine that it might have an undesired effect when pandas is in use, by always calling .values instead of simply passing the pandas object on, possibly to be handled by a registered converter that would be looking for the original object. It would be good to have a pandas person look at it from that standpoint. I don't know whether pandas does, or would, register such a converter. |
cecf1f5
to
873addf
Compare
Oooops, yes, the old version will bypass the pandas converter, beacuse I put the check in the wrong spot (not where you told me to). I think that was a mistake born of another mistake. This newly pushed version still works w/ the above example, and has the converter after pandas should have chosen a converter... |
873addf
to
59bbccb
Compare
I will take a closer look tomorrow from the pandas side. But, if pandas registers converters, it registers the same converters for both pandas.Timestamp as datetime.datetime and np.datetime64. So I don't think we have to worry about bypassing the pandas converter, if it is registered. It might be good to add a few tests for this (I suppose you can directly use pandas there to actually test it?) |
Yes, we can use pandas, though test suggestions welcome. |
@jorisvandenbossche Though this has been merged, it really should get some tests. happy to help with a PR to get those in. |
Backport PR #10638 on branch v2.2.x
PR Summary
attempt at #10533 after @efiring 's suggestion.
The following now works. (needs a recent pandas for the deregister step to work).
ping @jorisvandenbossche - if you have a more complete or canonical PR, feel free to push instead of this....
PR Checklist