-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ENH: new date formatter #10841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: new date formatter #10841
Conversation
It may be that I misunderstand something here, but shouldn't the date format be determined though the rcParams |
The current formatters are:
and which one gets used is set by logic inside This PR takes into account surrounding ticks and changes the format accordingly. For instance, its extra info to repeat It would make sense for this PR to have a way to specify the format of the individual date elements - there are already placeholders for that. i.e. self._yearfmt = '%Y'
self._monthfmt = '%b'
self._dayfmt = '%d'
# etc... but I've not added RC params or kwargs yet. It would of course be necessary to get the old behaviour back, either by leaving it the default, or by leaving instructions: locator = mdates.LegacyAutoDateLocator()
formatter = mdates.LegacyAutoDateFormatter(locator)
ax.yaxis.set_major_locator(locator)
ax.yaxis.set_major_formatter(formatter) |
I see. Just note that it is very convenient to set the preferred format once through the rcParams and have every plot the way you want it. So if that is going to change, intruducing a parameter which reverts back would definitely be useful, |
@ImportanceOfBeingErnest That makes sense to me. It sounds like we are also moving to a "startup-file" like startup sequence, so folks will be able to put slightly more sophisticated recipes in their new |
This seems to be something used downstream in ObsPy (e.g., on waveform plots). I have not looked at or compared either implementation though. |
@QuLogic I think ObPy just makes the first tick have more info, but doesn't try to be intelligent about |
abe1ea1
to
d99a23c
Compare
Working through the example, I think my question may be more of what resolution does the formatter pick up? fig, ax = plt.subplots()
t0 = datetime.datetime(2009, 1, 20)
tf = datetime.datetime(2009, 1, 21)
ax.axvspan(t0, tf, facecolor="blue", alpha=0.25)
ax.set_xlim(t0 - datetime.timedelta(days=5), tf + datetime.timedelta(days=5))
ax.vlines([datetime.datetime(2009,1,17,5) +
datetime.timedelta(hours=x) for x in range(0,200,25)], 0, 1)
ax.set_ylim(0,1)
plt.show() |
@story645 It returns: Note this doesn't change the locator (which chooses the ticks), it sets the formatter. |
Actually, coming back to the comment about localization mentioned during the call:
A quick search did not yield any way to obtain localized separators or formats (especially "date without year") though :/ Not saying this should be a blocker, but just raising the issue. |
I'm fine w/o the dashes. Easy change. I'm personally against any formats that don't go year month day. As someone who writes hand written logs, nothing drives me nuttier than seeing "12-11-10" in a log book I know was written in November 2010, even though I would totally say 12 November 2010 in speech; written and spoken speech and scientific notation on a plot can/should be different. |
@jklymak thanks! This only works with the major locator? (And I know that's probably a question about the autoformatter in general...) |
Ummm... I think you can pass whatever formatter you want: ax.yaxis.set_minor_formatter(formatter) I think to make nicely formatted labels for minor date ticks would be a challenge given that they are already too big for normal axes sizes! I think having a decent minor locator would be a fun challenge, but one thing at a time ;-) |
This was largely met w/ approval on the call to turn this into @ImportanceOfBeingErnest I'll still try to figure out the best API for getting the old behaviour back, and ping you for your opinion. |
Since my opinion on this is being asked for: I would not change the name. In view of that I could imagine to allow the rcParam Concerning the new formatter: I'm not sure if I grasp this correctly. Does it not allow to specify any custom formats for individual components like days, minutes, years etc, or is it still not decided? As the overall aim of the new formatter seems to be to have nicely looking plots I could imagine to name it |
Thanks - I'll certainly make it so the current behaviour is achievable with rcParams. As written the new formatter does not allow you to pass different formats for the elements. It'd be nice if it did. My tendancy would be to try to make it fit into one rcParam versus six or seven, but I'd have to think about whether parsing a format string is the best way to do that or passing a dict. Kind of needs @anntzer new rcParam methodology in place to see how that'll work 😉 |
8d3014d
to
3718266
Compare
26ff858
to
c59b5a4
Compare
in future versions of Matplotlib. Please report any issues or | ||
suggestions for improvement to the github repository or mailing list. | ||
|
||
First, this is the default formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this line be put after imports&data and right before the figure? It seems a bit odd here by itself.
|
||
############################################################################# | ||
# If all calls to axes that have dates are to be made using this converter, | ||
# it is probably most convenient to use the units reigistry where you do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
# it is probably most convenient to use the units reigistry where you do | |
# it is probably most convenient to use the units registry where you do |
|
||
############################################################################# | ||
# The default date formater is quite verbose, so Matplotlib has an | ||
# alternative that can be chosen with a few extra lines of code. Note, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should mention what the alternative is here; it may be in the code, but that doesn't mean it's obvious what's important.
# manipulating one of three lists of strings. | ||
# | ||
# The ``formatter.formats`` list of formats is for the normal tick labels, | ||
# There are seven levels, years, months, days, hours, minutes, seconds, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# There are seven levels, years, months, days, hours, minutes, seconds, and | |
# There are seven levels: years, months, days, hours, minutes, seconds, and |
# ticks that are "zeros". These are tick values that are either the first of | ||
# the year, month, or day of month, or the zeroth hour, minute, or second. | ||
# These are usually the same as the format of | ||
# the ticks a level above. e.g. if the axis limts mean the ticks are mostly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reflow.
Also, "e.g." cannot start a sentence; use "For example".
typo: limts
self._formatter = DateFormatter(fmt, self._tz) | ||
tickdate = np.array([num2date(tick).timetuple()[:6] | ||
for tick in ticks]) | ||
tickdatetime = [num2date(tick) for tick in ticks] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you flip the order of these two lines, you won't need to call num2date
twice and create duplicate datetime
objects, I think?
tickdatetime = [num2date(tick) for tick in ticks] | |
tickdatetime = [num2date(tick) for tick in ticks] | |
tickdate = np.array([tick.timetuple()[:6] | |
for tick in tickdatetime]) |
lib/matplotlib/dates.py
Outdated
else: | ||
fmt = fmts[level] | ||
else: | ||
# special handling for sconds + microseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# special handling for sconds + microseconds | |
# special handling for seconds + microseconds |
# now loop through and decide the actual ticklabels | ||
zerovals = [0, 1, 1, 0, 0, 0, 0] | ||
ticknew = [] | ||
for nn in range(len(tickdate)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could be zip
d, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, you don't seem to need nn
: for something, something_else in zip(tickdate, tickdatetime):
(I'm just not sure what to name those two.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, but I find that actually makes me have to read the code more than what this does. However, to make things better, I got rid of the append
and pre-allocate the list of string first (and hence need the index to know which string to populate).
Return the :class:`~matplotlib.units.AxisInfo` for *unit*. | ||
|
||
*unit* is a tzinfo instance or None. | ||
The *axis* argument is required but not used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, well... Thats just copied over from DateConverter
😉
return sts | ||
|
||
d1 = datetime.datetime(1997, 1, 1) | ||
results = ([datetime.timedelta(weeks=52 * 200), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using test parametrization seems the better option here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but all the other similar tests are written this way, and I', not going to change all of them, particularly given my lack of comfort with parameterized tests...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fairly straightforward rearrangement of the for
-loop, essentially:
@pytest.mark.parametrize(
't_delta expected',
# Here put the list/tuple of things that was in `results`.
)
def test_concise_formatter(t_delta, expected):
d1 = datetime.datetime(1997, 1, 1)
d2 = d1 + t_delta
# And then put the rest of `_create_auto_date_locator`, but without the function.
You could do more fancy things with ids
or pytest.param
, but it's not really required here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't seem any more readable to me. I guess there is some advantage to parameterizing tests from the point of view of running them all rather than having them fail sequentially, but....
c59b5a4
to
ec5fc34
Compare
Thanks @QuLogic - one of your comments I didn't grok, and a couple of them I think can be left to future efforts (parameterizing the tests).... |
9b7966b
to
211fc06
Compare
Pinging for this to be merged... |
locator_unit_scale = 1.0 | ||
ticks = self._locator() | ||
if pos is not None: | ||
if not np.array_equal(ticks, self._oldticks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I get this correctly, _oldticks
/_oldlabels
is an optimization to avoid recomputing all the ticks when the formatter gets called with pos=0, pos=1, pos=2, etc.
But that also means that if one assigns to formats
/offset_formats
/zero_formats
, the labels don't get invalidated, so the assignment has no effect... until one changes the axes limits enough to make the ticks change.
I guess the options are either to make formats
/... properties such that assigning to them invalidates _oldlabels
, or also keep track of _oldformats
, or use some helper that goes through functools.lru_cache(1)
, or just not worry about the performance cost of doing the formatting multiple times.
(FWIW I think the formatter API is not optimal here; transforming tick positions to tick labels should be done as a single method call that takes all tick positions at the same time and returns all labels.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it works w properties. OTOH, the API is not what I'd have expected. The "setter" doesn't get called if I do something like:
formatter.format[0] = '%y'
and instead you must do:
fmt = formatter.format
fmt[0] = '%y'
formatter.format = fmt
which I guess makes sense, but is not what I think would happen if it was just a bare class variable.
So, should I just track the old formats? Thats a bit ugly, and at that point I'm not convinced its not just as fast to reformat everything for each call
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I failed to notice the problem with mutability, which means you need to track the old formats (in a local copy, as the lists can get modified in place...).
I agree with just redoing the formatting n times for now and change that if someone ever comes up with a use case where the performance cost matters...
The setter doesn't get called as foo.bar[0] = baz
is equivalent to tmp = foo.bar; tmp[0] = baz
so you call the getter, not the setter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the other thing we could do is add set_formats
, get_formats
etc and make self.formats
private. Then it will be clear that you have to pass the whole list of formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that just seems to make the API more awkward to work around what may not even be a real performance issue.
On 2018/11/29 10:05 AM, Antony Lee wrote:
I agree with just redoing the formatting /n/ times for now and change
that if someone ever comes up with a use case where the performance cost
matters...
That strikes me as unspeakably ugly... Surely there is a clean and
simple way to do it once.
|
It's not particularly hard (keep track of the various foo_formats in copies and see whether things have been invalidated); the question is just whether the whole thing is worth it. Of course the datetime formatters already break the underlying assumption that formatters operate independently of locators and format each value independently of the others: the formatter proposed by this PR needs to call (n times... even with the fix above) |
ab328bd
to
46305a8
Compare
Added the check. It wasn't the hardness of it that I was worried about, it was just the performance. But the algorithm itself is a bit of a pig, so I think its worth adding the checks. |
2c13f9f
to
3cd7b6c
Compare
FIX: allow non-unit aware tickers FIX: make str list length more strict DOC FIX: add kwargs to ConciseDateConverter FIX: fix cursor formatter FIX: cleanup loop logic FIX: cleanup append
3cd7b6c
to
a76c70f
Compare
Rebased. I think this is OK to merge assuming it all passes CI |
This has two approvals - would someone do the honours and merge it? |
Let's roll. |
Thanks @anntzer (ahem, particularly as you don't use date times very much 😉) |
PR Summary
Update 22 Sep 2018: New localization mechanism.
This can now be localized. Its a bit verbose because there are basically 3*7 = 21 formatting strings. 7 for the normal ticks, 7 for the ticks that are major time divisions (i.e. if ticks are 22 Mar, 1 Apr, 8 Apr, the tick labels will be "22", "Apr", "8". ). I think the default is good, and ISO compliant. The verbose lists of strings allow localization with a bit of work. I include an example of how to create a trivial subclass to encapsulate that customization. Maybe someone more clever than me can come up with a better way to do this.
..its described in detail at https://13776-1385122-gh.circle-artifacts.com/0/home/circleci/project/doc/build/html/gallery/ticks_and_spines/date_concise_formatter.html#sphx-glr-gallery-ticks-and-spines-date-concise-formatter-py
[x] still needs tests
ping @anntzer and @ImportanceOfBeingErnest who were both interested in how this would localize.
Update 4 July 2018
Renamed
ConciseDateFormatter
and included examples of how to use it.I think mechanisms or dicussion of whether it shoudl be the defautl can be deferred. Certainly its nice to be able to set the date format manually as pointed out by @ImportanceOfBeingErnest below, and the current method allows this, whereas this Formatter has no localization like that.
...
This is a proposed new date formatter. I find the Automatic one less than useful, though I appreciate the functional flexibility of the default one.
I'd argue the default should be something like this, but apprecaite we are conservative on such matters.
This isn't ready for serious review attention yet, but thoughts on the API or the result are welcome...
I'll note a couple fo flaws (in my opinion) of the default locator. As per #9801,
interval_multiples=True
should be the default. Secondly, it would be nice if the locator only located the first and 15th of the month rather than 1, 15, 29.EDIT: added extra info in the "offset" region of the axes.
Before:
After:
PR Checklist