float128s everywhere for dates? #7139

anntzer · 2016-09-19T20:53:32Z

The bad roundings that appear in #7138 (loss of microsecond precision for dates around today) make me wonder whether we should just switch to using float128s everywhere for date-related data.

Time is encoded as seconds since 0001-01-01 UTC, so float64s (53 bits) can keep microsecond precision for a timedelta up to 2**53 / 1e6 / 3.154e7 ~ 285 years (3.154e7s/y), which does not go up to today. On the other hand float128s offer 113 bits of precision, i.e. ~3e20 years -- that should be safe, even if we switch to nanosecond precision :-)

On the other hand I don't actually ever use dates in plots so I don't know if it's really worth it.

Kojoley · 2016-09-19T20:58:34Z

Should we ever convert the dates to other precision? It is seems logical to me to keep them in what format user provided it to the plot. I had previously a big pain in figuring out all the num2date/date2num kitchen when I made my custom 'cursor'.

anntzer · 2016-09-19T21:06:26Z

The problem I can see is that an innocuous date such as 2000-01-01 at 00:00:00.000001 (1 microsecond after midnight), which is perfectly supported by datetime, loses its precision due to matplotlib's internal representation as seconds since 0001-01-01.

But again I'll leave it to people who actually use dates to figure this out.

tacaswell · 2016-09-19T21:25:55Z

The way that the unit handling works is that if the input data is converted from unitful -> float (which we know how to plot) and then from float -> unit ful for the tick labels and mouse over. None of the backends know what to do with a DateTime object or arrays there of.

iirc float128 does not exist on every platform and even if it does may not be 128

As the pandas folks can tell you dates get painful fast...

efiring · 2016-09-19T21:26:05Z

Numpy float128 is a strange beast, and I think we should stay away from it. We have more to lose than to gain by changing our datenum internal representation, which is similar to Matlab's (but has a different origin). What we need to do eventually is handle numpy datetime64 in addition to our original datenum, even though numpy datetime64 has its own problems--as does the python standard library datetime.

efiring · 2016-09-19T21:29:26Z

As far as I can see, Pandas made a painful choice in their decision to adopt numpy datetime64[ns], so they have absurd precision and a very small range of years. All this is with a datetime that doesn't know about leap seconds.

matthew-brett · 2016-09-19T21:52:52Z

Yeah, float128 in numpy on Intel is 80 bit precision floats, and, as far as
I remember, can be operated on in 64-bit precision. On other platforms it
can be a double float (PPC) or real IEEE 128 bit (SPARC I believe). So
best avoided completely.

On Mon, Sep 19, 2016 at 2:29 PM, Eric Firing notifications@github.com
wrote:

As far as I can see, Pandas made a painful choice in their decision to
adopt numpy datetime64[ns], so they have absurd precision and a very small
range of years. All this is with a datetime that doesn't know about leap
seconds.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7139 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEIHFoh882ko-1i2mncdL1BfhZmOM66ks5qrv63gaJpZM4KA89J
.

anntzer · 2016-09-19T22:04:15Z

FWIW 64bit precision (from 80-bit extended precision floats) is 1024x more than what we have now, so it should still be enough (but not for nanoseconds).

matthew-brett · 2016-09-19T22:10:18Z

Sorry - by 64 bit precision I mean 53 bits in the significand. I believe that is what you get for float128 on Windows. I seem to remember its 80 bit precision on OSX and Linux.

efiring · 2016-09-20T05:15:23Z

The answer to the question raised in the title is clearly "no"--float128 is a non-starter. A more sensible origin (like the unix epoch) for our legacy datenum would be nice, but I'm afraid the back-compatibility problems are prohibitive.
I think that substantially improving this aspect of our date and time handling is low-priority, and requires a MEP-level design process, if it is worth doing at all. It might involve optionally using Astropy's time, or numpy datetime64, or something else; but that crazy quasi-Matlab float64 datenum is going to be around for a long time.
In the interest of trying to get the number of open issues down I am going to close this now. Anyone who disagrees is of course welcome to reopen it.

tacaswell added the topic: date handling label Sep 19, 2016

tacaswell added this to the 2.0.1 (next bug fix release) milestone Sep 19, 2016

tacaswell modified the milestones: 2.1 (next point release), 2.0.1 (next bug fix release) Sep 19, 2016

efiring closed this as completed Sep 20, 2016

Kojoley removed this from the 2.1 (next point release) milestone Sep 20, 2016

jklymak mentioned this issue Aug 9, 2019

Discuss: Date handling issues and ideas #15018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

float128s everywhere for dates? #7139

float128s everywhere for dates? #7139

anntzer commented Sep 19, 2016

Kojoley commented Sep 19, 2016

anntzer commented Sep 19, 2016

tacaswell commented Sep 19, 2016

efiring commented Sep 19, 2016

efiring commented Sep 19, 2016

matthew-brett commented Sep 19, 2016

anntzer commented Sep 19, 2016

matthew-brett commented Sep 19, 2016

efiring commented Sep 20, 2016

float128s everywhere for dates? #7139

float128s everywhere for dates? #7139

Comments

anntzer commented Sep 19, 2016

Kojoley commented Sep 19, 2016

anntzer commented Sep 19, 2016

tacaswell commented Sep 19, 2016

efiring commented Sep 19, 2016

efiring commented Sep 19, 2016

matthew-brett commented Sep 19, 2016

anntzer commented Sep 19, 2016

matthew-brett commented Sep 19, 2016

efiring commented Sep 20, 2016