Skip to content

float128s everywhere for dates? #7139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anntzer opened this issue Sep 19, 2016 · 9 comments
Closed

float128s everywhere for dates? #7139

anntzer opened this issue Sep 19, 2016 · 9 comments

Comments

@anntzer
Copy link
Contributor

anntzer commented Sep 19, 2016

The bad roundings that appear in #7138 (loss of microsecond precision for dates around today) make me wonder whether we should just switch to using float128s everywhere for date-related data.

Time is encoded as seconds since 0001-01-01 UTC, so float64s (53 bits) can keep microsecond precision for a timedelta up to 2**53 / 1e6 / 3.154e7 ~ 285 years (3.154e7s/y), which does not go up to today. On the other hand float128s offer 113 bits of precision, i.e. ~3e20 years -- that should be safe, even if we switch to nanosecond precision :-)

On the other hand I don't actually ever use dates in plots so I don't know if it's really worth it.

@Kojoley
Copy link
Member

Kojoley commented Sep 19, 2016

Should we ever convert the dates to other precision? It is seems logical to me to keep them in what format user provided it to the plot. I had previously a big pain in figuring out all the num2date/date2num kitchen when I made my custom 'cursor'.

@anntzer
Copy link
Contributor Author

anntzer commented Sep 19, 2016

The problem I can see is that an innocuous date such as 2000-01-01 at 00:00:00.000001 (1 microsecond after midnight), which is perfectly supported by datetime, loses its precision due to matplotlib's internal representation as seconds since 0001-01-01.

But again I'll leave it to people who actually use dates to figure this out.

@tacaswell
Copy link
Member

The way that the unit handling works is that if the input data is converted from unitful -> float (which we know how to plot) and then from float -> unit ful for the tick labels and mouse over. None of the backends know what to do with a DateTime object or arrays there of.

iirc float128 does not exist on every platform and even if it does may not be 128

As the pandas folks can tell you dates get painful fast...

@tacaswell tacaswell added this to the 2.0.1 (next bug fix release) milestone Sep 19, 2016
@efiring
Copy link
Member

efiring commented Sep 19, 2016

Numpy float128 is a strange beast, and I think we should stay away from it. We have more to lose than to gain by changing our datenum internal representation, which is similar to Matlab's (but has a different origin). What we need to do eventually is handle numpy datetime64 in addition to our original datenum, even though numpy datetime64 has its own problems--as does the python standard library datetime.

@tacaswell tacaswell modified the milestones: 2.1 (next point release), 2.0.1 (next bug fix release) Sep 19, 2016
@efiring
Copy link
Member

efiring commented Sep 19, 2016

As far as I can see, Pandas made a painful choice in their decision to adopt numpy datetime64[ns], so they have absurd precision and a very small range of years. All this is with a datetime that doesn't know about leap seconds.

@matthew-brett
Copy link
Contributor

Yeah, float128 in numpy on Intel is 80 bit precision floats, and, as far as
I remember, can be operated on in 64-bit precision. On other platforms it
can be a double float (PPC) or real IEEE 128 bit (SPARC I believe). So
best avoided completely.

On Mon, Sep 19, 2016 at 2:29 PM, Eric Firing notifications@github.com
wrote:

As far as I can see, Pandas made a painful choice in their decision to
adopt numpy datetime64[ns], so they have absurd precision and a very small
range of years. All this is with a datetime that doesn't know about leap
seconds.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7139 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEIHFoh882ko-1i2mncdL1BfhZmOM66ks5qrv63gaJpZM4KA89J
.

@anntzer
Copy link
Contributor Author

anntzer commented Sep 19, 2016

FWIW 64bit precision (from 80-bit extended precision floats) is 1024x more than what we have now, so it should still be enough (but not for nanoseconds).

@matthew-brett
Copy link
Contributor

Sorry - by 64 bit precision I mean 53 bits in the significand. I believe that is what you get for float128 on Windows. I seem to remember its 80 bit precision on OSX and Linux.

@efiring
Copy link
Member

efiring commented Sep 20, 2016

The answer to the question raised in the title is clearly "no"--float128 is a non-starter. A more sensible origin (like the unix epoch) for our legacy datenum would be nice, but I'm afraid the back-compatibility problems are prohibitive.
I think that substantially improving this aspect of our date and time handling is low-priority, and requires a MEP-level design process, if it is worth doing at all. It might involve optionally using Astropy's time, or numpy datetime64, or something else; but that crazy quasi-Matlab float64 datenum is going to be around for a long time.
In the interest of trying to get the number of open issues down I am going to close this now. Anyone who disagrees is of course welcome to reopen it.

@efiring efiring closed this as completed Sep 20, 2016
@Kojoley Kojoley removed this from the 2.1 (next point release) milestone Sep 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants