Skip to content

np.float128 is a confusing name #10288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eric-wieser opened this issue Dec 28, 2017 · 35 comments
Open

np.float128 is a confusing name #10288

eric-wieser opened this issue Dec 28, 2017 · 35 comments

Comments

@eric-wieser
Copy link
Member

eric-wieser commented Dec 28, 2017

It implies the IEEE 754R 128-bit float, but in practice is typically whatever long double is on the platform, which #10281 shows can sometimes be other types.

@seberg
Copy link
Member

seberg commented Dec 28, 2017

Wouldn't be against putting a DeprecationWarning then making it a VisibleDeprecationWarning in a bit without any real plan to remove it, just to educate users that longdouble (and possibly checking its size) is really much more to the point (mostly for education, but at some point it might open up other things, or somewhat open up the name for a drop in replacement in some sense).

Even then, if we do it, might want to help out/check upstream a bit before doing it (or well, if they complain/notice in their testsuits).

@njsmith
Copy link
Member

njsmith commented Dec 28, 2017 via email

@ahaldane
Copy link
Member

There's even np.float80, which we carefully account for in many places. However, I'm not yet aware of which platform this can actually exist on. Intel has only ever used float96 (i386) or float128 (x86_64) for storage, as I undersand.

@charris
Copy link
Member

charris commented Dec 28, 2017

Intel has only ever used float96 (i386) or float128 (x86_64) for storage, as I undersand.

It comes down to 4 or 8 byte alignment for the 10 byte number. The first is on 32 bit systems, the second on 64 bit systems. I think that is pretty standard.

EDIT: (3 x 4) * 8 = 96 bits > 80, (2 x 8) x 8 = 128 bits > 80.

@mhvk
Copy link
Contributor

mhvk commented Dec 30, 2017

I've been fooled by that. It would seem better if the number of bits reflected the actual number used in the computation rather than the size (which one can get via dtype.alignment). Even better would be to actually have a float128, even if slow.

@aragilar
Copy link
Contributor

aragilar commented Jan 9, 2018

A suggestion would be to explicitly reserve the float(32,64,128,...) names for the corresponding IEEE 754 floating point types (there appears to be work to add this to C, which gcc and glibc support). It would make sense to add a visible generic warning (not deprecation) using float128/96 when it corresponds to a non-IEEE 754 type.

I've got a branch https://github.com/aragilar/numpy/tree/add_FloatN_new which adds the binary128 type (a.k.a. float128 or quad), using the new C types or libquadmath (it includes binary32/64 so I could check if problems were due to conflicts between binary128 and longdouble vs introducing new types, I plan on dropping them before submitting the PR). Support for non-gcc/glibc systems (or for those who want to avoid libquadmath) will need to be added (it's on my todo list, but unlikely to happen before May), and there are some bugs with complex support which I haven't fixed (I don't need complex support for the project I'm working on, which is why I haven't fixed the bugs). I plan to submit the PR and send out a email on the mailing list about the change in May (as I'm submitting my PhD in April), but if someone wants to finish cleaning up the code they can go ahead. I'll submit some of the helper code that I wrote as separate PRs, but I suspect because of how much code adding quad touches, it'll need to be one big patch.

@charris
Copy link
Member

charris commented Jan 9, 2018

@aragilar Good to hear. The coming of true float128 is going go happen, or rehappen -- VAX Fortran and SPARC had it -- and it would be good to get ahead of the curve. What is missing from the high precision types is BLAS and LAPACK support, but I suspect that will also come along at some point. Hopefully the system libraries will support the usual sin, cos, and so forth.

I would also like to be able to use true float128 for time, it would solve a lot of problems ...

@charris
Copy link
Member

charris commented Jan 9, 2018

There is an old discussion (but not the oldest) here. One of the suggestions there is to rename the extended float types float80_96 and float80_128. That omits IBM double double, however. I think we may just want to introduce a new quad type to avoid the hassle of deprecating current float128 uses.

@njsmith
Copy link
Member

njsmith commented Jan 9, 2018

The guarantee for float128 has always been "well, it's something, it definitely takes 128 bits of memory, but who can say what the precision is?". So mayyyyybe we could just swap it to IEEE754 quad?

@matthew-brett
Copy link
Contributor

I've hated the name for ages, as y'all may know, but there are surely people using float128 out there, and not suffering a massive performance penalty, because in fact they're getting 80-bit float, and maybe even 64 bit float computation. If we switch them so they are getting full IEEE 128 bit without due warning, that may cause some shocks.

My vote is still to deprecate float128 (if that's possible), and give IEEE 128-bit another name, maybe float128ieee or float128ie3. I worry that quad sounds compiler-dependent - I guess it isn't? I'd be perfectly happy to drop the float96 / float128 names in favor of longdouble.

@seberg
Copy link
Member

seberg commented Jan 9, 2018

Give it a future warning for at least a release, better a year, and then do the "just switch it because there was no real guarantee" argument maybe. Just switching based on "there was no guarantee" seems unnecessary.
IIRC FutureWarnings were always visible, so then that also serves as a UserWarning for people expecting IEEE754 quad.
If we implement the dtype earlier (then the finished futurewarning), we will have the quad name as an alias in any case and can switch over later.

EDIT: I guess float96 should be deprecated, though it can be given more time and is probably much less used anyway.

@matthew-brett
Copy link
Contributor

About the guarantee - float128 / float96 has always been longdouble - so no guarantee what that is across platform / compile, but an implicit never-broken assumption within platform / compiler. And the even across compilers, in practice it's fixed for a given platform (there was a difference between MSVC and certain mingw builds for a while, but those mingw builds are really uncommon and have been for a while).

Following the rule that we are allowed to break old code with suitable warnings, but not change the behavior of old code silently (unless it was a bug) - I think we should not re-use float128 for something different. OK, it's a precision change, but it may also be a huge performance change, which will be surprising.

@pv
Copy link
Member

pv commented Jan 9, 2018 via email

@mhvk
Copy link
Contributor

mhvk commented Jan 9, 2018

The numpy names are relatively easy, but what do we do with the corresponding type strings? Ideally, we can use dtype('f16') to always give quad precision.

@ahaldane
Copy link
Member

ahaldane commented Jan 9, 2018

The more official names of the IEEE types are "binary32", "binary64", "binary128". I don't see the word "quad" used in the IEEE 754 doc: http://ieeexplore.ieee.org/document/4610935/.

By the way, in my draft code to fix the IBM float128 printing (which I haven't put up yet) I'm currently using suffixes like "IEEE_binary64" for these types. For other types I am using "Intel_extended96", "Intel_extended128", "Motorola_extended96", "IBM_double_double" and so on. I may need to be more precise for the IBM types, since IBM also supports something that might be called "IBM_hexadecimal64", and "IBM_hexadecimal_double_double".

@charris
Copy link
Member

charris commented Jan 9, 2018

what do we do with the corresponding type strings?

I think that the one letter typestrings will need to be replaced at some point.

@eric-wieser
Copy link
Member Author

Could perhaps add an np.floats namespace to contain all the various float types, rather than cluttering the main namespace.

@eric-wieser
Copy link
Member Author

what do we do with the corresponding type strings?

f8[ieee] or similar? There's already precedent for dtypes, and having the size in the typestring is useful for inspecting struct offsets.

@njsmith
Copy link
Member

njsmith commented Jan 9, 2018

Type strings may have also ended up in .npy files. What happens right now if you make an .npy file with np.float128 values in it?

@mhvk
Copy link
Contributor

mhvk commented Jan 10, 2018

np.save('a.npy', np.arange(2., dtype=np.float128))
!head -1 a.npy
...{'descr': '<f16', 'fortran_order': False, 'shape': (2,), }

@mhvk
Copy link
Contributor

mhvk commented May 30, 2019

Cross-linking to #7647, which asked explicitly for float128 support.

@aragilar - above you wrote you had an implementation. I'd love to see that used!

The comments above seem to have been mostly about names. My 2¢ is to just replace float128 by the version that fully uses that precision - the main argument against is that some code may become slow, which I think is a small price to pay for ending the confusion we have right now.

@aragilar
Copy link
Contributor

I had half of an implementation (in that it supported linux, not windows): I wrapped quadmath/glibc with _Float128 which needs to be rebased. I looked at finding support for other compilers (icc is really poorly documented in this respect), there is http://www.jhauser.us/arithmetic/SoftFloat.html which provides different precisions, and which is BSD licensed. It doesn't provide trig or special functions though, so those will require implementation (there exists code to do this floating around on the internet, e.g. http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Website/articles/CUJ/1996/9602/prince/prince.htm, but the bigger challenge will finding ones that are licensed in such a way that numpy can use them).

My PhD is still ongoing, so I won't have time in the foreseeable future to get this merged, but if people want to make it easier to get quad precision (or other floating point types, such as double-double or quad-double) working in numpy, moving/rewriting/rearranging the type code so that there's a clear divide between native types (float, double, long double) and size/format based types (float32, float64), and when each is used (they're currently mixed everywhere), which made conversion and type checking more difficult than it needed to be.

@seberg
Copy link
Member

seberg commented May 31, 2019

Frankly, if this is tricky to get out on most platforms or fairly feature complete, it might be a better option to start developing such a dtype outside of proper numpy. This should even be possible fairly reasonable with the currently available framework, and hopefully get more powerful in the foreseeable future.

@mhvk
Copy link
Contributor

mhvk commented May 31, 2019

@seberg - I fear you may well be right, but sadly that probably will mean we won't have it for another decade...

This is partially why I was suggesting replacing float128 outright - i.e., stick with the inconsistent state we're in but for one compiler and processor/OS at the time move towards treating them as proper quad precision. This is much easier than working outside numpy as one can use the infrastructure already in place. It also doesn't break anybody's code as it just increases precision, though quite possibly it does make code slower.

And obviously if done as carelessly as I would do it, this would end up getting rid of the long doubles stored in 16 bytes, which one may or may not consider a loss (I guess possibly it could be a compile-time option).

@charris
Copy link
Member

charris commented May 31, 2019

I think I started complaining about this around 2008 :) With modern hardware and compiler support coming on line it seems the time is ripe to figure out a solution. My own preference is a quad precision type for the IEEE standard, something like quad128, that is only available when there is support. Theoretically we only support IEEE floats, but I think practicality also requires double double, so maybe add ibm128 as well. Then float128 can simply be considered indeterminate, but probably will probably settle down to IEEE extended precision over the long term. I think SPARC quad precision is compatible with the IEEE standard, but am not sure about that.

@charris
Copy link
Member

charris commented May 31, 2019

One problem is will with our type numbering and single letter form ("q"), so if we retain that the type will need more information attached. This case might be worth exploring as part of the new type system design.

EDIT: The problem is not when using float128 on a single machine, it is using it across platforms and in pickled data.

@seberg
Copy link
Member

seberg commented May 31, 2019

I think the main question right now should be what we do with the current float128 name (or if we do anything with it). We could also just put a DeprecationWarning around everything that spells float128 pointing to the alternative spellings and pointing out that quad128 may be what the user wants.

Developing things outside numpy does not seem too bad to me. It might be slower, but there is no real issue with it (yes the kind char is annoying, also because it only makes much sense for our own types right now in any case). And even if it has quirks, all the ufunc/casting code, the biggest chunk of work, could be merged into numpy at any time.

Of course in either case the question is about priorities and time...

@matthew-brett
Copy link
Contributor

My vote would be:

  • float128 with a deprecation warning
  • float80_96 and float80_128 with for 80bit Intel floats, as permanent names
  • quad128 for 128 bit IEEE type
  • float128 removed from namespace at some distant time.

@mhvk
Copy link
Contributor

mhvk commented May 31, 2019

I like the naming suggestions, especially as one goes from np.single to np.double to np.quad (where we could introduce the latter only when almost all systems support it).

Still leaves the string versions to be decided 'f4', 'f8', but what should 'f16' point to? Eventually or immediately to quad?

@charris
Copy link
Member

charris commented May 31, 2019

This is sounding like the start of an NEP where some of the details can be ironed out.

@seberg
Copy link
Member

seberg commented May 31, 2019

Also leaves how tricky it will be to deprecate the actual name ;). I think we could possibly pull of to simply not have any f16, at least until true quad becomes standard. These shorthands are nice, but they are not strictly necessary?

Yeah, Chuck is right, we should make this an NEP if we want to continue down the line...

@aarchiba
Copy link
Contributor

I have looked into this in the past but not had the resources to figure out numpy's type system. But the quaternions and the initial 16-bit float implementations demonstrate that you can have new dypes in importable modules. So a quick start would be to implement IEEE binary128 in a separate package, which would then be available for use immediately.

Such a package could also implement double-double precision based on one of the adequately-licensed libraries that are out there: these are usually faster than software binary128. Some of the libraries also offer a quad-double (that is, the implicit sum of four doubles) for even higher precision at modest additional cost. I suspect double-doubles (or quad-singles maybe even) would be faster on GPU hardware too.

The stumbling block is really to understand how to implement a new dtype in numpy so that it can work without too many surprises. (For example, losing precision on using np.cos, or on printing.)

I can also add, having tested it on a Raspberry Pi (armhf), that when C long doubles are the same as doubles, np.double==np.longdouble and np.float96 and np.float128 simply don't exist. (I can't guarantee that this is also true on exotic platforms like MSVC, as I have no access to them.)

@charris
Copy link
Member

charris commented Sep 15, 2019

AArch64 (ARM64) has quad precision, so I expect it will become more common in the not too distant future.

@aarchiba
Copy link
Contributor

AArch64 (ARM64) has quad precision, so I expect it will become more common in the not too distant future.

If I understand correctly, on aarch64 (which apparently the Raspberry Pi has the hardware to run so there are going to be users out there) long double is 128 bits, so (again if I understand correctly) np.longdouble will actually be quad precision without any action on numpy's part. I'm not clear on whether there is hardware support.

@charris
Copy link
Member

charris commented Sep 15, 2019

AArch64 has 16 quad precision registers, but I don't know the details of the implementation. It also has support for half precision floats (float16). Looks like Power9 also has support for quad precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants