Skip to content

gh-130317: fix PyFloat_Pack/Unpack[24] for NaN's with payload #130452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Apr 28, 2025

Conversation

skirpichev
Copy link
Member

@skirpichev skirpichev commented Feb 22, 2025

Previously (1) all NaN's payload in PyFloat_Pack/Unpack2() was ignored and (2) type conversions (float <-> double) in PyFloat_Pack/Unpack4() broke pack-unpack round-trip for sNaN's.


@skirpichev skirpichev force-pushed the fix-nan-packing/130317 branch from a6e99f9 to 971fd98 Compare February 23, 2025 03:15
@skirpichev skirpichev marked this pull request as ready for review February 23, 2025 03:53
vstinner

This comment was marked as outdated.

@skirpichev

This comment was marked as outdated.

@skirpichev skirpichev marked this pull request as draft February 25, 2025 06:04
@skirpichev skirpichev changed the title gh-130317: fix PyFloat_Pack2/Unpack2 for NaN's with payload gh-130317: fix PyFloat_Pack/Unpack[24] for NaN's with payload Feb 25, 2025
@skirpichev skirpichev force-pushed the fix-nan-packing/130317 branch from 139501d to e6c1c12 Compare February 25, 2025 08:24
@skirpichev skirpichev marked this pull request as ready for review February 25, 2025 09:12
@skirpichev
Copy link
Member Author

Ok, I've added new tests in test_capi.test_floats (in principle, test_struct's tests are redundant).

win32 behavior (you can't unset "quiet" bit for NaN) looks as a bug for me.

@skirpichev skirpichev requested a review from vstinner February 25, 2025 09:16
@vstinner
Copy link
Member

win32 behavior (you can't unset "quiet" bit for NaN) looks as a bug for me.

Any idea why there is a bug only on Windows?

@skirpichev
Copy link
Member Author

Any idea why there is a bug only on Windows?

Only on win32. Though, I suspect the situation could be worse. Maybe I should revert win32-workaround and test this with some buildbots?

C17 says: "This specification does not define the behavior of signaling NaNs." But I don't think this means you can't flip appropriate bit in float/double to make a signaling NaN.

@skirpichev skirpichev marked this pull request as ready for review April 27, 2025 13:10
@skirpichev
Copy link
Member Author

Ok, I think it's ready for review.

Test for Windows failed in 32-bit mode (signaling NaN type not preserved in roundtrip), that seems to be a known issue: https://developercommunity.visualstudio.com/t/155064 (sNaN returned from function becomes qNaN). See also https://developercommunity.visualstudio.com/t/903305 and https://en.delphipraxis.net/topic/12198-delphi-win32-quiets-signaling-nan-on-function-return/.

In principle, it's possible to workaround this for the struct module. But not for C-API (PyFloat_Unpack*).

@skirpichev skirpichev requested a review from vstinner April 27, 2025 15:18
@skirpichev skirpichev requested a review from vstinner April 28, 2025 11:57
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just have some minor comments.

def test_pack_unpack_roundtrip_for_nans(self):
pack = _testcapi.float_pack
unpack = _testcapi.float_unpack
for _ in range(1000):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100 tests should be enough to validate the implementation, no?

Suggested change
for _ in range(1000):
for _ in range(100):

1000 tests might be a little bit too slow, I don't think that it's worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you, the test looks instantaneous on my system. 0.3sec vs 0.03. Where the threshold?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think speed is really an issue here, but having the potential for 6000 failed test reports seems like major overkill. I think 10 would actually be plenty for this loop.

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner vstinner merged commit 6157135 into python:main Apr 28, 2025
42 checks passed
@vstinner
Copy link
Member

Merged, thank you.

@vstinner
Copy link
Member

Should we backport this change? It's unclear to me if it's a new feature or a bugfix. The issue is reported as a bug report.

@skirpichev skirpichev deleted the fix-nan-packing/130317 branch April 28, 2025 13:29
@skirpichev
Copy link
Member Author

The issue is reported as a bug report.

Strictly speaking, it's a bug, though maybe a very minor one (that's why there was no label for 3.13 backport).

From IEEE 754 (2008), 6.2.3: "Conversion of a quiet NaN from a narrower format to a wider format in the same radix, and then back to the same narrower format, should not change the quiet NaN payload in any way except to make it canonical."

E.g. in C nan's payload is preserved in conversions (e.g. float->double) or partially preserved (e.g. double -> float).

@vstinner
Copy link
Member

test_capi.test_float fails on x86 Debian Installed with X 3.x: https://buildbot.python.org/#/builders/1244/builds/5176

test.pythoninfo:

CC.version: gcc (Debian 12.2.0-14) 12.2.0

builtins.float.double_format: IEEE, little-endian
builtins.float.float_format: IEEE, little-endian

platform.architecture: 32bit ELF
platform.freedesktop_os_release[NAME]: Debian GNU/Linux
platform.freedesktop_os_release[VERSION]: 12 (bookworm)
platform.libc_ver: glibc 2.36
platform.platform: Linux-6.1.0-33-686-pae-i686-with-glibc2.36

sys.float_info: sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
sys.float_repr_style: short

sys.maxsize: 2147483647

There are failures on 16, 32 and 64 bit floats (size: 2, 4, 8). Examples:

FAIL: test_pack_unpack_roundtrip_for_nans (test.test_capi.test_float.CAPIFloatTest.test_pack_unpack_roundtrip_for_nans)
(data=b'\x7f\x98Pr', size=4, endian=0)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_capi/test_float.py", line 213, in test_pack_unpack_roundtrip_for_nans
    self.assertEqual(data1, data2)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
AssertionError: b'\x7f\x98Pr' != b'\x7f\xd8Pr'

======================================================================
FAIL: test_pack_unpack_roundtrip_for_nans (test.test_capi.test_float.CAPIFloatTest.test_pack_unpack_roundtrip_for_nans) 
(data=b'|\xd6', size=2, endian=0)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-debian-x86.installed/build/target/lib/python3.14/test/test_capi/test_float.py", line 213, in test_pack_unpack_roundtrip_for_nans
    self.assertEqual(data1, data2)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
AssertionError: b'|\xd6' != b'~\xd6'

======================================================================
FAIL: test_pack_unpack_roundtrip_for_nans (test.test_capi.test_float.CAPIFloatTest.test_pack_unpack_roundtrip_for_nans) 
(data=b'\x7f\xf2h\xf8\x1bU*\xc1', size=8, endian=0)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-debian-x86.installed/build/target/lib/python3.14/test/test_capi/test_float.py", line 213, in test_pack_unpack_roundtrip_for_nans
    self.assertEqual(data1, data2)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
AssertionError: b'\x7f\xf2h\xf8\x1bU*\xc1' != b'\x7f\xfah\xf8\x1bU*\xc1'

@vstinner
Copy link
Member

I can reproduce the issue on Fedora 42 by building Python using gcc -m32:

./configure CFLAGS="-m32" LDFLAGS="-m32"
sed -i -e 's!#define PY_HAVE_PERF_TRAMPOLINE 1!#undef PY_HAVE_PERF_TRAMPOLINE!g' pyconfig.h
make
./python -m test -v test_capi.test_float

@skirpichev
Copy link
Member Author

Hmm, again in all cases sNaN seems transformed to qNaN.

I think underlying reason is same and we should apply workaround for 32-bit modes.

@vstinner
Copy link
Member

I think underlying reason is same and we should apply workaround for 32-bit modes.

I replaced if signaling and sys.platform == 'win32': with if signaling and True: in test_capi.test_float, but it didn't work around the issue :-( Maybe _testcapi.float_set_snan() doesn't work on x86.

@vstinner
Copy link
Member

Aha, just copying a double to another double clears the sNaN flag:

(gdb) n
33	    double d = ((PyFloatObject *)obj)->ob_fval;
(gdb) n
35	    switch (size)
(gdb) p ((PyFloatObject *)obj)->ob_fval
$3 = nan(0x5e5454683683)
(gdb) p d
$4 = nan(0x85e5454683683)

d is not equal to ob_fval.

@vstinner
Copy link
Member

The sNaN flag is easily lost. I added some debug of the uint64_t value in different functions. Between float_pack() and PyFloat_Pack4(), the sNaN flag is lost:

float_set_snan(): v=fffbce9e80000000 (before)
float_set_snan(): v=fff3ce9e80000000 (sNaN)
float_pack():     v=fff3ce9e80000000
PyFloat_Pack4():  v=fffbce9e80000000

PyFloat_Pack4() is called with the wrong value :-(

@skirpichev
Copy link
Member Author

Maybe _testcapi.float_set_snan() doesn't work on x86.

Hmm, indeed. I should finally do memcpy() to created PyFloat's ob_fval, then return it.

Does it work:

diff --git a/Modules/_testcapi/float.c b/Modules/_testcapi/float.c
index 2feeb205d8..b382883fa8 100644
--- a/Modules/_testcapi/float.c
+++ b/Modules/_testcapi/float.c
@@ -181,9 +181,12 @@ _testcapi_float_set_snan(PyObject *module, PyObject *obj)
     }
     uint64_t v;
     memcpy(&v, &d, 8);
-    v &= ~(1ULL << 51); /* make sNaN */
-    memcpy(&d, &v, 8);
-    return PyFloat_FromDouble(d);
+    if (v & (1ULL << 51)) {
+        v -= (1ULL << 51); /* make sNaN */
+    }
+    PyObject *ret = PyFloat_FromDouble(0.0);
+    memcpy(&(((PyFloatObject *)ret)->ob_fval), &v, 8);
+    return ret;
 }
 
 static PyMethodDef test_methods[] = {

?

@vstinner
Copy link
Member

I wrote #133150 to fix 3 bugs, but it's not enough to fix the test on x86.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants