Skip to content

struct (un)packing of half-precision nan floats is non-invertible #130317

Closed
@seh-dev

Description

@seh-dev

Bug report

Bug description:

I noticed that chaining struct.unpack() and struct.pack() for IEEE 754 Half Precision floats (e) is non-invertible for nan. E.g.:

import struct

original_bytes = b'\xff\xff'

unpacked_float = struct.unpack('e', original_bytes)[0]  # nan
repacked_bytes = struct.pack('e', unpacked_float)  # b'\x00\xfe'  != b'\xff\xff'

IEEE nans aren't unique, so this isn't that surprising... However I found it curious that the same behavior is not exhibited for float (f) or double (d) format, where every original bit pattern I tested could be recovered from the unpacked nan object.

Is this by design?

Here's a quick pytest script that tests over a broad range of nan/inf/-inf cases for each encoding format.

# /// script
# requires-python = ">=3.11"
# dependencies = ["pytest"]
# ///
import struct
import pytest


# Floating Point Encodings Based on IEEE 754 per https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats
# binary 16 (half precision) - 1 bit sign, 5 bit exponent, 11 bit significand
# binary 32 (single precision) - 1 bit sign, 8 bit exponent, 23 bit significand
# binary 64 (double precision) - 1 bit sign, 11 bit exponent, 52 bit significand


MAX_TEST_CASES = 100000  # limit number of bit patterns being sampled so we aren't waiting too long


@pytest.mark.parametrize(["precision_format", "precision", "exponent_bits"], [("f", 32, 8), ("d", 64, 11), ("e", 16, 5)])
@pytest.mark.parametrize("sign_bit", [0, 1])
@pytest.mark.parametrize("endianness", ["little", "big"])
def test_struct_floats(precision_format: str, precision: int, exponent_bits: int, sign_bit: int, endianness: str):
    significand_bits = precision - exponent_bits - 1

    n_tests = min(MAX_TEST_CASES, 2**significand_bits)

    significand_patterns = [significand_bits * "0", significand_bits * "1"] + [
        bin(i + 1)[2:] for i in range(1, 2**significand_bits, 2**significand_bits // n_tests)
    ]

    for i in range(n_tests):
        binary = str(sign_bit) + "1" * exponent_bits + significand_patterns[i]
        if endianness == "big":
            format = ">" + precision_format
        elif endianness == "little":
            format = "<" + precision_format
        else:
            raise NotImplementedError()

        test_bytes = int(binary, base=2).to_bytes(precision // 8, endianness)

        unpacked = struct.unpack(format, test_bytes)
        assert len(unpacked) == 1

        repacked = struct.pack(format, unpacked[0])

        assert (
            repacked == test_bytes
        ), f"struct pack/unpack was not invertible for format {format} with raw value: {test_bytes} -> unpacks to {unpacked[0]}, repacks to {repacked}"

if __name__ == "__main__":
    pytest.main([__file__])

Image

CPython versions tested on:

3.13, 3.11, 3.12

Operating systems tested on:

Linux, Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions