Skip to content

BUG: BitGenerator.seed_seq doesn't round-trip through pickle #26234

@levbishop

Description

@levbishop

Describe the issue:

The seed_seq of a BitGenerator that's been round-tripped through pickle dump/load does not match with the source. This is despite the fact that the BitGenerator.state (and hence the sequence of variates generated) do round-trip, as does a bare SeedSequence.

This also shows up as a deviation of child bitgens from BitGenerator.spawn()

I'm assuming this is an oversight, since the pickle bytestream generated from a BitGenerator just doesn't mention the seed_seq element at all, and I'm assuming it gets generated from the SeedSequence default initializer by fetching fresh entropy from the OS, which presumably adds a few us delay to the unpickle path.

Reproduce the code example:

import numpy as np
import pickle

bg = np.random.PCG64(np.random.SeedSequence(1))
bg2 = pickle.loads(pickle.dumps(bg))

assert bg.seed_seq == bg2.seed_seq

Error message:

No response

Python and NumPy Versions:

1.26.4
3.11.8 (main, Feb 26 2024, 15:43:17) [Clang 14.0.6 ]

Runtime Environment:

[{'numpy_version': '1.26.4',
  'python': '3.11.8 (main, Feb 26 2024, 15:43:17) [Clang 14.0.6 ]',
  'uname': uname_result(system='Darwin', node='LSB-MAC.local', release='22.6.0', version='Darwin Kernel Version 22.6.0: Mon Feb 19 19:48:53 PST 2024; root:xnu-8796.141.3.704.6~1/RELEASE_X86_64', machine='x86_64')},
 {'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Haswell',
  'filepath': '/usr/local/Caskroom/miniconda/base/envs/rng-seeding/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 4,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.23.dev'}]

Context for the issue:

The new numpy 1.25 functionality to allow spawning directly from generators (introduced in #23195)rather than needing to pass around a SeedSequence could simplify some multiprocessing/multithreading workflows, but we lose reproducibility due to this bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions