Skip to content

Lack of clarity in RandomState compatibility guarantee #8771

Closed
@mdickinson

Description

@mdickinson

The docs for numpy.random.RandomState say:

Compatibility Guarantee A fixed seed and a fixed series of calls
to ‘RandomState’ methods using the same parameters will always
produce the same results up to roundoff error [...]

Question: in this context, does "always" mean that this guarantee should apply across platforms and machines, or just that it should apply across runs on a single machine?

If the latter, then the "up to roundoff error" is probably unnecessary, which leads me to believe that the intent is that the guarantee should apply across platforms. But now I'm failing to see how it's possible to make such a guarantee: some of the sample generation methods use the rejection method, and so consume some (unknown in advance) number of random samples. The number of samples actually consumed may depend on floating-point and libm variations. A good example is the zipf distribution:

long rk_zipf(rk_state *state, double a)
{
double T, U, V;
long X;
double am1, b;
am1 = a - 1.0;
b = pow(2.0, am1);
do
{
U = 1.0-rk_double(state);
V = rk_double(state);
X = (long)floor(pow(U, -1.0/am1));
/* The real result may be above what can be represented in a signed
* long. It will get casted to -sys.maxint-1. Since this is
* a straightforward rejection algorithm, we can just reject this value
* in the rejection condition below. This function then models a Zipf
* distribution truncated to sys.maxint.
*/
T = pow(1.0 + 1.0/X, am1);
} while (((V*X*(T-1.0)/(b-1.0)) > (T/b)) || X < 1);
return X;
}

Here, the number of calls to rk_double for a given a and random state may change depending on tiny floating-point differences in the result of pow (for example).

Should the guarantee be restricted to some subset of the RandomState methods?

Related: #6180 (where the wording explicitly includes "regardless of platform"), #6405 (where it doesn't).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions