Skip to content

Lack of clarity in RandomState compatibility guarantee #8771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mdickinson opened this issue Mar 11, 2017 · 9 comments
Closed

Lack of clarity in RandomState compatibility guarantee #8771

mdickinson opened this issue Mar 11, 2017 · 9 comments

Comments

@mdickinson
Copy link
Contributor

The docs for numpy.random.RandomState say:

Compatibility Guarantee A fixed seed and a fixed series of calls
to ‘RandomState’ methods using the same parameters will always
produce the same results up to roundoff error [...]

Question: in this context, does "always" mean that this guarantee should apply across platforms and machines, or just that it should apply across runs on a single machine?

If the latter, then the "up to roundoff error" is probably unnecessary, which leads me to believe that the intent is that the guarantee should apply across platforms. But now I'm failing to see how it's possible to make such a guarantee: some of the sample generation methods use the rejection method, and so consume some (unknown in advance) number of random samples. The number of samples actually consumed may depend on floating-point and libm variations. A good example is the zipf distribution:

long rk_zipf(rk_state *state, double a)
{
double T, U, V;
long X;
double am1, b;
am1 = a - 1.0;
b = pow(2.0, am1);
do
{
U = 1.0-rk_double(state);
V = rk_double(state);
X = (long)floor(pow(U, -1.0/am1));
/* The real result may be above what can be represented in a signed
* long. It will get casted to -sys.maxint-1. Since this is
* a straightforward rejection algorithm, we can just reject this value
* in the rejection condition below. This function then models a Zipf
* distribution truncated to sys.maxint.
*/
T = pow(1.0 + 1.0/X, am1);
} while (((V*X*(T-1.0)/(b-1.0)) > (T/b)) || X < 1);
return X;
}

Here, the number of calls to rk_double for a given a and random state may change depending on tiny floating-point differences in the result of pow (for example).

Should the guarantee be restricted to some subset of the RandomState methods?

Related: #6180 (where the wording explicitly includes "regardless of platform"), #6405 (where it doesn't).

@seberg
Copy link
Member

seberg commented Mar 11, 2017

@rkern probably knows this best. I guess you are probably right, but doubt there is much you can do about it (except trying to document details). I don't know if e.g. pow has enough round off difference for any problems to be possible (even then I guess they are extremely unlikely) at least on IEEE conform implementations? There are some others such as multivariate normal which also depend on the linear algebra solutions (though I doubt that can cause changes in the stream later on).

@charris
Copy link
Member

charris commented Mar 11, 2017

The sequences of floats, integers, etc. produced by the base rng should be the same assuming that division conforms to the ieee standard for ieee doubles. Platform variations in library functions such as log may change results for more complicated distributions, depending on the particular function involved, which in turn will affect the following sequence of generated results.

I think you are correct that we cannot make quarantees cross platform, or even between library/compiler versions, when certain of the distributions are involved.

@charris
Copy link
Member

charris commented Mar 11, 2017

To be a bit more precise, roundoff errors and small function differences will only make small differences in floating results if they are not involved in rejection methods. Differences in functions involved in rejection methods can change the whole following sequence, but that is unlikely unless a very large number of trials are made. We do have tests for repeatability and I'm not aware of any failures at this time.

@rkern
Copy link
Member

rkern commented Mar 11, 2017

There's one other source of cross-platform wonkiness that is somewhat easier to realize: internal integer values are C longs. With certain extreme parameter values, 32-bit longs overflow and give different results than on platforms with 64-bit longs. Of course, since they are overflowing, the result is crap anyways, but they can change the stream afterwards for all of the non-crap results that follow.

@mdickinson
Copy link
Contributor Author

I don't know if e.g. pow has enough round off difference for any problems to be possible (even then I guess they are extremely unlikely)

Possible, definitely: a single ulp difference in a pow result could be enough to cause a rejection-method inner loop to go for one more iteration, and such single ulp differences are very common: there's wide variation in pow implementations across platforms. In contrast to pow, reproducibility of arithmetic operations on IEEE 754-implementing hardware should be a fairly safe assumption, provided well-known issues associated with x87 usage are avoided, and I'd expect basic libm functions like exp, cos, log, etc. to be mostly correctly rounded most of the time.

I agree that you'd have to be very unlucky to run into any of these cases by accident, so that in practice the guarantee is likely to hold almost always. So perhaps my objection can be dismissed under "practicality beats purity" rules.

@rkern
Copy link
Member

rkern commented Mar 11, 2017

Yeah, there are two faces to the compatibility policy: what guarantees users can expect and what goals developers should strive towards. Being compatible "up to numerical rounding errors" across platforms is really an aspirational goal of the latter kind, not a firm guarantee of the former. I think I had the developer-goal in my head when I approved the language. And I was probably thinking of the transformation-type algorithms instead of the rejection-type ones.

@charris
Copy link
Member

charris commented Mar 11, 2017

The multivariate case mentioned by @seberg is probably the most likely to cause problems as the factorization is only unique up to the sign of the eigenvectors if the singular values are distinct and even less unique if the covariance has degenerate eigenvalues, so differences in implementation of the svd that is used can have a big effect. I believe the Cholesky factorization would be more specific.

@mattip
Copy link
Member

mattip commented Oct 16, 2018

Can we close this or do we need to clarify the documentation?

@WarrenWeckesser
Copy link
Member

Given that NEP 19 has been accepted, it looks like this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants