Skip to content

BUG: fix negative samples generated by Wald distribution #29609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mmaaz-git
Copy link

In numpy.random, when generating Wald samples, if the mean and scale have a large discrepancy, the current implementation sometimes generates negative samples. An explicit example of this has been added to test_generator_mt19937.py, mean=1e8, scale=2.25 (which is specific to the seed used in that test file).

The key line implicated in distributions.c:

X = mean + mu_2l * (Y - sqrt(4 * scale * Y + Y * Y));

which has numerical issues when Y >> scale.

I have replaced this with the equivalent rationalized form:

d = Y + sqrt(Y) * sqrt(Y + 4 * scale);
X = mean - (2 * mean * Y) / d;

This can be seen my noting the following algebraic identity:

Y - sqrt(4 * scale * Y + Y * Y))
= - 4 * scale * Y / (Y + sqrt(Y^2 + 4 * scale * Y))

As mu_2l = mean / (2 * scale), we then have:

X = mean - 2 * mean * Y / (Y + sqrt(Y^2 + 4 * scale * Y))

Introduce a new variable for the denominator and factoring gives the final expressions:

d = Y + sqrt(Y) * sqrt(Y + 4 * scale)
X = mean - (2 * mean * Y) / d

And now the test passes, i.e., all samples are non-negative.

I have not made this change to the legacy implementation (nor the legacy test) as I figure they are deprecated, although it also suffers from the same issue.

Disclaimer: I found this failing property during a research project I am working on using LLM agents to write property-based tests (i.e., in Hypothesis). This property, that all Wald samples should be non-negative, was proposed and written "autonomously" by the agent. I spent several hours analyzing the failure and becoming acquainted with the NumPy codebase myself in order to verify it was a real issue, running it with the seed in the test suite, etc. I also pinpointed the cause of the problem and wrote this fix and the PR myself. I take full responsibility for it.

The Hypothesis test that caught the issue was simply:

import numpy.random
from hypothesis import given, strategies as st, settings
from numpy.random import MT19937, Generator

@given(
    st.floats(min_value=1e8, max_value=1e15),
    st.floats(min_value=0.1, max_value=10.0)
)
@settings(max_examples=50)
def test_wald_negative_values(mean, scale):
    """Wald distribution should never produce negative values"""
    random = Generator(MT19937(1234567890))
    samples = random.wald(mean, scale, size=1000)
    assert all(s >= 0 for s in samples), f"Found negative values with mean={mean}, scale={scale}"

Note that even after this PR, this property fails, albeit the minimal example is now mean=280432326968266.0, scale=0.25 (i.e., even more extreme values). But, at such values, it seems like we hit the limits of floating point computations.

In numpy.random, when generating Wald samples, if the mean and scale
have a large discrepancy, the current implementation suffers from
catastrophic cancellation. An explicit example of this has been added
to test_generator_mt19937.py. The key line implicated in distributions.c:

`X = mean + mu_2l * (Y - sqrt(4 * scale * Y + Y * Y));`

which has numerical issues when Y >> scale.

I have replaced this with the equivalent rationalized form:
```
d = Y + sqrt(Y) * sqrt(Y + 4 * scale);
X = mean - (2 * mean * Y) / d;
```

And now the test passes.
@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Aug 21, 2025

Thanks @mmaaz-git, I confirmed that the problem exists on the main development branch.

At first glance, your reformulation looks mathematically correct and avoids the problem. I probably won't get to a closer look until this weekend (anyone else can jump in, of course).

One change I suggest before I click the "Approve workflows to run": you say the new test fails for you without the modified formulas, but in my build of numpy, your new test passes using the main development branch. I was able to get the test to fail by increasing mean from 1e8 to 1e9. Then I get 3 negative values. Could you update the test with a change like that, to increase the chances that it fails on more platforms?

@mmaaz-git
Copy link
Author

I've changed the test case to mean=1e9 and can confirm I also get 3 negative values on my machine.

@melissawm melissawm moved this to Awaiting a code review in NumPy first-time contributor PRs Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Awaiting a code review
Development

Successfully merging this pull request may close these issues.

2 participants