BUG: `quantile` should error when `weights` are all zeros #28589

lucyleeow · 2025-03-26T00:19:14Z

Describe the issue:

np.quantile with weights being all zeros should probably give a error/warning. It's effectively like asking for a quantile of an empty array. Currently numpy seems to return the first sample.

Reproduce the code example:

np.quantile([1,2,3,4], 0.5, weights=[0,0,0,0], method='inverted_cdf')

Error message:

/.../numpy/lib/_function_base_impl.py:4858: RuntimeWarning: invalid value encountered in divide
  cdf /= cdf[-1, ...]  # normalization to 1
Out[9]: array(1)

Python and NumPy Versions:

2.1.3
3.13.0 | packaged by conda-forge | (main, Nov 27 2024, 19:18:50) [GCC 13.3.0]

Runtime Environment:

No response

Context for the issue:

Context: Noticed when looking into scikit-learn/scikit-learn#31032

The text was updated successfully, but these errors were encountered:

seberg · 2025-03-26T06:24:52Z

Thanks for reporting! Interesting that it ends up with something after giving an invalid warning. CC @lorentzenchr, but I suspect this shouldn't be hard to fix by anyone, probably really needs an explicit check.

(Ah, I suppose the weights end up NaN, but the rest still works -- somewhat...)

techyorzl · 2025-03-26T06:37:04Z

Would love to work on this issue!

seberg · 2025-03-26T06:38:46Z

We don't assign issues, please just work on a PR with tests.

lorentzenchr · 2025-03-26T07:37:26Z

Maybe, this should be aligned with np.average (with weighted quantiles, I followed np.average).

Tontonio3 · 2025-03-27T16:52:45Z

I also found this weird behavior in np.quantile, if you use np.nan in any weight it outputs 1 but it doesn't raise any errors.

import numpy as np

wgt_lst = [1]*100

x = np.linspace(1,100,100)

q = np.zeros(100)

for i in range(100):

    wgt_lst[i] = np.nan
    q[i] = np.quantile(x, 0.5, weights=wgt_lst, method='inverted_cdf')
    wgt_lst[i] = 1

print(np.all(q == 1.))

q = np.zeros(100)

for i in range(100):

    wgt_lst[i] = np.nan
    q[i] = np.quantile(x, 0.5, weights=wgt_lst, method='inverted_cdf')

print(np.all(q == 1.))

it outputs:

True
True

Also, np.inf causes weird behaviours.

import numpy as np

a = np.quantile([1,2,3,4], 0.5, weights=[1,1,1,np.inf], method='inverted_cdf')
b = np.quantile([1,2,3,4], 0.5, weights=[np.inf,1,1,1], method='inverted_cdf')
print(a)
print(b)

outputs:

\...\numpy\lib\_function_base_impl.py:4891: RuntimeWarning: invalid value encountered in divide
  cdf /= cdf[-1, ...]  # normalization to 1
4
1

…eights are all zeros numpy#28589 Previously, np.quantile with all weights set to zero returned the first sample. This is fixed now via adding an explicit check that raises a ValueError when all weights are zero, preventing division by zero in the CDF. - Added the weight check in `_quantile` (line 4874 ff) to detect all-zero weights and raise an error. - Added a test in `test_function_base.py` to verify this behavior. - Added a second test to ensure existing quantile functionality remains unchanged. This should close numpy#28589

lucyleeow · 2025-03-27T23:25:06Z

Linking related issue: #21091

Not sure what the right behaviour is when a weight is nan, potentially ignore that sample like with numpy.nanquantile . If that is the case, maybe only numpy.nanquantile should handle nan weights?

ngoldbaum · 2025-03-28T14:58:58Z

We now have three open pull requests to close this issue (#28594, #28595, and #28597), all coming in within a day or so. We should probably close two of them. I'd appreciate it if the contributors who opened pull requests for this could figure out which one we should review.

Also, before opening a pull request, please double-check that there isn't an open PR already.

seberg · 2025-03-28T18:47:49Z

I don't think we need to handle nan weights. For all zero weights, it should be identical to quantile over no points quantile([], ...) without weights (I think an error).
I have not yet looked at the open PRs.

seberg · 2025-04-01T09:04:47Z

CC @rottenstea , @Tontonio3, and @galio1237 can you have a look at #28589 (comment)?

I think all of these PRs need some work/thought still. I don't want to review all 3 and some got helpful comments already, so I'll make some broad comments here:

Before we add this type of check to 2 out of 4 code paths, I would like to see a refactor and introduce a new function (I doubt negative weights in average make sense, in which case we don't need a new function, just a rename). @lorentzenchr had hinted to that, since he mentioned this code is based on patterns also used elsewhere.
I don't like dtype=object special cases unless there are very important. Just don't add them and rather explain why you are tempted to, so we can decide if it's worthwhile.
As Marten comment, find the minimal checks. Personally, I wouldn't care about a distinct error messages, but that is a matter of opinion.
The way comparisons are done matters if you are worried about NaNs. E.g. all(value > 0) is very different for not any(value <= 0).
Some checks may make more sense after summing (before the normalization). But we should also see if they are even needed (if they are only about improving the error message, we may not need them).

(If I would trust compilers/systems, all we actually would need to do is add a with np.errstate(invalid="error"): to an astype(intp) somewhere. @mattip wanted to look at that recently, but for now we don't have it.)

lorentzenchr · 2025-04-01T11:15:37Z

I doubt negative weights in average make sense

I have to raise concerns against this statement. Please do not restict weights in any way for np.average. There are myriads of applications of taking averages, some of them require negative weights for very good reasons.

For quantiles, it can be argued, things are different because an empirical CDF is calculated based on the weights which are therefore required to be non-negative and to sum to a positive number.

seberg · 2025-04-01T11:28:03Z

some of them require negative weights for very good reasons

Cool, thanks for pointing that out, sorry, I think there were more functions then just percentile/quantile that check < 0, but I think I just mistook them for others...

Tontonio3 · 2025-04-02T19:38:26Z

I don't like dtype=object special cases unless there are very important. Just don't add them and rather explain why you are tempted to, so we can decide if it's worthwhile.

There is a test in .\numpy\lib\tests\test_function_base.py that has a dtype=object, which causes n.isnan and np.isinf to give an error.

That's why I opened issue #28602

lucyleeow added the 00 - Bug label Mar 26, 2025

rottenstea linked a pull request Mar 27, 2025 that will close this issue

BUG: bugfix for #28589 (quantile should error when weights are all zeros) #28594

Open

Tontonio3 mentioned this issue Mar 27, 2025

BUG: quantile should error when weights are all zeros #28595

Open

galio1237 linked a pull request Mar 27, 2025 that will close this issue

fix quantile all zeros error #28597

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: `quantile` should error when `weights` are all zeros #28589

BUG: `quantile` should error when `weights` are all zeros #28589

lucyleeow commented Mar 26, 2025

seberg commented Mar 26, 2025 •

edited

Loading

techyorzl commented Mar 26, 2025

seberg commented Mar 26, 2025

lorentzenchr commented Mar 26, 2025

Tontonio3 commented Mar 27, 2025

lucyleeow commented Mar 27, 2025

ngoldbaum commented Mar 28, 2025 •

edited

Loading

seberg commented Mar 28, 2025

seberg commented Apr 1, 2025

lorentzenchr commented Apr 1, 2025

seberg commented Apr 1, 2025

Tontonio3 commented Apr 2, 2025 •

edited

Loading

BUG: quantile should error when weights are all zeros #28589

BUG: quantile should error when weights are all zeros #28589

Comments

lucyleeow commented Mar 26, 2025

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

seberg commented Mar 26, 2025 • edited Loading

techyorzl commented Mar 26, 2025

seberg commented Mar 26, 2025

lorentzenchr commented Mar 26, 2025

Tontonio3 commented Mar 27, 2025

lucyleeow commented Mar 27, 2025

ngoldbaum commented Mar 28, 2025 • edited Loading

seberg commented Mar 28, 2025

seberg commented Apr 1, 2025

lorentzenchr commented Apr 1, 2025

seberg commented Apr 1, 2025

Tontonio3 commented Apr 2, 2025 • edited Loading

BUG: `quantile` should error when `weights` are all zeros #28589

BUG: `quantile` should error when `weights` are all zeros #28589

seberg commented Mar 26, 2025 •

edited

Loading

ngoldbaum commented Mar 28, 2025 •

edited

Loading

Tontonio3 commented Apr 2, 2025 •

edited

Loading