-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG,DEP: Fixup quantile/percentile and rename interpolation->method #20327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -4445,19 +4492,22 @@ def _lerp(a, b, t, out=None): | |||
|
|||
def _get_gamma_mask(shape, default_value, conditioned_value, where): | |||
out = np.full(shape, default_value) | |||
out[where] = conditioned_value | |||
np.copyto(out, conditioned_value, where=where, casting="unsafe") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally condition_value
is just a scalar, so this doesn't matter. But in one of the branches, it is an array that matches the shape of out
, so we have to use the copyto
logic of copying both. (To be honest, the casting is just for sports right now.)
@bzah if you have some time, maybe you can also have a look? |
15b731c
to
bf047da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also rename the interpolation
keyword in the stub file?
With the way things are currently set up you'll only have to update np.percentile
.
61ab79b
to
c9f06ad
Compare
@seberg maybe it would be interesting to have a separate graph with only the discontinuous methods ? For the record, the discontinuous methods are: "inverted_cdf", "averaged_inverted_cdf", "closest_observation", and "lower", "higher", "nearest" and "midpoint". The names are also quite confusing, without a graph one could potentially expect the same results from |
I guess I could do two plots, one with all the discrete and one with the non-discrete. Although I wonder if it isn't also nice to mix. Right,
(wikipedia is similar) The tricky part is the |
I think there's merit in both approaches: having a single plot is nice because all of the information is collected in one place; however, as mentioned, the plot is quite busy. Two plots would have the advantage of being less busy, but the discontinuous distinction may not be particularly meaningful for the average user just trying to learn about the different methods. FWIW I'd vote to add the plot as-is then evaluate alternatives in a follow-up PR to make comparison easier. |
Just found some other projects where they use the numpy-based notation, if this is of interest: https://www.tensorflow.org/probability/api_docs/python/tfp/stats/percentile |
In the Numpy Zoom session today I've mentioned that we use 'lower' and 'upper' interpolation. We typically do so to compensate a bias and ensure that we obtain a value which has actually been measured (especially a value which is integral if all measurements are integral). If there are no NaNs, this can also be achieved by sorting along one axis and picking the target value, e.g. array entry 3 if there are 4 entries if one wants interpolation='upper' and q=0.5 ("slightly above median", especially for larger arrays). It would also be possible with interpolation='linear' and picking q=(3-1)/(4-1) = 2/3, but this could yield some numerical instability, with 2/3 being a good example. Probably changing our codebase to np.sort would be more appropriate (and not too much of an effort) anyway. I was maybe a little bit too worried my favorite (interpolation) methods would go away in the future ;) |
Also, the closest-observation did not correctly support multiple quantiles calculated at the same time (broadcasting error).
Apparently, sphinx does not resolve references to footnotes from parameter descriptions.
Co-authored-by: abel <aoun@cerfacs.fr>
Mainly fixes the method list slightly, tones down the warning a bit and fixes the link to the paper (I did not realize that the link failed to work due only because the reference was missing from nanquantile/nanpercentile).
Also updates the old release note to include the info (and generally tweaks it a bit)
f3d8e28
to
be15716
Compare
OK, updated, hopefully pretty much final, but especially release note bike shedding very welcome :):
|
Btw. I won't worry about projects like tensorflow personally, they should be able to keep up. We can't keep potentially serious flaws in what methods we offer stand in the way. |
[np.percentile, np.quantile, np.nanpercentile, np.nanquantile]) | ||
def test_both_passed(self, func): | ||
with warnings.catch_warnings(): | ||
# catch the warning, but make sure it does not raise: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"does not" <- "does"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning is given before the error is thrown. So we make sure the DeprecationWarning
does not raise, yes. (In order to test the error that comes after the warning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "catch the DeprecationWarning so that it does not raise".
@@ -3854,7 +3854,7 @@ def _median(a, axis=None, out=None, overwrite_input=False): | |||
|
|||
|
|||
def _percentile_dispatcher(a, q, axis=None, out=None, overwrite_input=None, | |||
interpolation=None, keepdims=None): | |||
method=None, keepdims=None, *, interpolation=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. The change in order is probably safe. We should make the options keyword only at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVM, this is private, so no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the dispatch, so it is the same as the public signature. This is something __array_function__
implementers will have to catch up with.
I guess one annoyance is that it is hard to use avoid the warning if you are writing for both old and new NumPy versions. That would be an argument to delay the DeprecationWarning
for a bit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see how the first release goes.
method="linear", | ||
keepdims=False, | ||
*, | ||
interpolation=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting approach, replace "interpolation" by "method" when used as positional arguments and only worry if "interpolation" is used as a keyword.
Co-authored-by: Charles Harris <charlesr.harris@gmail.com>
6130a87
to
546c47a
Compare
Thanks Sebastian. |
commit 30228578ac65ec6b90961700a5783c23d8e1b879 Author: Thomas Li <47963215+lithomas1@users.noreply.github.com> Date: Wed Nov 17 16:21:27 2021 -0800 Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Update LICENSE_win32.txt Update LICENSE_win32.txt Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Update LICENSE_win32.txt Update LICENSE_win32.txt Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Update LICENSE_win32.txt Update LICENSE_win32.txt Add Windows config to GHA update script [wheel build] typo [wheel build] fix typo? [wheel build] fix linux builds? [wheel build] typo [wheel build] add license and pin to windows 2016 skip tests [wheel build] pin to windows 2019 instead [wheel build] try to find out the error on windows [wheel build] maybe fix? [wheel build] maybe fix? [wheel build] fix? [wheel build] cleanup [wheel build] Update LICENSE_win32.txt Update LICENSE_win32.txt commit 4bd12df Author: Thomas Li <47963215+lithomas1@users.noreply.github.com> Date: Mon Nov 15 17:28:47 2021 -0800 # This is a combination of 14 commits. # This is the 1st commit message: Add Windows config to GHA # This is the commit message numpy#2: update script [wheel build] # This is the commit message numpy#3: typo [wheel build] # This is the commit message numpy#4: fix typo? [wheel build] # This is the commit message numpy#5: fix linux builds? [wheel build] # This is the commit message numpy#6: typo [wheel build] # This is the commit message numpy#7: add license and pin to windows 2016 # This is the commit message numpy#8: skip tests [wheel build] # This is the commit message numpy#9: pin to windows 2019 instead [wheel build] # This is the commit message numpy#10: try to find out the error on windows [wheel build] # This is the commit message numpy#11: maybe fix? [wheel build] # This is the commit message numpy#12: maybe fix? [wheel build] # This is the commit message numpy#13: fix? [wheel build] # This is the commit message numpy#14: cleanup [wheel build] commit 444a721 Merge: 376ad69 22448b4 Author: Charles Harris <charlesr.harris@gmail.com> Date: Mon Nov 15 17:47:23 2021 -0700 Merge pull request numpy#20274 from h-vetinari/fix_15179 TST: Some fixes & refactoring around glibc-dependent skips in test_umath.py commit 376ad69 Merge: b75fe57 546c47a Author: Charles Harris <charlesr.harris@gmail.com> Date: Mon Nov 15 17:31:41 2021 -0700 Merge pull request numpy#20327 from seberg/rename-interpolation BUG,DEP: Fixup quantile/percentile and rename interpolation->method commit 546c47a Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 15 16:13:50 2021 -0600 DOC: Fixups for interpolation rename comments from review Co-authored-by: Charles Harris <charlesr.harris@gmail.com> commit b75fe57 Merge: 7310c09 cbc25d2 Author: Warren Weckesser <warren.weckesser@gmail.com> Date: Mon Nov 15 17:27:08 2021 -0500 Merge pull request numpy#20369 from bilderbuchi/missing_diagnostics_newlines MAINT: Fix newlines in diagnostics output of numpy.f2py. commit cbc25d2 Author: Christoph Buchner <bilderbuchi@phononoia.at> Date: Sun Nov 14 08:36:03 2021 +0100 MAINT: Fix newlines in diagnostics output of numpy.f2py. Linebreaks were not consistently added to errmess/outmess arguments, which led to very long lines and wrong concatenation with compiler messages in f2py console output. commit be15716 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Fri Nov 12 12:10:20 2021 -0600 DOC: Add release not for quantile `interpolation` rename Also updates the old release note to include the info (and generally tweaks it a bit) commit 7d8a8e7 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Fri Nov 12 11:57:22 2021 -0600 DOC: Update percentile/quantile docs Mainly fixes the method list slightly, tones down the warning a bit and fixes the link to the paper (I did not realize that the link failed to work due only because the reference was missing from nanquantile/nanpercentile). commit 5bd71fb Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Tue Nov 9 09:48:59 2021 -0600 DOC: Add ticks to quantile interpolation/method error Co-authored-by: abel <aoun@cerfacs.fr> commit 0d5fb81 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 20:39:50 2021 -0600 DOC: Remove reference to paper from quantile `method` kwarg Apparently, sphinx does not resolve references to footnotes from parameter descriptions. commit 8437663 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 18:25:37 2021 -0600 MAINT: Rename interpolation to method in percentile stubs commit 8cfb6b5 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 16:47:27 2021 -0600 TST: Add deprecation testcase for quantile interpolation rename commit a5ac5a5 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 16:41:24 2021 -0600 DOC: Fixup the percentile methods plot commit 85f3dda Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 16:37:41 2021 -0600 BUG: quantile discrete methods ended up using -1 as index sometimes Also, the closest-observation did not correctly support multiple quantiles calculated at the same time (broadcasting error). commit 3993408 Author: Sebastian Berg <sebastian@sipsolutions.net> Date: Mon Nov 8 15:38:30 2021 -0600 API,DEP: Rename percentile/quantile `interpolation=` to `method=` commit 22448b4 Author: H. Vetinari <h.vetinari@gmx.com> Date: Tue Nov 2 16:51:59 2021 +1100 TST: parametrize glibc-version check in test_umath.py commit 12923c2 Author: H. Vetinari <h.vetinari@gmx.com> Date: Tue Nov 2 14:21:01 2021 +1100 TST: skip coverage of large elements in sincos_float32 for ancient glibc Fixes numpy#15179 commit 56268d5 Author: H. Vetinari <h.vetinari@gmx.com> Date: Tue Nov 2 14:19:24 2021 +1100 TST: use existence of glibc-version to clean up a test commit 01443e8 Author: H. Vetinari <h.vetinari@gmx.com> Date: Tue Nov 2 14:18:52 2021 +1100 TST: turn glibc_older_than_2.17 into boolean instead of xfail this anticipates reuse of this boolean within the test module
This (unfortunately currently) does 3 things:
interpolation
kwarg asmethod
The main
method
docstring is modified to:Here is the plot, note that it is smaller in the artifact: https://23201-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.percentile.html
(yeah, colors/linestyles are a bit annoying, some of the colors match, because they are somewhat related, I think)
Closes gh-20306.