-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Remove median_absolute_error
from METRICS_WITHOUT_SAMPLE_WEIGHT
#30787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lucyleeow
wants to merge
13
commits into
scikit-learn:main
Choose a base branch
from
lucyleeow:med_abs_er
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
2e9796e
add len check
lucyleeow 890aa74
fix test list
lucyleeow 5bfc5b6
fix tests
lucyleeow 967212c
lint
lucyleeow 92884bc
use arange sample weight
lucyleeow 0b6e9b6
merge main
lucyleeow acd8442
try av weighted percentile
lucyleeow 0bfb464
amend test
lucyleeow 20e97f4
revert even sample size
lucyleeow 4c7b87e
add comment
lucyleeow c672285
merge main
lucyleeow 917930c
amend comment
lucyleeow c2cb13e
add whats new
lucyleeow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
- :func:`metrics.median_absolute_error` now uses `_averaged_weighted_percentile` | ||
instead of `_weighted_percentile` to calculate median when `sample_weight` is not | ||
`None`. This is equivalent to using the "averaged_inverted_cdf" instead of | ||
the "inverted_cdf" quantile method, which gives results equivalent to `numpy.median` | ||
if equal weights used. | ||
By :user:`Lucy Liu <lucyleeow>` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -555,7 +555,6 @@ def precision_recall_curve_padded_thresholds(*args, **kwargs): | |
|
||
# No Sample weight support | ||
METRICS_WITHOUT_SAMPLE_WEIGHT = { | ||
"median_absolute_error", | ||
"max_error", | ||
"ovo_roc_auc", | ||
"weighted_ovo_roc_auc", | ||
|
@@ -1474,9 +1473,10 @@ def test_averaging_multilabel_all_ones(name): | |
check_averaging(name, y_true, y_true_binarize, y_pred, y_pred_binarize, y_score) | ||
|
||
|
||
def check_sample_weight_invariance(name, metric, y1, y2): | ||
def check_sample_weight_invariance(name, metric, y1, y2, sample_weight=None): | ||
rng = np.random.RandomState(0) | ||
sample_weight = rng.randint(1, 10, size=len(y1)) | ||
if sample_weight is None: | ||
sample_weight = rng.randint(1, 10, size=len(y1)) | ||
|
||
# top_k_accuracy_score always lead to a perfect score for k > 1 in the | ||
# binary case | ||
|
@@ -1552,7 +1552,11 @@ def check_sample_weight_invariance(name, metric, y1, y2): | |
if not name.startswith("unnormalized"): | ||
# check that the score is invariant under scaling of the weights by a | ||
# common factor | ||
for scaling in [2, 0.3]: | ||
# Due to numerical instability of floating points in `cumulative_sum` in | ||
# `median_absolute_error` (not fixable by `stable_cumsum`), it is not | ||
# always equivalent when scaling by a float. | ||
scaling_values = [2] if name == "median_absolute_error" else [2, 0.3] | ||
for scaling in scaling_values: | ||
assert_allclose( | ||
weighted_score, | ||
metric(y1, y2, sample_weight=sample_weight * scaling), | ||
|
@@ -1584,8 +1588,10 @@ def test_regression_sample_weight_invariance(name): | |
# regression | ||
y_true = random_state.random_sample(size=(n_samples,)) | ||
y_pred = random_state.random_sample(size=(n_samples,)) | ||
sample_weight = np.arange(len(y_true)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed as suggested in #30787 (comment). This works great. I've amended |
||
metric = ALL_METRICS[name] | ||
check_sample_weight_invariance(name, metric, y_true, y_pred) | ||
|
||
check_sample_weight_invariance(name, metric, y_true, y_pred, sample_weight) | ||
|
||
|
||
@pytest.mark.parametrize( | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem due to numerical instability in cumulative sum (NOT fixed by
stable_cumsum
) in_weighted_percentile
. It is rare for this problem to appear.In this test, the final value in
cumulative_sum
was a small amount (within tolerance instable_cumsum
) higher than 'actual' value, this resulted in theadjusted_percentile_rank
being a small amount higher than the 'actual' value:adjusted_percentile_rank
was 17.400000000000002 , when it should have been 17.4, which just happens to be a value inweight_cdf
. Thus when we dosearchsorted
scikit-learn/sklearn/utils/stats.py
Lines 98 to 99 in bff3d7d
the index should be that of the 17.4 element in
weight_cdf
, but instead it is the next index.stable_cumsum
does not fix this particular problem as it does not matter how closeadjusted_percentile_rank
value is to the true value, if the true value is itself present withinweight_cdf
,searchsorted
will take the adjacent index.Note that I checked using
numpy.quantile
(with "inverted_cdf`, which now supports weights) and got the same test failure).For completeness, I will reference the recent discussion regarding use of
stable_cumsum
(#29431 (comment)) - it was decided to not be required in_weighted_percentile
(cc @ogrisel).Further numpy's
quantile
implementation simply usescumsum
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check the behavior of
numpy.quantile
with"averaged_inverted_cdf"
and_averaged_weighted_percentile
on this kind of edge case?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: I am not sure if the comment above was written before the switch from
_weighted_percentile
to_averaged_weighted_percentile
or not.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have clarified that the failure persists with
_averaged_weighted_percentile
. But I agree that I also thought_averaged_weighted_percentile
should fix the problem so I did some more digging._averaged_weighted_percentile
errors with:_weighted_percentile
errors with:So
_averaged_weighted_percentile
halves the error. And the reason is because:array
the cumulative sum has the instability problem (i.e.adjusted_percentile_rank
is 17.400000000000002)-array
the cumulative sum does not have the instability problem (i.e.adjusted_percentile_rank
is 17.4) 🙃If cumulative sum had the instability problem with both
array
and-array
I think the problem would have been fixed.numpy.quantile
with weights only supportsinverted_cdf
(I double checked in dev docs https://numpy.org/devdocs/reference/generated/numpy.quantile.html and numpy PRs).