Add `sample_weight` support for `QuantileTransformer` when fit on dense data #31147

kaekkr · 2025-04-04T08:59:45Z

Reference Issues/PRs

Fixes #30707
See also the discussion in #30707.

What does this implement/fix? Explain your changes.

This PR adds support for the sample_weight parameter to QuantileTransformer, allowing users to apply weights to samples when computing quantiles. This makes the transformation more flexible, especially in cases where samples have varying importance or are part of imbalanced datasets.

Changes made:

Added sample_weight parameter to fit and _dense_fit.
Implemented weighted quantile logic.
Updated tests to check for correct behavior with and without sample_weight.

Any other comments?

The implementation ensures backward compatibility.
Tests pass and maintain previous behavior when sample_weight is not provided.
Would appreciate feedback on edge cases or numerical accuracy concerns.

Thanks for the review!

github-actions · 2025-04-04T09:01:17Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 06960cf. Link to the linter CI: here}

ogrisel · 2025-04-04T15:46:22Z

Thanks for the PR. Could you please instead use sklearn.utils.stats._averaged_weighted_percentile instead of reimplementing a new version of weighted quantiles?

For the unweighted case, we should use np.nanquantile/np.percentile with method="averaged_inverted_cdf" instead.

The two changes together should help make the weighting/repetition semantic check of check_sample_weight_equivalence_on_dense_data pass.

Please mark check_sample_weight_equivalence_on_sparse_data XFAIL in the PER_ESTIMATOR_XFAIL_CHECKS dict.

ogrisel · 2025-04-04T15:56:15Z

Please also don't forget to document your change in a changelog entry by adding a file under doc/whats_new/upcoming_changes. See #29907 for an example.

kaekkr · 2025-04-04T17:36:46Z

@ogrisel Okay, thank you for your reply! I will fix that

…rcentile, add XFAIL for sparse_data

Karassay and others added 3 commits April 3, 2025 12:53

add sample_weight parameter in fit function

841ed26

add support for sample_weight in QuantileTransformer

eb78c16

Merge branch 'main' into quantile-transformer-sample-weight

7d51a7d

github-actions bot added the module:preprocessing label Apr 4, 2025

ogrisel changed the title ~~Quantile transformer sample weight~~ Add sample_weight support for QuantileTransformer when fit on dense data Apr 4, 2025

Karassay and others added 2 commits April 10, 2025 12:29

remove custom weighted_percentile function, use _averaged_weighted_pe…

0b0f0e6

…rcentile, add XFAIL for sparse_data

Merge branch 'main' into quantile-transformer-sample-weight

06960cf

StefanieSenger added the Waiting for Reviewer label Jun 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add `sample_weight` support for `QuantileTransformer` when fit on dense data #31147

Add `sample_weight` support for `QuantileTransformer` when fit on dense data #31147

kaekkr commented Apr 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 4, 2025 •

edited

Loading

Uh oh!

ogrisel commented Apr 4, 2025 •

edited

Loading

Uh oh!

ogrisel commented Apr 4, 2025 •

edited

Loading

Uh oh!

kaekkr commented Apr 4, 2025

Uh oh!

Uh oh!

Uh oh!

Add sample_weight support for QuantileTransformer when fit on dense data #31147

Are you sure you want to change the base?

Add sample_weight support for QuantileTransformer when fit on dense data #31147

Conversation

kaekkr commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

ogrisel commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaekkr commented Apr 4, 2025

Uh oh!

Uh oh!

Add `sample_weight` support for `QuantileTransformer` when fit on dense data #31147

Add `sample_weight` support for `QuantileTransformer` when fit on dense data #31147

kaekkr commented Apr 4, 2025 •

edited

Loading

github-actions bot commented Apr 4, 2025 •

edited

Loading

ogrisel commented Apr 4, 2025 •

edited

Loading

ogrisel commented Apr 4, 2025 •

edited

Loading