Skip to content

DOC Clarify usage of d2_pinball_score with model selection tools #31239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

MaddyRizvi
Copy link

@MaddyRizvi MaddyRizvi commented Apr 22, 2025

Reference Issues/PRs

Towards #28671
This PR addresses confusion discussed by users trying to use d2_pinball_score directly as a string value for the scoring parameter in model selection APIs such as GridSearchCV and RandomizedSearchCV.

Although d2_pinball_score is a valid scoring function, it is not registered as a string scorer and must be wrapped using make_scorer. This was not clearly explained in the docstring.

What does this implement/fix? Explain your changes.

This improves the documentation of sklearn.metrics.d2_pinball_score by:

Adding a note to clarify that this metric is not a valid string identifier for the scoring parameter in model selection tools.

Providing a code example showing how to use make_scorer to wrap the function correctly for use with GridSearchCV or RandomizedSearchCV.

Adding a usage snippet under the Examples section for easy discoverability.

This change is intended to make usage of d2_pinball_score more transparent and reduce common user errors and confusion.

Any other comments?

This PR does not affect the behavior of the function or its API — it is documentation-only.

The motivation arose from real-world usage in probabilistic forecasting, where d2_pinball_score is useful but hard to integrate into model selection workflows due to missing documentation around make_scorer.

Copy link

github-actions bot commented Apr 22, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: fe85a06. Link to the linter CI: here

Comment on lines 1753 to 1761
This metric is not a built-in scoring string for use in model selection
tools such as `GridSearchCV` or `RandomizedSearchCV`.

To use it as a custom scoring function, wrap it using
:func:`~sklearn.metrics.make_scorer`:

>>> from sklearn.metrics import make_scorer, d2_pinball_score
>>> scorer = make_scorer(d2_pinball_score, alpha=0.95)
>>> # Then use it as `scoring=scorer` in RandomizedSearchCV or GridSearchCV
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This metric is not a built-in scoring string for use in model selection
tools such as `GridSearchCV` or `RandomizedSearchCV`.
To use it as a custom scoring function, wrap it using
:func:`~sklearn.metrics.make_scorer`:
>>> from sklearn.metrics import make_scorer, d2_pinball_score
>>> scorer = make_scorer(d2_pinball_score, alpha=0.95)
>>> # Then use it as `scoring=scorer` in RandomizedSearchCV or GridSearchCV
This metric is not a built-in scoring string for use in model selection
tools such as `GridSearchCV` or `RandomizedSearchCV`.
To use it as a custom scoring function, wrap it using
:func:`~sklearn.metrics.make_scorer`. See Examples for details.

We can keep the code snippet in the examples section.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the scorer usage example to the Examples section and cleaned up the Notes.
Thanks for the feedback — ready for your next review whenever you have time!

Comment on lines 1786 to 1788
>>> # Using with make_scorer
>>> from sklearn.metrics import make_scorer
>>> scorer = make_scorer(d2_pinball_score, alpha=0.95)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> # Using with make_scorer
>>> from sklearn.metrics import make_scorer
>>> scorer = make_scorer(d2_pinball_score, alpha=0.95)
Using with :func:`~sklearn.metrics.make_scorer`:
>>> from sklearn.metrics import make_scorer, d2_pinball_score
>>> pinball_95_scorer = make_scorer(d2_pinball_score, alpha=0.95)
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.svm import LinearSVC
>>> grid = GridSearchCV(
... LinearSVC(),
... param_grid={"C": [1, 10]},
... scoring=pinball_95_scorer,
... cv=5,
... )

Expand the example a bit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand examples section to show how to use d2_pinball_score with
make_scorer and GridSearchCV. Also clarify in the Notes that
d2_pinball_score is not a built-in scorer string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuanx749 note that LinearSVC is a classifier (suited to model discrete class observations in the target variable), while d2_pinball_score is a metric for quantile regression problems. Better use a (quantile) regression model in the example.

@MaddyRizvi MaddyRizvi force-pushed the doc/update-d2-pinball-score branch from e4f733b to fe85a06 Compare April 26, 2025 15:04
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Overall this looks good to me, but the failures reported by the continuous integration need to be addressed. Please find details and further suggestions below:

>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> d2_pinball_score(y_true, y_pred, alpha=0.95)
0.968...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should help fix the broken tests.

Suggested change
0.968...
0.578...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next time, please run pytest --doctest-modules path/to/the/code/you/change.py when editing doctests.

Alternatively, read the logs of the failing continuous integration reports linked from the PR to find out what caused the failures.

... scoring=pinball_95_scorer,
... cv=2,
... )
>>> _ = grid.fit(X, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could display the value of grid.best_params_ to make the example more complete. E.g. something like the following:

    >>> grid.fit(X, y).best_params_
    {"fit_intercept": True}

Run the doctest locally to check that this is actually the best param:

$ pytest -v --doctest-modules sklearn/metrics/_regression.py

Comment on lines 1786 to 1788
>>> # Using with make_scorer
>>> from sklearn.metrics import make_scorer
>>> scorer = make_scorer(d2_pinball_score, alpha=0.95)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuanx749 note that LinearSVC is a classifier (suited to model discrete class observations in the target variable), while d2_pinball_score is a metric for quantile regression problems. Better use a (quantile) regression model in the example.

>>> X = np.array([[1], [2], [3], [4]])
>>> y = np.array([2.5, 0.0, 2, 8])
>>> grid = GridSearchCV(
... LinearRegression(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make more sense to tune the fit_intercept parameter of QuantileRegressor(quantile=0.95) instead of LinearRegression. LinearRegression predicts an estimate of E[y|X] instead of an estimate of Q_{0.95}(y|X).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants