Skip to content

API to predict multiple quantiles at once #23334

@ogrisel

Description

@ogrisel

Classifiers have a predict_proba method that makes it possible to quantify probabilistic ally the certainty in the predictions for a given input X_i.

Currently most regressors in scikit-learn only predict a conditional expectile E[Y|X], and some have a return_std option that makes it also possible to estimate sqrt(VAR[Y|X]), which can be used to quantify the certainty when assuming a Gaussian predictive distribution (typically for Gaussian processes which estimate a Gaussian predictive posterior distribution).

We do have pointwise quantile estimators (linear models, gradient boosting, hist gradient boosting) where the predict method returns a single point estimate for target quantile passed as an hyper-parameter instead of estimating an expectile.

Several people have expressed the need to have more generic API that can return an array of quantile estimates for a given input X_i.

The goal of this issue is to centralize the discussion of an API extension to be able to do this more uniformly in scikit-learn, either via a meta-estimator that wraps an array of point-wise quantile estimator to turn it into a quantile-array estimator or to directly have the base estimators able to do this directly (and sometimes more efficiently).

Some non-exhausitive list of related PRs and issues (feel free to add or suggest new ones):

Also related:

Furthermore, models like Poisson regression that make a specific assumption about the conditional Y|X distribution, it would be possible to estimates of the inverse-CDF values of the estimated Y|X for instance. Those could probably also benefit from an expanded API.

If we do this, then we have the side question of how to evaluate such multi-quantile models. We could probably extend the pinball_loss scorer to average the pinball scores for an array of quantiles for instance.

/cc @GaelVaroquaux @amueller @lorentzenchr

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions