Add transformation option to inspect functions #18309

mayer79 · 2020-08-31T15:25:22Z

Describe the workflow you want to enable

I like the inspect module very much. Sometimes, interpretation of a model is more natural on a different scale (e.g. log) than on the scale of the predictions.

Here some examples:

We fit a GLM. It is often more natural to inspect the model on the linear scale of its link function, not on the scale of the response variable.
We fit a Poisson regression in XGBoost/LGB. These models are fitted on log scale, so it might be worth inspecting them on the log scale, even if it is only to compare with a benchmark GLM model.

Describe your proposed solution

Add an argument transformation to all explainers (partial dependence, permutation importance). By default, it is None (or the identity). The user can pass e.g. np.log to allow evaluation on the log scale.

Partial dependence plot: Here, it suffices to transform the predictions before averaging them.
Permutation importance: Here, both the response and the predictions need to be transformed. The scorer must be in line with the transformation and provided by the user.

Describe alternatives you've considered, if relevant

An alternative would be to change the prediction function of the Classifier/Regressor.

The text was updated successfully, but these errors were encountered:

glemaitre · 2020-09-01T08:01:06Z

An alternative would be to change the prediction function of the Classifier/Regressor.

I would be more in favour of this and it would be easy for somebody to create an estimator that does such transform:

class TransformPredictionRegressor(BaseEstimator):
    def __init__(self, estimator=None, transform=np.log):
        self.estimator = estimator
        self.transform = transform
    def fit(self, X, y):
        # TODO: validate the parameters
        self.estimator.fit(X, y)
        return self
    def predict(self, X):
        return transform(self.estimator.predict(X))

mayer79 · 2020-09-01T12:04:52Z

Okay, agreed. The wish to apply such transform in the explainers will raise as soon as we get interaction importance measues like Friedman's H. There, it will be essential to quickly switch to the "raw" level of predictions.

jnothman · 2020-09-01T12:38:31Z

I don't get that suggestion. y in fit and the return value of predict must be on the same scale, or metrics don't work.

glemaitre · 2020-09-01T13:02:57Z

I don't get that suggestion. y in fit and the return value of predict must
be on the same scale, or metrics don't work.

To my limited knowledge, I agree. I thought that we develop the TransformedTargetRegressor, especially to avoid such evaluation in practice.

mayer79 · 2020-09-01T14:06:23Z

I don't get that suggestion. y in fit and the return value of predict must
be on the same scale, or metrics don't work.

To my limited knowledge, I agree. I thought that we develop the TransformedTargetRegressor, especially to avoid such evaluation in practice.

For permutation importance, being able to transform the scale is less important than for partial dependence (and many other interpretability methods hopefully up to come). We could thus focus on partial dependence (and keep it in mind for upcoming extensions of inspection).

jnothman · 2020-09-01T21:51:32Z

It sounds like for permutation importance the TransformedTargetRegressor will suffice too? So maybe we should trial such a change on partial dependence? Would you be willing to attempt a pull request?

lorentzenchr · 2020-09-02T11:23:00Z

To make the argument explicit, let us assume a PoissonRegressor (similiar for LogisticRegression) that predicts exp(a + b + c). On the "raw"/"log" scale, there are no interactions, i.e. a + b + c, whereas on the scale of y there is a 3-fold interaction exp(a) * exp(b) * exp(c). Therefore, it would be very convenient for a user, if she or he could switch the analysis scale for partial dependence plots.

TransformedTargetRegressor is a bit overkill. We don't want to fit at all, but apply a transformation to predict of an already fitted model, i.e. a FunctionTransformer applied to the outcome of predict. But FunctionTransformer(lambda x: np.log(reg.predict(X))) doesn't work as it has no predict method.

class PredictionTransformRegressor(BaseEstimator):
    """Apply transformation to an already fitted estimator."""
    def __init__(self, estimator=None, transform=np.log):
        self.estimator = estimator
        self.transform = transform
    def fit(self, X, y):
        # TODO: validate the parameters
        # This should be a no-op, i.e. we do not fit anything.
        return self
    def predict(self, X):
        # We do not need to check if self was fitted.
        return transform(self.estimator.predict(X))  # will call check_is_fitted(estimator)

reg = PoissonRegressor().fit(X, y)
transformed_reg = PredictionTransformRegressor(reg, transform=np.log)
partial_dependence(transformed_reg, X=X, features=...)

I wonder if a simple option in partial_dependence would be more user friendly.

lorentzenchr · 2024-02-16T16:25:13Z

I still think this is very important for usability. To be more explicit, we are discussing
sklearn.inspection.partial_dependence and sklearn.inspection.PartialDependenceDisplay.

Options are:

Add an argument like transform_to_link_space="auto" | "identity" | "log" | "logit" | callable (todo: better name!!!)
Extend allowed values of method, e.g. method="_linear_predictor" for PoissonRegressor.
- advantage: Could be faster
- disadvantage: its either a private method (we could change that) or one needs to add such a function to an estimator (how does that impact pickle`).

mayer79 · 2024-02-17T09:06:41Z

This would be extremely useful. In the currently available inspection tools, its indeed only partial dependence that would need to be touched. (Later also H-statistics drafted in this PR).

In this jupyter notebook, we use a custom LogRegressor class to see that certain ICE curves are parallel in a Poisson boosted trees model with interaction constraints.

lorentzenchr · 2024-02-17T11:18:06Z

@glemaitre @adrinjalali @NicolasHug @thomasjpfan Do you have preferences or other proposals? See #18309 (comment).

thomasjpfan · 2024-02-17T16:38:41Z

For PoissonRegressor, is it enough to define a decision_function that outputs the raw scale? Then we can use response_method="decision_function" with partial_dependence to get the desired plots.

mayer79 added the New Feature label Aug 31, 2020

cmarmo added the module:inspection label Feb 5, 2021

lorentzenchr added the Needs Decision Requires decision label Nov 15, 2021

lorentzenchr mentioned this issue Oct 9, 2022

ENH FEA add interaction constraints to HGBT #21020

Merged

7 tasks

lorentzenchr mentioned this issue Jun 3, 2024

RFC make response / inverse link / activation function official #29169

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add transformation option to inspect functions #18309

Add transformation option to inspect functions #18309

mayer79 commented Aug 31, 2020

glemaitre commented Sep 1, 2020

mayer79 commented Sep 1, 2020

jnothman commented Sep 1, 2020 via email

glemaitre commented Sep 1, 2020 •

edited

Loading

mayer79 commented Sep 1, 2020

jnothman commented Sep 1, 2020 via email

lorentzenchr commented Sep 2, 2020 •

edited

Loading

lorentzenchr commented Feb 16, 2024

mayer79 commented Feb 17, 2024

lorentzenchr commented Feb 17, 2024

thomasjpfan commented Feb 17, 2024

Add transformation option to inspect functions #18309

Add transformation option to inspect functions #18309

Comments

mayer79 commented Aug 31, 2020

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

glemaitre commented Sep 1, 2020

mayer79 commented Sep 1, 2020

jnothman commented Sep 1, 2020 via email

glemaitre commented Sep 1, 2020 • edited Loading

mayer79 commented Sep 1, 2020

jnothman commented Sep 1, 2020 via email

lorentzenchr commented Sep 2, 2020 • edited Loading

lorentzenchr commented Feb 16, 2024

mayer79 commented Feb 17, 2024

lorentzenchr commented Feb 17, 2024

thomasjpfan commented Feb 17, 2024

glemaitre commented Sep 1, 2020 •

edited

Loading

lorentzenchr commented Sep 2, 2020 •

edited

Loading