Create a way to transform the target variable within a custom sklearn transformer. #26936

amorimds · 2023-07-29T14:07:16Z

Describe the workflow you want to enable

It is not uncommon that the target variable in the raw dataset is not in the ideal format to be fitted in the estimator:

In multiclass classification, we may need to apply a custom encoding.
In regression, we may want to scale the target.

It is critical (good practice) to keep all data transformation within the sklearn pipeline. This will ensure that the model
can accept the raw features and target as input when performing streaming predictions. If all transformations are not concentrated in the sklearn pipeline, the input data for online requests will need to pass through a preprocessing pipeline first adding a lot of unnecessary complexity. If the transformation in this preprocessing pipeline needs to be stateful (learn its parameter from the training dataset through fit) the creation of such preprocessing pipeline becomes even more complicated.

Describe your proposed solution

Enable a way for a Transformer to be able to change the target variable and return it forward as y.
The default behavior for a transformation should not change nor return why:

If the transformer doesn't return y (default) then can assume that y did not change.
If it returns X, y we should replace the old y with the returned y.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

thomasjpfan · 2023-07-29T15:45:33Z

For transforming regression targets there is TransformedTargetRegressor.

For generically transforming y there is already an open issue: #4143. According to our triaging guidelines, I am closing this issue as a duplicate. You are welcome to continue the discussion in #4143.

amorimds added Needs Triage Issue requires triage New Feature labels Jul 29, 2023

amorimds changed the title ~~Create a way to transform the target valued within a custom sklearn transformer.~~ Create a way to transform the target variable within a custom sklearn transformer. Jul 29, 2023

thomasjpfan closed this as completed Jul 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a way to transform the target variable within a custom sklearn transformer. #26936

Create a way to transform the target variable within a custom sklearn transformer. #26936

amorimds commented Jul 29, 2023 •

edited

Loading

thomasjpfan commented Jul 29, 2023

Create a way to transform the target variable within a custom sklearn transformer. #26936

Create a way to transform the target variable within a custom sklearn transformer. #26936

Comments

amorimds commented Jul 29, 2023 • edited Loading

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

thomasjpfan commented Jul 29, 2023

amorimds commented Jul 29, 2023 •

edited

Loading