Skip to content

FEA Add TransformedTargetClassifier #29952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/api_reference.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,7 @@ def _get_submodule(module_name, submodule_name):
"title": None,
"autosummary": [
"ColumnTransformer",
"TransformedTargetClassifier",
"TransformedTargetRegressor",
"make_column_selector",
"make_column_transformer",
Expand Down
19 changes: 8 additions & 11 deletions doc/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,19 +214,16 @@ for an example of working with heterogeneous (e.g. categorical and numeric) data

Do you plan to implement transform for target ``y`` in a pipeline?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Currently transform only works for features ``X`` in a pipeline. There's a
long-standing discussion about not being able to transform ``y`` in a pipeline.
Follow on GitHub issue :issue:`4143`. Meanwhile, you can check out
:class:`~compose.TransformedTargetRegressor`,
`pipegraph <https://github.com/mcasl/PipeGraph>`_,
and `imbalanced-learn <https://github.com/scikit-learn-contrib/imbalanced-learn>`_.
Note that scikit-learn solved for the case where ``y``
Only the features `X` can be transformed in a pipeline and scikit-learn will
not support arbitrary transformation of the target ``y`` in a pipeline. However,
please check out :class:`~compose.TransformedTargetClassifier` and
:class:`~compose.TransformedTargetRegressor` for the case where ``y``
has an invertible transformation applied before training
and inverted after prediction. scikit-learn intends to solve for
use cases where ``y`` should be transformed at training time
and not at test time, for resampling and similar uses, like at
and inverted after prediction. For use cases where ``y`` should be transformed
at training time and not at test time, such as resampling and similar uses, have
a look at
`imbalanced-learn <https://github.com/scikit-learn-contrib/imbalanced-learn>`_.
In general, these use cases can be solved
In general, other use cases can be solved
with a custom meta estimator rather than a :class:`~pipeline.Pipeline`.

Why are there so many different estimators for linear models?
Expand Down
1 change: 1 addition & 0 deletions doc/metadata_routing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ Meta-estimators and functions supporting metadata routing:

- :class:`sklearn.calibration.CalibratedClassifierCV`
- :class:`sklearn.compose.ColumnTransformer`
- :class:`sklearn.compose.TransformedTargetClassifier`
- :class:`sklearn.compose.TransformedTargetRegressor`
- :class:`sklearn.covariance.GraphicalLassoCV`
- :class:`sklearn.ensemble.StackingClassifier`
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
- Added :class:`compose.TransformedTargetClassifier` which transforms the
target y before fitting a classification model. The predictions are
mapped back to the original space via an inverse transform.
By :user:`Guillaume Tauzin <gtauzin>`
68 changes: 34 additions & 34 deletions doc/whats_new/v1.6.rst
Original file line number Diff line number Diff line change
Expand Up @@ -748,38 +748,38 @@ Python and CPython ecosystem, for example :user:`Nathan Goldbaum <ngoldbaum>`,
Thanks to everyone who has contributed to the maintenance and improvement of
the project since version 1.5, including:

Aaron Schumacher, Abdulaziz Aloqeely, abhi-jha, Acciaro Gennaro Daniele, Adam
J. Stewart, Adam Li, Adeel Hassan, Adeyemi Biola, Aditi Juneja, Adrin Jalali,
Aisha, Akanksha Mhadolkar, Akihiro Kuno, Alberto Torres, alexqiao, Alihan
Zihna, Aniruddha Saha, antoinebaker, Antony Lee, Anurag Varma, Arif Qodari,
Arthur Courselle, ArthurDbrn, Arturo Amor, Aswathavicky, Audrey Flanders,
aurelienmorgan, Austin, awwwyan, AyGeeEm, a.zy.lee, baggiponte, BlazeStorm001,
bme-git, Boney Patel, brdav, Brigitta Sipőcz, Cailean Carter, Camille
Troillard, Carlo Lemos, Christian Lorentzen, Christian Veenhuis, Christine P.
Chai, claudio, Conrad Stevens, datarollhexasphericon, Davide Chicco, David
Matthew Cherney, Dea María Léon, Deepak Saldanha, Deepyaman Datta,
dependabot[bot], dinga92, Dmitry Kobak, Domenico, Drew Craeton, dymil, Edoardo
Abati, EmilyXinyi, Eric Larson, Evelyn, fabianhenning, Farid "Freddie" Taba,
Gael Varoquaux, Giorgio Angelotti, Gleb Levitski, Guillaume Lemaitre, Guntitat
Sawadwuthikul, Haesun Park, Hanjun Kim, Henrique Caroço, hhchen1105, Hugo
Boulenger, Ilya Komarov, Inessa Pawson, Ivan Pan, Ivan Wiryadi, Jaimin Chauhan,
Jakob Bull, James Lamb, Janez Demšar, Jérémie du Boisberranger, Jérôme
Dockès, Jirair Aroyan, João Morais, Joe Cainey, Joel Nothman, John Enblom,
JorgeCardenas, Joseph Barbier, jpienaar-tuks, Julian Chan, K.Bharat Reddy,
Kevin Doshi, Lars, Loic Esteve, Lucas Colley, Lucy Liu, lunovian, Marc Bresson,
Marco Edward Gorelli, Marco Maggi, Marco Wolsza, Maren Westermann,
MarieS-WiMLDS, Martin Helm, Mathew Shen, mathurinm, Matthew Feickert, Maxwell
Liu, Meekail Zain, Michael Dawson, Miguel Cárdenas, m-maggi, mrastgoo, Natalia
Mokeeva, Nathan Goldbaum, Nathan Orgera, nbrown-ScottLogic, Nikita Chistyakov,
Nithish Bolleddula, Noam Keidar, NoPenguinsLand, Norbert Preining, notPlancha,
Olivier Grisel, Omar Salman, ParsifalXu, Piotr, Priyank Shroff, Priyansh Gupta,
Quentin Barthélemy, Rachit23110261, Rahil Parikh, raisadz, Rajath,
renaissance0ne, Reshama Shaikh, Roberto Rosati, Robert Pollak, rwelsch427,
Santiago Castro, Santiago M. Mola, scikit-learn-bot, sean moiselle, SHREEKANT
VITTHAL NANDIYAWAR, Shruti Nath, Søren Bredlund Caspersen, Stefanie Senger,
Stefano Gaspari, Steffen Schneider, Štěpán Sršeň, Sylvain Combettes,
Tamara, Thomas, Thomas Gessey-Jones, Thomas J. Fan, Thomas Li, ThorbenMaa,
Tialo, Tim Head, Tuhin Sharma, Tushar Parimi, Umberto Fasci, UV, vedpawar2254,
Velislav Babatchev, Victoria Shevchenko, viktor765, Vince Carey, Virgil Chan,
Wang Jiayi, Xiao Yuan, Xuefeng Xu, Yao Xiao, yareyaredesuyo, Zachary Vealey,
Aaron Schumacher, Abdulaziz Aloqeely, abhi-jha, Acciaro Gennaro Daniele, Adam
J. Stewart, Adam Li, Adeel Hassan, Adeyemi Biola, Aditi Juneja, Adrin Jalali,
Aisha, Akanksha Mhadolkar, Akihiro Kuno, Alberto Torres, alexqiao, Alihan
Zihna, Aniruddha Saha, antoinebaker, Antony Lee, Anurag Varma, Arif Qodari,
Arthur Courselle, ArthurDbrn, Arturo Amor, Aswathavicky, Audrey Flanders,
aurelienmorgan, Austin, awwwyan, AyGeeEm, a.zy.lee, baggiponte, BlazeStorm001,
bme-git, Boney Patel, brdav, Brigitta Sipőcz, Cailean Carter, Camille
Troillard, Carlo Lemos, Christian Lorentzen, Christian Veenhuis, Christine P.
Chai, claudio, Conrad Stevens, datarollhexasphericon, Davide Chicco, David
Matthew Cherney, Dea María Léon, Deepak Saldanha, Deepyaman Datta,
dependabot[bot], dinga92, Dmitry Kobak, Domenico, Drew Craeton, dymil, Edoardo
Abati, EmilyXinyi, Eric Larson, Evelyn, fabianhenning, Farid "Freddie" Taba,
Gael Varoquaux, Giorgio Angelotti, Gleb Levitski, Guillaume Lemaitre, Guntitat
Sawadwuthikul, Haesun Park, Hanjun Kim, Henrique Caroço, hhchen1105, Hugo
Boulenger, Ilya Komarov, Inessa Pawson, Ivan Pan, Ivan Wiryadi, Jaimin Chauhan,
Jakob Bull, James Lamb, Janez Demšar, Jérémie du Boisberranger, Jérôme
Dockès, Jirair Aroyan, João Morais, Joe Cainey, Joel Nothman, John Enblom,
JorgeCardenas, Joseph Barbier, jpienaar-tuks, Julian Chan, K.Bharat Reddy,
Kevin Doshi, Lars, Loic Esteve, Lucas Colley, Lucy Liu, lunovian, Marc Bresson,
Marco Edward Gorelli, Marco Maggi, Marco Wolsza, Maren Westermann,
MarieS-WiMLDS, Martin Helm, Mathew Shen, mathurinm, Matthew Feickert, Maxwell
Liu, Meekail Zain, Michael Dawson, Miguel Cárdenas, m-maggi, mrastgoo, Natalia
Mokeeva, Nathan Goldbaum, Nathan Orgera, nbrown-ScottLogic, Nikita Chistyakov,
Nithish Bolleddula, Noam Keidar, NoPenguinsLand, Norbert Preining, notPlancha,
Olivier Grisel, Omar Salman, ParsifalXu, Piotr, Priyank Shroff, Priyansh Gupta,
Quentin Barthélemy, Rachit23110261, Rahil Parikh, raisadz, Rajath,
renaissance0ne, Reshama Shaikh, Roberto Rosati, Robert Pollak, rwelsch427,
Santiago Castro, Santiago M. Mola, scikit-learn-bot, sean moiselle, SHREEKANT
VITTHAL NANDIYAWAR, Shruti Nath, Søren Bredlund Caspersen, Stefanie Senger,
Stefano Gaspari, Steffen Schneider, Štěpán Sršeň, Sylvain Combettes,
Tamara, Thomas, Thomas Gessey-Jones, Thomas J. Fan, Thomas Li, ThorbenMaa,
Tialo, Tim Head, Tuhin Sharma, Tushar Parimi, Umberto Fasci, UV, vedpawar2254,
Velislav Babatchev, Victoria Shevchenko, viktor765, Vince Carey, Virgil Chan,
Wang Jiayi, Xiao Yuan, Xuefeng Xu, Yao Xiao, yareyaredesuyo, Zachary Vealey,
Ziad Amerr
3 changes: 2 additions & 1 deletion sklearn/compose/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,12 @@
make_column_selector,
make_column_transformer,
)
from ._target import TransformedTargetRegressor
from ._target import TransformedTargetClassifier, TransformedTargetRegressor

__all__ = [
"ColumnTransformer",
"make_column_transformer",
"TransformedTargetClassifier",
"TransformedTargetRegressor",
"make_column_selector",
]
Loading
Loading