Skip to content

FEA Add TransformedTargetClassifier #29952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

gtauzin
Copy link

@gtauzin gtauzin commented Sep 27, 2024

Reference Issues/PRs

Fixes #20952.

What does this implement/fix? Explain your changes.

This PR adds TransformedTargetClassifier, a classification counterpart to the regression-oriented TransformedTargetRegressor.

Task list

  • Implement base class for both TransformedTargetRegressor and TransformedTargetClassifier
  • Deprecate regressor in favor of estimator in TransformedTargetRegressor
  • Add TransformedTargetClassifier to api reference
  • Check docstring
  • Update documentation wherever relevant
  • Allow proper input validation
  • Enable hyperparameter validation with _parameter_constraints
  • Ensure consistency w.r.t. estimator tags
  • Enable metadata routing
  • Add test in tests/test_metaestimators_metadata_routing.py
  • Make sure modified docs renders nicely
  • Update changelog
  • Add tests
  • Update user guide
  • Ensure compatibility with multi-label targets

Important points

  • I created a base class BaseTransformedTarget in a similar spirit than what is done for the bagging estimators with BaseBagging. However, this introduces a breaking change for TransformedTargetRegressor: the estimator constructor argument is renamed from regressor to estimator. I deprecated the former in favor of the latter.
  • While for TransformedTargetRegressor, the transformer is meant to be any sklearn transformer and therefore accept 2d inputs, things are not so clear for TransformedTargetClassifier. Natural transformers in that case would be label transformers: LabelEncoder, LabelBinarizer and MultipleLabelBinarizer, which will warn in case the input is 2d. To avoid that I can make sure the input is 1d if input_tags.two_d_array = False.
  • As the historical issue Allow for Transformers on y #4143 is now closed, I updated the FAQ. I took the liberty of removing the reference to pipegraph that is referenced in the same section as it has not been maintained in the past 5 years. Please let me know if this is not fine, and I'll add it back.
  • Following the discussion in TransformedTargetClassifier #20952, it seems to me the main use case is to provide the possibility of using LabelEncoder along with classifier that do not use it internally (so external to sklearn, eg. XGBoostClassifier). As it only make sense along with 3rd party classifiers, I feel I should probably not try and add an example to the gallery.
  • I have added some tests but to make a proper test plan, I need to understand better what are the potential use cases. For example, should I ensure it works for multi-label targets? What are the allowed y like?
  • I am happy to update the user guide, but would prefer to get some feedback from maintainers first.

Please don't hesitate to leave any feedback, I am here to learn :)

Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
@gtauzin gtauzin changed the title Add TransformedTargetClassifier ENH Add TransformedTargetClassifier Sep 27, 2024
Copy link

github-actions bot commented Sep 27, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 8943c5f. Link to the linter CI: here

Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
@gtauzin gtauzin changed the title ENH Add TransformedTargetClassifier [WIP] FEA Add TransformedTargetClassifier Sep 29, 2024
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
@oelhammouchi
Copy link

Hi, came across your work while looking for this functionality. Just wondering whether you're planning to finish this up? I'm considering having a go at a PR of my own otherwise. Thanks!

@gtauzin
Copy link
Author

gtauzin commented Feb 24, 2025

Hi @oelhammouchi, sorry it took so long, I had to focus on other things lately.

I am planning to take this up again in the next few weeks. The TransformedTargetClassifier itself is basically implemented and I just want to add a few tests before I ask for a review.

@oelhammouchi
Copy link

Great, then I'll hold off on it. Thanks again!

gtauzin added 3 commits March 1, 2025 11:43
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
Signed-off-by: Guillaume Tauzin <4648633+gtauzin@users.noreply.github.com>
@gtauzin gtauzin changed the title [WIP] FEA Add TransformedTargetClassifier FEA Add TransformedTargetClassifier Mar 1, 2025
@gtauzin gtauzin marked this pull request as ready for review March 1, 2025 11:16
@gtauzin
Copy link
Author

gtauzin commented Mar 1, 2025

There is a single test failing: sklearn/tests/test_common.py::test_estimators[TransformedTargetClassifier()-check_classifiers_classes]. With the current default, the effective transformer is set to a FunctionTransformer with validate=True that ends up calling check_array without allowing non-numeric dtypes. The common classification test passes targets that are string.

I am not sure what is the best way to tackle that (changing the default, removing validation, etc...) so I'll wait for maintainers guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TransformedTargetClassifier
2 participants