Fix LabelEncoder set_output method availability #31943
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes #26711
What does this implement/fix? Explain your changes.
This PR fixes the
LabelEncoder.set_output()
method availability issue where the method was listed in documentation but threw anAttributeError
when called.Problem:
LabelEncoder
inheritedset_output
fromTransformerMixin
but it was conditionally unavailable due to@available_if(_auto_wrap_is_configured)
decoratorget_feature_names_out
method and properauto_wrap_output_keys
configuration, both of which were missingset_output
in documentation but got runtime errors when trying to use itSolution:
auto_wrap_output_keys=None
from theLabelEncoder
class definition, which was explicitly blocking the auto-wrapping functionalityget_feature_names_out()
method that returns appropriate feature names for the single output thatLabelEncoder
producesImplementation Details:
get_feature_names_out()
returns a single-element array with either the first input feature name (if provided) or a default name'labelencoder_output'
_auto_wrap_is_configured()
condition insklearn/utils/_set_output.py
Testing:
The fix enables the following workflow that previously failed:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.set_output(transform='pandas') # This now works without AttributeError
le.fit(['cat', 'dog', 'bird'])
result = le.transform(['cat', 'dog']) # Can now return pandas Series when configured
text
Any other comments?
This is a minimal, targeted fix that addresses the API consistency issue without breaking changes. The implementation follows sklearn's established patterns for transformers and maintains the principle that
LabelEncoder
is designed for target variable transformation (1D output).The fix enables users to use
set_output
withLabelEncoder
as they would expect from reading the documentation, resolving the confusion between documented and actual behavior