Skip to content

Fix LabelEncoder set_output method availability #31943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

atheendre130505
Copy link

Reference Issues/PRs

Fixes #26711

What does this implement/fix? Explain your changes.

This PR fixes the LabelEncoder.set_output() method availability issue where the method was listed in documentation but threw an AttributeError when called.

Problem:

  • LabelEncoder inherited set_output from TransformerMixin but it was conditionally unavailable due to @available_if(_auto_wrap_is_configured) decorator
  • The condition required get_feature_names_out method and proper auto_wrap_output_keys configuration, both of which were missing
  • Users saw set_output in documentation but got runtime errors when trying to use it

Solution:

  1. Removed auto_wrap_output_keys=None from the LabelEncoder class definition, which was explicitly blocking the auto-wrapping functionality
  2. Added get_feature_names_out() method that returns appropriate feature names for the single output that LabelEncoder produces
  3. The method handles both cases where input feature names are provided and when they need to be generated

Implementation Details:

  • get_feature_names_out() returns a single-element array with either the first input feature name (if provided) or a default name 'labelencoder_output'
  • This satisfies the _auto_wrap_is_configured() condition in sklearn/utils/_set_output.py
  • Maintains backward compatibility - all existing functionality remains unchanged
  • Follows the same pattern as other sklearn transformers

Testing:
The fix enables the following workflow that previously failed:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.set_output(transform='pandas') # This now works without AttributeError
le.fit(['cat', 'dog', 'bird'])
result = le.transform(['cat', 'dog']) # Can now return pandas Series when configured

text

Any other comments?

This is a minimal, targeted fix that addresses the API consistency issue without breaking changes. The implementation follows sklearn's established patterns for transformers and maintains the principle that LabelEncoder is designed for target variable transformation (1D output).

The fix enables users to use set_output with LabelEncoder as they would expect from reading the documentation, resolving the confusion between documented and actual behavior

- Remove auto_wrap_output_keys=None from class definition
- Add get_feature_names_out method to satisfy _auto_wrap_is_configured condition
- Fixes GitHub issue scikit-learn#26711 where set_output was documented but not available
Copy link

github-actions bot commented Aug 13, 2025

❌ Linting issues

Merging with upstream/main might fix / improve the issues if you haven't done that since 21.06.2023.

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here


ruff check

ruff detected issues. Please run ruff check --fix --output-format=full locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.11.7.


ruff format

ruff detected issues. Please run ruff format locally and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.11.7.


Deprecation Order

Deprecation order check detected issues. Please fix them locally and push the changes. Here you can see the detected issues.


Doctest Directives

doctest directive check detected issues. Please fix them locally and push the changes. Here you can see the detected issues.


Joblib Imports

joblib import check detected issues. Please fix them locally and push the changes. Here you can see the detected issues.

Generated for commit: e816151. Link to the linter CI: here

- Run ruff check --fix and ruff format
- Ensure code follows project style guidelines
@atheendre130505 atheendre130505 force-pushed the fix-labelencoder-set-output branch from 8424691 to c9519d5 Compare August 14, 2025 18:18
- Add check_is_fitted to get_feature_names_out
- Fix codespell issues
- Fix import ordering and other lint violations
- Update test to expect LabelEncoder to have set_output
- Fix get_feature_names_out to properly handle input_features array
- Resolve pandas indexing errors in set_output functionality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AttributeError: This 'LabelEncoder' has no attribute 'set_output'
1 participant