Skip to content

FIX check_transformer_data_not_an_array for ColumnTransformer #29938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mrastgoo
Copy link
Contributor

@mrastgoo mrastgoo commented Sep 27, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Removed the _xfail.checks from ColumnTransformer for the name of test check_transformer_data_not_an_array
by adding if hasattr(X, shape) to _check_X, after discussion with @glemaitre, @jeremiedbb

Any other comments?

Copy link

github-actions bot commented Sep 27, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 1b54949. Link to the linter CI: here

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @mrastgoo!

For reference, here is the traceback one would get without the fix:

__________________________________ test_estimators[ColumnTransformer(transformers=[('trans1',StandardScaler(),[0,1])])-check_transformer_data_not_an_array] __________________________________

estimator = ColumnTransformer(transformers=[('trans1', StandardScaler(), [0, 1])])
check = functools.partial(<function check_transformer_data_not_an_array at 0x123672de0>, 'ColumnTransformer')
request = <FixtureRequest for <Function test_estimators[ColumnTransformer(transformers=[('trans1',StandardScaler(),[0,1])])-check_transformer_data_not_an_array]>>

    @parametrize_with_checks(list(_tested_estimators()))
    def test_estimators(estimator, check, request):
        # Common tests for estimator instances
        with ignore_warnings(
            category=(FutureWarning, ConvergenceWarning, UserWarning, LinAlgWarning)
        ):
>           check(estimator)

sklearn/tests/test_common.py:120: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/utils/_testing.py:140: in wrapper
    return fn(*args, **kwargs)
sklearn/utils/estimator_checks.py:1594: in check_transformer_data_not_an_array
    _check_transformer(name, transformer, this_X, this_y)
sklearn/utils/estimator_checks.py:1646: in _check_transformer
    transformer.fit(X, y_)
sklearn/compose/_column_transformer.py:951: in fit
    self.fit_transform(X, y=y, **params)
sklearn/utils/_set_output.py:319: in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
sklearn/base.py:1244: in wrapper
    return fit_method(estimator, *args, **kwargs)
sklearn/compose/_column_transformer.py:997: in fit_transform
    self._validate_column_callables(X)
sklearn/compose/_column_transformer.py:551: in _validate_column_callables
    transformer_to_input_indices[name] = _get_column_indices(X, columns)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

X = <sklearn.utils.estimator_checks._NotAnArray object at 0x123411e20>, key = [0, 1]

    def _get_column_indices(X, key):
        """Get feature column indices for input data X and key.
    
        For accepted values of `key`, see the docstring of
        :func:`_safe_indexing`.
        """
        key_dtype = _determine_key_type(key)
        if _use_interchange_protocol(X):
            return _get_column_indices_interchange(X.__dataframe__(), key, key_dtype)
    
>       n_columns = X.shape[1]
E       AttributeError: '_NotAnArray' object has no attribute 'shape'

sklearn/utils/_indexing.py:333: AttributeError
----------------------------------------------------------------------------------- Captured stdout setup ------------------------------------------------------------------------------------
I: Seeding RNGs with 1235426376
================================================================================== short test summary info ===================================================================================
FAILED sklearn/tests/test_common.py::test_estimators[ColumnTransformer(transformers=[('trans1',StandardScaler(),[0,1])])-check_transformer_data_not_an_array] - AttributeError: '_NotAnArray' object has no attribute 'shape'

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @mrastgoo

@OmarManzoor OmarManzoor merged commit 9732b58 into scikit-learn:main Oct 7, 2024
30 checks passed
BenJourdan pushed a commit to gregoryschwartzman/scikit-learn that referenced this pull request Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants