Skip to content

[MRG] Test __array_function__ not called in non-estimator API (#15865) #18292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions sklearn/inspection/tests/test_permutation_importance.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,21 @@
from sklearn.preprocessing import scale
from sklearn.utils import parallel_backend
from sklearn.utils._testing import _convert_container
from sklearn.utils.estimator_checks import _NotAnArray


def test_array_function_not_called():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide context for future maintainers that are not familiar with NEP18:

Suggested change
def test_array_function_not_called():
def test_array_function_not_called():
"""Check that `__array_function__` (NEP18) is not called."""

Small nit: may this test be moved to the end of the file?

X = np.array([[1, 1], [1, 2], [1, 3], [1, 4],
[2, 1], [2, 2], [2, 3], [2, 4],
[3, 1], [3, 2], [3, 3], [3, 4]])
X = _NotAnArray(X)
y = _NotAnArray([1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 2])
estimator = LogisticRegression()
estimator.fit(X, y)
rng = np.random.RandomState(42)
permutation_importance(estimator, X, y, n_repeats=5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the test is based on _NotAnArray not raising an error, we do not need to repeat too much:

Suggested change
permutation_importance(estimator, X, y, n_repeats=5,
permutation_importance(estimator, X, y, n_repeats=2,

random_state=rng, n_jobs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to include n_jobs=1 since it is the default when n_jobs=None.

Suggested change
random_state=rng, n_jobs=1)
random_state=rng)



@pytest.mark.parametrize("n_jobs", [1, 2])
def test_permutation_importance_correlated_feature_regression(n_jobs):
Expand Down
12 changes: 12 additions & 0 deletions sklearn/model_selection/tests/test_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

from sklearn.model_selection.tests.test_search import FailingClassifier

from sklearn.utils.estimator_checks import _NotAnArray
from sklearn.utils._testing import assert_almost_equal
from sklearn.utils._testing import assert_raises
from sklearn.utils._testing import assert_raise_message
Expand Down Expand Up @@ -226,6 +227,17 @@ def get_params(self, deep=False):
return {'a': self.a, 'allow_nd': self.allow_nd}


def test_array_function_not_called():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here and nit about moving the test to the end of the file.

Suggested change
def test_array_function_not_called():
def test_array_function_not_called():
"""Check that `__array_function__` (NEP18) is not called."""

X = np.array([[1, 1], [1, 2], [1, 3], [1, 4],
[2, 1], [2, 2], [2, 3], [2, 4],
[3, 1], [3, 2], [3, 3], [3, 4]])
X = _NotAnArray(X)
y = _NotAnArray([1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 2])
estimator = LogisticRegression()
grid = GridSearchCV(estimator, param_grid={'C': [1, 10]})
cross_validate(grid, X, y, n_jobs=2)
Comment on lines +237 to +238
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need nested cross validation to check that cross_validate itself does not use __array_function__:

Suggested change
grid = GridSearchCV(estimator, param_grid={'C': [1, 10]})
cross_validate(grid, X, y, n_jobs=2)
cross_validate(estimator, X, y, cv=2)

Also cv=2 to speed the test up and leaving n_jobs=None as default.



# XXX: use 2D array, since 1D X is being detected as a single sample in
# check_consistent_length
X = np.ones((10, 2))
Expand Down