MNT Fix E721 linting issues to do type comparisons with is #29501

adrinjalali · 2024-07-16T11:03:14Z

Got these from a ruff check ., and hope they don't change anything.

github-actions · 2024-07-16T11:04:26Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 3f4bdb8. Link to the linter CI: here}

jeremiedbb · 2024-07-17T09:10:17Z

Is that because you a more recent version of ruff than the min 0.2.1 ?
Latest is 0.5.2, maybe it's time to bump the min version.

glemaitre · 2024-07-25T08:36:25Z

sklearn/cluster/_optics.py

@@ -324,7 +324,7 @@ def fit(self, X, y=None):
            Returns a fitted instance of self.
        """
        dtype = bool if self.metric in PAIRWISE_BOOLEAN_FUNCTIONS else float
-        if dtype == bool and X.dtype != bool:
+        if dtype is bool and X.dtype != bool:


X.dtype is also a type. I would expect:

Suggested change

if dtype is bool and X.dtype != bool:

if dtype is bool and X.dtype is not bool:

glemaitre · 2024-07-25T08:36:45Z

sklearn/metrics/pairwise.py

@@ -2388,7 +2388,7 @@ def pairwise_distances(

        dtype = bool if metric in PAIRWISE_BOOLEAN_FUNCTIONS else "infer_float"

-        if dtype == bool and (X.dtype != bool or (Y is not None and Y.dtype != bool)):
+        if dtype is bool and (X.dtype != bool or (Y is not None and Y.dtype != bool)):


glemaitre · 2024-07-25T08:37:31Z

sklearn/model_selection/tests/test_validation.py

+            assert type(cv_results["test_r2"]) is np.ndarray
+            assert type(cv_results["test_neg_mean_squared_error"]) is np.ndarray
+            assert type(cv_results["fit_time"]) is np.ndarray
+            assert type(cv_results["score_time"]) is np.ndarray


It would be better to use isinstance here.

Suggested change

assert type(cv_results["test_r2"]) is np.ndarray

assert type(cv_results["test_neg_mean_squared_error"]) is np.ndarray

assert type(cv_results["fit_time"]) is np.ndarray

assert type(cv_results["score_time"]) is np.ndarray

assert isinstance(cv_results["test_r2"], np.ndarray)

assert isinstance(cv_results["test_neg_mean_squared_error"], np.ndarray)

assert isinstance(cv_results["fit_time"], np.ndarray)

assert isinstance(cv_results["score_time"], np.ndarray)

glemaitre · 2024-07-25T08:38:47Z

sklearn/utils/estimator_checks.py

@@ -1501,7 +1501,7 @@ def _apply_on_subsets(func, X):
    result_by_batch = [func(batch.reshape(1, n_features)) for batch in X]

    # func can output tuple (e.g. score_samples)
-    if type(result_full) == tuple:
+    if type(result_full) is tuple:


Suggested change

if type(result_full) is tuple:

if isinstance(result_full, tuple):

glemaitre · 2024-07-25T08:39:24Z

sklearn/utils/tests/test_validation.py

@@ -1341,7 +1341,7 @@ def test_check_scalar_invalid(
            include_boundaries=include_boundaries,
        )
    assert str(raised_error.value) == str(err_msg)
-    assert type(raised_error.value) == type(err_msg)
+    assert type(raised_error.value) is type(err_msg)


Suggested change

assert type(raised_error.value) is type(err_msg)

assert isinstance(raised_error.value, type(err_msg))

glemaitre · 2024-07-25T08:41:04Z

This is strange that we did not catch this up before. I always thought that flake8 (or pycodestyle) was complaining about those pattern.

adrinjalali · 2024-07-30T13:22:24Z

@glemaitre I remember why I hadn't changed the ones you suggested, cause it results in the following errors in the CI:

FAILED cluster/tests/test_optics.py::test_nowarn_if_metric_bool_data_bool - sklearn.exceptions.DataConversionWarning: Data will be converted to boolean...
FAILED cluster/tests/test_optics.py::test_warn_if_metric_bool_data_no_bool - AssertionError
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[dice] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[jaccard] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[rogerstanimoto] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[russellrao] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[sokalmichener] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[sokalsneath] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[yule] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-boolean-bool] - ValueError: could not convert string to float: 'a'
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-Int64-int64] - ValueError: could not convert string to float: 'a'
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-Float64-float64] - ValueError: could not convert string to float: 'a'

adrinjalali · 2024-07-31T09:16:14Z

Here's where the issue comes from for instance:

X.dtype == bool
Out[2]: True
X.dtype != bool
Out[3]: False
X.dtype is not bool
Out[4]: True
X.dtype
Out[5]: dtype('bool')

glemaitre · 2024-08-02T09:56:02Z

sklearn/utils/validation.py

@@ -919,7 +919,7 @@ def is_sparse(dtype):
        )
        if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig):
            dtype_orig = np.result_type(*dtypes_orig)
-        elif pandas_requires_conversion and any(d == object for d in dtypes_orig):
+        elif pandas_requires_conversion and any(d is object for d in dtypes_orig):


Apparently here we change the behaviour because of the following rule:

In [32]: np.dtype("object") is object Out[32]: False In [33]: np.dtype("object") == object Out[33]: True

adrinjalali · 2024-08-06T11:46:48Z

CI is now green here @glemaitre

…arn#29501)

MNT Fix E721 linting issues to do type comparisons with is

7448c04

jeremiedbb mentioned this pull request Jul 17, 2024

MNT Add author/license note where missing and add the linter #29477

Merged

glemaitre reviewed Jul 25, 2024

View reviewed changes

glemaitre added the No Changelog Needed label Jul 25, 2024

adrinjalali added 2 commits July 30, 2024 13:36

Merge remote-tracking branch 'upstream/main' into ruff/e721

b324c81

apply Guillaume's suggestions

8705dc3

revert numpy type comparison with is

488a191

glemaitre reviewed Aug 2, 2024

View reviewed changes

adrinjalali added 2 commits August 6, 2024 12:34

Merge remote-tracking branch 'upstream/main' into ruff/e721

2b5f391

fix another is vs == issue

13ce565

Merge remote-tracking branch 'upstream/main' into ruff/e721

3f4bdb8

glemaitre approved these changes Aug 30, 2024

View reviewed changes

adrinjalali merged commit 68d8c2c into scikit-learn:main Aug 30, 2024
29 of 30 checks passed

adrinjalali deleted the ruff/e721 branch August 30, 2024 13:13

MarcBresson pushed a commit to MarcBresson/scikit-learn that referenced this pull request Sep 2, 2024

MNT Fix E721 linting issues to do type comparisons with is (scikit-le…

b03cd37

…arn#29501)

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 9, 2024

MNT Fix E721 linting issues to do type comparisons with is (scikit-le…

55b3607

…arn#29501)

glemaitre pushed a commit that referenced this pull request Sep 11, 2024

MNT Fix E721 linting issues to do type comparisons with is (#29501)

8a1fe4c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNT Fix E721 linting issues to do type comparisons with is #29501

MNT Fix E721 linting issues to do type comparisons with is #29501

adrinjalali commented Jul 16, 2024

github-actions bot commented Jul 16, 2024 •

edited

Loading

jeremiedbb commented Jul 17, 2024

glemaitre Jul 25, 2024

glemaitre Jul 25, 2024

glemaitre Jul 25, 2024

glemaitre Jul 25, 2024

glemaitre Jul 25, 2024

glemaitre Jul 25, 2024

glemaitre commented Jul 25, 2024 •

edited

Loading

adrinjalali commented Jul 30, 2024

adrinjalali commented Jul 31, 2024

glemaitre Aug 2, 2024

adrinjalali commented Aug 6, 2024

	if dtype is bool and X.dtype != bool:
	if dtype is bool and X.dtype is not bool:

	if type(result_full) is tuple:
	if isinstance(result_full, tuple):

	assert type(raised_error.value) is type(err_msg)
	assert isinstance(raised_error.value, type(err_msg))

MNT Fix E721 linting issues to do type comparisons with is #29501

MNT Fix E721 linting issues to do type comparisons with is #29501

Conversation

adrinjalali commented Jul 16, 2024

github-actions bot commented Jul 16, 2024 • edited Loading

✔️ Linting Passed

jeremiedbb commented Jul 17, 2024

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre Jul 25, 2024

Choose a reason for hiding this comment

glemaitre commented Jul 25, 2024 • edited Loading

adrinjalali commented Jul 30, 2024

adrinjalali commented Jul 31, 2024

glemaitre Aug 2, 2024

Choose a reason for hiding this comment

adrinjalali commented Aug 6, 2024

github-actions bot commented Jul 16, 2024 •

edited

Loading

glemaitre commented Jul 25, 2024 •

edited

Loading