-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Fix E721 linting issues to do type comparisons with is #29501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is that because you a more recent version of ruff than the min 0.2.1 ? |
@@ -324,7 +324,7 @@ def fit(self, X, y=None): | |||
Returns a fitted instance of self. | |||
""" | |||
dtype = bool if self.metric in PAIRWISE_BOOLEAN_FUNCTIONS else float | |||
if dtype == bool and X.dtype != bool: | |||
if dtype is bool and X.dtype != bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X.dtype
is also a type. I would expect:
if dtype is bool and X.dtype != bool: | |
if dtype is bool and X.dtype is not bool: |
@@ -2388,7 +2388,7 @@ def pairwise_distances( | |||
|
|||
dtype = bool if metric in PAIRWISE_BOOLEAN_FUNCTIONS else "infer_float" | |||
|
|||
if dtype == bool and (X.dtype != bool or (Y is not None and Y.dtype != bool)): | |||
if dtype is bool and (X.dtype != bool or (Y is not None and Y.dtype != bool)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
assert type(cv_results["test_r2"]) is np.ndarray | ||
assert type(cv_results["test_neg_mean_squared_error"]) is np.ndarray | ||
assert type(cv_results["fit_time"]) is np.ndarray | ||
assert type(cv_results["score_time"]) is np.ndarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use isinstance
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert type(cv_results["test_r2"]) is np.ndarray | |
assert type(cv_results["test_neg_mean_squared_error"]) is np.ndarray | |
assert type(cv_results["fit_time"]) is np.ndarray | |
assert type(cv_results["score_time"]) is np.ndarray | |
assert isinstance(cv_results["test_r2"], np.ndarray) | |
assert isinstance(cv_results["test_neg_mean_squared_error"], np.ndarray) | |
assert isinstance(cv_results["fit_time"], np.ndarray) | |
assert isinstance(cv_results["score_time"], np.ndarray) |
sklearn/utils/estimator_checks.py
Outdated
@@ -1501,7 +1501,7 @@ def _apply_on_subsets(func, X): | |||
result_by_batch = [func(batch.reshape(1, n_features)) for batch in X] | |||
|
|||
# func can output tuple (e.g. score_samples) | |||
if type(result_full) == tuple: | |||
if type(result_full) is tuple: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if type(result_full) is tuple: | |
if isinstance(result_full, tuple): |
@@ -1341,7 +1341,7 @@ def test_check_scalar_invalid( | |||
include_boundaries=include_boundaries, | |||
) | |||
assert str(raised_error.value) == str(err_msg) | |||
assert type(raised_error.value) == type(err_msg) | |||
assert type(raised_error.value) is type(err_msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert type(raised_error.value) is type(err_msg) | |
assert isinstance(raised_error.value, type(err_msg)) |
This is strange that we did not catch this up before. I always thought that |
@glemaitre I remember why I hadn't changed the ones you suggested, cause it results in the following errors in the CI: FAILED cluster/tests/test_optics.py::test_nowarn_if_metric_bool_data_bool - sklearn.exceptions.DataConversionWarning: Data will be converted to boolean...
FAILED cluster/tests/test_optics.py::test_warn_if_metric_bool_data_no_bool - AssertionError
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[dice] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[jaccard] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[rogerstanimoto] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[russellrao] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[sokalmichener] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[sokalsneath] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED metrics/tests/test_pairwise.py::test_pairwise_boolean_distance[yule] - sklearn.exceptions.DataConversionWarning: Data was converted to boolean for...
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-boolean-bool] - ValueError: could not convert string to float: 'a'
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-Int64-int64] - ValueError: could not convert string to float: 'a'
FAILED utils/tests/test_validation.py::test_check_array_multiple_extensions[True-Float64-float64] - ValueError: could not convert string to float: 'a' |
Here's where the issue comes from for instance: X.dtype == bool
Out[2]: True
X.dtype != bool
Out[3]: False
X.dtype is not bool
Out[4]: True
X.dtype
Out[5]: dtype('bool') |
sklearn/utils/validation.py
Outdated
@@ -919,7 +919,7 @@ def is_sparse(dtype): | |||
) | |||
if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig): | |||
dtype_orig = np.result_type(*dtypes_orig) | |||
elif pandas_requires_conversion and any(d == object for d in dtypes_orig): | |||
elif pandas_requires_conversion and any(d is object for d in dtypes_orig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently here we change the behaviour because of the following rule:
In [32]: np.dtype("object") is object
Out[32]: False
In [33]: np.dtype("object") == object
Out[33]: True
CI is now green here @glemaitre |
Got these from a
ruff check .
, and hope they don't change anything.