-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Describe the bug
We have a custom estimator class that inherits from sklearn.base.BaseEstimator
and RegressorMixin
. We run automated unit tests in Azure DevOps pipelines on both Windows Server 2022 and Ubuntu 22.04.1. All the tests pass on Windows. On Python 3.12.6 in Linux the test with the stacktrace shown below fails with:
RuntimeWarning: invalid value encountered in cast
This causes the test and hence build to fail because we set PYTHONWARNINGS=error
before running the tests. On Python 3.11.10 in Linux this test actually passes; but a different test using the same custom estimator fails with an identical stacktrace. And yet this latter test passes on Python 3.12 in Linux!
Note this change in numpy 1.24.0: https://numpy.org/doc/stable/release/1.24.0-notes.html#numpy-now-gives-floating-point-errors-in-casts; especially this bit:
The precise behavior is subject to the C99 standard and its implementation in both software and hardware.
I can probably work around this error in our tests by using a numpy.errstate context manager, but could there be a bug in sklearn?
I don't know if this issue is related to #25319. AFAIK the test data has no nan values; the feature data columns are all float64.
Steps/Code to Reproduce
Sorry, this is proprietary code which I didn't write and don't understand!
Expected Results
The call to fit()
succeeds without throwing a RuntimeWarning
.
Actual Results
Stacktrace from Python 3.12.6 x64 on Linux (Ubuntu 22.04.1):
Traceback (most recent call last):
File "/home/vsts/work/1/tests/<our_test_module>", line 76, in test_gen_data
grid_search.fit(data[features].values)
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/base.py", line 1473, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/model_selection/_search.py", line 1019, in fit
self._run_search(evaluate_candidates)
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/model_selection/_search.py", line 1573, in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/model_selection/_search.py", line 1013, in evaluate_candidates
results = self._format_results(
^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/model_selection/_search.py", line 1137, in _format_results
for param, ma in _yield_masked_array_for_each_param(candidate_params):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/sklearn/model_selection/_search.py", line 429, in _yield_masked_array_for_each_param
ma = MaskedArray(np.empty(n_candidates), mask=True, dtype=arr_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.12.6/x64/lib/python3.12/site-packages/numpy/ma/core.py", line 2820, in __new__
_data = np.array(data, dtype=dtype, copy=copy,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeWarning: invalid value encountered in cast
Versions
Relevant pip-installed package versions, which were all the same in Python 3.11 and 3.12 in both Linux and Windows on Azure DevOps:
numpy 1.26.4
pandas 2.2.3
scikit-learn 1.5.2
scipy 1.14.1