Skip to content

FIX Fix GridSearchCV regression in 1.5 with parameter grid with heterogeneous parameter values #29078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 43 additions & 25 deletions doc/whats_new/v1.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,24 @@ For a short description of the main highlights of the release, please refer to

.. include:: changelog_legend.inc

.. _changes_1_5_1:

Version 1.5.1
=============

**TODO**

Changelog
---------

:mod:`sklearn.model_selection`
..............................

- |Fix| Fix a regression in :class:`model_selection.GridSearchCV` for parameter
grids that have heterogeneous parameter values.
:pr:`29078` by :user:`Loïc Estève <lesteve>`


.. _changes_1_5:

Version 1.5.0
Expand Down Expand Up @@ -550,29 +568,29 @@ Changelog
Thanks to everyone who has contributed to the maintenance and improvement of
the project since version 1.4, including:

101AlexMartin, Abdulaziz Aloqeely, Adam J. Stewart, Adam Li, Adarsh Wase, Adrin
Jalali, Advik Sinha, Akash Srivastava, Akihiro Kuno, Alan Guedes, Alexis
IMBERT, Ana Paula Gomes, Anderson Nelson, Andrei Dzis, Arnaud Capitaine, Arturo
Amor, Aswathavicky, Bharat Raghunathan, Brendan Lu, Bruno, Cemlyn, Christian
Lorentzen, Christian Veenhuis, Cindy Liang, Claudio Salvatore Arcidiacono,
Connor Boyle, Conrad Stevens, crispinlogan, davidleon123, DerWeh, Dipan Banik,
Duarte São José, DUONG, Eddie Bergman, Edoardo Abati, Egehan Gunduz, Emad
Izadifar, Erich Schubert, Filip Karlo Došilović, Franck Charras, Gael
Varoquaux, Gönül Aycı, Guillaume Lemaitre, Gyeongjae Choi, Harmanan Kohli,
Hong Xiang Yue, Ian Faust, itsaphel, Ivan Wiryadi, Jack Bowyer, Javier Marin
Tur, Jérémie du Boisberranger, Jérôme Dockès, Jiawei Zhang, Joel Nothman,
Johanna Bayer, John Cant, John Hopfensperger, jpcars, jpienaar-tuks, Julian
Libiseller-Egger, Julien Jerphanion, KanchiMoe, Kaushik Amar Das, keyber,
Koustav Ghosh, kraktus, Krsto Proroković, ldwy4, LeoGrin, lihaitao, Linus
Sommer, Loic Esteve, Lucy Liu, Lukas Geiger, manasimj, Manuel Labbé, Manuel
Morales, Marco Edward Gorelli, Maren Westermann, Marija Vlajic, Mark Elliot,
Mateusz Sokół, Mavs, Michael Higgins, Michael Mayer, miguelcsilva, Miki
Watanabe, Mohammed Hamdy, myenugula, Nathan Goldbaum, Naziya Mahimkar, Neto,
Olivier Grisel, Omar Salman, Patrick Wang, Pierre de Fréminville, Priyash
Shah, Puneeth K, Rahil Parikh, raisadz, Raj Pulapakura, Ralf Gommers, Ralph
Urlus, Randolf Scholz, Reshama Shaikh, Richard Barnes, Rodrigo Romero, Saad
Mahmood, Salim Dohri, Sandip Dutta, SarahRemus, scikit-learn-bot, Shaharyar
Choudhry, Shubham, sperret6, Stefanie Senger, Suha Siddiqui, Thanh Lam DANG,
thebabush, Thomas J. Fan, Thomas Lazarus, Thomas Li, Tialo, Tim Head, Tuhin
Sharma, VarunChaduvula, Vineet Joshi, virchan, Waël Boukhobza, Weyb, Will
101AlexMartin, Abdulaziz Aloqeely, Adam J. Stewart, Adam Li, Adarsh Wase, Adrin
Jalali, Advik Sinha, Akash Srivastava, Akihiro Kuno, Alan Guedes, Alexis
IMBERT, Ana Paula Gomes, Anderson Nelson, Andrei Dzis, Arnaud Capitaine, Arturo
Amor, Aswathavicky, Bharat Raghunathan, Brendan Lu, Bruno, Cemlyn, Christian
Lorentzen, Christian Veenhuis, Cindy Liang, Claudio Salvatore Arcidiacono,
Connor Boyle, Conrad Stevens, crispinlogan, davidleon123, DerWeh, Dipan Banik,
Duarte São José, DUONG, Eddie Bergman, Edoardo Abati, Egehan Gunduz, Emad
Izadifar, Erich Schubert, Filip Karlo Došilović, Franck Charras, Gael
Varoquaux, Gönül Aycı, Guillaume Lemaitre, Gyeongjae Choi, Harmanan Kohli,
Hong Xiang Yue, Ian Faust, itsaphel, Ivan Wiryadi, Jack Bowyer, Javier Marin
Tur, Jérémie du Boisberranger, Jérôme Dockès, Jiawei Zhang, Joel Nothman,
Johanna Bayer, John Cant, John Hopfensperger, jpcars, jpienaar-tuks, Julian
Libiseller-Egger, Julien Jerphanion, KanchiMoe, Kaushik Amar Das, keyber,
Koustav Ghosh, kraktus, Krsto Proroković, ldwy4, LeoGrin, lihaitao, Linus
Sommer, Loic Esteve, Lucy Liu, Lukas Geiger, manasimj, Manuel Labbé, Manuel
Morales, Marco Edward Gorelli, Maren Westermann, Marija Vlajic, Mark Elliot,
Mateusz Sokół, Mavs, Michael Higgins, Michael Mayer, miguelcsilva, Miki
Watanabe, Mohammed Hamdy, myenugula, Nathan Goldbaum, Naziya Mahimkar, Neto,
Olivier Grisel, Omar Salman, Patrick Wang, Pierre de Fréminville, Priyash
Shah, Puneeth K, Rahil Parikh, raisadz, Raj Pulapakura, Ralf Gommers, Ralph
Urlus, Randolf Scholz, Reshama Shaikh, Richard Barnes, Rodrigo Romero, Saad
Mahmood, Salim Dohri, Sandip Dutta, SarahRemus, scikit-learn-bot, Shaharyar
Choudhry, Shubham, sperret6, Stefanie Senger, Suha Siddiqui, Thanh Lam DANG,
thebabush, Thomas J. Fan, Thomas Lazarus, Thomas Li, Tialo, Tim Head, Tuhin
Sharma, VarunChaduvula, Vineet Joshi, virchan, Waël Boukhobza, Weyb, Will
Dean, Xavier Beltran, Xiao Yuan, Xuefeng Xu, Yao Xiao
2 changes: 1 addition & 1 deletion sklearn/model_selection/_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -1090,7 +1090,7 @@ def _store(key_name, array, weights=None, splits=False, rank=False):
param_list = list(param_result.values())
try:
arr_dtype = np.result_type(*param_list)
except TypeError:
except (TypeError, ValueError):
arr_dtype = object
if len(param_list) == n_candidates and arr_dtype != object:
# Exclude `object` else the numpy constructor might infer a list of
Expand Down
45 changes: 45 additions & 0 deletions sklearn/model_selection/tests/test_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -2641,3 +2641,48 @@ def test_score_rejects_params_with_no_routing_enabled(SearchCV, param_search):

# End of Metadata Routing Tests
# =============================


def test_cv_results_dtype_issue_29074():
"""Non-regression test for https://github.com/scikit-learn/scikit-learn/issues/29074"""

class MetaEstimator(BaseEstimator, ClassifierMixin):
def __init__(
self,
base_clf,
parameter1=None,
parameter2=None,
parameter3=None,
parameter4=None,
):
self.base_clf = base_clf
self.parameter1 = parameter1
self.parameter2 = parameter2
self.parameter3 = parameter3
self.parameter4 = parameter4

def fit(self, X, y=None):
self.base_clf.fit(X, y)
return self

def score(self, X, y):
return self.base_clf.score(X, y)

# Values of param_grid are such that np.result_type gives slightly
# different errors, in particular ValueError and TypeError
param_grid = {
"parameter1": [None, {"option": "A"}, {"option": "B"}],
"parameter2": [None, [1, 2]],
"parameter3": [{"a": 1}],
"parameter4": ["str1", "str2"],
}
grid_search = GridSearchCV(
estimator=MetaEstimator(LogisticRegression()),
param_grid=param_grid,
cv=3,
)

X, y = make_blobs(random_state=0)
grid_search.fit(X, y)
for param in param_grid:
assert grid_search.cv_results_[f"param_{param}"].dtype == object
Loading