ENH Add Array API compatibility tests for `*SearchCV` classes #27096

betatim · 2023-08-18T09:43:28Z

RandomizedSearchCV and GridSearchCV appear to just work with Array API inputs.

This adds a test that makes sure that they will keep working.

For the common tests to pass we need Ridge to support the Array API.

github-actions · 2023-08-18T09:45:11Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: aebb43e. Link to the linter CI: here}

RandomizedSearchCV and GridSearchCV appear to just work with Array API inputs.

ogrisel

Nice! Can you please add those estimators to the list of estimators in the array api section of the user guide?

ogrisel · 2024-05-24T16:02:19Z

For the common tests to pass we need Ridge to support the Array API.

This is now the case. Let's see if this can work.

ogrisel · 2024-05-24T16:30:26Z

I started some of the failures reported by our test suite after the merge with main but it's still WIP. I have to tune out for today.

ogrisel

Marking this PR as not accepted because there is still work to do.

… indices

…ort tag

sklearn/model_selection/tests/test_search.py

ogrisel · 2024-05-27T09:24:31Z

sklearn/model_selection/tests/test_validation.py

-        train=None,
-        test=None,
+        train=train,
+        test=test,


Not sure why this did no fail in main but based on the docstrings _fit_and_score was never supposed to accept None for its train and test arguments so better pass valid integer arrays in this test instead.

ogrisel · 2024-05-27T09:25:22Z

sklearn/tests/test_common.py

@@ -332,7 +332,9 @@ def _generate_search_cv_instances():
        extra_params = (
            {"min_resources": "smallest"} if "min_resources" in init_params else {}
        )
-        search_cv = SearchCV(Estimator(), param_grid, cv=2, **extra_params)
+        search_cv = SearchCV(
+            Estimator(), param_grid, cv=2, error_score="raise", **extra_params


Setting error_score="raise" makes pytest traceback reading much more direct, especially in CI logs.

sklearn/model_selection/_split.py

ogrisel

I think this is +1 on my side. The fact that the CV splitters are currently NumPy only makes the conversion for stratifification a bit ugly.

We probably need a follow-up PR to check that all CV splitters work when splitting array API inputs with train_test_split or other tools that accept array API inputs and cv= parameter but I would rather do that in a dedicated PR and focus this one on what is necessary for *SearchCV themselves.

/cc @OmarManzoor and @betatim.

ogrisel · 2024-05-27T10:04:26Z

BTW, I tested this PR on a cuda host and on an MPS host and all tests pass.

betatim

LGTM.

Not sure I should merge it because I opened the PR. But voting to merge it anyway :D

OmarManzoor

Thanks @betatim @ogrisel

OmarManzoor · 2024-06-05T05:20:05Z

sklearn/model_selection/_split.py

+        # we need the following explicit conversion:
+        xp, is_array_api = get_namespace(y)
+        if is_array_api:
+            y = _convert_to_numpy(y, xp)


Would it be possible to add a test to cover this?

I think we had coverage for it via the test I removed in the last commit. But I thought that we tested the same thing via a common test. So I am a bit puzzled why that doesn't happen :-/

I understand what's going on:

the common tests run the search cv meta estimator using LogisticRegression and Ridge as base-estimator.

the LogisticRegression test is skipped because LogisticRegression does not have the array API estimator tag hence the wrapping search cv estimators ain't either;

the stratified k-fold CV splitter is only used for classification problem, hence the Ridge-based common test does not cover it;

the previous hard-coded test of this PR used LinearDiscriminantAnalysis which is a classifier and supports array API, hence could cover this line.

Since adding array API support to LogisticRegression might take a bit of time, I would be in favor of re-adding the previous non-common test that was removed from this PR.

Yeah that seems right. Thanks for the explanation! I think the test seems valid for now and not redundant.

@ogrisel I added the test back. Can you have a look and then we can probably merge.

ogrisel · 2024-06-07T09:15:58Z

BTW, I tested on CUDA with google colab and https://gist.github.com/EdAbati/ff3bdc06bafeb92452b3740686cc8d7c and tests are green.

OmarManzoor

LGTM. Thanks @betatim @ogrisel

ogrisel · 2024-06-07T09:46:46Z

I marked this PR for auto-merge. Thanks all!

github-actions bot added the module:model_selection label Aug 18, 2023

Add Array API compatibility tests for SearchCV classes

6f74abb

RandomizedSearchCV and GridSearchCV appear to just work with Array API inputs.

betatim force-pushed the array-api-randomsearch branch from 970c612 to 6f74abb Compare August 18, 2023 10:01

ogrisel approved these changes Aug 31, 2023

View reviewed changes

Merge branch 'main' into array-api-randomsearch

28dcbae

FIX attempt to fix array API related test failures

cf8cc15

ogrisel marked this pull request as draft May 24, 2024 16:30

ogrisel requested changes May 27, 2024

View reviewed changes

ogrisel added 3 commits May 27, 2024 10:57

FIX _fit_and_score was never supposed to accept None as train or test…

a5653dd

… indices

Make test_search_cv easier to debug by setting error_score='raise'

3c3a569

FIX make SearchCV meta-estimator expose base estimator array_api_supp…

0e3c7d7

…ort tag

ogrisel reviewed May 27, 2024

View reviewed changes

sklearn/model_selection/tests/test_search.py Outdated Show resolved Hide resolved

ogrisel reviewed May 27, 2024

View reviewed changes

sklearn/model_selection/_split.py Outdated Show resolved Hide resolved

ogrisel added 2 commits May 27, 2024 11:58

Typo

452a5e9

DOC update doc and changelog

fcdd39d

ogrisel added this to the 1.6 milestone May 27, 2024

ogrisel marked this pull request as ready for review May 27, 2024 10:00

ogrisel approved these changes May 27, 2024

View reviewed changes

ogrisel and others added 2 commits June 4, 2024 17:31

Merge branch 'main' into array-api-randomsearch

214a41c

Remove test that is superseeded by comon test

d0069c6

betatim commented Jun 4, 2024

View reviewed changes

OmarManzoor reviewed Jun 5, 2024

View reviewed changes

Add the search cv test for classifier back

725d649

OmarManzoor added 3 commits June 7, 2024 14:37

Fix linting

7618f4d

Merge branch 'main' into array-api-randomsearch

2b0cc89

Merge branch 'main' into array-api-randomsearch

aebb43e

OmarManzoor approved these changes Jun 7, 2024

View reviewed changes

ogrisel enabled auto-merge (squash) June 7, 2024 09:46

ogrisel merged commit 5692e59 into scikit-learn:main Jun 7, 2024
28 checks passed

ogrisel mentioned this pull request Jun 20, 2024

Array API support for cross_validation and friends #28677

Closed

jeremiedbb mentioned this pull request Jul 2, 2024

Release 1.5.1 #29382

Merged

11 tasks

Uh oh!

ENH Add Array API compatibility tests for *SearchCV classes #27096

ENH Add Array API compatibility tests for *SearchCV classes #27096

Uh oh!

Conversation

betatim commented Aug 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented May 24, 2024

Uh oh!

ogrisel commented May 24, 2024

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel May 27, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel May 27, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented May 27, 2024

Uh oh!

betatim left a comment

Choose a reason for hiding this comment

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

betatim Jun 6, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jun 7, 2024

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jun 7, 2024

Uh oh!

Uh oh!

Uh oh!

ENH Add Array API compatibility tests for `*SearchCV` classes #27096

ENH Add Array API compatibility tests for `*SearchCV` classes #27096

betatim commented Aug 18, 2023 •

edited

Loading

github-actions bot commented Aug 18, 2023 •

edited

Loading