ENH Array API support for confusion_matrix converting to numpy array #30562

StefanieSenger · 2024-12-30T09:49:34Z

Reference Issues/PRs

towards #26024
closes #30440 (supercedes)

This PR is an alternative, discussed in #30440, that converts the input arrays to numpy arrays right away and returns the confusion_matrix as a numpy array. For more details see the discussion there.

github-actions · 2024-12-30T09:50:53Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 869f568. Link to the linter CI: here}

StefanieSenger · 2024-12-30T10:25:43Z

sklearn/utils/_array_api.py

+def _nan_to_num(array, xp=None):
+    """Substitutes NaN values of an array with 0 and inf values with the maximum or
+    minimum numbers available for the dtype respectively; like np.nan_to_num."""
+    xp, _ = get_namespace(array, xp=xp)
+    try:
+        array = xp.nan_to_num(array)
+    except AttributeError:  # currently catching exceptions from array_api_strict
+        array[xp.isnan(array)] = 0
+        if xp.isdtype(array.dtype, "real floating"):
+            array[xp.isinf(array) & (array > 0)] = xp.finfo(array.dtype).max
+            array[xp.isinf(array) & (array < 0)] = xp.finfo(array.dtype).min
+        else:  # xp.isdtype(array.dtype, "integral")
+            array[xp.isinf(array) & (array > 0)] = xp.iinfo(array.dtype).max
+            array[xp.isinf(array) & (array < 0)] = xp.iinfo(array.dtype).min
+    return array


_nan_to_num is a leftover from the orginal branch and not needed for this PR, since here, we are converting to numpy arrays and _nan_to_num is only needed to work with array_api_strict arrays.

But we're using np.nan_to_num also in silhouette_samples, in BaseSearchCV._format_results._store() and in two tests. So it might be needed later anyways (?) and it this case it could stay.

Oh, just seen this, including the link to this issue. Maybe I have to remove it then, depending on what will be discussed on the issue.

I can also make a separate PR only with _nan_to_num(), that can stay open months or years until we know if the issue will be fruitful. People can than find it easier if it is needed sometime in future.

I think this needs more discussion anyway, so would be nicer to move it out of this PR. I'm personally not convinced with what's happening in this function TBH.

Okay, I will delete it then. I don't think it makes sense to keep it somewhere anymore.

StefanieSenger · 2024-12-30T11:31:42Z

sklearn/metrics/tests/test_classification.py

+        result = confusion_matrix(y_true, y_pred)
+        xp_result, _ = get_namespace(result)
+        assert _is_numpy_namespace(xp_result)
+
+        # Since the computation always happens with NumPy / SciPy on the CPU, this
+        # function is expected to return an array allocated on the CPU even when it does
+        # not match the input array's device.
+        assert result.device == "cpu"


I have adjusted this test to your suggestions from this comment, @ogrisel. But here, the test is narrower because we made the return value of confusion_matrix to always be a numpy array on cpu.

StefanieSenger · 2024-12-30T11:50:08Z

Regarding the question how to document the return type of confusion_matrix() as a numpy array, I think that keeping
C : ndarray of shape (n_classes, n_classes) in the docstring should be enough, assumed that all the other functions and methods where we have added array api support document the return value type correctly, which is currently not the case.

StefanieSenger · 2025-01-02T14:39:30Z

sklearn/utils/_array_api.py

+# TODO: remove when minimum pandas version is pandas==1.2.0, when
+# `numpy.asarray(pd.Series)` with nullable dtypes no longer returns nd.arrays with
+# `object` dtypes:
+def convert_pandas_nullable_dtypes(pandas_series):
+    """Convert from pandas nullable extension dtypes to numpy dtypes. Without this
+    conversion, numpy.asarray(array) creates a numpy array with dtype `object` for older
+    pandas versions.
+    """
+    dtype_mapping = {
+        **{f"pd.Int{x}Dtype()": f"int{x}" for x in [8, 16, 32, 64]},
+        **{f"pd.Float{x}Dtype()": f"float{x}" for x in [32, 64]},
+        "pd.BooleanDtype()": "bool",
+    }
+    return pandas_series.astype(dtype_mapping.get(pandas_series.dtype), None)


This fixes the CI failure with pandas==1.1.5 that appeared because we now use _convert_to_numpy(), which has a numpy.asarray(array) line, which returns a numpy array with dtype object which then causes _check_targets() to raise.

Mapping to the desired numpy dtype prevents this.

virchan

I suspect the CI failure (about the Test Library) is a false positive, as I couldn't reproduce the same error after running the tests multiple times on my local machine. Instead, I encountered a different set of errors related to CUDA. I believe the issue might be linked to how we convert sample_weight into a NumPy array.

sklearn/metrics/_classification.py

virchan · 2025-01-13T23:18:34Z

Edit: Oh my, we still get the same CI error:

FAILED metrics/tests/test_classification.py::test_confusion_matrix_pandas_nullable[Int64] - ValueError: Classification metrics can't handle a mix of unknown and binary...
FAILED metrics/tests/test_classification.py::test_confusion_matrix_pandas_nullable[Float64] - ValueError: Classification metrics can't handle a mix of unknown and binary...
FAILED metrics/tests/test_classification.py::test_confusion_matrix_pandas_nullable[boolean] - ValueError: Classification metrics can't handle a mix of unknown and binary...
= 3 failed, 32952 passed, 3142 skipped, 146 xfailed, 70 xpassed, 4114 warnings in 1138.31s (0:18:58) =

I couldn't reproduce the CI error mentioned above on my local machine. In fact, test_confusion_matrix_pandas_nullable passed in all three cases:

Could this be a false positive?

StefanieSenger · 2025-01-15T11:11:51Z

Thanks for checking this @virchan.

I have now re-tried, runing pytest on an environment created from the lockfile for the min dependencies.
(conda create --name min_dep_env --file ./build_tools/azure/pymin_conda_forge_openblas_min_dependencies_linux-64_conda.lock).

I now get the failures locally too.

When I tried before, the test has passed locally with pandas==1.2.0. However, @lesteve pointed me to the possibility that there might be some interplay between dependencies that causes this. I'm actually not sure anymore how I did run the test. Maybe I wasn't in my min_dep_env and instead only lazily re-installed my pandas with numpy unaltered. So this might have been the reason why I had seen these tests pass locally.

StefanieSenger

To keep the ball rolling and make the CIs green again, I have added calls to check_array() into confusion_matrix().

An alternative would be to re-introduce the helper function _convert_pandas_nullable_dtypes() as it was before removing it (which also leads to green CIs).

Please let me know if there is anything else I can or should do here.

StefanieSenger · 2025-01-30T13:30:36Z

sklearn/metrics/_classification.py

+    y_true = check_array(y_true, dtype=None, ensure_2d=False, ensure_min_samples=0)
+    y_pred = check_array(y_pred, dtype=None, ensure_2d=False, ensure_min_samples=0)
+    y_true = _convert_to_numpy(y_true, xp)
+    y_pred = _convert_to_numpy(y_pred, xp)


Comment for reviewers:

Only using check_array() without _convert_to_numpy() afterwards results in a test failure for array_api_strict arrays: FAILED sklearn/metrics/tests/test_classification.py::test_confusion_matrix_array_api[array_api_strict-None-None] - TypeError: unhashable type: 'Array'

This is why I left it there. However, I did not research the reason.

OmarManzoor

I think this looks good overall. Thanks @StefanieSenger
We just need to finalize whether we want to keep the current flow where we return a numpy array on the cpu or should we convert to the initial namespace and device.
CC: @ogrisel

sklearn/metrics/_classification.py

sklearn/metrics/tests/test_classification.py

Co-authored-by: Omar Salman <omar.salman2007@gmail.com>

virchan · 2025-02-11T00:05:52Z

sklearn/metrics/_classification.py

+        if not _is_numpy_namespace(get_namespace(labels)[0]):
+            labels = _convert_to_numpy(labels, xp)
+        else:
+            labels = np.asarray(labels)


I think we can replace these with a single line:

labels = _convert_to_numpy(labels, xp)

lesteve · 2025-02-11T06:08:45Z

We just need to finalize whether we want to keep the current flow where we return a numpy array on the cpu or should we convert to the initial namespace and device.

About namespace, I would say for now the least controversial thing to do (from the point of view of array API support) would be to convert to the initial namespace, for example see #30440 (comment).

About device, I am not quite sure ... the Scipy array API support blog post compiled code section (from October 2023) says:

or now, we attempt to convert to NumPy using np.asarray before the compiled code, then convert back to our array's namespace using xp.asarray after it. This will raise exceptions for arrays on a different device to the CPU, as justified in the RFC (tldr: we want to avoid silent device transfers).

Maybe @lucascolley has some opinion on what to do with device based on his scipy experience?

lucascolley · 2025-02-11T07:20:19Z

thanks for the ping @lesteve, I left my initial thoughts at #30440 (comment) then discussed this with Olivier and Tim in a call. The summary is that I think you'll want to think carefully before setting the precedent for different defaults across functions as to whether they return the input namespace or NumPy. That seems like it could be confusing for users and difficult to document.

In this case, if (some of) the computation has to be done with NumPy, then I would suggest attempting to convert back to the input namespace. In SciPy we have chosen to let this error for now and skip the tests for device transfers, but you could choose to let the device transfers happen (as I think you do in your _convert_to_numpy, right?).

OmarManzoor · 2025-02-16T16:52:44Z

I ran some benchmarks on a Kaggle kernel with CUDA using this branch where in one case, the current flow in which a numpy array is returned is used, while in the other we convert to the namespace and device when returning return xp.asarray(cm, device=device_)

Array Size (CUDA)	Return Numpy Timing	Return Namespace and Device Timing
1e3	0.002571168	0.002800589
1e4	0.005218287	0.005717518
1e5	0.010124249	0.011052353
1e6	0.041873491	0.044407668
1e7	0.571530447	0.586297677

By looking at this, I think we might as well return the confusion matrix in the original namespace and device.

StefanieSenger · 2025-02-16T21:01:39Z

Thanks, everyone, for your thoughts on the return type!

To summarize:

The original decision (based on @ogrisel’s suggestion) was to default to returning numpy arrays.
@OmarManzoor’s benchmarks show that converting back to the original namespace and device is a little less performant than keeping the arrays numpy arrays, but the difference is small.
@lesteve and @lucascolley brought up the user experience and consistency across functions and users get the return types that they expect.

So, should we finalise the PR by converting back to the input namespace and device, despite the minor performance difference? Or should we keep the numpy default?

I believe the advantages of converting back to the input namespace and device overweigh and in case we would later judge differently for some reason, array API is experimental and we could still change its behaviour without annoying users too much.

@ogrisel, since your suggestion was to convert to numpy, do you have any thoughts on whether we should stick with that or return types from the input namespace based on the discussion?

lucascolley · 2025-02-16T21:21:06Z

So, should we finalise the PR by converting back to the input namespace and device, despite the minor performance difference?

+1 from me!

For clarity, what is the NumPy code (i.e. which lines) that we have been unable to write in a performant way with the standard (which caused the initial performance discussion)?

OmarManzoor · 2025-02-18T06:18:55Z

For clarity, what is the NumPy code (i.e. which lines) that we have been unable to write in a performant way with the standard (which caused the initial performance discussion)?

Basically where we create the confusion matrix using a coo_matrix

scikit-learn/sklearn/metrics/_classification.py

Lines 413 to 417 in 869f568

    
           cm = coo_matrix( 
        
               (sample_weight, (y_true, y_pred)), 
        
               shape=(n_labels, n_labels), 
        
               dtype=dtype, 
        
           ).toarray()

lesteve · 2025-02-18T09:08:14Z

So, should we finalise the PR by converting back to the input namespace and device, despite the minor performance difference?

This seems the less surprising to me for now and we can revisit this later if the need arises. I would think metrics computation is unlikely to be the bottleneck in most real life use cases.

OmarManzoor · 2025-03-13T06:52:53Z

@ogrisel Could you kindly provide your feedback so that we can finalize this PR?

ogrisel · 2025-06-19T06:56:34Z

Based on the discussion in #31286 (comment), the decision is to convert such output arrays back to the original namespace.

StefanieSenger and others added 14 commits December 7, 2024 23:51

ENH Array API for confusion_matrix

78d2a65

fix dtype checking

770e638

prepare for PR

af440ca

change log

b45646e

use our _isin

3db7054

changes after review

abab5ea

forgot to push that before

abc3981

add test

09cec5d

fix sclar dtype

fdb25f6

fix typos

49f75b7

convert_to_numpy and coo_matrix instead of python loop

914bb63

Merge branch 'main' into array_api_confusion_matrix

a939c80

experiment with convert_to_numpy

6da1d06

np.intersect1d can stay as it is

1f23f63

github-actions bot added module:metrics module:utils labels Dec 30, 2024

StefanieSenger added 2 commits December 30, 2024 10:55

return cm as numpy array

6a43bc3

move attach unique to after conversion to numpy

2000a00

StefanieSenger commented Dec 30, 2024

View reviewed changes

adjust test

5963e0f

StefanieSenger commented Dec 30, 2024

View reviewed changes

document return array type

ef84e04

StefanieSenger requested a review from ogrisel December 30, 2024 11:50

StefanieSenger added 2 commits January 2, 2025 09:36

use get_namespace

1cf525e

fix issue with nullable dtypes with pandas==1.1.5

f50f3ea

StefanieSenger commented Jan 2, 2025

View reviewed changes

private function

84038e4

virchan reviewed Jan 3, 2025

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

StefanieSenger mentioned this pull request Jan 13, 2025

Documenting return array types #30638

Closed

virchan added the Array API label Jan 13, 2025

lesteve mentioned this pull request Jan 23, 2025

[WIP] Add array-api support to metrics.confusion_matrix #28867

Closed

1 task

StefanieSenger mentioned this pull request Jan 28, 2025

ENH Array API support for confusion_matrix #30440

Open

use check_array for handling pandas extension dtypes

fa7564d

StefanieSenger commented Jan 30, 2025

View reviewed changes

StefanieSenger and others added 6 commits January 30, 2025 14:43

ensure_all_finite=False

6ee6afc

add label passing to test to archive CodeCov

32ea61e

fix naming

e59cd7f

experiment - need to push so I can test on GPU

5124f98

convert labels to numpy

5034d01

Merge branch 'main' into array_api_confusion_matrix_numpy

3cffcbf

OmarManzoor reviewed Feb 10, 2025

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved

StefanieSenger and others added 3 commits February 10, 2025 12:28

Update sklearn/metrics/tests/test_classification.py

d42caa6

Co-authored-by: Omar Salman <omar.salman2007@gmail.com>

remove unhelpful comment

094ca6d

Merge branch 'main' into array_api_confusion_matrix_numpy

869f568

virchan reviewed Feb 11, 2025

View reviewed changes

StefanieSenger mentioned this pull request May 5, 2025

Clarification of output array type when metrics accept multiclass/multioutput #31286

Closed

lucyleeow mentioned this pull request Jun 26, 2025

Make more of the "tools" of scikit-learn Array API compatible #26024

Open

Uh oh!

ENH Array API support for confusion_matrix converting to numpy array #30562

Are you sure you want to change the base?

ENH Array API support for confusion_matrix converting to numpy array #30562

Uh oh!

Conversation

StefanieSenger commented Dec 30, 2024 • edited by lucyleeow Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

Uh oh!

github-actions bot commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

StefanieSenger Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Dec 30, 2024

Choose a reason for hiding this comment

Uh oh!

StefanieSenger commented Dec 30, 2024

Uh oh!

StefanieSenger Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

virchan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

virchan commented Jan 13, 2025

Uh oh!

StefanieSenger commented Jan 15, 2025

Uh oh!

StefanieSenger left a comment

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

virchan Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

lesteve commented Feb 11, 2025

Uh oh!

lucascolley commented Feb 11, 2025

Uh oh!

OmarManzoor commented Feb 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StefanieSenger commented Feb 16, 2025

Uh oh!

lucascolley commented Feb 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OmarManzoor commented Feb 18, 2025

Uh oh!

lesteve commented Feb 18, 2025

Uh oh!

OmarManzoor commented Mar 13, 2025

StefanieSenger commented Dec 30, 2024 •

edited by lucyleeow

Loading

github-actions bot commented Dec 30, 2024 •

edited

Loading

StefanieSenger Dec 30, 2024 •

edited

Loading

StefanieSenger Dec 30, 2024 •

edited

Loading

StefanieSenger Jan 3, 2025 •

edited

Loading

StefanieSenger Jan 2, 2025 •

edited

Loading

OmarManzoor commented Feb 16, 2025 •

edited

Loading

lucascolley commented Feb 16, 2025 •

edited

Loading