Ensure predictions sparse before `sp.hstack` in `ClassifierChain`

We use `sp.hstack` in a number of places in `ClassifierChain` where we may be stacking sparse with dense, e.g.,:

https://github.com/scikit-learn/scikit-learn/blob/36f6734789fc7e4940792c1cfb6a6e90dfcae484/sklearn/multioutput.py#L948

and

https://github.com/scikit-learn/scikit-learn/blob/36f6734789fc7e4940792c1cfb6a6e90dfcae484/sklearn/multioutput.py#L693

AFAICT it seems stacking a sparse with dense via `sp.hstack` gives you a sparse array (even though `sp.hstack` is not documented to support dense):

```bash
In [34]: from scipy.sparse import coo_matrix, hstack
    ...: 
    ...: A = coo_matrix([[1, 2], [3, 4]])

In [35]: B = np.zeros((2,2))

In [36]: hstack([A,B])
Out[36]: 
<2x4 sparse matrix of type '<class 'numpy.float64'>'
        with 4 stored elements in COOrdinate format>
```

Maybe due to: https://github.com/scipy/scipy/blob/f990b1d2471748c79bc4260baf8923db0a5248af/scipy/sparse/_construct.py#L654 ?

Should we ensure y is sparse before using `sp.hstack` ?

I had quick look at our code, I could not find any other cases where it would be possible to be stacking dense + sparse. I think `ClassifierChain` is unique in that we do not usually combine `X` with `y`

Discussed here: https://github.com/scikit-learn/scikit-learn/pull/27700#discussion_r1378691272

cc @glemaitre 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Ensure predictions sparse before `sp.hstack` in `ClassifierChain` #27905

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Ensure predictions sparse before sp.hstack in ClassifierChain #27905

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Ensure predictions sparse before `sp.hstack` in `ClassifierChain` #27905