-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
fetch_mldata needs to handle sparse matrices as labels #700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
done in af3b08a |
Hi, My fix looks like this: # set axes to sklearn conventions
if transpose_data:
dataset['data'] = dataset['data'].T
if 'target' in dataset:
if issparse(dataset['target']):
dataset['target'] = dataset['target'].todense()
dataset['target'] = dataset['target'].squeeze() Do you think this is the right way to solve this bug? What else is there to do? Add tests? |
You're just half an hour to late, I already fixed the bug. Sorry about that. My fix was using squeeze only when the target is not sparse, which should be more efficient than converting to dense if there are many outputs. Let me have a look if I can find anything else that you could give a try. |
Thanks, I'll look at those issues. I wasn't sure where the labels can be used and if there won't be any code depending on getting dense matrix. |
Most of the code assumes dense matrices but not inside the fetch_mldata part. So I thought it would be reasonable to let the user decide what to do, once he got his hands on the targets. |
This might be a special case but the "yeast" data set returns a sparse matrix as labels.
As this is a standard dataset for multi-label prediction, it would be good if we supported that.
Maybe we could even make an example.
The text was updated successfully, but these errors were encountered: