-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Add chain_method
to ClassifierChain
#27700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments regarding the code.
@@ -844,6 +889,10 @@ class labels for each estimator in the chain. | |||
order_ : list | |||
The order of labels in the classifier chain. | |||
|
|||
chain_method_ : str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be provided when using _return_response_method_used
in _get_response_values
.
sklearn/multioutput.py
Outdated
if sp.issparse(X): | ||
X_aug = sp.hstack((X, previous_predictions)) | ||
else: | ||
X_aug = np.hstack((X, previous_predictions)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if sp.issparse(X): | |
X_aug = sp.hstack((X, previous_predictions)) | |
else: | |
X_aug = np.hstack((X, previous_predictions)) | |
hstack = sp.hstack if sp.issparse(X) else np.hstack | |
X_aug = hstack([X, previous_prediction]) |
Is it possible to get a previous_prediction
that is dense and X
being sparse and thus the stacking will make something not expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT it seems stacking a sparse with dense via sp.hstack
gives you a sparse array (even though sp.hstack
is not documented to support dense):
In [34]: from scipy.sparse import coo_matrix, hstack
...:
...: A = coo_matrix([[1, 2], [3, 4]])
In [35]: B = np.zeros((2,2))
In [36]: hstack([A,B])
Out[36]:
<2x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
Maybe from here: https://github.com/scipy/scipy/blob/f990b1d2471748c79bc4260baf8923db0a5248af/scipy/sparse/_construct.py#L654 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Probably this is the np.hstack
that did not behave properly then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you may be right. The documentation does not mention that it would support sparse + dense.
I can't find any reference about whether this is intentional. This scipy PR mentions that the stack functions converts everything to COO format (but we knew this from the code). Also, I found this stackoverflow answer saying that you can stack sparse + dense.
Happy to add previous_prediction
sparse conversion ? (Though this would be difficult to test since sp.hstack
allows sparse + dense)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a quick look at our code, I could not find any other cases where it would be possible to be stacking dense + sparse, if that helps the decision in whether we need to covert to sparse.
@glemaitre tests are currently failing because |
I did not think about |
The funny part about |
Yes I noticed that too, I mostly added it because it was suggested by Joel here: #9316 (review) I can add support for |
I open #27720 already. |
Ah amazing, thanks! |
Sorry I missed that in your comment! |
Since we merged the blocking PR, I can give a review to this one once this is ready. Ping me @lucyleeow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round of review but it looks good on my side.
Thanks @glemaitre , changes made. I think we did not decide if should convert to sparse: #27700 (comment) I am fine with changing this in a separate PR as well, since this is not strictly related to this PR |
Let's keep the same behaviour here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @lucyleeow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments on moving to v1.5, otherwise LGTM
doc/whats_new/v1.4.rst
Outdated
:mod:`sklearn.multioutput` | ||
.......................... | ||
|
||
- |Enhancement| `chain_method` parameter added to `:class:``multioutput.ClassifierChain`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be moved to 1.5
sklearn/multioutput.py
Outdated
preference. The method used corresponds to the first method in | ||
the list that is implemented by `base_estimator`. | ||
|
||
.. versionadded:: 1.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. versionadded:: 1.4 | |
.. versionadded:: 1.5 |
Looks like the CI is failing. It has been a while, so I do not know if it is related.
Updated, thanks @thomasjpfan ! |
Thanks @glemaitre and @thomasjpfan for the reviews! |
It seems like PR broke The error does reminded me of some issues that were seen in #27576.
Maybe @StefanieSenger has some insights into this. |
Yes, I do: :) Since merging this PR estimator_checks also check for sparse arrays, and for See this diff for more info. Can I fix that? |
Reference Issues/PRs
Fixes #9247
Closes #9316 (supersedes)
What does this implement/fix? Explain your changes.
Add
chain_method
toClassifierChain
. Supports{'predict', 'predict_proba', 'predict_log_proba', 'decision_function'}
(as suggested in #9316 (review))Any other comments?