-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
OneVsOneClassifier decision function shape non-standard #8049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
well, you have the option to use a FutureWarning to help users anticipate
the change
…On 14 December 2016 at 02:56, Andreas Mueller ***@***.***> wrote:
For binary tasks, OvO has a shape of (n_samples, 2) which pretty much
violates our standards.
I'm not sure what the best solution is apart from just breaking it as a
bug-fix.
We could add a parameter and deprecate it and then remove the parameter if
we really want.
The OVR classifier also has a potential issue where the decision_function
for binary is (n_samples, 1) instead of (n_samples,). That also seems
non-standard. I'm not sure we have other multi-output classifiers with a
decision function, so I'm not entirely certain what the standard should be.
All the multi-label ones do (n_samples,) according to the tests.
Found via #8022 <#8022>.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8049>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAEz63bs1PaG8mEY2d7mqxXZU_q7NSPrks5rHsA6gaJpZM4LL5WR>
.
|
yeah, I guess so. I don't like those because there is no easy way to silence them except filtering them. |
Note that, actually, this OvO decision_function is enshrined in the codebase in SVC which has a parameter |
@jnothman If I read this correctly, then I disagree. In SVC The decision function of |
Okay. Sorry for making claims without giving them full attention! |
For OvR classifiers, because of the following in return np.array([est.decision_function(X).ravel()
for est in self.estimators_]).T I think the only option is to add a separate binary check for consistency in the shape. |
I'm not sure what you mean. |
As @amueller mentioned that we are getting the shape as |
Well yeah fixing it is obvious, and I did that in my PR (for now, will
probably revert unless we can figure something out). The problem is fixing
it in a backwards compatible way.
Sent from phone. Please excuse spelling and brevity.
…On Dec 19, 2016 5:19 AM, "Aman Dalmia" ***@***.***> wrote:
As @amueller <https://github.com/amueller> mentioned that we are getting
the shape as (n_samples,1) instead of the standard (n_samples,) that we
have as standard for multi-label classifiers, I wanted to add that because
of the above line, even if len(self.estimators_)==1 the shape will always
be (n_samples,1) for the binary case. So, the only workaround that seems
plausible here is to add a separate check.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8049 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbcFmroFLbXiFNhrH24xi6VMFTb_0Wbks5rJlongaJpZM4LL5WR>
.
|
@amueller Yes, I get it now. I would like to work on the issue, but am not really sure as to where should I start. Could you please give me a head start? |
Hello @amueller, Is it feasible to add a parameter (say standard_binary_output) to decision_function() which can be either True or False and will be False by default.So, now if task is binary we can use a warning that output of shape (n_classes,2 or 1) is not standard and will be of shape (n_classes,) in future version.When passed parameter is True than we can return desired standard output through a binary check in decision_function. |
We need to handle the case where decision_function is being called by
library code, not user code, and hence requires a standard API for
decision_function.
…On 6 January 2017 at 22:19, akshay0724 ***@***.***> wrote:
Hello ***@***.*** <https://github.com/amueller>*, Is it feasible to add a
parameter (say standard_binary_output) to decision_function() which can be
either True or False and will be False by default.So, now if task is binary
we can use a warning that output of shape (n_classes,2 or 1) is not
standard and will be of shape (n_classes,) in future version.When passed
parameter is True than we can return desired standard output through a
binary check in decision_function.
Please correct me if I'm wrong.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8049 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6wefttUeMwhgZI7Zw81C9N6YKW31ks5rPiNXgaJpZM4LL5WR>
.
|
@lucyleeow I don't feel like I can close... this comment makes me think that something is missing... @amueller do you mind clarifying? This will probably be useful also for contributors willing to address this issue. Thanks! |
Ah good point! I missed that. Though if we haven't been backwards compatible since 2016, changing it now will make it not backwards compatible 😅 |
For binary tasks, OvO has a shape of
(n_samples, 2)
which pretty much violates our standards.I'm not sure what the best solution is apart from just breaking it as a bug-fix.
We could add a parameter and deprecate it and then remove the parameter if we really want.
The OVR classifier also has a potential issue where the decision_function for binary is
(n_samples, 1)
instead of(n_samples,)
. That also seems non-standard. I'm not sure we have other multi-output classifiers with a decision function, so I'm not entirely certain what the standard should be. All the multi-label ones do(n_samples,)
according to the tests.Found via #8022.
The text was updated successfully, but these errors were encountered: