Skip to content

OneVsOneClassifier decision function shape non-standard #8049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amueller opened this issue Dec 13, 2016 · 15 comments
Open

OneVsOneClassifier decision function shape non-standard #8049

amueller opened this issue Dec 13, 2016 · 15 comments

Comments

@amueller
Copy link
Member

For binary tasks, OvO has a shape of (n_samples, 2) which pretty much violates our standards.
I'm not sure what the best solution is apart from just breaking it as a bug-fix.
We could add a parameter and deprecate it and then remove the parameter if we really want.

The OVR classifier also has a potential issue where the decision_function for binary is (n_samples, 1) instead of (n_samples,). That also seems non-standard. I'm not sure we have other multi-output classifiers with a decision function, so I'm not entirely certain what the standard should be. All the multi-label ones do (n_samples,) according to the tests.

Found via #8022.

@jnothman
Copy link
Member

jnothman commented Dec 13, 2016 via email

@amueller
Copy link
Member Author

yeah, I guess so. I don't like those because there is no easy way to silence them except filtering them.

@jnothman
Copy link
Member

Note that, actually, this OvO decision_function is enshrined in the codebase in SVC which has a parameter decision_function_shape in {'ovo', 'ovr'}.

@amueller
Copy link
Member Author

amueller commented Dec 14, 2016

@jnothman If I read this correctly, then I disagree. In SVC decision_function_shape (which I think I added) only impacts the n_classes > 2 case, and decides whether decision_function is of shape (n_samples, n_classes) or (n_samples, n_classes * (n_classes - 1) / 2).

The decision function of OneVsOneClassifier is always (n_samples, n_classes) (and never (n_samples, n_classes * (n_classes - 1) / 2) (except for 3 when they coincide ;)). But it's also that for n_classes = 2 which is not what it is anywhere else in the code base (as far as I know).

@jnothman
Copy link
Member

Okay. Sorry for making claims without giving them full attention!

@dalmia
Copy link
Contributor

dalmia commented Dec 17, 2016

For OvR classifiers, because of the following in decision_function:

return np.array([est.decision_function(X).ravel()
                         for est in self.estimators_]).T

I think the only option is to add a separate binary check for consistency in the shape.

@jnothman
Copy link
Member

I'm not sure what you mean.

@dalmia
Copy link
Contributor

dalmia commented Dec 19, 2016

As @amueller mentioned that we are getting the shape as (n_samples,1) instead of the standard (n_samples,) that we have as standard for multi-label classifiers, I wanted to add that because of the above line, even if len(self.estimators_)==1 the shape will always be (n_samples,1) for the binary case. So, the only workaround that seems plausible here is to add a separate check.

@amueller
Copy link
Member Author

amueller commented Dec 19, 2016 via email

@dalmia
Copy link
Contributor

dalmia commented Dec 21, 2016

@amueller Yes, I get it now. I would like to work on the issue, but am not really sure as to where should I start. Could you please give me a head start?

@Akshay0724
Copy link
Contributor

Hello @amueller, Is it feasible to add a parameter (say standard_binary_output) to decision_function() which can be either True or False and will be False by default.So, now if task is binary we can use a warning that output of shape (n_classes,2 or 1) is not standard and will be of shape (n_classes,) in future version.When passed parameter is True than we can return desired standard output through a binary check in decision_function.
Please correct me if I'm wrong.

@jnothman
Copy link
Member

jnothman commented Jan 7, 2017 via email

@lucyleeow
Copy link
Member

ping @cmarmo I think this was fixed by #9100 and can be closed

@cmarmo
Copy link
Contributor

cmarmo commented Aug 27, 2020

@lucyleeow I don't feel like I can close... this comment makes me think that something is missing... @amueller do you mind clarifying? This will probably be useful also for contributors willing to address this issue. Thanks!

@lucyleeow
Copy link
Member

Ah good point! I missed that. Though if we haven't been backwards compatible since 2016, changing it now will make it not backwards compatible 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants