Skip to content

[WIP] Correct ducktyping of predict proba in GridSearchCV #4909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

raghavrv
Copy link
Member

Fixes #4902

The decorator (used for ducktyping) was set to 'estimator' (the unfitted est.) rather than the final 'best_estimator_' at #3982

And so predict_proba_ checked for the attr probA_ in the estimator instead of best_estimator_...

Also we need to make sure NotFittedError gets raised first since estimator is not fitted...

@agramfort @adammenges Please take a look!

@raghavrv raghavrv changed the title [Correct delegation of predict proba. [MRG] Correct ducktyping of predict proba in GridSearchCV Jun 30, 2015
@raghavrv
Copy link
Member Author

Selfnote: Needs to be patched to #4294

@raghavrv
Copy link
Member Author

Ah! This is interesting... So we need it to delegate to either estimator or best_estimator_ based on the fit status of GridSearchCV! :/

@raghavrv raghavrv changed the title [MRG] Correct ducktyping of predict proba in GridSearchCV [WIP] Correct ducktyping of predict proba in GridSearchCV Jun 30, 2015
@amueller
Copy link
Member

We should always delegate to estimator.
This is caused by #4835, which is apparently not the right fix.

Err... what happens if someone searches over probability? hum..

@jnothman always knows what to do in this kind of situations ;)

@amueller
Copy link
Member

Maybe we should ignore the issue with people searching over probability and do either of the two:

  1. run the check for probA_ and probB_ only if the estimator was fitted.

  2. move the check for probA_ and probB_ into _predict_proba, not predict_proba.

The first one is a bit more magic and I don't really see it as being very necessary. If people set probability=True after fitting, I think it is more appropriate to raise an error "not fitted with probability=True" then branching to decision_function in our usual duck-typing.

@amueller
Copy link
Member

So I vote for 2.

@adammenges
Copy link

Unless I'm missing something, that example I provided should always have probability=True, is that not the case?

@raghavrv
Copy link
Member Author

raghavrv commented Jul 1, 2015

@adammenges The example you provided is fine! This is an issue with delegation of the meta-estimator (GridSearchCV)'s methods to best_estimator_, so one can directly access those methods from the GridSearchCV instance instead of (grid_svm.best_estimator_....), (if I understood this correctly) :)

@raghavrv
Copy link
Member Author

raghavrv commented Jul 1, 2015

And Andreas was talking about setting probability to True after fitting which is not correct...

@jnothman
Copy link
Member

jnothman commented Jul 1, 2015

I've not the time to read all comments right now, but this is the wrong solution. We need to check the unfitted estimator and delegate to the fitted one.

@jnothman jnothman closed this Jul 1, 2015
@raghavrv raghavrv deleted the correct_delegation_of_predict_proba branch July 6, 2015 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants