-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
FEA return final cross-validation score in SequentialFeatureSelector
#31483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @cboseak
doc/whats_new/upcoming_changes/sklearn.feature_selection/31483.feature.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @cboseak
See latest changes to address your comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need more opinions on this to see if we'd like to include it.
cc @scikit-learn/core-devs
doc/whats_new/upcoming_changes/sklearn.feature_selection/31483.feature.rst
Outdated
Show resolved
Hide resolved
….feature.rst update based on PR suggestions Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
…_cv_score` to return raw values instead of mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind the implementation as is, but I do wonder its usecases and whether it's useful to enough users.
Tagging for a second opinion: @OmarManzoor @adam2392
scores : ndarray of shape (n_splits,) | ||
Array of cross-validation scores for each split. | ||
""" | ||
_raise_for_params(params, self, "get_final_cv_score") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please have a test for this.
When I approved this I considered it being added as an attribute but since that increases the fit time I am not so sure about having a separate function that will still need to be called separately. Wouldn't that kind of be similar to just calling the code within the function? I guess if it adds some convenience to users we can add it. |
Not sure which function you mean. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this is purely a convenience function right?
The computation time to get the answer that you'd want is the same with or without the function.
In that case, my main criterion would be looking at whether this makes the API more usable. Is this function name also present in other feature selectors? If so, let's add it imo. If not, shouldn't we consolidate?
@@ -193,6 +193,21 @@ def __init__( | |||
self.cv = cv | |||
self.n_jobs = n_jobs | |||
|
|||
def _get_cv(self, y): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this function is needed. Perhaps I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a suggestion in one of the comments but basically we had duplicate code in 2 places (cv = check_cv(self.cv, y, classifier=is_classifier(self.estimator))
) so it was moved into a function to clean it up
I wanted to check in on this one. What do we need to do to finish this PR out. I can make any updates needed either way |
I agree with @adam2392 here that it'd be nice to consider where else this could be used. Since we don't add estimator level functions lightly, I'd be happy if you could investigate @cboseak , for a consistent API across estimators. |
SequentialFeatureSelector
Reference Issues/PRs
What does this implement/fix? Explain your changes.