Skip to content

FEA return final cross-validation score in SequentialFeatureSelector #31483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

cboseak
Copy link

@cboseak cboseak commented Jun 4, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

  • Added an attribute (e.g., final_cv_score_) that stores the mean cross-validation score of the final model with the selected features. This would avoid having to run another cross-validation externally to get the final performance score.
    • Currently, when using SequentialFeatureSelector, it internally performs cross-validation to decide which features to select, based on the scoring function. However, the final cross-validation score (e.g., recall) is not returned by the SFS object.

Copy link

github-actions bot commented Jun 4, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 1944c07. Link to the linter CI: here

@betatim betatim changed the title per issue 31473, return final cross-validation score in SequentialFea… FEA return final cross-validation score in SequentialFea… Jun 6, 2025
Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @cboseak

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @cboseak

@OmarManzoor OmarManzoor added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jun 11, 2025
@adrinjalali adrinjalali removed the Waiting for Second Reviewer First reviewer is done, need a second one! label Jun 12, 2025
@adrinjalali adrinjalali changed the title FEA return final cross-validation score in SequentialFea… FEA return final cross-validation score in SequentialFeatureSelector Jun 12, 2025
@cboseak
Copy link
Author

cboseak commented Jun 12, 2025

See latest changes to address your comments

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need more opinions on this to see if we'd like to include it.

cc @scikit-learn/core-devs

cboseak and others added 3 commits June 12, 2025 09:06
….feature.rst


update based on PR suggestions

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
…_cv_score` to return raw values instead of mean
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the implementation as is, but I do wonder its usecases and whether it's useful to enough users.

Tagging for a second opinion: @OmarManzoor @adam2392

scores : ndarray of shape (n_splits,)
Array of cross-validation scores for each split.
"""
_raise_for_params(params, self, "get_final_cv_score")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please have a test for this.

@OmarManzoor
Copy link
Contributor

When I approved this I considered it being added as an attribute but since that increases the fit time I am not so sure about having a separate function that will still need to be called separately. Wouldn't that kind of be similar to just calling the code within the function? I guess if it adds some convenience to users we can add it.

@adrinjalali
Copy link
Member

Wouldn't that kind of be similar to just calling the code within the function?

Not sure which function you mean.

@OmarManzoor
Copy link
Contributor

Not sure which function you mean.

get_final_cv_score the one that is added in this PR

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is purely a convenience function right?

The computation time to get the answer that you'd want is the same with or without the function.

In that case, my main criterion would be looking at whether this makes the API more usable. Is this function name also present in other feature selectors? If so, let's add it imo. If not, shouldn't we consolidate?

@@ -193,6 +193,21 @@ def __init__(
self.cv = cv
self.n_jobs = n_jobs

def _get_cv(self, y):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this function is needed. Perhaps I'm missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a suggestion in one of the comments but basically we had duplicate code in 2 places (cv = check_cv(self.cv, y, classifier=is_classifier(self.estimator))) so it was moved into a function to clean it up

@cboseak
Copy link
Author

cboseak commented Jun 23, 2025

I wanted to check in on this one. What do we need to do to finish this PR out. I can make any updates needed either way

@adrinjalali
Copy link
Member

In that case, my main criterion would be looking at whether this makes the API more usable. Is this function name also present in other feature selectors? If so, let's add it imo. If not, shouldn't we consolidate?

I agree with @adam2392 here that it'd be nice to consider where else this could be used. Since we don't add estimator level functions lightly, I'd be happy if you could investigate @cboseak , for a consistent API across estimators.

@adrinjalali adrinjalali changed the title FEA return final cross-validation score in SequentialFeatureSelector FEA return final cross-validation score in SequentialFeatureSelector Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to return final cross-validation score in SequentialFeatureSelector
4 participants