-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Extend SequentialFeatureSelector example to demonstrate how to use negative tol #25525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
During the triaging meeting, we think that allowing for |
For the record, we also discussed the alternative to deprecate the |
Maybe we could also extend the existing example or craft a new one to demonstrate how to use negative tols. The breast cancer dataset seems to be an interesting use case. Anybody interested in doing a PR for some or all of the above (the example can come in a distinct PR not to delay the merge of the review and merge of the fix). |
@ogrisel which existing example is referred to in your post? I could look into improving it. I can expand the breast cancer case and show how a different values of |
I checked and there is only one example for The body of this example uses the diabetes dataset that has probably too few features to be able to demonstrate your use of negative tolerance values. I think it's fine to keep the existing contents as it is but then append a new section that demonstrate backward selection with a negative tolerance to select a smaller number of features on the Brest cancer dataset. |
I opened #25664 to fix the regression but leave the example part for another PR to not delay the 1.2.2 release. |
Hey @ogrisel, I created a Colab Notebook for feature selection on the breast cancer dataset using |
Describe the bug
I utilized the SequentialFeatureSelector for feature selection in my code, with the direction set to "backward." The tolerance value is negative and the selection process stops when the decrease in the metric, AUC in this case, is less than the specified tolerance. Generally, increasing the number of features results in a higher AUC, but sacrificing some features, especially correlated ones that offer little contribution, can produce a pessimistic model with a lower AUC. The code worked as expected in sklearn 1.1.1, but when I updated to sklearn 1.2.1, I encountered the following error.
Steps/Code to Reproduce
Expected Results
Actual Results
Versions
The text was updated successfully, but these errors were encountered: