Skip to content

FEA SLEP006: Metadata routing for SelfTrainingClassifier #28494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 64 commits into from
Jul 9, 2024

Conversation

adam2392
Copy link
Member

@adam2392 adam2392 commented Feb 21, 2024

Reference Issues/PRs

Towards: #22893

What does this implement/fix? Explain your changes.

Implements metadata routing for SelfTrainingClassifier. Note the added code diff simply comes from a replacement of base_estimator for estimator.

  • convert base_estimator to estimator and add a deprecation
  • uncomments other functions that previously were not tested in NonConsumingClassifier
  • fixes some minor design choices in the unit-testing framework of test_metaestimators_metadata_routing.py (e.g. try/except -> if/else to be more transparent)

Any other comments?

cc: @adrinjalali

Some open questions:

1. I presume, we want to forward metadata within all the functions possibly?
2. As a result, I'm not sure if the unit-testing approach is the best, so I was wondering if you have any suggestions? Should I try potentially refactoring the existing unit-testing code to allow testing for more than just fit?

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link

github-actions bot commented Feb 21, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2cacdc9. Link to the linter CI: here

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 marked this pull request as ready for review February 23, 2024 20:38
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes, thanks @adam2392

Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Member Author

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review and pointers! I went thru and fixed the issues in the doc-strings and Bunch.

@adam2392 adam2392 requested a review from adrinjalali February 27, 2024 17:17
@adam2392
Copy link
Member Author

Resolved conflicts. Feel free to ping me if there's additional changes desired

@adam2392
Copy link
Member Author

adam2392 commented Apr 2, 2024

This PR should not be affected by #28734

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. A few other comments.

Comment on lines 652 to 655
if method_name in ["fit", "partial_fit", "score"]:
# `fit`, `partial_fit`, 'score' accept y, others don't.
method(X, y, **method_kwargs)
except TypeError:
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with try, except we don't really need to bother to maintain this list. However I am fine with either way. I'll let @adrinjalali finalise.

@adam2392
Copy link
Member Author

Thanks for the review @OmarManzoor!

I'll change the last comment if @adrinjalali has any issues (#28494 (comment))

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @adam2392

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adam2392

Comment on lines 652 to 655
if method_name in ["fit", "partial_fit", "score"]:
# `fit`, `partial_fit`, 'score' accept y, others don't.
method(X, y, **method_kwargs)
except TypeError:
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this makes things hard to debug really. The try/except is more foolproof since a library like imbalance-learn adds some methods to the whole system by patching a few things in sklearn and things would just work.

adam2392 and others added 3 commits July 5, 2024 08:16
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Member Author

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @adrinjalali ! I addressed your comments in 3b5a3f0

Comment on lines 652 to 655
if method_name in ["fit", "partial_fit", "score"]:
# `fit`, `partial_fit`, 'score' accept y, others don't.
method(X, y, **method_kwargs)
except TypeError:
else:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I think it was hard before because the error message was not very clear on what method was failing from what class, so I assume that's been fixed with #29226

Comment on lines 652 to 655
if method_name in ["fit", "partial_fit", "score"]:
# `fit`, `partial_fit`, 'score' accept y, others don't.
method(X, y, **method_kwargs)
except TypeError:
else:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted it to the try/except

)
else:
estimator_ = clone(self.estimator)
return estimator_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still missing the case where both estimator and base_estimator are passed, in which case we need to raise

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



# TODO(1.8): remove in 1.8
def test_deprecation_warning_base_estimator():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also test for all other cases in _get_estimator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adam2392 added 3 commits July 5, 2024 09:12
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 requested a review from adrinjalali July 5, 2024 14:21
@adam2392
Copy link
Member Author

adam2392 commented Jul 6, 2024

Lmk if I missed anything else @adrinjalali.

Thanks for the review and patience!

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

I'll let @OmarManzoor have another look since this changed a bit since last he reviewed.

adam2392 added 2 commits July 8, 2024 09:42
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 requested a review from OmarManzoor July 8, 2024 13:42
@adam2392
Copy link
Member Author

adam2392 commented Jul 8, 2024

SG! Thanks for the reviews.

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @adam2392

@OmarManzoor OmarManzoor merged commit cef803a into scikit-learn:main Jul 9, 2024
30 checks passed
@adam2392 adam2392 deleted the self-learn-meta branch July 9, 2024 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants