-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] first step in fixing minimum 2 required samples for fixing MLPRegressor attribute error #24788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] first step in fixing minimum 2 required samples for fixing MLPRegressor attribute error #24788
Conversation
…or attribute error
Thanks for making this PR! I think a better place to perform this check is in |
Thanks @betatim , great tip. Will get back on this.
|
Hi @betatim , I updated the Of course, this implies that for any training input where len(X) is less than 11, a ValueError will be raised. This is the intention or? Could there be some angles of the bigger picture that I might be missing here? Thank you. |
Apologies, I didn't read the code carefully enough, if X_val.shape[0] < 2:
raise ValueError("The validation set is too small. Increase the validation_fraction or the size of your dataset.") after line 581 (where This way we can directly check that I've also made a suggestion for the error message. |
Thanks for the swift feedback @betatim
|
|
b8ec6ec
to
29f15d8
Compare
I see. Then, if I understood to somewhat correct degree, here is what could work:
You should already see it in the PR update. Please let me know if it makes sense or if I am missing something in the issue. |
Hi @betatim , wonder if you had a chance to look at the solution above? Let me know if it addresses the issue. Thanks. |
HI, unfortunately we misunderstood each other.
|
Ok, so the recent commit contains the following two additions:
if (self.validation_fraction * X.shape[0] < 2):
raise ValueError("The validation set is too small. Increase the validation_fraction or the size of your dataset.")
if X_val.shape[0] < 2:
raise ValueError("The validation set is too small. Increase the validation_fraction or the size of your dataset.") I suspect the reason to add this additional check in |
I would remove your first addition and only check in |
I see, got it @betatim - thanks for the clarification. Done - now, we only have the check in |
Two more things are needed:
|
@betatim - sorry for a bit of delay in coming back to it. I have added a test (last function) in the It passes the test when
Could it be due to some deprecated code? |
@betatim - could you look at the code (test submitted) above? Any feedback welcome. |
The linting tests that check the formatting of the code are failing. Take a look at step 9 of https://scikit-learn.org/stable/developers/contributing.html#how-to-contribute or the section after that for the tools you need to run to have the formatting fixed automatically. The easiest way is to use pre-commit. I'm not sure why the other two tests fail, they shouldn't be related to your changes. If you fix the linter errors we can see if the problems also happen in the CI itself. |
You are right @betatim, shouldn't be from the fix I suggested. Let me have look. Thanks for the feedback. |
Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @mohitthakur13
…gressor (scikit-learn#24788) Co-authored-by: Mohit Singh Thakur <mohit.thakur@tum.de> Co-authored-by: Mohit Singh Thakur <thakur.mohit@gmail.com> Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Reference Issues/PRs
Fixes #24713
What does this implement/fix? Explain your changes.
Added an if loop to check 10% of input training data (X) is at least 2 - based on the comment by @glemaitre in the _validate_data function in the BaseEstimator class in sklearn/base.py
If this is the correct approach to design the fix, what needs to be further done:
Any other comments?
Any further comments or directions will be appreciated to design a proper fix.