-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Adds n_features_in_ checks to linear and svm modules #18578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Adds n_features_in_ checks to linear and svm modules #18578
Conversation
sklearn/linear_model/_ransac.py
Outdated
@@ -478,7 +478,7 @@ def predict(self, X): | |||
Returns predicted values. | |||
""" | |||
check_is_fitted(self) | |||
|
|||
X = self._validate_data(X, accept_sparse='csr', reset=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach is to not validate if self.estimator_.n_features_in_
is defined and delgate it to self.estimator_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, think I am in favor to delegating the check to the underlying estimator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are delegating, I think it would good to adjust the error message: #18585 to not include the estimator name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing this, I am in favor of not checking n_features_int_
when it's missing for any reason (not just stateless estimators). It cuts some useless code complexity and it's likely to be nicer to third party libraries.
sklearn/linear_model/_ransac.py
Outdated
@@ -478,7 +478,7 @@ def predict(self, X): | |||
Returns predicted values. | |||
""" | |||
check_is_fitted(self) | |||
|
|||
X = self._validate_data(X, accept_sparse='csr', reset=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, think I am in favor to delegating the check to the underlying estimator.
Now that the |
WDYT of #18585 where the check expects any name? |
Ok I merged #18585. Let me resolve the conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM when (hopefully) green.
@NicolasHug this one is also ready for quick merge :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment but LGTM
@@ -489,10 +490,6 @@ def _validate_for_predict(self, X): | |||
raise ValueError("X.shape[1] = %d should be equal to %d, " | |||
"the number of samples at training time" % | |||
(X.shape[1], self.shape_fit_[0])) | |||
elif not callable(self.kernel) and X.shape[1] != self.shape_fit_[1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need shape_fit_
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shape_fit_[0]
is still being used in some places of the codebase.
two approvals and a conflict, do you want to update? |
Synced up PR with master. |
@ogrisel is this PR a good candidate for the final 0.24 release? I believe it just needs to be merged... |
Reference Issues/PRs
Continues #18514