ENH Adds n_features_in_ checks to linear and svm modules #18578

thomasjpfan · 2020-10-09T04:06:42Z

Reference Issues/PRs

Continues #18514

thomasjpfan · 2020-10-09T04:09:18Z

sklearn/linear_model/_ransac.py

@@ -478,7 +478,7 @@ def predict(self, X):
            Returns predicted values.
        """
        check_is_fitted(self)
-
+        X = self._validate_data(X, accept_sparse='csr', reset=False)


Another approach is to not validate if self.estimator_.n_features_in_ is defined and delgate it to self.estimator_.

Indeed, think I am in favor to delegating the check to the underlying estimator.

If we are delegating, I think it would good to adjust the error message: #18585 to not include the estimator name.

sklearn/linear_model/_logistic.py

ogrisel

After reviewing this, I am in favor of not checking n_features_int_ when it's missing for any reason (not just stateless estimators). It cuts some useless code complexity and it's likely to be nicer to third party libraries.

sklearn/linear_model/_base.py

ogrisel · 2020-10-09T16:26:07Z

sklearn/linear_model/_ransac.py

@@ -478,7 +478,7 @@ def predict(self, X):
            Returns predicted values.
        """
        check_is_fitted(self)
-
+        X = self._validate_data(X, accept_sparse='csr', reset=False)


Indeed, think I am in favor to delegating the check to the underlying estimator.

sklearn/linear_model/_stochastic_gradient.py

ogrisel · 2020-10-14T13:35:50Z

E               AssertionError: The error message should contain one of the following patterns:
E               X has 1 features, but RANSACRegressor is expecting 2 features as input
E               Got X has 1 features, but Ridge is expecting 2 features as input.

Now that the base_estimator is in charge of doing the check, the common test has to be adapted to reflect this logic.

thomasjpfan · 2020-10-14T13:56:30Z

WDYT of #18585 where the check expects any name?

ogrisel · 2020-10-14T14:49:28Z

Ok I merged #18585. Let me resolve the conflict.

ogrisel

LGTM when (hopefully) green.

ogrisel · 2020-10-14T17:02:14Z

@NicolasHug this one is also ready for quick merge :)

NicolasHug

minor comment but LGTM

sklearn/linear_model/_stochastic_gradient.py

NicolasHug · 2020-10-14T19:11:29Z

sklearn/svm/_base.py

@@ -489,10 +490,6 @@ def _validate_for_predict(self, X):
                raise ValueError("X.shape[1] = %d should be equal to %d, "
                                 "the number of samples at training time" %
                                 (X.shape[1], self.shape_fit_[0]))
-        elif not callable(self.kernel) and X.shape[1] != self.shape_fit_[1]:


do we still need shape_fit_ then?

shape_fit_[0] is still being used in some places of the codebase.

amueller · 2020-11-20T22:30:01Z

two approvals and a conflict, do you want to update?

thomasjpfan · 2020-11-23T20:07:37Z

Synced up PR with master.

cmarmo · 2020-12-14T14:02:38Z

@ogrisel is this PR a good candidate for the final 0.24 release? I believe it just needs to be merged...

thomasjpfan added 2 commits October 9, 2020 00:02

ENH Adds n_features_in_ checks to linear module

83a538c

REV Reduces diff

61ba6d5

github-actions bot added module:ensemble module:linear_model module:svm labels Oct 9, 2020

thomasjpfan commented Oct 9, 2020

View reviewed changes

ogrisel self-requested a review October 9, 2020 14:39

ogrisel mentioned this pull request Oct 9, 2020

CLN Only check for n_features_in_ only when it exists #18011

Merged

ogrisel reviewed Oct 9, 2020

View reviewed changes

thomasjpfan mentioned this pull request Oct 9, 2020

MNT Uses a less strict error message for n_features_in_ checks #18585

Merged

thomasjpfan added 3 commits October 13, 2020 16:47

Merge remote-tracking branch 'upstream/master' into n_feature_in_svm

345d432

CLN Remove unreachable code

6b44811

REV Reduces diff

7201313

Expect base estimator names

81b3fc7

Merge branch 'master' into n_feature_in_svm

5c8aae9

ogrisel approved these changes Oct 14, 2020

View reviewed changes

NicolasHug approved these changes Oct 14, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into n_feature_in_svm

7d15672

Merge remote-tracking branch 'upstream/master' into n_feature_in_svm

926bf4c

lorentzenchr approved these changes Jan 2, 2021

View reviewed changes

lorentzenchr merged commit 5946f8b into scikit-learn:master Jan 2, 2021

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH Adds n_features_in_ checks to linear and svm modules #18578

ENH Adds n_features_in_ checks to linear and svm modules #18578

thomasjpfan commented Oct 9, 2020

thomasjpfan Oct 9, 2020

ogrisel Oct 9, 2020

thomasjpfan Oct 10, 2020

ogrisel left a comment

ogrisel Oct 9, 2020

ogrisel commented Oct 14, 2020

thomasjpfan commented Oct 14, 2020

ogrisel commented Oct 14, 2020

ogrisel left a comment

ogrisel commented Oct 14, 2020

NicolasHug left a comment

NicolasHug Oct 14, 2020

thomasjpfan Oct 15, 2020

amueller commented Nov 20, 2020

thomasjpfan commented Nov 23, 2020

cmarmo commented Dec 14, 2020 •

edited

Loading

ENH Adds n_features_in_ checks to linear and svm modules #18578

ENH Adds n_features_in_ checks to linear and svm modules #18578

Conversation

thomasjpfan commented Oct 9, 2020

Reference Issues/PRs

thomasjpfan Oct 9, 2020

Choose a reason for hiding this comment

ogrisel Oct 9, 2020

Choose a reason for hiding this comment

thomasjpfan Oct 10, 2020

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

ogrisel Oct 9, 2020

Choose a reason for hiding this comment

ogrisel commented Oct 14, 2020

thomasjpfan commented Oct 14, 2020

ogrisel commented Oct 14, 2020

ogrisel left a comment

Choose a reason for hiding this comment

ogrisel commented Oct 14, 2020

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug Oct 14, 2020

Choose a reason for hiding this comment

thomasjpfan Oct 15, 2020

Choose a reason for hiding this comment

amueller commented Nov 20, 2020

thomasjpfan commented Nov 23, 2020

cmarmo commented Dec 14, 2020 • edited Loading

cmarmo commented Dec 14, 2020 •

edited

Loading