scikit-learn · glemaitre · Sep 28, 2020 · Oct 5, 2020 · Oct 5, 2020 · Oct 6, 2020
diff --git a/doc/developers/develop.rst b/doc/developers/develop.rst
@@ -249,22 +249,16 @@ Rolling your own estimator
 If you want to implement a new estimator that is scikit-learn-compatible,
 whether it is just for you or for contributing it to scikit-learn, there are
 several internals of scikit-learn that you should be aware of in addition to
-the scikit-learn API outlined above. You can check whether your estimator
-adheres to the scikit-learn interface and standards by running
-:func:`~sklearn.utils.estimator_checks.check_estimator` on an instance. The
-:func:`~sklearn.utils.estimator_checks.parametrize_with_checks` pytest
-decorator can also be used (see its docstring for details and possible
-interactions with `pytest`)::
-
-  >>> from sklearn.utils.estimator_checks import check_estimator
-  >>> from sklearn.svm import LinearSVC
-  >>> check_estimator(LinearSVC())  # passes
+the scikit-learn API outlined above.
 
 The main motivation to make a class compatible to the scikit-learn estimator
 interface might be that you want to use it together with model evaluation and
 selection tools such as :class:`model_selection.GridSearchCV` and
 :class:`pipeline.Pipeline`.
 
+Checking the compatibility of your estimator with scikit-learn is described
+in :ref:`checking_compatibility`
+
 Before detailing the required interface below, we describe two ways to achieve
 the correct interface more easily.
 
@@ -499,6 +493,35 @@ patterns.
 The :mod:`sklearn.utils.multiclass` module contains useful functions
 for working with multiclass and multilabel problems.
 
+.. _checking_compatibility:
+
+Checking the estimator's compatibility
+--------------------------------------
+
+You can check whether your estimator adheres to the scikit-learn interface
+and standards by running
+:func:`~sklearn.utils.estimator_checks.check_estimator` on an instance.
+
+The :func:`~sklearn.utils.estimator_checks.parametrize_with_checks` pytest
+decorator can also be used (see its docstring for details and possible
+interactions with `pytest`)::
+
+  >>> from sklearn.utils.estimator_checks import check_estimator
+  >>> from sklearn.svm import LinearSVC
+  >>> check_estimator(LinearSVC())  # passes
+
+Both :func:`~sklearn.utils.estimator_checks.check_estimator` and
+:func:`~sklearn.utils.estimator_checks.parametrize_with_checks` expose an
+`api_only` parameter: when True, the check suite will only consider pure
+API-compatibility checks. Some more advanced checks will be ignored, such as
+ensuring that error messages are informative, or ensuring that a classifier
+is able to properly discriminate classes on a simple problem. We recommend
+leaving this parameter to False to guarantee robust and user-friendly
+estimators.
+
+The kind of checks that the check suite will run can also be partially
+controlled by setting estimator tags, described below:
+
 .. _estimator_tags:
 
 Estimator Tags

diff --git a/doc/glossary.rst b/doc/glossary.rst
@@ -142,7 +142,9 @@ General Concepts
             We provide limited backwards compatibility assurances for the
             estimator checks: we may add extra requirements on estimators
             tested with this function, usually when these were informally
-            assumed but not formally tested.
+            assumed but not formally tested. In particular, checks that are
+            not API-related (i.e. those that are ignored when `api_only` is
+            True) may enforce backward-incompatible requirements.
 
         Despite this informal contract with our users, the software is provided
         as is, as stated in the license.  When a release inadvertently

diff --git a/doc/whats_new/v0.24.rst b/doc/whats_new/v0.24.rst
@@ -97,7 +97,7 @@ Changelog
   `init_size_`, are deprecated and will be removed in 0.26. :pr:`17864` by
   :user:`Jérémie du Boisberranger <jeremiedbb>`.
 
-- |Enhancement| Added :func:`cluster.kmeans_plusplus` as public function. 
+- |Enhancement| Added :func:`cluster.kmeans_plusplus` as public function.
   Initialization by KMeans++ can now be called separately to generate
   initial cluster centroids. :pr:`17937` by :user:`g-walsh`
 
@@ -736,7 +736,7 @@ Changelog
   when `handle_unknown='error'` and `drop=None` for samples
   encoded as all zeros. :pr:`14982` by
   :user:`Kevin Winata <kwinata>`.
-  
+
 :mod:`sklearn.semi_supervised`
 ..............................
 
@@ -775,6 +775,12 @@ Changelog
 :mod:`sklearn.utils`
 ....................
 
+- |Feature| :func:`~utils.estimator_checks.check_estimator` and
+  :func:`~utils.estimator_checks.parametrize_with_checks` now expose an
+  `api_only` parameter which allows to control whether the check suite should
+  only check for pure API-compatibility, or also run more advanced checks.
+  :pr:`18582` and :pr:`17361` by `Nicolas Hug`_.
+
 - |Enhancement| Add ``check_methods_sample_order_invariance`` to
   :func:`~utils.estimator_checks.check_estimator`, which checks that
   estimator methods are invariant if applied to the same dataset
@@ -793,12 +799,10 @@ Changelog
   dimensions do not match in :func:`utils.sparse_func.incr_mean_variance_axis`.
   By :user:`Alex Gramfort <agramfort>`.
 
-
 - |Enhancement| Add support for weights in
   :func:`utils.sparse_func.incr_mean_variance_axis`.
   By :user:`Maria Telenczuk <maikia>` and :user:`Alex Gramfort <agramfort>`.
 
-
 Miscellaneous
 .............
 

diff --git a/sklearn/tests/test_common.py b/sklearn/tests/test_common.py
@@ -219,10 +219,13 @@ def test_class_support_removed():
 
 class MyNMFWithBadErrorMessage(NMF):
     # Same as NMF but raises an uninformative error message if X has negative
-    # value. This estimator would fail the check suite in strict mode,
-    # specifically it would fail check_fit_non_negative
-    # FIXME : should be removed in 0.26
+    # value. This estimator would fail the check suite with api_only=False,
+    # specifically it would fail check_fit_non_negative because its error
+    # message doesn't match the expected one.
+
     def __init__(self):
+        # declare init to avoid deprecation warning since default has changed
+        # FIXME : __init__ should be removed in 0.26
         super().__init__()
         self.init = 'nndsvda'
         self.max_iter = 500
@@ -238,51 +241,52 @@ def fit(self, X, y=None, **params):
         return super().fit(X, y, **params)
 
 
-def test_strict_mode_check_estimator():
-    # Tests various conditions for the strict mode of check_estimator()
+def test_api_only_check_estimator():
+    # Tests various conditions for the api_only parameter of check_estimator()
     # Details are in the comments
 
-    # LogisticRegression has no _xfail_checks, so when strict_mode is on, there
+    # LogisticRegression has no _xfail_checks, so when api_only=False, there
     # should be no skipped tests.
     with pytest.warns(None) as catched_warnings:
-        check_estimator(LogisticRegression(), strict_mode=True)
+        check_estimator(LogisticRegression(), api_only=False)
     assert not any(isinstance(w, SkipTestWarning) for w in catched_warnings)
-    # When strict mode is off, check_n_features should be skipped because it's
-    # a fully strict check
-    msg_check_n_features_in = 'check_n_features_in is fully strict '
-    with pytest.warns(SkipTestWarning, match=msg_check_n_features_in):
-        check_estimator(LogisticRegression(), strict_mode=False)
+    # When api_only is True, check_fit2d_1sample should be skipped
+    # because it's not an API check
+    skip_match = 'check_fit2d_1sample is not an API check'
+    with pytest.warns(SkipTestWarning, match=skip_match):
+        check_estimator(LogisticRegression(), api_only=True)
 
     # NuSVC has some _xfail_checks. They should be skipped regardless of
-    # strict_mode
+    # api_only
     with pytest.warns(SkipTestWarning,
                       match='fails for the decision_function method'):
-        check_estimator(NuSVC(), strict_mode=True)
-    # When strict mode is off, check_n_features_in is skipped along with the
-    # rest of the xfail_checks
-    with pytest.warns(SkipTestWarning, match=msg_check_n_features_in):
-        check_estimator(NuSVC(), strict_mode=False)
-
-    # MyNMF will fail check_fit_non_negative() in strict mode because it yields
-    # a bad error message
+        check_estimator(NuSVC(), api_only=False)
+    # When api_only is True, check_fit2d_1sample is skipped along
+    # with the rest of the xfail_checks
+    with pytest.warns(SkipTestWarning, match=skip_match):
+        check_estimator(NuSVC(), api_only=True)
+
+    # MyNMF will fail check_fit_non_negative() with api_only=False because it
+    # yields a bad error message
     with pytest.raises(
         AssertionError, match="The error message should contain"
     ):
-        check_estimator(MyNMFWithBadErrorMessage(), strict_mode=True)
-    # However, it should pass the test suite in non-strict mode because when
-    # strict mode is off, check_fit_non_negative() will not check the exact
-    # error messsage. (We still assert that the warning from
-    # check_n_features_in is raised)
-    with pytest.warns(SkipTestWarning, match=msg_check_n_features_in):
-        check_estimator(MyNMFWithBadErrorMessage(), strict_mode=False)
+        check_estimator(MyNMFWithBadErrorMessage(), api_only=False)
+    # However, it should pass the test suite with api_only=True because when in
+    # this case, check_fit_non_negative() will not check the exact error
+    # messsage. (We still assert that the warning from
+    # check_fit2d_1sample is raised)
+    with pytest.warns(SkipTestWarning, match=skip_match):
+        check_estimator(MyNMFWithBadErrorMessage(), api_only=True)
 
 
 @parametrize_with_checks([LogisticRegression(),
                           NuSVC(),
                           MyNMFWithBadErrorMessage()],
-                         strict_mode=False)
-def test_strict_mode_parametrize_with_checks(estimator, check):
-    # Ideally we should assert that the strict checks are Xfailed...
+                         api_only=True)
+def test_api_only_parametrize_with_checks(estimator, check):
+    # Ideally we should assert that the NON_API checks are either Xfailed or
+    # Xpassed
     check(estimator)