[NoMRG] evaluate minimal implementation for sklearn estimator #18811

glemaitre · 2020-11-11T12:54:12Z

Build on the top of:

_safe_tags: TST introduce _safe_tags for estimator not inheriting from BaseEstimator #18797
api_mode: [MRG] MNT Api only mode in check_estimator #18582

This PR is to evaluate what is the minimal implementation required for a compatible estimator (Regressor/Classifier/Transformer) which would not inherit from BaseEstimator.

The relevant changes are in test_estimator_checks.py in the function test_check_estimator_minimal

…i_only_mode

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…to api_only_mode

…mator

ogrisel · 2020-11-13T19:05:41Z

Have you ever seen this kind of pytest error before?

==================================== ERRORS ====================================
_____________________________ ERROR collecting gw1 _____________________________
Different tests were collected between gw0 and gw1. The difference is:
--- gw0

+++ gw1

@@ -18216,102 +18216,102 @@

 utils/tests/test_estimator_checks.py::test_check_estimators_unfitted
 utils/tests/test_estimator_checks.py::test_check_no_attributes_set_in_init
 utils/tests/test_estimator_checks.py::test_check_estimator_pairwise
-utils/tests/test_estimator_checks.py::test_check_estimator_minimal[<sklearn.utils.tests.test_estimator_checks.MinimalClassifierobjectat0x7f42e607cb80>-check_no_attributes_set_in_init(api_only=True)]
...
+utils/tests/test_estimator_checks.py::test_check_estimator_minimal[<sklearn.utils.tests.test_estimator_checks.MinimalTransformerobjectat0x7f9612b0b2b0>-check_n_features_in(api_only=True)]
+utils/tests/test_estimator_checks.py::test_check_estimator_minimal[<sklearn.utils.tests.test_estimator_checks.MinimalTransformerobjectat0x7f9612b0b2b0>-check_fit1d(api_only=True)]
+utils/tests/test_estimator_checks.py::test_check_estimator_minimal[<sklearn.utils.tests.test_estimator_checks.MinimalTransformerobjectat0x7f9612b0b2b0>-check_fit2d_predict1d(api_only=True)]
 utils/tests/test_estimator_checks.py::test_check_classifier_data_not_an_array
 utils/tests/test_estimator_checks.py::test_check_regressor_data_not_an_array
 utils/tests/test_estimator_checks.py::test_check_class_weight_balanced_linear_classifier
--------- generated xml file: /home/vsts/work/tmp_folder/test-data.xml ---------

glemaitre · 2020-11-13T19:26:43Z

Have you ever seen this kind of pytest error before?

noop, I don't know if this is linked with running a test calling run_tests_without_pytest while I am using parametrize_with_checks which required pytest. Depending on the order running the test, things can go sideways.
It is passing locally.

thomasjpfan · 2020-11-14T15:56:52Z

Have you ever seen this kind of pytest error before?

That error appears when the id of tests are not the same across processes when using pytest-xdist. For this specific case, we would need to give MinimalClassifierobject an id does not include the memory location such as 7f42e607cb80.

ogrisel

Thanks for the new test for the minimal estimators. I think we should add an estimator checks that checks that if a model has a predict method, then either is_classifier or is_regressor should return True both not both at the same time and not neither of them.

ogrisel · 2020-11-23T11:27:42Z

sklearn/utils/tests/test_estimator_checks.py

+    def __getstate__(self):
+        return self.__dict__.copy()
+
+    def __setstate__(self, state):
+        self.__dict__.update(state)


I think you can just remove those methods, no?

sklearn/utils/tests/test_estimator_checks.py

glemaitre · 2020-11-23T12:32:32Z

sklearn/utils/estimator_checks.py

@@ -2251,8 +2331,7 @@ def check_classifiers_predictions(X, y, name, classifier_orig,
                               (classifier, ", ".join(map(str, y_exp)),
                                ", ".join(map(str, y_pred))))

-    # training set performance
-    if name != "ComplementNB":
+    if not api_only and name != "ComplementNB":


@NicolasHug we are missing this check in the api_only PR. However, there is something fishy here because either I am passing api_only=True, this if statement is run. I will investigate

It's because api_only isn't passed in the calls to check_classifiers_predictions. I'll update the other PR, thanks for noting

We did not pass api_only in the other when calling this function in check_classifiers_classes.

I was 16 seconds too slow :)

glemaitre · 2020-11-23T13:16:25Z

The only issue here is that my test should be placed in test_estimator_checks.py because it requires pytest

ogrisel · 2020-11-23T16:01:29Z

Yes we could move test_check_estimator_minimal in test_common although this is weird because this test is more written to check the checks and not the minimal estimators themselves.

amueller · 2020-11-23T17:18:55Z

for the record I thought we already had this, but maybe I was hallucinating. I think we certainly need it. Maybe having at least one constructor argument would be good?

glemaitre · 2020-11-25T10:27:37Z

I am closing this PR but I will introduce the test with minimal estimator implementation.
We will have to skip it for this release because it will fail for the minimal performance checks.

for the record I thought we already had this, but maybe I was hallucinating. I think we certainly need it. Maybe having at least one constructor argument would be good?

I think this is a good point. It would even be more meaningful to test set_params and get_params checks and we could even check the compatibility with SearchCV and Pipeline.

Tests in CI pipeline return error: Different tests were collected between gw0 and gw1 For details and similar situation see:scikit-learn#18811 (comment)

NicolasHug and others added 21 commits September 28, 2020 18:50

WIP

4f50975

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

544919c

…i_only_mode

WIP

ef04fce

WIP

94b069e

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

dac7574

…i_only_mode

some more

e4f889f

ooops

6fda30c

some more

db71e0f

whatsnew

b4b8138

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

7ac5387

…i_only_mode

addressed comments

41393fa

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

e7aeb4f

…i_only_mode

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

f6f6aee

…i_only_mode

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ap…

d001fc5

…i_only_mode

make pickle full API check

cb66293

Apply suggestions from code review

1ff8887

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Merge branch 'api_only_mode' of github.com:NicolasHug/scikit-learn in…

59446cc

…to api_only_mode

TST reintroduce _safe_tags for estimator not inheriting from BaseEsti…

a68194b

…mator

typo

36f1c5c

TST implement minimal classifier

9e54014

create minimal classifier/regressor/transformer

dbcea4d

github-actions bot added the module:utils label Nov 11, 2020

glemaitre marked this pull request as draft November 11, 2020 12:57

glemaitre changed the title ~~Safe tags api only~~ [NoMRG] evaluate minimal implementation for sklearn estimator Nov 11, 2020

allow pickling

eaca564

glemaitre added 2 commits November 23, 2020 11:45

remove base class

a06dfc4

fix issue with id

111ef8e

glemaitre added 3 commits November 23, 2020 12:06

fix

3383bda

create most frequent for classifier

425746d

Merge remote-tracking branch 'origin/master' into safe_tags_api_only

d64b9b1

ogrisel reviewed Nov 23, 2020

View reviewed changes

glemaitre added 2 commits November 23, 2020 12:39

iter

c012e09

iter

9bf5cbc

glemaitre commented Nov 23, 2020

View reviewed changes

iter

d254c88

glemaitre closed this Nov 25, 2020

ogrisel mentioned this pull request Jul 15, 2021

ENH Private Cython Submodule for Reduction over Pairwise Distances #20254

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[NoMRG] evaluate minimal implementation for sklearn estimator #18811

[NoMRG] evaluate minimal implementation for sklearn estimator #18811

Uh oh!

glemaitre commented Nov 11, 2020 •

edited

Loading

Uh oh!

ogrisel commented Nov 13, 2020

Uh oh!

glemaitre commented Nov 13, 2020

Uh oh!

thomasjpfan commented Nov 14, 2020

Uh oh!

ogrisel left a comment

Uh oh!

ogrisel Nov 23, 2020

Uh oh!

Uh oh!

Uh oh!

glemaitre Nov 23, 2020

Uh oh!

NicolasHug Nov 23, 2020

Uh oh!

glemaitre Nov 23, 2020

Uh oh!

glemaitre Nov 23, 2020

Uh oh!

glemaitre commented Nov 23, 2020

Uh oh!

ogrisel commented Nov 23, 2020

Uh oh!

amueller commented Nov 23, 2020

Uh oh!

glemaitre commented Nov 25, 2020

Uh oh!

Uh oh!

Uh oh!

[NoMRG] evaluate minimal implementation for sklearn estimator #18811

[NoMRG] evaluate minimal implementation for sklearn estimator #18811

Uh oh!

Conversation

glemaitre commented Nov 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Nov 13, 2020

Uh oh!

glemaitre commented Nov 13, 2020

Uh oh!

thomasjpfan commented Nov 14, 2020

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Nov 23, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

glemaitre Nov 23, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug Nov 23, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre Nov 23, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre Nov 23, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Nov 23, 2020

Uh oh!

ogrisel commented Nov 23, 2020

Uh oh!

amueller commented Nov 23, 2020

Uh oh!

glemaitre commented Nov 25, 2020

Uh oh!

Uh oh!

glemaitre commented Nov 11, 2020 •

edited

Loading