Skip to content

TST allow categorisation of tests into API and legacy #29699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 4, 2024

Conversation

adrinjalali
Copy link
Member

This is a very minimal PR, allowing us to start working on categorisation of tests. The idea is:

  • a set of API only checks which will always be run
  • a set of legacy tests which start with almost all common tests we have
  • gradually move / create API only tests into the API only category, and document them in the developer guides
  • gradually move other tests into their own categories, while adding those categories as a boolean parameter to parametrize_with_checks, such as statistical, etc.

cc @glemaitre @thomasjpfan @OmarManzoor @adam2392

@adrinjalali adrinjalali added the Developer API Third party developer API related label Aug 21, 2024
Copy link

github-actions bot commented Aug 21, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 38d82bc. Link to the linter CI: here

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume documentation will come later?

Is there an associated GH issue to track what will move where and such?

@glemaitre
Copy link
Member

At least for the moment, this is going to solve: #16241 -> it means that we need to have a API documentation page in line with the check that we are doing.

Then, adding tests to other category would be for later.

@adrinjalali
Copy link
Member Author

Yes, there will be a bunch of documentation work for this page: https://scikit-learn.org/stable/developers/develop.html

I rather we settle with this API here and merge, and then work on API only tests dedicated in a different PR, and follow up PR to update our documentation. Along with other PRs here: Developer API Third party developer API related my goal is to make it nicer and cleaner for third party developers to develop scikit-learn compatible estimators.

@adrinjalali
Copy link
Member Author

Also for some historical context, @NicolasHug had some a lot of work in #16882, #16890, #17252, and #17361. But we ended up reverting them last minute before a release.

@adam2392
Copy link
Member

I see! That's exciting. I always did find the parametrize_with_checks just a blackbox, so I think having some more fine-grained control over what's "checked" is useful.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali Can you open an issue or comment on an issue that gives an overview on what the categories will be?

@@ -513,9 +522,14 @@ def _should_be_skipped_or_marked(estimator, check):
return False, "placeholder reason that will never be used"


def parametrize_with_checks(estimators):
def parametrize_with_checks(estimators, legacy=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be added to check_estimator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it.

Copy link
Member Author

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasjpfan

Can you open an issue or comment on an issue that gives an overview on what the categories will be?

We don't really know at this point what those categories are gonna be. API and "statistical" tests are probably the two we have, then there's array API, sample weight, and I'm not sure about the rest. My plan is to start going through them and to create the categories as we go, while having an overview once I start working on it.

@@ -513,9 +522,14 @@ def _should_be_skipped_or_marked(estimator, check):
return False, "placeholder reason that will never be used"


def parametrize_with_checks(estimators):
def parametrize_with_checks(estimators, legacy=True):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it.

@adrinjalali
Copy link
Member Author

Issue for test categories: #29703

@adrinjalali
Copy link
Member Author

At least for the moment, this is going to solve: #16241 -> it means that we need to have a API documentation page in line with the check that we are doing.

@glemaitre I agree, but do you want this in this PR? I rather have that in a followup PR since adding all API tests would be quite large.

@glemaitre glemaitre self-requested a review September 3, 2024 09:36
@glemaitre
Copy link
Member

I agree, but do you want this in this PR? I rather have that in a followup PR since adding all API tests would be quite large.

Not in this PR, it was just to provide the big picture here.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a test now to make sure that we have less test runnnings with legacy=False.

@@ -533,6 +547,11 @@ def parametrize_with_checks(estimators):

.. versionadded:: 0.24

legacy : bool (default=True)
Whether to include legacy checks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if you should add a little note to mention that the legacy checks exists during the transition that we create the categorisation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add new test to make sure that the length of the generator with legacy=True is bigger than legacy=False.

@glemaitre
Copy link
Member

Enabling auto-merge and trying to make Debian build works

@glemaitre glemaitre enabled auto-merge (squash) September 3, 2024 14:42
@glemaitre glemaitre merged commit 1c3dcb4 into scikit-learn:main Sep 4, 2024
28 checks passed
@adrinjalali adrinjalali deleted the tests/legacy branch September 4, 2024 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants