Skip to content

MAINT avoid fetching lfw datasets when SKLEARN_SKIP_NETWORK_TESTS=1 #28709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

glemaitre
Copy link
Member

closes #28707

Avoid fetching LFW datasets if network is disabled.

Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: f5c4983. Link to the linter CI: here

@glemaitre glemaitre changed the title MAINT avoid fetching lfw datasets MAINT avoid fetching lfw datasets when SKLEARN_SKIP_NETWORK_TESTS=1 Mar 27, 2024
@glemaitre
Copy link
Member Author

ping @jeremiedbb @lesteve since you already know about the issue. It should be a quick fix.
Sorry about this mistake, I probably merged forgetting to check all the fixtures.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The docstring tests of fetch_lfw_xxx no longer appear in the slowest tests report.

@jeremiedbb
Copy link
Member

Which is weird cause the pylatest_conda_forge_mkl job has SKLEARN_SKIP_NETWORK_TESTS=0 :/

@lesteve
Copy link
Member

lesteve commented Mar 27, 2024

The code is not that easy to follow ... I am not 100% sure, but I think this change has the side effect to download more datasets than previously when running network cases in some cases e.g. when not running doctests. Is it really important, probably not?

Which is weird cause the pylatest_conda_forge_mkl job has SKLEARN_SKIP_NETWORK_TESTS=0 :/

Network tests are run only on scheduled runs. I don't remember why exactly but I was involved in this change (maybe there were previously run during scipy-dev tests and it was annoying for some reason) ... Edit: #28383 there was some mysterious slowness of dataset downloads inside pytest when moving the scipy-dev from Python 3.11 to Python 3.12 ...

@jeremiedbb
Copy link
Member

Network tests are run only on scheduled runs. I don't remember why exactly but I was involved in this change (maybe there were previously run during scipy-dev tests and it was annoying for some reason) ...

Oh right, I read too quickly

@jeremiedbb
Copy link
Member

jeremiedbb commented Mar 27, 2024

I think this change has the side effect to download more datasets than previously when running network cases in some cases e.g. when not running doctests. Is it really important, probably not?

I don't think there are cases where we enable network tests but disable docstring tests, so I think it's fine to merge as is and not bother too much

@lesteve
Copy link
Member

lesteve commented Mar 27, 2024

OK let's merge this one then!

@lesteve lesteve merged commit 1f46775 into scikit-learn:main Mar 27, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetchers docstring examples trigger dataset fetch in CI
3 participants