Skip to content

DEBUG Thread parallel tests #30041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

ogrisel
Copy link
Member

@ogrisel ogrisel commented Oct 10, 2024

This is an investigative branch, not meant to be merged as such, to investigate what effort would be required to tackle #30007 as a whole (using pytest-run-parallel instead of pytest-freethreaded).

It includes commits from:

That can be merged independently.

On top of this, I started to use @pytest.mark.parallel_threads(1) (EDIT: now @pytest.mark.thread_unsafe) on tests that are fundamentally not thread-safe or that use other fixtures such as tmpdir that would require to be initialized in the thread running the test to function properly (as opposed to running in the main pytest thread).

Many more similar changes to silence all the non-informative failures, but I will stop there for now.

This already highlights that the most common patterns are:

  • testing for warnings with pytest.warns;
  • testing for sys.stdout with capsys;
  • usage of the tmpdir fixture;
  • usage of the monkeypatch fixture.

EDIT: since the first version of this PR, pytest-run-parallel has been updated to make the tmpdir and tmp_path fixture work by default and automatically detect tests that use problematic fixtures and code patterns involving warnings automatically. The remaining problematic fixtures can probably be added by configuration.

I have also found a failure in test_minibatch_kmeans_partial_fit_init with the lambda init case using:

pytest --parallel-threads=4  sklearn/cluster/tests/test_k_means.py -k test_minibatch_kmeans_partial_fit_init

and I cannot explain it yet (using regular Python with GIL enabled).

Copy link

github-actions bot commented Oct 10, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c26fd32. Link to the linter CI: here

@ogrisel
Copy link
Member Author

ogrisel commented Oct 10, 2024

Note that, I don't plan to continue manually flagging test functions individually. It's too tedious and invasive.

@ogrisel
Copy link
Member Author

ogrisel commented May 9, 2025

I updated the PR to simplify it and rely on automated thread-unsafety heuristics of the latest version of pytest-run-parallel as much as possible. The heuristics seem to work as expected and the amount of manual annotation is much reduced as a result.

I started to run pytest --parallel-threads=4 --iterations=3 on a nogil install. However, I interrupted it after 12% because it's very slow and using all the RAM available on my machine.

Note that tests collection itself is slow, but still bearable compared to the actual test execution time and the memory usage problem. I guess we will need to profile a run on submodule of scikit-learn to check whether the slowdown is expected: when running with --parallel-threads=4 --iterations=3 there are 4 x 3 more tests to run compared to a regular pytest run.

For the record, I created the nogil env using conda-forge:

mamba create -n nogil -c conda-forge python-freethreading ipython numpy scipy cython meson-python compilers

@ogrisel
Copy link
Member Author

ogrisel commented May 10, 2025

To avoid the memory usage problem, I reduced the parallelism to 2 threads and run 1 iteration of the tests per thread:

pytest -v --parallel-threads=2 --iterations=1  2>&1 | tee pytest_all.log

Here are the result: pytest_all.log

There are 146 errors. Many of those are caused by the use of stateful generators in @pytest.mark.parametrize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant