Skip to content

TST Speed up some of the slowest tests #27383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 16, 2023

Conversation

ogrisel
Copy link
Member

@ogrisel ogrisel commented Sep 15, 2023

Mostly by reducing the dataset sizes.

  • slowest tests for pylatest_conda_forge_mkl on main:
============================= slowest 20 durations =============================
31.32s call     decomposition/_dict_learning.py::sklearn.decomposition._dict_learning.DictionaryLearning
26.73s call     ensemble/tests/test_gradient_boosting.py::test_classification_synthetic[5-log_loss]
25.88s call     neighbors/tests/test_neighbors.py::test_knn_forcing_backend[kd_tree-multiprocessing]
25.35s call     neighbors/tests/test_neighbors.py::test_knn_forcing_backend[auto-multiprocessing]
23.85s call     ensemble/tests/test_gradient_boosting.py::test_classification_synthetic[5-exponential]
22.09s call     neighbors/tests/test_neighbors.py::test_knn_forcing_backend[ball_tree-multiprocessing]
18.64s call     tree/tests/test_tree.py::test_min_impurity_decrease
17.27s call     decomposition/tests/test_sparse_pca.py::test_mini_batch_correct_shapes
14.74s call     experimental/tests/test_enable_successive_halving.py::test_imports_strategies
14.27s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
13.12s call     experimental/tests/test_enable_iterative_imputer.py::test_imports_strategies
12.71s call     decomposition/tests/test_dict_learning.py::test_cd_work_on_joblib_memmapped_data
11.94s call     utils/tests/test_parallel.py::test_dispatch_config_parallel[2]
11.44s call     feature_selection/tests/test_sequential.py::test_n_features_to_select_stopping_criterion[forward]
11.35s call     ensemble/tests/test_gradient_boosting.py::test_gradient_boosting_validation_fraction
11.28s call     decomposition/tests/test_sparse_pca.py::test_transform_inverse_transform_round_trip[MiniBatchSparsePCA]
10.55s call     decomposition/tests/test_dict_learning.py::test_sparse_encode_shapes_omp
10.18s call     feature_selection/tests/test_rfe.py::test_rfe_cv_groups
9.86s call     feature_extraction/image.py::sklearn.feature_extraction.image.PatchExtractor
9.83s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[5]
  • same report on this PR:
============================= slowest 20 durations =============================
13.73s call     utils/tests/test_parallel.py::test_dispatch_config_parallel[2]
11.97s call     experimental/tests/test_enable_successive_halving.py::test_imports_strategies
11.21s call     decomposition/tests/test_dict_learning.py::test_cd_work_on_joblib_memmapped_data
11.06s call     model_selection/tests/test_split.py::test_nested_cv
10.62s call     decomposition/tests/test_sparse_pca.py::test_transform_inverse_transform_round_trip[MiniBatchSparsePCA]
10.44s call     decomposition/tests/test_dict_learning.py::test_dict_learning_lassocd_readonly_data
10.43s call     feature_extraction/image.py::sklearn.feature_extraction.image.PatchExtractor
10.37s call     feature_selection/tests/test_sequential.py::test_n_features_to_select_stopping_criterion[forward]
10.32s call     experimental/tests/test_enable_iterative_imputer.py::test_imports_strategies
9.76s call     decomposition/tests/test_dict_learning.py::test_sparse_encode_shapes_omp
9.11s call     metrics/tests/test_pairwise.py::test_sparse_manhattan_readonly_dataset
9.04s call     linear_model/tests/test_sgd.py::test_multi_core_gridsearch_and_early_stopping
8.89s call     ensemble/tests/test_gradient_boosting.py::test_gradient_boosting_validation_fraction
8.79s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
8.73s call     inspection/tests/test_partial_dependence.py::test_partial_dependence_non_null_weight_idx[1-estimator2]
8.69s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[5]
8.67s call     inspection/tests/test_partial_dependence.py::test_partial_dependence_non_null_weight_idx[0-estimator2]
8.55s call     model_selection/tests/test_search.py::test_random_search_bad_cv
8.34s call     manifold/tests/test_t_sne.py::test_tsne_with_mahalanobis_distance
8.10s call     inspection/tests/test_partial_dependence.py::test_partial_dependence_equivalence_equal_sample_weight[LogisticRegression-data1]

@github-actions
Copy link

github-actions bot commented Sep 15, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 8cb8ff3. Link to the linter CI: here

@@ -2044,10 +2043,10 @@ def test_same_radius_neighbors_parallel(algorithm):
assert_allclose(graph, graph_parallel)


@pytest.mark.parametrize("backend", JOBLIB_BACKENDS)
@pytest.mark.parametrize("backend", ["threading", "loky"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to test multiprocessing and sequential backends?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multiprocessing case is very slow on macOS (and maybe windows as well): this backend creates new processes from scratch each time which is can take several tens of seconds to run on the overloaded macOS Azure CI.

I think the loky backend is enough to test for multiprocess-related problems in our Cython code. The main difference between multiprocessing and loky is that loky reuses its workers across calls making it much more efficient on overloaded machines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also sequential is already tested everywhere else in test_neighbors.py (as it's the default backend for the default value of n_jobs).

... n_samples=100, n_components=15, n_features=20, n_nonzero_coefs=10,
... n_samples=30, n_components=15, n_features=20, n_nonzero_coefs=10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, how much speedup are we getting here? are we THAT slow? 😁

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the macOS CI build, this test is sometimes one of the slowest and can last more than 20s on an overloaded CI runner.

Locally it's fast but decreasing this makes it ~3x faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the description of the PR to show the impact on the macOS CI between main and this PR.

I observed in the latest run than the docstring of MiniBatchDictionaryLearning has a similar problem (and a bug in the direction on the assertion about the sparsity of the transformed data). I pushed a96dee2 to fix it and improve the speed. We will have to wait for the end of this CI run to see the concrete impact of that last commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the description to take a96dee2 into account.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to my comments, I am fine with the changes.

I assume that all the merging regarding the sparse containers hurts the testing performance in terms of time. I tried to be sure that we were not duplicating tests for no reason while merging.

I am wondering if we could have been slightly more lenient with the test: be sure that fit/predict/transform works but not always running statistical checking for all tests.

@ogrisel
Copy link
Member Author

ogrisel commented Sep 15, 2023

I am wondering if we could have been slightly more lenient with the test: be sure that fit/predict/transform works but not always running statistical checking for all tests.

In the long term, we will remove the tests for scipy sparse matrices and only keep the scipy sparse arrays.

In the short term, we could maybe change the defintion of the *_CONTAINERS lists to:

  • only include scipy sparse matrices for versions of scipy < 1.8 (already the case)
  • only include scipy sparse arrays for recent versions of scipy
  • run both only if a specific environment variable is set and we would enable this on one of the fast runners with a recent version of scipy only.

@ogrisel ogrisel removed the Quick Review For PRs that are quick to review label Sep 15, 2023
Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lorentzenchr
Copy link
Member

@ogrisel ready to merge or anything else?

@ogrisel ogrisel merged commit 1fbf5fb into scikit-learn:main Sep 16, 2023
@ogrisel ogrisel deleted the speed-up-slow-tests branch September 16, 2023 09:09
@ogrisel
Copy link
Member Author

ogrisel commented Sep 16, 2023

Merged. There are more tests to optimize down the line but with diminishing returns.

Another PR that should greatly speed-up the tests is the test refactoring of pairwise argkmin & radius reductions: #27281.

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants