MAINT make ClassifierChain test more efficient #28705

ogrisel · 2024-03-27T09:40:13Z

ogrisel · 2024-03-27T09:41:58Z

For the sake of curiosity, I triggered the pypy CI to check whether that would fix the timeout problem before we get to merge #28704 to main.

github-actions · 2024-03-27T09:42:20Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: fe84886. Link to the linter CI: here}

lesteve · 2024-03-27T10:15:13Z

I doubt that this is a single issue but more a combination of multiple small issues. In particular there seems to be quite some variability in the slowest tests. In the last four logs where PyPy passed, none of them mentioned this particular test, see below.

Wild-guess: the fact that a test is slow does not necessarily indicate an issue in this particular test but it could be that memory usage is high (for example caused by another test running in parallel because we are using xdist or even previous tests) and the OS starts to swap and everything become slow.

============================= slowest 20 durations =============================
595.61s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_pairs
177.22s call     tests/test_multioutput.py::test_multi_target_regression
175.45s call     tests/test_multioutput.py::test_classifier_chain_vs_independent_models
72.16s call     tests/test_multioutput.py::test_base_chain_fit_and_predict_with_sparse_data_and_cv[csr_array]
49.70s call     tests/test_multioutput.py::test_base_chain_fit_and_predict_with_sparse_data_and_cv[csr_matrix]
41.11s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
40.39s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_people
39.32s call     decomposition/tests/test_dict_learning.py::test_dict_learning_online_numerical_consistency[lars]
35.22s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[29]
30.50s call     decomposition/_dict_learning.py::sklearn.decomposition._dict_learning.dict_learning
28.77s call     decomposition/tests/test_dict_learning.py::test_cd_work_on_joblib_memmapped_data
28.01s call     feature_selection/tests/test_rfe.py::test_rfe_cv_groups
26.76s call     utils/tests/test_estimator_checks.py::test_check_estimator
26.20s call     tests/test_multioutput.py::test_classifier_chain_fit_and_predict[predict_log_proba-predict_log_proba]
20.63s call     tests/test_kernel_approximation.py::test_polynomial_count_sketch[0-3-5000-1]
18.78s call     ensemble/tests/test_stacking.py::test_stacking_classifier_multilabel_auto_predict[True-auto]
17.41s call     utils/tests/test_parallel.py::test_configuration_passes_through_to_joblib[multiprocessing-2]
17.01s call     ensemble/tests/test_stacking.py::test_stacking_classifier_multilabel_auto_predict[True-predict]
16.77s call     ensemble/tests/test_stacking.py::test_stacking_classifier_multilabel_auto_predict[False-predict]
16.53s call     tests/test_config.py::test_config_threadsafe_joblib[multiprocessing]

============================= slowest 20 durations =============================
244.85s call     tests/test_common.py::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_dict_unchanged]
228.94s call     tests/test_common.py::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_pipeline_consistency]
206.38s call     utils/tests/test_extmath.py::test_incremental_weighted_mean_and_variance[1-1e-08-1e-08--10000000.0]
171.47s call     tests/test_common.py::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_estimator_sparse_array]
166.41s call     tree/_classes.py::sklearn.tree._classes.DecisionTreeClassifier
158.86s call     tree/tests/test_tree.py::test_sparse_input[digits-DecisionTreeClassifier]
105.92s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
101.60s call     tests/test_multioutput.py::test_base_chain_fit_and_predict_with_sparse_data_and_cv[csr_array]
80.18s call     tests/test_random_projection.py::test_inverse_transform[28-False-SparseRandomProjection-2-2-coo_matrix]
73.72s call     tests/test_random_projection.py::test_random_projection_dtype_match[float64-float64-GaussianRandomProjection]
52.50s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_pairs
50.28s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_people
49.38s call     linear_model/tests/test_huber.py::test_huber_sample_weights[csr_matrix]
46.43s call     tests/test_multioutput.py::test_classifier_chain_fit_and_predict[predict_proba-decision_function]
42.10s call     decomposition/tests/test_dict_learning.py::test_dict_learning_online_numerical_consistency[lars]
41.02s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[28]
40.13s call     tests/test_common.py::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_estimator_sparse_array]
39.87s call     decomposition/tests/test_sparse_pca.py::test_spca_n_components_[None-MiniBatchSparsePCA]
34.23s call     decomposition/tests/test_sparse_pca.py::test_spca_max_iter_None_deprecation
33.41s call     decomposition/_dict_learning.py::sklearn.decomposition._dict_learning.dict_learning

============================= slowest 20 durations =============================
212.77s call     tests/test_common.py::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_supervised_y_2d]
197.74s call     tests/test_common.py::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_methods_subset_invariance]
77.04s call     utils/tests/test_estimator_checks.py::test_check_estimator
69.94s call     tests/test_common.py::test_search_cv[RandomizedSearchCV(cv=2,estimator=LogisticRegression(),param_distributions={'C':[0.1,1.0]},random_state=0)-check_classifiers_train]
37.06s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_pairs
33.16s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
30.80s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[8]
28.30s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_people
26.63s call     tests/test_config.py::test_config_threadsafe_joblib[multiprocessing]
26.53s call     decomposition/tests/test_dict_learning.py::test_cd_work_on_joblib_memmapped_data
25.61s call     tests/test_multioutput.py::test_multi_target_sparse_regression[csr_array]
25.32s call     decomposition/tests/test_dict_learning.py::test_dict_learning_online_numerical_consistency[lars]
23.93s call     utils/tests/test_extmath.py::test_randomized_eigsh_reconst_low_rank[10-7]
20.50s call     feature_selection/tests/test_rfe.py::test_rfe_cv_groups
16.92s call     decomposition/_dict_learning.py::sklearn.decomposition._dict_learning.dict_learning
15.72s call     tree/tests/test_tree.py::test_min_weight_fraction_leaf_with_min_samples_leaf_on_dense_input[DecisionTreeRegressor]
14.05s call     tests/test_common.py::test_search_cv[GridSearchCV(cv=2,estimator=LogisticRegression(),param_grid={'C':[0.1,1.0]})-check_classifiers_one_label_sample_weights]
13.90s call     ensemble/tests/test_bagging.py::test_regression
13.79s call     tests/test_common.py::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_estimators_dtypes]
13.47s call     ensemble/tests/test_stacking.py::test_stacking_classifier_multilabel_auto_predict[True-auto]

============================= slowest 20 durations =============================
358.09s call     tests/test_config.py::test_config_threadsafe_joblib[multiprocessing]
224.04s call     tests/test_random_projection.py::test_correct_RandomProjection_dimensions_embedding[0-coo_array]
215.13s call     tree/tests/test_tree.py::test_sparse_input[digits-DecisionTreeClassifier]
175.26s call     tree/_classes.py::sklearn.tree._classes.DecisionTreeClassifier
171.64s call     tests/test_random_projection.py::test_random_projection_dtype_match[int32-float64-SparseRandomProjection]
56.14s call     tests/test_kernel_approximation.py::test_polynomial_count_sketch[0-3-5000-1]
48.04s call     tests/test_random_projection.py::test_random_projection_numerical_consistency[SparseRandomProjection]
47.30s call     tests/test_random_projection.py::test_inverse_transform[0-False-SparseRandomProjection-1000-1000-coo_matrix]
45.57s call     utils/tests/test_estimator_checks.py::test_check_estimator_clones
41.98s call     tests/test_multioutput.py::test_base_chain_fit_and_predict_with_sparse_data_and_cv[csr_array]
36.68s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_pairs
34.31s call     tests/test_common.py::test_f_contiguous_array_estimator[TSNE]
34.28s call     preprocessing/tests/test_target_encoder.py::test_fit_transform_not_associated_with_y_if_ordinal_categorical_is_not[0]
30.91s call     datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_people
27.36s call     tree/_classes.py::sklearn.tree._classes.ExtraTreeClassifier
26.65s call     decomposition/tests/test_dict_learning.py::test_cd_work_on_joblib_memmapped_data
25.55s call     feature_selection/tests/test_rfe.py::test_rfe_cv_groups
25.35s call     decomposition/tests/test_dict_learning.py::test_dict_learning_online_numerical_consistency[lars]
20.88s call     tree/tests/test_tree.py::test_sparse_input[multilabel-DecisionTreeRegressor]
20.53s call     utils/tests/test_estimator_checks.py::test_check_estimator

ogrisel · 2024-03-27T10:26:28Z

And the change in this PR broken the 32-bit linux tests:

    @pytest.mark.parametrize("order_type", [list, np.array, tuple])
    def test_classifier_chain_tuple_order(order_type):
        X = [[1, 2, 3], [4, 5, 6], [1.5, 2.5, 3.5]]
        y = [[3, 2], [2, 3], [3, 2]]
        order = order_type([1, 0])
    
        chain = ClassifierChain(RandomForestClassifier(n_estimators=2), order=order)
    
        chain.fit(X, y)
        X_test = [[1.5, 2.5, 3.5]]
        y_test = [[3, 2]]
>       assert_array_almost_equal(chain.predict(X_test), y_test)
E       AssertionError: 
E       Arrays are not almost equal to 6 decimals
E       
E       Mismatched elements: 1 / 2 (50%)
E       Max absolute difference: 1.
E       Max relative difference: 0.33333333
E        x: array([[2., 2.]])
E        y: array([[3, 2]])

not sure why though, as I would not have expected this test to be particularly sensitive to platform bitness...

ogrisel · 2024-03-27T10:28:11Z

Let's wait for the CI to complete for the sake of curiosity but I don't plan to invest more time on this. We can close without merging once the CI has ended.

ogrisel · 2024-03-27T10:30:42Z

not sure why though, as I would not have expected this test to be particularly sensitive to platform bitness...

Platform bitness is probably not the culprit but we do need to fix the random seed.

ogrisel · 2024-03-27T10:33:19Z

I pushed a commit to fix the random_state in the test. If CI is green this time, I think it's it's worth merging, irrespective of the (lack of) impact on the PyPy CI. It's just a small test optim and reliability improvement in the end.

jeremiedbb · 2024-03-27T10:46:31Z

from #28705 (comment)

datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_pairs
datasets/_lfw.py::sklearn.datasets._lfw.fetch_lfw_people

The exemples recently added in the docstrings of fetchers have the effect that we now fetch the lfw datasets in all CI jobs.

EDIT: I opened a dedicated issue for that (#28707) since it impacts all CI jobs

lesteve

LGTM.

For completeness, I suspect this does not significantly improve the situation for PyPy.

lesteve · 2024-03-27T13:17:28Z

I set auto-merge and pushed a commit to trigger the CI since the PyPy build timed out.

MAINT make ClassifierChain test more efficient

fce0f88

ogrisel added module:test-suite everything related to our tests No Changelog Needed labels Mar 27, 2024

Trigger [pypy]

5ed015a

Fix random_state and retrigger [pypy]

2cdc465

lesteve approved these changes Mar 27, 2024

View reviewed changes

lesteve enabled auto-merge (squash) March 27, 2024 13:16

Trigger CI

fe84886

lesteve merged commit 52138e2 into scikit-learn:main Mar 27, 2024

ogrisel deleted the classifier-chain-test-optim branch March 27, 2024 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT make ClassifierChain test more efficient #28705

MAINT make ClassifierChain test more efficient #28705

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

github-actions bot commented Mar 27, 2024 •

edited

Loading

lesteve commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

jeremiedbb commented Mar 27, 2024 •

edited

Loading

lesteve left a comment

lesteve commented Mar 27, 2024

MAINT make ClassifierChain test more efficient #28705

MAINT make ClassifierChain test more efficient #28705

Conversation

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

github-actions bot commented Mar 27, 2024 • edited Loading

✔️ Linting Passed

lesteve commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

ogrisel commented Mar 27, 2024

jeremiedbb commented Mar 27, 2024 • edited Loading

lesteve left a comment

Choose a reason for hiding this comment

lesteve commented Mar 27, 2024

github-actions bot commented Mar 27, 2024 •

edited

Loading

jeremiedbb commented Mar 27, 2024 •

edited

Loading