MNT add durations=20 to makefile to show test runtimes locally #11147

amueller · 2018-05-26T16:18:12Z

This is already happening on travis. Times on travis and locally are quite different sometimes, though.

jnothman · 2018-05-28T06:21:07Z

should this go in setup.cfg?

amueller · 2018-06-01T18:42:03Z

should showlocals also go there?

rth

In its current state this LGTM.

I don't think that putting --duration into setup.cfg is a good idea though, as when one selects a single test with pytest it will show its setup/teardown/run time which we don't care about (just tried it).

+1 for putting --showlocals there though.

rth · 2018-06-01T22:09:07Z

+1 for putting --showlocals there though.

Though one can always just select and rerun the failed test locally with --pdb so I'm not sure it's really useful. Do others use it locally? (I don't).

In some way using pytest defaults, unless there is a reason not to, might be easier.

amueller · 2018-06-02T00:36:11Z

I'm not sure about showlocals... I think it might give pretty long tracebacks sometimes.
And I did put the duration in the setup.cfg, but apparently failed to push before you commented.

amueller · 2018-06-02T00:37:15Z

Why do you think providing the runtime of a single test is a bad idea? I think it's good because it reminds us to be careful.

rth · 2018-06-02T14:11:01Z

Why do you think providing the runtime of a single test is a bad idea? I think it's good because it reminds us to be careful.

Yeah, before this PR,

$ pytest sklearn/tests/test_grid_search.py
==================================================================== test session starts ====================================================================
platform linux -- Python 3.6.5, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /home/rth/src/scikit-learn, inifile: setup.cfg
collected 34 items                                                                                                                                          

sklearn/tests/test_grid_search.py ..................................                                                                                  [100%]

========================================================== 34 passed, 32 warnings in 2.24 seconds ===========================================================

with --duration=20 in setup.cfg (this PR),

$ pytest sklearn/tests/test_grid_search.py
==================================================================== test session starts ====================================================================
platform linux -- Python 3.6.5, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /home/rth/src/scikit-learn, inifile: setup.cfg
collected 34 items                                                                                                                                          

sklearn/tests/test_grid_search.py ..................................                                                                                  [100%]

================================================================= slowest 20 test durations =================================================================
0.51s call     sklearn/tests/test_grid_search.py::test_randomized_search_grid_scores
0.31s call     sklearn/tests/test_grid_search.py::test_unsupervised_grid_search
0.20s call     sklearn/tests/test_grid_search.py::test_pandas_input
0.11s call     sklearn/tests/test_grid_search.py::test_grid_search_with_multioutput_data
0.08s call     sklearn/tests/test_grid_search.py::test_grid_search_sparse_scoring
0.06s call     sklearn/tests/test_grid_search.py::test_grid_search_score_consistency
0.05s call     sklearn/tests/test_grid_search.py::test_grid_search_sparse
0.04s call     sklearn/tests/test_grid_search.py::test_grid_search_one_grid_point
0.04s call     sklearn/tests/test_grid_search.py::test_grid_search_no_score
0.03s call     sklearn/tests/test_grid_search.py::test_grid_search_score_method
0.02s call     sklearn/tests/test_grid_search.py::test_classes__property
0.02s call     sklearn/tests/test_grid_search.py::test_grid_search_precomputed_kernel
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_iid
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_failing_classifier
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_allows_nans
0.01s call     sklearn/tests/test_grid_search.py::test_refit
0.00s call     sklearn/tests/test_grid_search.py::test_gridsearch_no_predict
0.00s call     sklearn/tests/test_grid_search.py::test_y_as_list
0.00s call     sklearn/tests/test_grid_search.py::test_predict_proba_disabled
0.00s call     sklearn/tests/test_grid_search.py::test_gridsearch_nd
========================================================== 34 passed, 32 warnings in 1.96 seconds ===========================================================

IMO this is just visual noise - can tell just by running it that it takes 2 s (or that some test is slow as you did in #11146).

jnothman · 2018-06-04T01:01:14Z

can we set it to a smaller number in setup.cfg, and to 20 in Travis?

rth · 2018-06-04T09:07:54Z

can we set it to a smaller number in setup.cfg, and to 20 in Travis?

With respect to putting it in setup.cfg, I'm not convinced there is a problem that needs solving. Overall, test runtime is fairly well optimized, I tried to improve it a few month ago (or maybe in 2017) and couldn't find anything that could be trivially improved that would have an impact on the overall run time (I think thanks to @lesteve's and other contributors efforts).

Pytest already prints cumulated run time by default so when selecting a couple of tests this doesn't bring anything relevant. For all test suite, why not, but we already have this in Travis. Run time shouldn't differ wildly between local runs and Travis unless there is an issue with the implementation (as might be the case with #11146 (comment) ).

Also if this is added, and a contributor doesn't want it, he/she would need to keep a local copy of setup.cfg with git update-index --assume-unchanged. While on the opposite side, occasionally adding --duration to pytest cli parameters if there is any suspicion of slow tests is straightforward.

lesteve · 2018-06-04T14:03:28Z

I would be in favour of leaving things as they are too. Locally I run selected tests a lot more than the full test suite, so adding a --duration argument in setup.cfg would err on the noisy side for me.

While on the opposite side, occasionally adding --duration to pytest cli parameters if there is any suspicion of slow tests is straightforward.

That would be my suggestion too.

amueller · 2018-06-04T17:24:54Z

@lesteve but ok with adding to makefile?

amueller · 2018-06-04T17:26:42Z

The problem that needs solving is that for three years (or whenever the regression happened if there was one) no-one noticed that the parallel argument in MeanShift was terribly broken, and that 15% of the test runtime of the whole suite was spend there. See #11146 and #5189.

amueller · 2018-06-04T17:29:51Z

@rth was the mean shift already the slowest by far when you debugged it? If you didn't notice it then, maybe something in joblib broke in the meantime?

rth · 2018-06-04T19:11:47Z

As you saw in #11146 (comment) it has only been around 3 weeks since that regression appeared. TBH, the slowdown in that tests was noticeable when running tests locally lately but I never took the time to investigate.

As to the rest, in the slowest tests from your #11146 (comment) test_mice_transform_recovery was also introduced recently in #8478 (the example data size is probably too big). How to prevent introduction of slow tests, I don't know. However I think it's still more productive to check the top 20 slowest tests from time to time rather than see the timings for the remaining 9500+ which individually have very little impact.

rth · 2018-06-04T19:14:27Z

Sorry accidental key press :)

amueller · 2018-06-04T20:28:59Z

Happy to go back to my original change of adding only to the makefile, which runs all the tests.

lesteve · 2018-06-05T07:35:25Z

Happy to go back to my original change of adding only to the makefile, which runs all the tests.

I am fine with adding --duration to the Makefile if that makes you happy, personally I only use make in ...

jnothman · 2018-06-05T21:48:13Z

I agree that printing durations would not likely have fixed the current issue. A CI that told us if there was an unexpected increase in test time might be a better alert, but false positives would be rife due to environmental factors

amueller · 2018-08-21T20:34:37Z

reverted to just doing it in the makefile

rth · 2018-08-22T09:07:00Z

Merging with Loic's +1 in #11147 (comment). Thanks!

* tag '0.20rc1': (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

* releases: (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

* dfsg: (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

add durations=20 to makefile to show test runtimes locally

f46b745

rth approved these changes Jun 1, 2018

View reviewed changes

rth closed this Jun 4, 2018

rth reopened this Jun 4, 2018

amueller force-pushed the topn_slow_tests_makefile branch from 9fbd1aa to f46b745 Compare August 21, 2018 20:34

rth merged commit a8cd4f4 into scikit-learn:master Aug 22, 2018

Uh oh!

MNT add durations=20 to makefile to show test runtimes locally #11147

MNT add durations=20 to makefile to show test runtimes locally #11147

Uh oh!

Conversation

amueller commented May 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented May 28, 2018 via email

Uh oh!

amueller commented Jun 1, 2018

Uh oh!

rth left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth commented Jun 1, 2018

Uh oh!

amueller commented Jun 2, 2018

Uh oh!

amueller commented Jun 2, 2018

Uh oh!

rth commented Jun 2, 2018

Uh oh!

jnothman commented Jun 4, 2018 via email

Uh oh!

rth commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Jun 4, 2018

Uh oh!

amueller commented Jun 4, 2018

Uh oh!

amueller commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller commented Jun 4, 2018

Uh oh!

rth commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rth commented Jun 4, 2018

Uh oh!

amueller commented Jun 4, 2018

Uh oh!

lesteve commented Jun 5, 2018

Uh oh!

jnothman commented Jun 5, 2018 via email

Uh oh!

amueller commented Aug 21, 2018

Uh oh!

rth commented Aug 22, 2018

Uh oh!

Uh oh!

amueller commented May 26, 2018 •

edited

Loading

rth left a comment •

edited

Loading

rth commented Jun 4, 2018 •

edited

Loading

amueller commented Jun 4, 2018 •

edited

Loading

rth commented Jun 4, 2018 •

edited

Loading