Skip to content

MNT add durations=20 to makefile to show test runtimes locally #11147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 22, 2018

Conversation

amueller
Copy link
Member

@amueller amueller commented May 26, 2018

This is already happening on travis. Times on travis and locally are quite different sometimes, though.

@jnothman
Copy link
Member

jnothman commented May 28, 2018 via email

@amueller
Copy link
Member Author

amueller commented Jun 1, 2018

should showlocals also go there?

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In its current state this LGTM.

I don't think that putting --duration into setup.cfg is a good idea though, as when one selects a single test with pytest it will show its setup/teardown/run time which we don't care about (just tried it).

+1 for putting --showlocals there though.

@rth
Copy link
Member

rth commented Jun 1, 2018

+1 for putting --showlocals there though.

Though one can always just select and rerun the failed test locally with --pdb so I'm not sure it's really useful. Do others use it locally? (I don't).

In some way using pytest defaults, unless there is a reason not to, might be easier.

@amueller
Copy link
Member Author

amueller commented Jun 2, 2018

I'm not sure about showlocals... I think it might give pretty long tracebacks sometimes.
And I did put the duration in the setup.cfg, but apparently failed to push before you commented.

@amueller
Copy link
Member Author

amueller commented Jun 2, 2018

Why do you think providing the runtime of a single test is a bad idea? I think it's good because it reminds us to be careful.

@rth
Copy link
Member

rth commented Jun 2, 2018

Why do you think providing the runtime of a single test is a bad idea? I think it's good because it reminds us to be careful.

Yeah, before this PR,

$ pytest sklearn/tests/test_grid_search.py
==================================================================== test session starts ====================================================================
platform linux -- Python 3.6.5, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /home/rth/src/scikit-learn, inifile: setup.cfg
collected 34 items                                                                                                                                          

sklearn/tests/test_grid_search.py ..................................                                                                                  [100%]

========================================================== 34 passed, 32 warnings in 2.24 seconds ===========================================================

with --duration=20 in setup.cfg (this PR),

$ pytest sklearn/tests/test_grid_search.py
==================================================================== test session starts ====================================================================
platform linux -- Python 3.6.5, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /home/rth/src/scikit-learn, inifile: setup.cfg
collected 34 items                                                                                                                                          

sklearn/tests/test_grid_search.py ..................................                                                                                  [100%]

================================================================= slowest 20 test durations =================================================================
0.51s call     sklearn/tests/test_grid_search.py::test_randomized_search_grid_scores
0.31s call     sklearn/tests/test_grid_search.py::test_unsupervised_grid_search
0.20s call     sklearn/tests/test_grid_search.py::test_pandas_input
0.11s call     sklearn/tests/test_grid_search.py::test_grid_search_with_multioutput_data
0.08s call     sklearn/tests/test_grid_search.py::test_grid_search_sparse_scoring
0.06s call     sklearn/tests/test_grid_search.py::test_grid_search_score_consistency
0.05s call     sklearn/tests/test_grid_search.py::test_grid_search_sparse
0.04s call     sklearn/tests/test_grid_search.py::test_grid_search_one_grid_point
0.04s call     sklearn/tests/test_grid_search.py::test_grid_search_no_score
0.03s call     sklearn/tests/test_grid_search.py::test_grid_search_score_method
0.02s call     sklearn/tests/test_grid_search.py::test_classes__property
0.02s call     sklearn/tests/test_grid_search.py::test_grid_search_precomputed_kernel
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_iid
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_failing_classifier
0.01s call     sklearn/tests/test_grid_search.py::test_grid_search_allows_nans
0.01s call     sklearn/tests/test_grid_search.py::test_refit
0.00s call     sklearn/tests/test_grid_search.py::test_gridsearch_no_predict
0.00s call     sklearn/tests/test_grid_search.py::test_y_as_list
0.00s call     sklearn/tests/test_grid_search.py::test_predict_proba_disabled
0.00s call     sklearn/tests/test_grid_search.py::test_gridsearch_nd
========================================================== 34 passed, 32 warnings in 1.96 seconds ===========================================================

IMO this is just visual noise - can tell just by running it that it takes 2 s (or that some test is slow as you did in #11146).

@jnothman
Copy link
Member

jnothman commented Jun 4, 2018 via email

@rth
Copy link
Member

rth commented Jun 4, 2018

can we set it to a smaller number in setup.cfg, and to 20 in Travis?

With respect to putting it in setup.cfg, I'm not convinced there is a problem that needs solving. Overall, test runtime is fairly well optimized, I tried to improve it a few month ago (or maybe in 2017) and couldn't find anything that could be trivially improved that would have an impact on the overall run time (I think thanks to @lesteve's and other contributors efforts).

Pytest already prints cumulated run time by default so when selecting a couple of tests this doesn't bring anything relevant. For all test suite, why not, but we already have this in Travis. Run time shouldn't differ wildly between local runs and Travis unless there is an issue with the implementation (as might be the case with #11146 (comment) ).

Also if this is added, and a contributor doesn't want it, he/she would need to keep a local copy of setup.cfg with git update-index --assume-unchanged. While on the opposite side, occasionally adding --duration to pytest cli parameters if there is any suspicion of slow tests is straightforward.

@lesteve
Copy link
Member

lesteve commented Jun 4, 2018

I would be in favour of leaving things as they are too. Locally I run selected tests a lot more than the full test suite, so adding a --duration argument in setup.cfg would err on the noisy side for me.

While on the opposite side, occasionally adding --duration to pytest cli parameters if there is any suspicion of slow tests is straightforward.

That would be my suggestion too.

@amueller
Copy link
Member Author

amueller commented Jun 4, 2018

@lesteve but ok with adding to makefile?

@amueller
Copy link
Member Author

amueller commented Jun 4, 2018

The problem that needs solving is that for three years (or whenever the regression happened if there was one) no-one noticed that the parallel argument in MeanShift was terribly broken, and that 15% of the test runtime of the whole suite was spend there. See #11146 and #5189.

@amueller
Copy link
Member Author

amueller commented Jun 4, 2018

@rth was the mean shift already the slowest by far when you debugged it? If you didn't notice it then, maybe something in joblib broke in the meantime?

@rth
Copy link
Member

rth commented Jun 4, 2018

As you saw in #11146 (comment) it has only been around 3 weeks since that regression appeared. TBH, the slowdown in that tests was noticeable when running tests locally lately but I never took the time to investigate.

As to the rest, in the slowest tests from your #11146 (comment) test_mice_transform_recovery was also introduced recently in #8478 (the example data size is probably too big). How to prevent introduction of slow tests, I don't know. However I think it's still more productive to check the top 20 slowest tests from time to time rather than see the timings for the remaining 9500+ which individually have very little impact.

@rth rth closed this Jun 4, 2018
@rth rth reopened this Jun 4, 2018
@rth
Copy link
Member

rth commented Jun 4, 2018

Sorry accidental key press :)

@amueller
Copy link
Member Author

amueller commented Jun 4, 2018

Happy to go back to my original change of adding only to the makefile, which runs all the tests.

@lesteve
Copy link
Member

lesteve commented Jun 5, 2018

Happy to go back to my original change of adding only to the makefile, which runs all the tests.

I am fine with adding --duration to the Makefile if that makes you happy, personally I only use make in ...

@jnothman
Copy link
Member

jnothman commented Jun 5, 2018 via email

@amueller amueller force-pushed the topn_slow_tests_makefile branch from 9fbd1aa to f46b745 Compare August 21, 2018 20:34
@amueller
Copy link
Member Author

reverted to just doing it in the makefile

@rth
Copy link
Member

rth commented Aug 22, 2018

Merging with Loic's +1 in #11147 (comment). Thanks!

@rth rth merged commit a8cd4f4 into scikit-learn:master Aug 22, 2018
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* tag '0.20rc1': (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* releases: (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* dfsg: (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants