MAINT More test runtime optimizations #14136

rth · 2019-06-21T02:16:28Z

Optimizes the test runtime of a few more modules.

For the changed modules, this PR reduces the runtime by 40% (or 30s),

pytest sklearn/feature_extraction sklearn/feature_selection sklearn/gaussian_process sklearn/impute sklearn/inspection/ sklearn/manifold sklearn/metrics/ sklearn/mixture sklearn/model_selection/

rth

A few explanations below.

The methodology was to measure runtime with pytest module --duration=10 and modify tests that were the slowest. All the changes here are those that have a measurable impact on runtime.

rth · 2019-06-21T02:18:12Z

sklearn/manifold/tests/test_t_sne.py

@@ -605,7 +603,7 @@ def test_kl_divergence_not_nan(method):

    X = random_state.randn(50, 2)
    tsne = TSNE(n_components=2, perplexity=2, learning_rate=100.0,
-                random_state=0, method=method, verbose=0, n_iter=1003)
+                random_state=0, method=method, verbose=0, n_iter=503)


We just want to be sure that tsne.kl_divergence_ is computed when n_iter % n_iter_check != 0 cf comment in the code above.

rth · 2019-06-21T05:03:41Z

sklearn/metrics/tests/test_classification.py

@@ -678,7 +678,7 @@ def test_matthews_corrcoef_multiclass():
    assert_almost_equal(mcc, 0.)


-@pytest.mark.parametrize('n_points', [100, 10000, 1000000])


The last case was taking 6s on my laptop, which is too much for a test, and doesn't seem to add much compared to the other two.

rth · 2019-06-21T05:04:45Z

sklearn/model_selection/tests/test_search.py

    for iid in (True, False):
        for refit in (True, False):
            random_searches = []
            for scoring in (('accuracy', 'recall'), 'accuracy', 'recall'):
                # If True, for multi-metric pass refit='accuracy'
                if refit:
+                    probability = True


predict_proba is only used below when refit=True

rth · 2019-06-21T05:05:19Z

sklearn/model_selection/tests/test_validation.py

@@ -812,20 +812,21 @@ def split(self, X, y=None, groups=None):
                       'not match total number of classes (3). '
                       'Results may not be appropriate for your use case.')
    assert_warns_message(RuntimeWarning, warning_message,
-                         cross_val_predict, LogisticRegression(),
+                         cross_val_predict,
+                         LogisticRegression(solver="liblinear"),


liblinear converges faster than lbfgs here

glemaitre

LGTM

adrinjalali

LGTM, let's see if it breaks the cron jobs :P

* Feature extraction / feature selection * Metrics, manifold, impute, GP optimization * Optimize mixture * Optimize model_selection * Fix tests * Lint

rth and others added 5 commits June 20, 2019 19:02

Feature extraction / feature selection

64fe9c7

Metrics, manifold, impute, GP optimization

e9e51b4

Optimize mixture

fca08a8

Optimize model_selection

cb3ac30

Fix tests

0701232

rth commented Jun 21, 2019

View reviewed changes

Lint

c449bfd

glemaitre approved these changes Jun 21, 2019

View reviewed changes

adrinjalali approved these changes Jun 22, 2019

View reviewed changes

adrinjalali merged commit 5674122 into scikit-learn:master Jun 22, 2019

rth deleted the opt-test-runtime-part-2 branch June 22, 2019 13:38

rth mentioned this pull request Jun 23, 2019

test_fit_csr_matrix failing on Linux py35_conda_openblas #14168

Closed

rth restored the opt-test-runtime-part-2 branch July 15, 2019 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT More test runtime optimizations #14136

MAINT More test runtime optimizations #14136

Uh oh!

rth commented Jun 21, 2019 •

edited

Loading

Uh oh!

rth left a comment

Uh oh!

rth Jun 21, 2019

Uh oh!

rth Jun 21, 2019

Uh oh!

rth Jun 21, 2019

Uh oh!

rth Jun 21, 2019

Uh oh!

glemaitre left a comment

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

		@@ -678,7 +678,7 @@ def test_matthews_corrcoef_multiclass():
		assert_almost_equal(mcc, 0.)


		@pytest.mark.parametrize('n_points', [100, 10000, 1000000])

Uh oh!

MAINT More test runtime optimizations #14136

MAINT More test runtime optimizations #14136

Uh oh!

Conversation

rth commented Jun 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth Jun 21, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jun 21, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jun 21, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jun 21, 2019

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rth commented Jun 21, 2019 •

edited

Loading