Skip to content

MAINT More test runtime optimizations #14136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 22, 2019

Conversation

rth
Copy link
Member

@rth rth commented Jun 21, 2019

Optimizes the test runtime of a few more modules.

For the changed modules, this PR reduces the runtime by 40% (or 30s),

pytest sklearn/feature_extraction sklearn/feature_selection sklearn/gaussian_process sklearn/impute sklearn/inspection/ sklearn/manifold sklearn/metrics/ sklearn/mixture sklearn/model_selection/

Copy link
Member Author

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few explanations below.

The methodology was to measure runtime with pytest module --duration=10 and modify tests that were the slowest. All the changes here are those that have a measurable impact on runtime.

@@ -605,7 +603,7 @@ def test_kl_divergence_not_nan(method):

X = random_state.randn(50, 2)
tsne = TSNE(n_components=2, perplexity=2, learning_rate=100.0,
random_state=0, method=method, verbose=0, n_iter=1003)
random_state=0, method=method, verbose=0, n_iter=503)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just want to be sure that tsne.kl_divergence_ is computed when n_iter % n_iter_check != 0 cf comment in the code above.

@@ -678,7 +678,7 @@ def test_matthews_corrcoef_multiclass():
assert_almost_equal(mcc, 0.)


@pytest.mark.parametrize('n_points', [100, 10000, 1000000])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last case was taking 6s on my laptop, which is too much for a test, and doesn't seem to add much compared to the other two.

for iid in (True, False):
for refit in (True, False):
random_searches = []
for scoring in (('accuracy', 'recall'), 'accuracy', 'recall'):
# If True, for multi-metric pass refit='accuracy'
if refit:
probability = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predict_proba is only used below when refit=True

@@ -812,20 +812,21 @@ def split(self, X, y=None, groups=None):
'not match total number of classes (3). '
'Results may not be appropriate for your use case.')
assert_warns_message(RuntimeWarning, warning_message,
cross_val_predict, LogisticRegression(),
cross_val_predict,
LogisticRegression(solver="liblinear"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

liblinear converges faster than lbfgs here

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's see if it breaks the cron jobs :P

@adrinjalali adrinjalali merged commit 5674122 into scikit-learn:master Jun 22, 2019
@rth rth deleted the opt-test-runtime-part-2 branch June 22, 2019 13:38
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
* Feature extraction / feature selection

* Metrics, manifold, impute, GP optimization

* Optimize mixture

* Optimize model_selection

* Fix tests

* Lint
@rth rth restored the opt-test-runtime-part-2 branch July 15, 2019 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants