-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST Speed up some of the slowest tests #27383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ae92dfc
eccf01d
c731f7e
ed81754
44a3a52
700f109
a96dee2
8cb8ff3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -74,7 +74,6 @@ | |
) # type: ignore | ||
|
||
P = (1, 2, 3, 4, np.inf) | ||
JOBLIB_BACKENDS = list(joblib.parallel.BACKENDS.keys()) | ||
|
||
# Filter deprecation warnings. | ||
neighbors.kneighbors_graph = ignore_warnings(neighbors.kneighbors_graph) | ||
|
@@ -2044,10 +2043,10 @@ def test_same_radius_neighbors_parallel(algorithm): | |
assert_allclose(graph, graph_parallel) | ||
|
||
|
||
@pytest.mark.parametrize("backend", JOBLIB_BACKENDS) | ||
@pytest.mark.parametrize("backend", ["threading", "loky"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we don't want to test multiprocessing and sequential backends? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The multiprocessing case is very slow on macOS (and maybe windows as well): this backend creates new processes from scratch each time which is can take several tens of seconds to run on the overloaded macOS Azure CI. I think the loky backend is enough to test for multiprocess-related problems in our Cython code. The main difference between multiprocessing and loky is that loky reuses its workers across calls making it much more efficient on overloaded machines. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also sequential is already tested everywhere else in |
||
@pytest.mark.parametrize("algorithm", ALGORITHMS) | ||
def test_knn_forcing_backend(backend, algorithm): | ||
# Non-regression test which ensure the knn methods are properly working | ||
# Non-regression test which ensures the knn methods are properly working | ||
# even when forcing the global joblib backend. | ||
with joblib.parallel_backend(backend): | ||
X, y = datasets.make_classification( | ||
|
@@ -2056,12 +2055,12 @@ def test_knn_forcing_backend(backend, algorithm): | |
X_train, X_test, y_train, y_test = train_test_split(X, y) | ||
|
||
clf = neighbors.KNeighborsClassifier( | ||
n_neighbors=3, algorithm=algorithm, n_jobs=3 | ||
n_neighbors=3, algorithm=algorithm, n_jobs=2 | ||
) | ||
clf.fit(X_train, y_train) | ||
clf.predict(X_test) | ||
clf.kneighbors(X_test) | ||
clf.kneighbors_graph(X_test, mode="distance").toarray() | ||
clf.kneighbors_graph(X_test, mode="distance") | ||
|
||
|
||
def test_dtype_convert(): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious, how much speedup are we getting here? are we THAT slow? 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the macOS CI build, this test is sometimes one of the slowest and can last more than 20s on an overloaded CI runner.
Locally it's fast but decreasing this makes it ~3x faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the description of the PR to show the impact on the macOS CI between
main
and this PR.I observed in the latest run than the docstring of
MiniBatchDictionaryLearning
has a similar problem (and a bug in the direction on the assertion about the sparsity of the transformed data). I pushed a96dee2 to fix it and improve the speed. We will have to wait for the end of this CI run to see the concrete impact of that last commit.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the description to take a96dee2 into account.