Description
From #8847 (comment):
and we should have a CI test for non-plotted examples or convert as many as possible to plots
My proposal is to have a convention like run_
for examples that do not produce any plots. sphinx-gallery allows to have a regex to specify which examples you want to run. It could be something like plot_|run_
. See the doc for more details.
I looked at the examples whose filename is not starting with plot_
. Timings are in seconds and in increasing order.
examples/feature_selection/feature_selection_pipeline.py 1.39
examples/exercises/digits_classification_exercise.py 1.47
examples/applications/svm_gui.py 1.86
examples/missing_values.py 2.01
examples/model_selection/randomized_search.py 2.02
examples/feature_stacker.py 2.14
examples/text/document_clustering.py 3.21
examples/linear_model/lasso_dense_vs_sparse_data.py 3.98
examples/text/hashing_vs_dict_vectorizer.py 4.78
examples/model_selection/grid_search_digits.py 8.29
examples/text/document_classification_20newsgroups.py 8.93
examples/applications/topics_extraction_with_nmf_lda.py 10.53
examples/applications/face_recognition.py 25.02
examples/bicluster/bicluster_newsgroups.py 25.72
examples/hetero_feature_union.py 116.22
examples/applications/wikipedia_principal_eigenvector.py 139.77
examples/model_selection/grid_search_text_feature_extraction.py 156.86
With this in mind I would be in favour of running all the examples but svm_gui.py
and the last three examples.
More details:
svm_gui.py
pops up a gui so it should probably not be run. Whether we should run wikipedia_principal_eigenvector.py
and grid_search_text_feature_extraction.py
which each takes more than 2 minutes is up for debate. On top of that, some of them may require data download that is not using the typical ~/scikit_learn_data
(e.g. the Wikipedia one). If that is the case these examples would not benefit from the CircleCI cache.