scikit-learn: Accelerate slow examples

These examples take quite a long time to run, and they make our documentation CI fail quite frequently due to timeout. It’d be nice to speed the up a little bit.

To contributors: if you want to work on an example, first have a look at the example, and if you think you’re comfortable working on it and have found a potential way to speed-up execution time while preserving the educational message of the example, please mention which one you’re working on in the comments below.

Please open a dedicated PR for each individual example you have a found fix for (with a new git branch branched off of main for each example) to make the review faster.

Please focus on the longest running examples first (e.g. 30s or more). Examples that run in less than 15s are probably fine.

Please also keep in mind that we want to keep the example code as simple as possible for educational reasons while preserving the main points expressed in the text of the example valid and well illustrated by the result of the execution (plots or text outputs).

Finally, we expect that some examples cannot really be accelerated while preserving their educational value (integrity of the message and the simplicity of the code). In this case, we might decide to keep them as they are if they last less than 60s.

To maintainers: I’m running a script which automatically updates the following list with connected PRs and “done” checkboxes, no need to updated them manually.

Examples to Update

  • …/examples/linear_model/plot_poisson_regression_non_normal_loss.py: 60.41 sec #21787
  • …/examples/impute/plot_missing_values.py: 26.37 sec #21792
  • …/examples/miscellaneous/plot_johnson_lindenstrauss_bound.py: 19.42 sec #21795
  • …/examples/linear_model/plot_sgd_early_stopping.py: 91.61 sec #21627
  • …/examples/kernel_approximation/plot_scalable_poly_kernels.py: 42.52 sec #22903
  • …/examples/ensemble/plot_stack_predictors.py: 32.45 sec #21726
  • …/examples/decomposition/plot_image_denoising.py: 29.42 sec #21799
  • …/examples/applications/plot_model_complexity_influence.py: 28.06 sec #21963
  • …/examples/impute/plot_iterative_imputer_variants_comparison.py: 27.26 sec #21748
  • …/examples/inspection/plot_partial_dependence.py: 21.99 sec #21768
  • …/examples/neighbors/plot_nca_classification.py: 21.13 sec #21771
  • …/examples/miscellaneous/plot_kernel_ridge_regression.py: 18.07 sec #21794 #21791
  • …/examples/linear_model/plot_sparse_logistic_regression_20newsgroups.py: 18.05 sec #21773
  • …/examples/neural_networks/plot_mnist_filters.py: 76.16 sec #21647
  • …/examples/ensemble/plot_gradient_boosting_quantile.py: 60.39 sec #21666
  • …/examples/semi_supervised/plot_semi_supervised_newsgroups.py: 55.99 sec #21673
  • …/examples/ensemble/plot_gradient_boosting_early_stopping.py: 51.35 sec #21609
  • …/examples/manifold/plot_lle_digits.py: 44.89 sec #21736
  • …/examples/svm/plot_svm_scale_c.py: 40.61 sec #21625
  • …/examples/cluster/plot_cluster_comparison.py: 39.24 sec #21624
  • …/examples/compose/plot_digits_pipe.py: 37.29 sec #21728
  • …/examples/model_selection/plot_multi_metric_evaluation.py: 32.78 sec #21626
  • …/examples/ensemble/plot_gradient_boosting_regularization.py: 28.18 sec #21611
  • …/examples/applications/plot_face_recognition.py: 24.58 sec #21725
  • …/examples/linear_model/plot_sgd_comparison.py: 24.05 sec #21610
  • …/examples/ensemble/plot_ensemble_oob.py: 20.69 sec #21730
  • …/examples/feature_selection/plot_select_from_model_diabetes.py: 18.98 sec #21738
  • …/examples/ensemble/plot_gradient_boosting_categorical.py: 18.68 sec #21634
  • …/examples/manifold/plot_compare_methods.py: 14.77 sec #21635
  • …/examples/model_selection/plot_successive_halving_iterations.py: 14.16 sec #21612
  • …/examples/model_selection/plot_randomized_search.py: 253.02 sec #21637
  • …/examples/model_selection/plot_permutation_tests_for_classification.py: 39.82 sec #21649
  • …/examples/cluster/plot_digits_linkage.py: 39.15 sec #21678 #21737
  • …/examples/neural_networks/plot_mlp_alpha.py: 34.27 sec #21648
  • …/examples/preprocessing/plot_discretization_classification.py: 34.11 sec #21661
  • …/examples/manifold/plot_t_sne_perplexity.py: 24.81 sec #21636
  • …/examples/model_selection/plot_validation_curve.py: 15.32 sec #21638
  • …/examples/ensemble/plot_adaboost_multiclass.py: 14.90 sec #21651
  • …/examples/decomposition/plot_pca_vs_fa_model_selection.py: 12.14 sec #21671
  • …/examples/cluster/plot_birch_vs_minibatchkmeans.py: 11.75 sec #21703
  • …/examples/model_selection/plot_learning_curve.py: 10.50 sec #21628

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 63 (44 by maintainers)

Commits related to this issue

Most upvoted comments

@hhnnhh or @marenwestermann may be interested in this.

Could you please change your script to remove examples that run in less than 20s or 15s from the list to avoid giving incentives of opening PRs with smaller added-value / review time ratios?

@ogrisel done!

I’ll try …/examples/feature_selection/plot_select_from_model_diabetes.py, Is it okay to change the example - different dataset in order to achieve a speedup?

In general, what matters most is the quality of the pedagogical message. It always comes first and runtime is second (assuming it’s less than a few minutes). So if you are confident that you can craft a enlightening example that teaches the same concepts with a different dataset, why not. But in general I am not sure it’s easy nor worth it.

@norbusan and I are working on ../examples/ensemble/plot_stack_predictors.py

For instance, you can switch from the digits dataset to the iris dataset in the first and slowest example, and speed it up by almost 100 fold. The question is then if that still represents the benefit of RandomizedSearchCV. Or you could try to use HistGradientBoostingClassifier instead of SGDClassifier and see if it works much faster. Then open a PR and through discussions we’ll figure out what the best choice is.

@cakiki ideally you’d be able to speed them up by just changing some parameters or reducing the size of the data, while being able to present the same outcome, but changing the examples a bit is also not necessarily out of scope if it’s required.