-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
PERF Euclidean Specialization for ArgKminClassMode #28219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PERF Euclidean Specialization for ArgKminClassMode #28219
Conversation
Thank you, @OmarManzoor. I do not have time to have a look into it at the moment. IIRC, there are two pathways for this case (extend Can you report benchmarks for various configurations against |
HI @jjerphan I did run some benchmarks. It seems that when we have test samples of about a 100 the performance has decreased. Otherwise for large test sizes the performance seems to have increased. · Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
-- n_features / n_neighbors
------------------ -----------------------------------------------------------------------------------------------------------------------
n_train n_test 50 / 100 50 / 500 50 / 1000 100 / 100 100 / 500 100 / 1000 500 / 100 500 / 500 500 / 1000
========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
1000 100 2.24±0.2ms 4.12±0.3ms 3.42±0.3ms 2.27±0.2ms 4.38±0.3ms 3.76±0.4ms 4.02±0.5ms 6.25±1ms 5.89±0.7ms
1000 1000 4.13±0.02ms 11.7±1ms 9.06±0.1ms 4.72±0.01ms 13.1±0.09ms 12.8±0.1ms 13.4±0.2ms 19.9±0.2ms 19.8±0.2ms
1000 10000 32.9±4ms 65.0±4ms 69.2±5ms 33.6±3ms 70.5±3ms 70.3±3ms 73.7±4ms 105±3ms 118±6ms
10000 100 15.4±4ms 20.1±2ms 28.7±2ms 14.6±1ms 23.1±2ms 32.2±3ms 39.1±4ms 45.6±4ms 58.7±6ms
10000 1000 15.5±0.03ms 33.2±0.4ms 50.2±0.1ms 21.5±0.1ms 39.4±0.1ms 55.9±0.1ms 68.3±0.5ms 87.3±0.4ms 106±0.9ms
10000 10000 119±5ms 243±4ms 373±7ms 175±4ms 301±6ms 446±10ms 639±20ms 773±20ms 937±30ms
========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
-- n_features / n_neighbors
------------------ ----------------------------------------------------------------------------------------------------------------------------
n_train n_test 50 / 100 50 / 500 50 / 1000 100 / 100 100 / 500 100 / 1000 500 / 100 500 / 500 500 / 1000
========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
1000 100 1.60±0.01ms 3.62±0.06ms 5.61±0.09ms 1.66±0.01ms 3.70±0.05ms 5.70±0.09ms 2.22±0.02ms 4.33±0.07ms 6.39±0.1ms
1000 1000 10.8±0.2ms 27.6±9ms 39.9±0.5ms 11.0±0.1ms 26.9±0.6ms 40.4±0.4ms 16.1±0.2ms 32.1±0.5ms 45.8±0.6ms
1000 10000 42.8±1ms 152±4ms 290±7ms 48.5±2ms 156±3ms 297±8ms 88.7±2ms 197±6ms 324±5ms
10000 100 3.94±0.1ms 10.0±0.5ms 14.3±0.6ms 4.93±0.3ms 10.8±0.4ms 14.8±0.7ms 10.1±0.3ms 16.3±0.6ms 20.4±1ms
10000 1000 30.6±1ms 86.7±3ms 117±4ms 36.6±1ms 92.4±3ms 123±4ms 83.4±2ms 141±5ms 173±7ms
10000 10000 136±3ms 333±6ms 591±10ms 192±7ms 395±9ms 649±20ms 639±10ms 878±20ms 1.14±0.01s
========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
| Change | Before [8734d2f0] | After [5d0219cf] <argkmin_classmode_euclidean> | Ratio | Benchmark (Parameter) |
|----------|----------------------|--------------------------------------------------|---------|----------------------------------------------------------------------------------------------|
| + | 3.94±0.1ms | 15.4±4ms | 3.89 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100) |
| + | 10.1±0.3ms | 39.1±4ms | 3.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 100) |
| + | 4.93±0.3ms | 14.6±1ms | 2.95 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100) |
| + | 20.4±1ms | 58.7±6ms | 2.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 1000) |
| + | 16.3±0.6ms | 45.6±4ms | 2.8 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 500) |
| + | 14.8±0.7ms | 32.2±3ms | 2.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000) |
| + | 10.8±0.4ms | 23.1±2ms | 2.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 500) |
| + | 14.3±0.6ms | 28.7±2ms | 2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000) |
| + | 10.0±0.5ms | 20.1±2ms | 2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 500) |
| + | 2.22±0.02ms | 4.02±0.5ms | 1.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 100) |
| + | 4.33±0.07ms | 6.25±1ms | 1.44 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 500) |
| + | 1.60±0.01ms | 2.24±0.2ms | 1.4 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 100) |
| + | 1.66±0.01ms | 2.27±0.2ms | 1.37 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 100) |
| + | 3.70±0.05ms | 4.38±0.3ms | 1.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 500) |
| + | 3.62±0.06ms | 4.12±0.3ms | 1.14 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 500) |
| - | 136±3ms | 119±5ms | 0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100) |
| - | 878±20ms | 773±20ms | 0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 500) |
| - | 16.1±0.2ms | 13.4±0.2ms | 0.84 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 100) |
| - | 88.7±2ms | 73.7±4ms | 0.83 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 100) |
| - | 83.4±2ms | 68.3±0.5ms | 0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 100) |
| - | 1.14±0.01s | 937±30ms | 0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 1000) |
| - | 42.8±1ms | 32.9±4ms | 0.77 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 100) |
| - | 395±9ms | 301±6ms | 0.76 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 500) |
| - | 333±6ms | 243±4ms | 0.73 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 500) |
| - | 48.5±2ms | 33.6±3ms | 0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 100) |
| - | 649±20ms | 446±10ms | 0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |
| - | 5.70±0.09ms | 3.76±0.4ms | 0.66 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 1000) |
| - | 591±10ms | 373±7ms | 0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000) |
| - | 32.1±0.5ms | 19.9±0.2ms | 0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 500) |
| - | 141±5ms | 87.3±0.4ms | 0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 500) |
| - | 5.61±0.09ms | 3.42±0.3ms | 0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 1000) |
| - | 173±7ms | 106±0.9ms | 0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 1000) |
| - | 36.6±1ms | 21.5±0.1ms | 0.59 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100) |
| - | 197±6ms | 105±3ms | 0.54 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 500) |
| - | 30.6±1ms | 15.5±0.03ms | 0.51 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100) |
| - | 26.9±0.6ms | 13.1±0.09ms | 0.49 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 500) |
| - | 156±3ms | 70.5±3ms | 0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 500) |
| - | 123±4ms | 55.9±0.1ms | 0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000) |
| - | 11.0±0.1ms | 4.72±0.01ms | 0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 100) |
| - | 45.8±0.6ms | 19.8±0.2ms | 0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 1000) |
| - | 152±4ms | 65.0±4ms | 0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 500) |
| - | 92.4±3ms | 39.4±0.1ms | 0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 500) |
| - | 117±4ms | 50.2±0.1ms | 0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000) |
| - | 27.6±9ms | 11.7±1ms | 0.42 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 500) |
| - | 10.8±0.2ms | 4.13±0.02ms | 0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 100) |
| - | 86.7±3ms | 33.2±0.4ms | 0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 500) |
| - | 324±5ms | 118±6ms | 0.36 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 1000) |
| - | 40.4±0.4ms | 12.8±0.1ms | 0.32 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 1000) |
| - | 297±8ms | 70.3±3ms | 0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 1000) |
| - | 290±7ms | 69.2±5ms | 0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 1000) |
| - | 39.9±0.5ms | 9.06±0.1ms | 0.23 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 1000) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.
|
Hi @OmarManzoor, thank you for the benchmarks! We might need to adapt the heuristic when Could you perform ASV benchmarks for
? Also, I believe that some part of this contribution might also be reused for |
Yes I think RadiusNeighbors will be similar but I think we will have to override the EuclideanSpecialization for that separately. |
So if I change the strategy to "auto" for "euclidean" the performance increases generically. · Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
-- n_features / n_neighbors
------------------ ---------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============ ============ ============ ============
10000 100 3.64±0.1ms 10.2±0.3ms 4.14±0.1ms 11.1±0.5ms
10000 1000 29.7±2ms 89.9±2ms 34.5±2ms 102±8ms
10000 10000 116±5ms 387±20ms 187±10ms 437±20ms
========= ======== ============ ============ ============ ============
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
-- n_features / n_neighbors
------------------ ---------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============ ============ ============ ============
10000 100 3.96±0.2ms 13.0±0.4ms 4.79±0.2ms 14.0±0.7ms
10000 1000 28.9±0.9ms 112±7ms 40.1±4ms 118±5ms
10000 10000 143±8ms 609±20ms 214±10ms 717±20ms
========= ======== ============ ============ ============ ============
| Change | Before [8734d2f0] | After [d26bde4c] <testing> | Ratio | Benchmark (Parameter) |
|----------|----------------------|------------------------------|---------|----------------------------------------------------------------------------------------------|
| - | 4.79±0.2ms | 4.14±0.1ms | 0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100) |
| - | 214±10ms | 187±10ms | 0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 100) |
| - | 40.1±4ms | 34.5±2ms | 0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100) |
| - | 118±5ms | 102±8ms | 0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000) |
| - | 143±8ms | 116±5ms | 0.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100) |
| - | 112±7ms | 89.9±2ms | 0.8 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000) |
| - | 14.0±0.7ms | 11.1±0.5ms | 0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000) |
| - | 13.0±0.4ms | 10.2±0.3ms | 0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000) |
| - | 609±20ms | 387±20ms | 0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000) |
| - | 717±20ms | 437±20ms | 0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
However changing to "auto" doesn't work for the "manhattan" metric and the performance decreases · Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
-- n_features / n_neighbors
------------------ ---------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============ ============ ============ ============
10000 100 7.59±0.2ms 13.5±0.4ms 14.3±0.7ms 20.3±0.3ms
10000 1000 71.0±2ms 126±5ms 135±5ms 208±9ms
10000 10000 487±10ms 847±60ms 1.15±0.04s 1.45±0.04s
========= ======== ============ ============ ============ ============
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
-- n_features / n_neighbors
------------------ ---------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============ ============ ============ ============
10000 100 24.9±6ms 39.8±0.1ms 54.3±0.5ms 72.7±0.5ms
10000 1000 62.9±0.6ms 110±1ms 147±1ms 199±3ms
10000 10000 516±20ms 862±30ms 1.11±0.02s 1.54±0.06s
========= ======== ============ ============ ============ ============
| Change | Before [8734d2f0] | After [d26bde4c] <testing> | Ratio | Benchmark (Parameter) |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| + | 110±1ms | 126±5ms | 1.15 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000) |
| + | 62.9±0.6ms | 71.0±2ms | 1.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100) |
| - | 39.8±0.1ms | 13.5±0.4ms | 0.34 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000) |
| - | 24.9±6ms | 7.59±0.2ms | 0.3 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100) |
| - | 72.7±0.5ms | 20.3±0.3ms | 0.28 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000) |
| - | 54.3±0.5ms | 14.3±0.7ms | 0.26 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED. Whereas if we keep the strategy as "parallel_on_X" the performance for "manhattan" does not degrade · Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ======== ============= ============= ============ ============
-- n_features / n_neighbors
------------------ -----------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============= ============= ============ ============
10000 100 23.1±0.08ms 39.6±0.08ms 53.8±0.3ms 72.0±3ms
10000 1000 62.3±0.3ms 108±0.3ms 147±2ms 198±2ms
10000 10000 480±9ms 817±20ms 1.06±0.02s 1.38±0.01s
========= ======== ============= ============= ============ ============
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ======== ============= ============= ============= ============
-- n_features / n_neighbors
------------------ ------------------------------------------------------
n_train n_test 50 / 100 50 / 1000 100 / 100 100 / 1000
========= ======== ============= ============= ============= ============
10000 100 23.1±0.09ms 39.6±0.08ms 53.9±0.09ms 72.0±0.1ms
10000 1000 62.2±0.3ms 108±0.4ms 146±0.4ms 197±0.2ms
10000 10000 497±10ms 819±10ms 1.04±0.02s 1.40±0.05s
========= ======== ============= ============= ============= ============
BENCHMARKS NOT SIGNIFICANTLY CHANGED. I think we need to adjust the strategy based on the metric used now that we are including the euclidean specialization. |
@jjerphan Here are the benchmarks that you requested. This is with "euclidean" and strategy "auto" · Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ===================
-- n_test / n_features / n_neighbors
--------- -------------------
n_train 100000 / 100 / 10
========= ===================
10000 1.24±0.03s
========= ===================
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ===================
-- n_test / n_features / n_neighbors
--------- -------------------
n_train 100000 / 100 / 10
========= ===================
10000 1.28±0.03s
========= ===================
BENCHMARKS NOT SIGNIFICANTLY CHANGED.
· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ====================
-- n_test / n_features / n_neighbors
--------- --------------------
n_train 1000000 / 100 / 10
========= ====================
1000 1.49±0.05s
========= ====================
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ====================
-- n_test / n_features / n_neighbors
--------- --------------------
n_train 1000000 / 100 / 10
========= ====================
1000 1.64±0.02s
========= ====================
BENCHMARKS NOT SIGNIFICANTLY CHANGED.
· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[50.00%] ··· ========= ====================
-- n_test / n_features / n_neighbors
--------- --------------------
n_train 1000000 / 100 / 10
========= ====================
100 313±5ms
========= ====================
[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba ok
[100.00%] ··· ========= ====================
-- n_test / n_features / n_neighbors
--------- --------------------
n_train 1000000 / 100 / 10
========= ====================
100 456±9ms
========= ====================
| Change | Before [8734d2f0] | After [d26bde4c] <testing> | Ratio | Benchmark (Parameter) |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| - | 456±9ms | 313±5ms | 0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(100, 1000000, 100, 10) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
|
Thank you. Can you provide the ASV benchmark script that you have used? |
Sure here is the code import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from .common import Benchmark
N_SAMPLES_TRAIN = [10000]
N_SAMPLES_TEST = [100000]
N_FEATURES = [100]
N_NEIGHBORS = [10]
class BruteForceKNeighborsClassifier(Benchmark):
"""Benchmarks for KNeighborsClassifier.predict when using algorithm='brute'."""
param_names = ["n_train", "n_test", "n_features", "n_neighbors"]
params = (
N_SAMPLES_TRAIN,
N_SAMPLES_TEST,
N_FEATURES,
N_NEIGHBORS,
)
def setup(self, *params):
n_train, n_test, n_features, n_neighbors = params
metric = "euclidean"
self.knc = KNeighborsClassifier(
n_neighbors=n_neighbors, algorithm="brute", metric=metric
)
self.rng = np.random.RandomState(0)
self.X_train = self.rng.random_sample((n_train, n_features))
self.X_test = self.rng.random_sample((n_test, n_features))
self.y = self.rng.randint(low=0, high=2, size=(n_train,))
self.knc.fit(X=self.X_train, y=self.y)
def time_predict_proba(self, *params):
self.knc.predict_proba(self.X_test) |
I wonder whether it would make sense to add a heuristic for this specialization or if it would just add too much complexity. I think we need to socilitate others' opinions. |
@adam2392: would you be interested in reviewing this PR? |
I'm not 100% comfortable w/ the codebase around the tempitaing, but I can take a look as a beginner :). It'll also help me familiarize myself. I will revisit sometime this week. Just to summarize, is the following correct? Anything I missed?
|
Thats correct
Yeah that also seems correct. Basically Euclidean distances have their own specialisation already implemented for the ArgKmin and RadiusNeighbors classes. In the original PR to add ArgKminClassMode these specialisations were not included so as to avoid making the PR too complex and thus left for later. |
Reference Issues/PRs
Towards #25888
What does this implement/fix? Explain your changes.
Any other comments?
CC: @jjerphan @Micky774