PERF Euclidean Specialization for ArgKminClassMode #28219

OmarManzoor · 2024-01-22T12:17:17Z

Reference Issues/PRs

Towards #25888

What does this implement/fix? Explain your changes.

Implements the Euclidean Specialization for ArgKminClassMode

Any other comments?

CC: @jjerphan @Micky774

github-actions · 2024-01-22T12:18:38Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: dc14320. Link to the linter CI: here}

jjerphan · 2024-01-22T13:07:38Z

Thank you, @OmarManzoor.

I do not have time to have a look into it at the moment. IIRC, there are two pathways for this case (extend EuclideanArgKmin or extend ArgKminClassMode).

Can you report benchmarks for various configurations against main, please?

OmarManzoor · 2024-01-23T10:16:21Z

HI @jjerphan

I did run some benchmarks. It seems that when we have test samples of about a 100 the performance has decreased. Otherwise for large test sizes the performance seems to have increased.

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
             --                                                                 n_features / n_neighbors                                               
             ------------------ -----------------------------------------------------------------------------------------------------------------------
              n_train   n_test     50 / 100     50 / 500    50 / 1000     100 / 100     100 / 500    100 / 1000   500 / 100    500 / 500    500 / 1000 
             ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
                1000     100      2.24±0.2ms   4.12±0.3ms   3.42±0.3ms    2.27±0.2ms    4.38±0.3ms   3.76±0.4ms   4.02±0.5ms    6.25±1ms    5.89±0.7ms 
                1000     1000    4.13±0.02ms    11.7±1ms    9.06±0.1ms   4.72±0.01ms   13.1±0.09ms   12.8±0.1ms   13.4±0.2ms   19.9±0.2ms   19.8±0.2ms 
                1000    10000      32.9±4ms     65.0±4ms     69.2±5ms      33.6±3ms      70.5±3ms     70.3±3ms     73.7±4ms     105±3ms      118±6ms   
               10000     100       15.4±4ms     20.1±2ms     28.7±2ms      14.6±1ms      23.1±2ms     32.2±3ms     39.1±4ms     45.6±4ms     58.7±6ms  
               10000     1000    15.5±0.03ms   33.2±0.4ms   50.2±0.1ms    21.5±0.1ms    39.4±0.1ms   55.9±0.1ms   68.3±0.5ms   87.3±0.4ms   106±0.9ms  
               10000    10000      119±5ms      243±4ms      373±7ms       175±4ms       301±6ms      446±10ms     639±20ms     773±20ms     937±30ms  
             ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
              --                                                                   n_features / n_neighbors                                                  
              ------------------ ----------------------------------------------------------------------------------------------------------------------------
               n_train   n_test     50 / 100      50 / 500     50 / 1000     100 / 100     100 / 500     100 / 1000    500 / 100     500 / 500    500 / 1000 
              ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
                 1000     100     1.60±0.01ms   3.62±0.06ms   5.61±0.09ms   1.66±0.01ms   3.70±0.05ms   5.70±0.09ms   2.22±0.02ms   4.33±0.07ms   6.39±0.1ms 
                 1000     1000     10.8±0.2ms     27.6±9ms     39.9±0.5ms    11.0±0.1ms    26.9±0.6ms    40.4±0.4ms    16.1±0.2ms    32.1±0.5ms   45.8±0.6ms 
                 1000    10000      42.8±1ms      152±4ms       290±7ms       48.5±2ms      156±3ms       297±8ms       88.7±2ms      197±6ms      324±5ms   
                10000     100      3.94±0.1ms    10.0±0.5ms    14.3±0.6ms    4.93±0.3ms    10.8±0.4ms    14.8±0.7ms    10.1±0.3ms    16.3±0.6ms    20.4±1ms  
                10000     1000      30.6±1ms      86.7±3ms      117±4ms       36.6±1ms      92.4±3ms      123±4ms       83.4±2ms      141±5ms      173±7ms   
                10000    10000      136±3ms       333±6ms       591±10ms      192±7ms       395±9ms       649±20ms      639±10ms      878±20ms    1.14±0.01s 
              ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============

| Change   | Before [8734d2f0]    | After [5d0219cf] <argkmin_classmode_euclidean>   |   Ratio | Benchmark (Parameter)                                                                        |
|----------|----------------------|--------------------------------------------------|---------|----------------------------------------------------------------------------------------------|
| +        | 3.94±0.1ms           | 15.4±4ms                                         |    3.89 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100)     |
| +        | 10.1±0.3ms           | 39.1±4ms                                         |    3.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 100)    |
| +        | 4.93±0.3ms           | 14.6±1ms                                         |    2.95 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)    |
| +        | 20.4±1ms             | 58.7±6ms                                         |    2.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 1000)   |
| +        | 16.3±0.6ms           | 45.6±4ms                                         |    2.8  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 500)    |
| +        | 14.8±0.7ms           | 32.2±3ms                                         |    2.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000)   |
| +        | 10.8±0.4ms           | 23.1±2ms                                         |    2.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 500)    |
| +        | 14.3±0.6ms           | 28.7±2ms                                         |    2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)    |
| +        | 10.0±0.5ms           | 20.1±2ms                                         |    2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 500)     |
| +        | 2.22±0.02ms          | 4.02±0.5ms                                       |    1.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 100)     |
| +        | 4.33±0.07ms          | 6.25±1ms                                         |    1.44 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 500)     |
| +        | 1.60±0.01ms          | 2.24±0.2ms                                       |    1.4  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 100)      |
| +        | 1.66±0.01ms          | 2.27±0.2ms                                       |    1.37 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 100)     |
| +        | 3.70±0.05ms          | 4.38±0.3ms                                       |    1.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 500)     |
| +        | 3.62±0.06ms          | 4.12±0.3ms                                       |    1.14 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 500)      |
| -        | 136±3ms              | 119±5ms                                          |    0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100)   |
| -        | 878±20ms             | 773±20ms                                         |    0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 500)  |
| -        | 16.1±0.2ms           | 13.4±0.2ms                                       |    0.84 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 100)    |
| -        | 88.7±2ms             | 73.7±4ms                                         |    0.83 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 100)   |
| -        | 83.4±2ms             | 68.3±0.5ms                                       |    0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 100)   |
| -        | 1.14±0.01s           | 937±30ms                                         |    0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 1000) |
| -        | 42.8±1ms             | 32.9±4ms                                         |    0.77 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 100)    |
| -        | 395±9ms              | 301±6ms                                          |    0.76 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 500)  |
| -        | 333±6ms              | 243±4ms                                          |    0.73 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 500)   |
| -        | 48.5±2ms             | 33.6±3ms                                         |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 100)   |
| -        | 649±20ms             | 446±10ms                                         |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |
| -        | 5.70±0.09ms          | 3.76±0.4ms                                       |    0.66 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 1000)    |
| -        | 591±10ms             | 373±7ms                                          |    0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000)  |
| -        | 32.1±0.5ms           | 19.9±0.2ms                                       |    0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 500)    |
| -        | 141±5ms              | 87.3±0.4ms                                       |    0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 500)   |
| -        | 5.61±0.09ms          | 3.42±0.3ms                                       |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 1000)     |
| -        | 173±7ms              | 106±0.9ms                                        |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 1000)  |
| -        | 36.6±1ms             | 21.5±0.1ms                                       |    0.59 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100)   |
| -        | 197±6ms              | 105±3ms                                          |    0.54 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 500)   |
| -        | 30.6±1ms             | 15.5±0.03ms                                      |    0.51 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100)    |
| -        | 26.9±0.6ms           | 13.1±0.09ms                                      |    0.49 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 500)    |
| -        | 156±3ms              | 70.5±3ms                                         |    0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 500)   |
| -        | 123±4ms              | 55.9±0.1ms                                       |    0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000)  |
| -        | 11.0±0.1ms           | 4.72±0.01ms                                      |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 100)    |
| -        | 45.8±0.6ms           | 19.8±0.2ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 1000)   |
| -        | 152±4ms              | 65.0±4ms                                         |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 500)    |
| -        | 92.4±3ms             | 39.4±0.1ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 500)   |
| -        | 117±4ms              | 50.2±0.1ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000)   |
| -        | 27.6±9ms             | 11.7±1ms                                         |    0.42 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 500)     |
| -        | 10.8±0.2ms           | 4.13±0.02ms                                      |    0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 100)     |
| -        | 86.7±3ms             | 33.2±0.4ms                                       |    0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 500)    |
| -        | 324±5ms              | 118±6ms                                          |    0.36 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 1000)  |
| -        | 40.4±0.4ms           | 12.8±0.1ms                                       |    0.32 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 1000)   |
| -        | 297±8ms              | 70.3±3ms                                         |    0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 1000)  |
| -        | 290±7ms              | 69.2±5ms                                         |    0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 1000)   |
| -        | 39.9±0.5ms           | 9.06±0.1ms                                       |    0.23 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 1000)    |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

jjerphan · 2024-01-23T10:37:37Z

Hi @OmarManzoor, thank you for the benchmarks!

We might need to adapt the heuristic when n_test is small.

Could you perform ASV benchmarks for n_neighbors=10 and for larger datasets, e.g. for:

(n_train, n_test) = (10_000, 100_000)
(n_train, n_test) = (1000, 1_000_000)
(n_train, n_test) = (100, 1_000_000)

?

Also, I believe that some part of this contribution might also be reused for RadiusNeighborsClassMode then. What do you think?

OmarManzoor · 2024-01-23T10:40:05Z

Yes I think RadiusNeighbors will be similar but I think we will have to override the EuclideanSpecialization for that separately.

OmarManzoor · 2024-01-23T12:13:48Z

So if I change the strategy to "auto" for "euclidean" the performance increases generically.

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
             --                               n_features / n_neighbors             
             ------------------ ---------------------------------------------------
              n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============ ============ ============ ============
               10000     100     3.64±0.1ms   10.2±0.3ms   4.14±0.1ms   11.1±0.5ms 
               10000     1000     29.7±2ms     89.9±2ms     34.5±2ms     102±8ms   
               10000    10000     116±5ms      387±20ms     187±10ms     437±20ms  
             ========= ======== ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
              --                               n_features / n_neighbors             
              ------------------ ---------------------------------------------------
               n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
              ========= ======== ============ ============ ============ ============
                10000     100     3.96±0.2ms   13.0±0.4ms   4.79±0.2ms   14.0±0.7ms 
                10000     1000    28.9±0.9ms    112±7ms      40.1±4ms     118±5ms   
                10000    10000     143±8ms      609±20ms     214±10ms     717±20ms  
              ========= ======== ============ ============ ============ ============

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                        |
|----------|----------------------|------------------------------|---------|----------------------------------------------------------------------------------------------|
| -        | 4.79±0.2ms           | 4.14±0.1ms                   |    0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)    |
| -        | 214±10ms             | 187±10ms                     |    0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 100)  |
| -        | 40.1±4ms             | 34.5±2ms                     |    0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100)   |
| -        | 118±5ms              | 102±8ms                      |    0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000)  |
| -        | 143±8ms              | 116±5ms                      |    0.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100)   |
| -        | 112±7ms              | 89.9±2ms                     |    0.8  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000)   |
| -        | 14.0±0.7ms           | 11.1±0.5ms                   |    0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000)   |
| -        | 13.0±0.4ms           | 10.2±0.3ms                   |    0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)    |
| -        | 609±20ms             | 387±20ms                     |    0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000)  |
| -        | 717±20ms             | 437±20ms                     |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

However changing to "auto" doesn't work for the "manhattan" metric and the performance decreases

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
             --                               n_features / n_neighbors             
             ------------------ ---------------------------------------------------
              n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============ ============ ============ ============
               10000     100     7.59±0.2ms   13.5±0.4ms   14.3±0.7ms   20.3±0.3ms 
               10000     1000     71.0±2ms     126±5ms      135±5ms      208±9ms   
               10000    10000     487±10ms     847±60ms    1.15±0.04s   1.45±0.04s 
             ========= ======== ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
              --                               n_features / n_neighbors             
              ------------------ ---------------------------------------------------
               n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
              ========= ======== ============ ============ ============ ============
                10000     100      24.9±6ms    39.8±0.1ms   54.3±0.5ms   72.7±0.5ms 
                10000     1000    62.9±0.6ms    110±1ms      147±1ms      199±3ms   
                10000    10000     516±20ms     862±30ms    1.11±0.02s   1.54±0.06s 
              ========= ======== ============ ============ ============ ============

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                      |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| +        | 110±1ms              | 126±5ms                      |    1.15 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000) |
| +        | 62.9±0.6ms           | 71.0±2ms                     |    1.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100)  |
| -        | 39.8±0.1ms           | 13.5±0.4ms                   |    0.34 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)  |
| -        | 24.9±6ms             | 7.59±0.2ms                   |    0.3  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100)   |
| -        | 72.7±0.5ms           | 20.3±0.3ms                   |    0.28 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000) |
| -        | 54.3±0.5ms           | 14.3±0.7ms                   |    0.26 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)  |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

Whereas if we keep the strategy as "parallel_on_X" the performance for "manhattan" does not degrade

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============= ============= ============ ============
             --                                n_features / n_neighbors              
             ------------------ -----------------------------------------------------
              n_train   n_test     50 / 100     50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============= ============= ============ ============
               10000     100     23.1±0.08ms   39.6±0.08ms   53.8±0.3ms    72.0±3ms  
               10000     1000     62.3±0.3ms    108±0.3ms     147±2ms      198±2ms   
               10000    10000      480±9ms       817±20ms    1.06±0.02s   1.38±0.01s 
             ========= ======== ============= ============= ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                       ok
[100.00%] ··· ========= ======== ============= ============= ============= ============
              --                                n_features / n_neighbors               
              ------------------ ------------------------------------------------------
               n_train   n_test     50 / 100     50 / 1000     100 / 100    100 / 1000 
              ========= ======== ============= ============= ============= ============
                10000     100     23.1±0.09ms   39.6±0.08ms   53.9±0.09ms   72.0±0.1ms 
                10000     1000     62.2±0.3ms    108±0.4ms     146±0.4ms    197±0.2ms  
                10000    10000      497±10ms      819±10ms     1.04±0.02s   1.40±0.05s 
              ========= ======== ============= ============= ============= ============


BENCHMARKS NOT SIGNIFICANTLY CHANGED.

I think we need to adjust the strategy based on the metric used now that we are including the euclidean specialization.

OmarManzoor · 2024-01-23T12:43:53Z

@jjerphan Here are the benchmarks that you requested. This is with "euclidean" and strategy "auto"

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ===================
             --        n_test / n_features / n_neighbors
             --------- -------------------
              n_train   100000 / 100 / 10 
             ========= ===================
               10000        1.24±0.03s    
             ========= ===================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ===================
              --        n_test / n_features / n_neighbors
              --------- -------------------
               n_train   100000 / 100 / 10 
              ========= ===================
                10000        1.28±0.03s    
              ========= ===================


BENCHMARKS NOT SIGNIFICANTLY CHANGED.



· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ====================
             --        n_test / n_features / n_neighbors
             --------- --------------------
              n_train   1000000 / 100 / 10 
             ========= ====================
                1000        1.49±0.05s     
             ========= ====================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ====================
              --        n_test / n_features / n_neighbors
              --------- --------------------
               n_train   1000000 / 100 / 10 
              ========= ====================
                 1000        1.64±0.02s     
              ========= ====================


BENCHMARKS NOT SIGNIFICANTLY CHANGED.




· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ====================
             --        n_test / n_features / n_neighbors
             --------- --------------------
              n_train   1000000 / 100 / 10 
             ========= ====================
                100          313±5ms       
             ========= ====================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ====================
              --        n_test / n_features / n_neighbors
              --------- --------------------
               n_train   1000000 / 100 / 10 
              ========= ====================
                 100          456±9ms       
              ========= ====================

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                      |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| -        | 456±9ms              | 313±5ms                      |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(100, 1000000, 100, 10) |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

jjerphan · 2024-01-23T14:39:35Z

Thank you.

Can you provide the ASV benchmark script that you have used?

OmarManzoor · 2024-01-24T07:42:06Z

Thank you.

Can you provide the ASV benchmark script that you have used?

Sure here is the code

import numpy as np

from sklearn.neighbors import KNeighborsClassifier

from .common import Benchmark

N_SAMPLES_TRAIN = [10000]
N_SAMPLES_TEST = [100000]
N_FEATURES = [100]
N_NEIGHBORS = [10]


class BruteForceKNeighborsClassifier(Benchmark):
    """Benchmarks for KNeighborsClassifier.predict when using algorithm='brute'."""

    param_names = ["n_train", "n_test", "n_features", "n_neighbors"]
    params = (
        N_SAMPLES_TRAIN,
        N_SAMPLES_TEST,
        N_FEATURES,
        N_NEIGHBORS,
    )

    def setup(self, *params):
        n_train, n_test, n_features, n_neighbors = params
        metric = "euclidean"
        self.knc = KNeighborsClassifier(
            n_neighbors=n_neighbors, algorithm="brute", metric=metric
        )

        self.rng = np.random.RandomState(0)
        self.X_train = self.rng.random_sample((n_train, n_features))
        self.X_test = self.rng.random_sample((n_test, n_features))
        self.y = self.rng.randint(low=0, high=2, size=(n_train,))

        self.knc.fit(X=self.X_train, y=self.y)

    def time_predict_proba(self, *params):
        self.knc.predict_proba(self.X_test)

jjerphan · 2024-05-29T19:52:11Z

I wonder whether it would make sense to add a heuristic for this specialization or if it would just add too much complexity.

I think we need to socilitate others' opinions.

jjerphan · 2024-09-08T15:41:31Z

@adam2392: would you be interested in reviewing this PR?

adam2392 · 2024-09-09T15:04:24Z

@adam2392: would you be interested in reviewing this PR?

I'm not 100% comfortable w/ the codebase around the tempitaing, but I can take a look as a beginner :). It'll also help me familiarize myself. I will revisit sometime this week.

Just to summarize, is the following correct? Anything I missed?

Is the main idea that computing Euclidean distances can further be improved compared to other distance metrics?
The PR leverages some of the class designs for these generic distance metrics, and then further adds some code logic for the special case of euclidean distance metric to realize this performance improvement?

OmarManzoor · 2024-09-11T07:10:46Z

Just to summarize, is the following correct? Anything I missed?

Is the main idea that computing Euclidean distances can further be improved compared to other distance metrics?

Thats correct

The PR leverages some of the class designs for these generic distance metrics, and then further adds some code logic for the special case of euclidean distance metric to realize this performance improvement?

Yeah that also seems correct.

Basically Euclidean distances have their own specialisation already implemented for the ArgKmin and RadiusNeighbors classes. In the original PR to add ArgKminClassMode these specialisations were not included so as to avoid making the PR too complex and thus left for later.

Euclidean Specialization for ArgKminClassMode

5d0219c

github-actions bot added module:metrics cython labels Jan 22, 2024

jjerphan added the Performance label Jan 23, 2024

OmarManzoor added 2 commits January 25, 2024 17:09

Merge branch 'main' into argkmin_classmode_euclidean

f5db7cb

Add strategy auto for euclidean metrics

dc14320

jjerphan changed the title ~~Euclidean Specialization for ArgKminClassMode~~ PERF Euclidean Specialization for ArgKminClassMode Mar 15, 2024

jjerphan mentioned this pull request Jun 6, 2024

PERF PairwiseDistancesReductions subsequent work #25888

Open

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF Euclidean Specialization for ArgKminClassMode #28219

PERF Euclidean Specialization for ArgKminClassMode #28219

OmarManzoor commented Jan 22, 2024

github-actions bot commented Jan 22, 2024 •

edited

Loading

jjerphan commented Jan 22, 2024

OmarManzoor commented Jan 23, 2024

jjerphan commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024 •

edited

Loading

jjerphan commented Jan 23, 2024

OmarManzoor commented Jan 24, 2024

jjerphan commented May 29, 2024

jjerphan commented Sep 8, 2024

adam2392 commented Sep 9, 2024

OmarManzoor commented Sep 11, 2024

PERF Euclidean Specialization for ArgKminClassMode #28219

Are you sure you want to change the base?

PERF Euclidean Specialization for ArgKminClassMode #28219

Conversation

OmarManzoor commented Jan 22, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Jan 22, 2024 • edited Loading

✔️ Linting Passed

jjerphan commented Jan 22, 2024

OmarManzoor commented Jan 23, 2024

jjerphan commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024

OmarManzoor commented Jan 23, 2024 • edited Loading

jjerphan commented Jan 23, 2024

OmarManzoor commented Jan 24, 2024

jjerphan commented May 29, 2024

jjerphan commented Sep 8, 2024

adam2392 commented Sep 9, 2024

OmarManzoor commented Sep 11, 2024

github-actions bot commented Jan 22, 2024 •

edited

Loading

OmarManzoor commented Jan 23, 2024 •

edited

Loading