Skip to content

PERF Euclidean Specialization for ArgKminClassMode #28219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

OmarManzoor
Copy link
Contributor

Reference Issues/PRs

Towards #25888

What does this implement/fix? Explain your changes.

  • Implements the Euclidean Specialization for ArgKminClassMode

Any other comments?

CC: @jjerphan @Micky774

Copy link

github-actions bot commented Jan 22, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: dc14320. Link to the linter CI: here

@jjerphan
Copy link
Member

Thank you, @OmarManzoor.

I do not have time to have a look into it at the moment. IIRC, there are two pathways for this case (extend EuclideanArgKmin or extend ArgKminClassMode).

Can you report benchmarks for various configurations against main, please?

@OmarManzoor
Copy link
Contributor Author

HI @jjerphan

I did run some benchmarks. It seems that when we have test samples of about a 100 the performance has decreased. Otherwise for large test sizes the performance seems to have increased.

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
             --                                                                 n_features / n_neighbors                                               
             ------------------ -----------------------------------------------------------------------------------------------------------------------
              n_train   n_test     50 / 100     50 / 500    50 / 1000     100 / 100     100 / 500    100 / 1000   500 / 100    500 / 500    500 / 1000 
             ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============
                1000     100      2.24±0.2ms   4.12±0.3ms   3.42±0.3ms    2.27±0.2ms    4.38±0.3ms   3.76±0.4ms   4.02±0.5ms    6.25±1ms    5.89±0.7ms 
                1000     1000    4.13±0.02ms    11.7±1ms    9.06±0.1ms   4.72±0.01ms   13.1±0.09ms   12.8±0.1ms   13.4±0.2ms   19.9±0.2ms   19.8±0.2ms 
                1000    10000      32.9±4ms     65.0±4ms     69.2±5ms      33.6±3ms      70.5±3ms     70.3±3ms     73.7±4ms     105±3ms      118±6ms   
               10000     100       15.4±4ms     20.1±2ms     28.7±2ms      14.6±1ms      23.1±2ms     32.2±3ms     39.1±4ms     45.6±4ms     58.7±6ms  
               10000     1000    15.5±0.03ms   33.2±0.4ms   50.2±0.1ms    21.5±0.1ms    39.4±0.1ms   55.9±0.1ms   68.3±0.5ms   87.3±0.4ms   106±0.9ms  
               10000    10000      119±5ms      243±4ms      373±7ms       175±4ms       301±6ms      446±10ms     639±20ms     773±20ms     937±30ms  
             ========= ======== ============= ============ ============ ============= ============= ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
              --                                                                   n_features / n_neighbors                                                  
              ------------------ ----------------------------------------------------------------------------------------------------------------------------
               n_train   n_test     50 / 100      50 / 500     50 / 1000     100 / 100     100 / 500     100 / 1000    500 / 100     500 / 500    500 / 1000 
              ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============
                 1000     100     1.60±0.01ms   3.62±0.06ms   5.61±0.09ms   1.66±0.01ms   3.70±0.05ms   5.70±0.09ms   2.22±0.02ms   4.33±0.07ms   6.39±0.1ms 
                 1000     1000     10.8±0.2ms     27.6±9ms     39.9±0.5ms    11.0±0.1ms    26.9±0.6ms    40.4±0.4ms    16.1±0.2ms    32.1±0.5ms   45.8±0.6ms 
                 1000    10000      42.8±1ms      152±4ms       290±7ms       48.5±2ms      156±3ms       297±8ms       88.7±2ms      197±6ms      324±5ms   
                10000     100      3.94±0.1ms    10.0±0.5ms    14.3±0.6ms    4.93±0.3ms    10.8±0.4ms    14.8±0.7ms    10.1±0.3ms    16.3±0.6ms    20.4±1ms  
                10000     1000      30.6±1ms      86.7±3ms      117±4ms       36.6±1ms      92.4±3ms      123±4ms       83.4±2ms      141±5ms      173±7ms   
                10000    10000      136±3ms       333±6ms       591±10ms      192±7ms       395±9ms       649±20ms      639±10ms      878±20ms    1.14±0.01s 
              ========= ======== ============= ============= ============= ============= ============= ============= ============= ============= ============

| Change   | Before [8734d2f0]    | After [5d0219cf] <argkmin_classmode_euclidean>   |   Ratio | Benchmark (Parameter)                                                                        |
|----------|----------------------|--------------------------------------------------|---------|----------------------------------------------------------------------------------------------|
| +        | 3.94±0.1ms           | 15.4±4ms                                         |    3.89 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100)     |
| +        | 10.1±0.3ms           | 39.1±4ms                                         |    3.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 100)    |
| +        | 4.93±0.3ms           | 14.6±1ms                                         |    2.95 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)    |
| +        | 20.4±1ms             | 58.7±6ms                                         |    2.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 1000)   |
| +        | 16.3±0.6ms           | 45.6±4ms                                         |    2.8  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 500, 500)    |
| +        | 14.8±0.7ms           | 32.2±3ms                                         |    2.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000)   |
| +        | 10.8±0.4ms           | 23.1±2ms                                         |    2.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 500)    |
| +        | 14.3±0.6ms           | 28.7±2ms                                         |    2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)    |
| +        | 10.0±0.5ms           | 20.1±2ms                                         |    2.01 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 500)     |
| +        | 2.22±0.02ms          | 4.02±0.5ms                                       |    1.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 100)     |
| +        | 4.33±0.07ms          | 6.25±1ms                                         |    1.44 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 500, 500)     |
| +        | 1.60±0.01ms          | 2.24±0.2ms                                       |    1.4  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 100)      |
| +        | 1.66±0.01ms          | 2.27±0.2ms                                       |    1.37 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 100)     |
| +        | 3.70±0.05ms          | 4.38±0.3ms                                       |    1.18 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 500)     |
| +        | 3.62±0.06ms          | 4.12±0.3ms                                       |    1.14 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 500)      |
| -        | 136±3ms              | 119±5ms                                          |    0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100)   |
| -        | 878±20ms             | 773±20ms                                         |    0.88 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 500)  |
| -        | 16.1±0.2ms           | 13.4±0.2ms                                       |    0.84 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 100)    |
| -        | 88.7±2ms             | 73.7±4ms                                         |    0.83 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 100)   |
| -        | 83.4±2ms             | 68.3±0.5ms                                       |    0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 100)   |
| -        | 1.14±0.01s           | 937±30ms                                         |    0.82 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 500, 1000) |
| -        | 42.8±1ms             | 32.9±4ms                                         |    0.77 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 100)    |
| -        | 395±9ms              | 301±6ms                                          |    0.76 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 500)  |
| -        | 333±6ms              | 243±4ms                                          |    0.73 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 500)   |
| -        | 48.5±2ms             | 33.6±3ms                                         |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 100)   |
| -        | 649±20ms             | 446±10ms                                         |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |
| -        | 5.70±0.09ms          | 3.76±0.4ms                                       |    0.66 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 100, 1000)    |
| -        | 591±10ms             | 373±7ms                                          |    0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000)  |
| -        | 32.1±0.5ms           | 19.9±0.2ms                                       |    0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 500)    |
| -        | 141±5ms              | 87.3±0.4ms                                       |    0.62 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 500)   |
| -        | 5.61±0.09ms          | 3.42±0.3ms                                       |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 100, 50, 1000)     |
| -        | 173±7ms              | 106±0.9ms                                        |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 500, 1000)  |
| -        | 36.6±1ms             | 21.5±0.1ms                                       |    0.59 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100)   |
| -        | 197±6ms              | 105±3ms                                          |    0.54 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 500)   |
| -        | 30.6±1ms             | 15.5±0.03ms                                      |    0.51 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100)    |
| -        | 26.9±0.6ms           | 13.1±0.09ms                                      |    0.49 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 500)    |
| -        | 156±3ms              | 70.5±3ms                                         |    0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 500)   |
| -        | 123±4ms              | 55.9±0.1ms                                       |    0.45 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000)  |
| -        | 11.0±0.1ms           | 4.72±0.01ms                                      |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 100)    |
| -        | 45.8±0.6ms           | 19.8±0.2ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 500, 1000)   |
| -        | 152±4ms              | 65.0±4ms                                         |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 500)    |
| -        | 92.4±3ms             | 39.4±0.1ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 500)   |
| -        | 117±4ms              | 50.2±0.1ms                                       |    0.43 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000)   |
| -        | 27.6±9ms             | 11.7±1ms                                         |    0.42 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 500)     |
| -        | 10.8±0.2ms           | 4.13±0.02ms                                      |    0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 100)     |
| -        | 86.7±3ms             | 33.2±0.4ms                                       |    0.38 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 500)    |
| -        | 324±5ms              | 118±6ms                                          |    0.36 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 500, 1000)  |
| -        | 40.4±0.4ms           | 12.8±0.1ms                                       |    0.32 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 100, 1000)   |
| -        | 297±8ms              | 70.3±3ms                                         |    0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 100, 1000)  |
| -        | 290±7ms              | 69.2±5ms                                         |    0.24 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 10000, 50, 1000)   |
| -        | 39.9±0.5ms           | 9.06±0.1ms                                       |    0.23 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(1000, 1000, 50, 1000)    |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

@jjerphan
Copy link
Member

Hi @OmarManzoor, thank you for the benchmarks!

We might need to adapt the heuristic when n_test is small.

Could you perform ASV benchmarks for n_neighbors=10 and for larger datasets, e.g. for:

  • (n_train, n_test) = (10_000, 100_000)
  • (n_train, n_test) = (1000, 1_000_000)
  • (n_train, n_test) = (100, 1_000_000)

?

Also, I believe that some part of this contribution might also be reused for RadiusNeighborsClassMode then. What do you think?

@OmarManzoor
Copy link
Contributor Author

Yes I think RadiusNeighbors will be similar but I think we will have to override the EuclideanSpecialization for that separately.

@OmarManzoor
Copy link
Contributor Author

So if I change the strategy to "auto" for "euclidean" the performance increases generically.

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
             --                               n_features / n_neighbors             
             ------------------ ---------------------------------------------------
              n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============ ============ ============ ============
               10000     100     3.64±0.1ms   10.2±0.3ms   4.14±0.1ms   11.1±0.5ms 
               10000     1000     29.7±2ms     89.9±2ms     34.5±2ms     102±8ms   
               10000    10000     116±5ms      387±20ms     187±10ms     437±20ms  
             ========= ======== ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
              --                               n_features / n_neighbors             
              ------------------ ---------------------------------------------------
               n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
              ========= ======== ============ ============ ============ ============
                10000     100     3.96±0.2ms   13.0±0.4ms   4.79±0.2ms   14.0±0.7ms 
                10000     1000    28.9±0.9ms    112±7ms      40.1±4ms     118±5ms   
                10000    10000     143±8ms      609±20ms     214±10ms     717±20ms  
              ========= ======== ============ ============ ============ ============

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                        |
|----------|----------------------|------------------------------|---------|----------------------------------------------------------------------------------------------|
| -        | 4.79±0.2ms           | 4.14±0.1ms                   |    0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)    |
| -        | 214±10ms             | 187±10ms                     |    0.87 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 100)  |
| -        | 40.1±4ms             | 34.5±2ms                     |    0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 100)   |
| -        | 118±5ms              | 102±8ms                      |    0.86 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 100, 1000)  |
| -        | 143±8ms              | 116±5ms                      |    0.81 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 100)   |
| -        | 112±7ms              | 89.9±2ms                     |    0.8  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000)   |
| -        | 14.0±0.7ms           | 11.1±0.5ms                   |    0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000)   |
| -        | 13.0±0.4ms           | 10.2±0.3ms                   |    0.79 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)    |
| -        | 609±20ms             | 387±20ms                     |    0.63 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 50, 1000)  |
| -        | 717±20ms             | 437±20ms                     |    0.61 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 10000, 100, 1000) |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

However changing to "auto" doesn't work for the "manhattan" metric and the performance decreases

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============ ============ ============ ============
             --                               n_features / n_neighbors             
             ------------------ ---------------------------------------------------
              n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============ ============ ============ ============
               10000     100     7.59±0.2ms   13.5±0.4ms   14.3±0.7ms   20.3±0.3ms 
               10000     1000     71.0±2ms     126±5ms      135±5ms      208±9ms   
               10000    10000     487±10ms     847±60ms    1.15±0.04s   1.45±0.04s 
             ========= ======== ============ ============ ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ======== ============ ============ ============ ============
              --                               n_features / n_neighbors             
              ------------------ ---------------------------------------------------
               n_train   n_test    50 / 100    50 / 1000    100 / 100    100 / 1000 
              ========= ======== ============ ============ ============ ============
                10000     100      24.9±6ms    39.8±0.1ms   54.3±0.5ms   72.7±0.5ms 
                10000     1000    62.9±0.6ms    110±1ms      147±1ms      199±3ms   
                10000    10000     516±20ms     862±30ms    1.11±0.02s   1.54±0.06s 
              ========= ======== ============ ============ ============ ============

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                      |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| +        | 110±1ms              | 126±5ms                      |    1.15 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 1000) |
| +        | 62.9±0.6ms           | 71.0±2ms                     |    1.13 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 1000, 50, 100)  |
| -        | 39.8±0.1ms           | 13.5±0.4ms                   |    0.34 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 1000)  |
| -        | 24.9±6ms             | 7.59±0.2ms                   |    0.3  | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 50, 100)   |
| -        | 72.7±0.5ms           | 20.3±0.3ms                   |    0.28 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 1000) |
| -        | 54.3±0.5ms           | 14.3±0.7ms                   |    0.26 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(10000, 100, 100, 100)  |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

Whereas if we keep the strategy as "parallel_on_X" the performance for "manhattan" does not degrade

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit 5d0219cf <argkmin_classmode_euclidean> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ======== ============= ============= ============ ============
             --                                n_features / n_neighbors              
             ------------------ -----------------------------------------------------
              n_train   n_test     50 / 100     50 / 1000    100 / 100    100 / 1000 
             ========= ======== ============= ============= ============ ============
               10000     100     23.1±0.08ms   39.6±0.08ms   53.8±0.3ms    72.0±3ms  
               10000     1000     62.3±0.3ms    108±0.3ms     147±2ms      198±2ms   
               10000    10000      480±9ms       817±20ms    1.06±0.02s   1.38±0.01s 
             ========= ======== ============= ============= ============ ============

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                       ok
[100.00%] ··· ========= ======== ============= ============= ============= ============
              --                                n_features / n_neighbors               
              ------------------ ------------------------------------------------------
               n_train   n_test     50 / 100     50 / 1000     100 / 100    100 / 1000 
              ========= ======== ============= ============= ============= ============
                10000     100     23.1±0.09ms   39.6±0.08ms   53.9±0.09ms   72.0±0.1ms 
                10000     1000     62.2±0.3ms    108±0.4ms     146±0.4ms    197±0.2ms  
                10000    10000      497±10ms      819±10ms     1.04±0.02s   1.40±0.05s 
              ========= ======== ============= ============= ============= ============


BENCHMARKS NOT SIGNIFICANTLY CHANGED.

I think we need to adjust the strategy based on the metric used now that we are including the euclidean specialization.

@OmarManzoor
Copy link
Contributor Author

OmarManzoor commented Jan 23, 2024

@jjerphan Here are the benchmarks that you requested. This is with "euclidean" and strategy "auto"

· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ===================
             --        n_test / n_features / n_neighbors
             --------- -------------------
              n_train   100000 / 100 / 10 
             ========= ===================
               10000        1.24±0.03s    
             ========= ===================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ===================
              --        n_test / n_features / n_neighbors
              --------- -------------------
               n_train   100000 / 100 / 10 
              ========= ===================
                10000        1.28±0.03s    
              ========= ===================


BENCHMARKS NOT SIGNIFICANTLY CHANGED.



· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ====================
             --        n_test / n_features / n_neighbors
             --------- --------------------
              n_train   1000000 / 100 / 10 
             ========= ====================
                1000        1.49±0.05s     
             ========= ====================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0..
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ====================
              --        n_test / n_features / n_neighbors
              --------- --------------------
               n_train   1000000 / 100 / 10 
              ========= ====================
                 1000        1.64±0.02s     
              ========= ====================


BENCHMARKS NOT SIGNIFICANTLY CHANGED.




· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For scikit-learn commit d26bde4c <testing> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[50.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[50.00%] ··· ========= ====================
             --        n_test / n_features / n_neighbors
             --------- --------------------
              n_train   1000000 / 100 / 10 
             ========= ====================
                100          313±5ms       
             ========= ====================

[50.00%] · For scikit-learn commit 8734d2f0 (round 1/1):
[50.00%] ·· Building for conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0.
[50.00%] ·· Benchmarking conda-py3.10-cython3.0.3-joblib1.3.2-numpy1.25.2-pandas2.1.0-scipy1.11.2-threadpoolctl3.2.0
[100.00%] ··· argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba                                                                                                                                                ok
[100.00%] ··· ========= ====================
              --        n_test / n_features / n_neighbors
              --------- --------------------
               n_train   1000000 / 100 / 10 
              ========= ====================
                 100          456±9ms       
              ========= ====================

| Change   | Before [8734d2f0]    | After [d26bde4c] <testing>   |   Ratio | Benchmark (Parameter)                                                                      |
|----------|----------------------|------------------------------|---------|--------------------------------------------------------------------------------------------|
| -        | 456±9ms              | 313±5ms                      |    0.69 | argkmin_classmode.BruteForceKNeighborsClassifier.time_predict_proba(100, 1000000, 100, 10) |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@jjerphan
Copy link
Member

Thank you.

Can you provide the ASV benchmark script that you have used?

@OmarManzoor
Copy link
Contributor Author

Thank you.

Can you provide the ASV benchmark script that you have used?

Sure here is the code

import numpy as np

from sklearn.neighbors import KNeighborsClassifier

from .common import Benchmark

N_SAMPLES_TRAIN = [10000]
N_SAMPLES_TEST = [100000]
N_FEATURES = [100]
N_NEIGHBORS = [10]


class BruteForceKNeighborsClassifier(Benchmark):
    """Benchmarks for KNeighborsClassifier.predict when using algorithm='brute'."""

    param_names = ["n_train", "n_test", "n_features", "n_neighbors"]
    params = (
        N_SAMPLES_TRAIN,
        N_SAMPLES_TEST,
        N_FEATURES,
        N_NEIGHBORS,
    )

    def setup(self, *params):
        n_train, n_test, n_features, n_neighbors = params
        metric = "euclidean"
        self.knc = KNeighborsClassifier(
            n_neighbors=n_neighbors, algorithm="brute", metric=metric
        )

        self.rng = np.random.RandomState(0)
        self.X_train = self.rng.random_sample((n_train, n_features))
        self.X_test = self.rng.random_sample((n_test, n_features))
        self.y = self.rng.randint(low=0, high=2, size=(n_train,))

        self.knc.fit(X=self.X_train, y=self.y)

    def time_predict_proba(self, *params):
        self.knc.predict_proba(self.X_test)

@jjerphan jjerphan changed the title Euclidean Specialization for ArgKminClassMode PERF Euclidean Specialization for ArgKminClassMode Mar 15, 2024
@jjerphan
Copy link
Member

I wonder whether it would make sense to add a heuristic for this specialization or if it would just add too much complexity.

I think we need to socilitate others' opinions.

@jjerphan
Copy link
Member

jjerphan commented Sep 8, 2024

@adam2392: would you be interested in reviewing this PR?

@adam2392
Copy link
Member

adam2392 commented Sep 9, 2024

@adam2392: would you be interested in reviewing this PR?

I'm not 100% comfortable w/ the codebase around the tempitaing, but I can take a look as a beginner :). It'll also help me familiarize myself. I will revisit sometime this week.

Just to summarize, is the following correct? Anything I missed?

  1. Is the main idea that computing Euclidean distances can further be improved compared to other distance metrics?
  2. The PR leverages some of the class designs for these generic distance metrics, and then further adds some code logic for the special case of euclidean distance metric to realize this performance improvement?

@OmarManzoor
Copy link
Contributor Author

Just to summarize, is the following correct? Anything I missed?

  1. Is the main idea that computing Euclidean distances can further be improved compared to other distance metrics?

Thats correct

  1. The PR leverages some of the class designs for these generic distance metrics, and then further adds some code logic for the special case of euclidean distance metric to realize this performance improvement?

Yeah that also seems correct.

Basically Euclidean distances have their own specialisation already implemented for the ArgKmin and RadiusNeighbors classes. In the original PR to add ArgKminClassMode these specialisations were not included so as to avoid making the PR too complex and thus left for later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants