TSNE performance regression in 1.5 #29665

gagandeep987123 · 2024-08-13T16:30:58Z

Describe the bug

The performance of TSNE transformation reduces when using n_jobs as 25 for the newer version w.r.t. 1.3.1.
version 1.3.1

df = np.random.rand(30000, 3)
tsne = TSNE(n_components=2, random_state=42, n_jobs=25, verbose=10, n_iter=1500)

1.5.1

df = np.random.rand(30000,3)
tsne = TSNE(n_components=2, random_state=42, n_jobs=25, verbose=10,max_iter=1500)

Time 1.3.1 vs 1.5.1 :: 59 vs 223

Is this a intended behavior?

Steps/Code to Reproduce

df = np.random.rand(30000, 3)
tsne = TSNE(n_components=2, random_state=42, n_jobs=25, verbose=10, n_iter=1500)

1.5.1

df = np.random.rand(30000,3)
tsne = TSNE(n_components=2, random_state=42, n_jobs=25, verbose=10,max_iter=1500)

Expected Results

Minimal time discrepancy

Actual Results

Similar time

Versions

1.5.1

System:
    python: 3.12.3 (main, Jul 31 2024, 17:43:48) [GCC 13.2.0]
executable: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/bin/python
   machine: Linux-6.8.0-40-generic-x86_64-with-glibc2.39

Python dependencies:
      sklearn: 1.5.1
          pip: 23.2.1
   setuptools: 72.2.0
        numpy: 1.26.4
        scipy: 1.14.0
       Cython: None
       pandas: None
   matplotlib: None
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 28
         prefix: libscipy_openblas
       filepath: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/lib/python3.12/site-packages/scipy.libs/libscipy_openblas-c128ec02.so
        version: 0.3.27.dev
threading_layer: pthreads
   architecture: Haswell


1.3.1
System:
    python: 3.12.3 (main, Jul 31 2024, 17:43:48) [GCC 13.2.0]
executable: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/bin/python
   machine: Linux-6.8.0-40-generic-x86_64-with-glibc2.39

Python dependencies:
      sklearn: 1.3.1
          pip: 23.2.1
   setuptools: 72.2.0
        numpy: 1.26.4
        scipy: 1.14.0
       Cython: None
       pandas: None
   matplotlib: None
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 28
         prefix: libopenblas
       filepath: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/lib/python3.12/site-packages/numpy.libs/libopenblas64_p-r0-0cf96a72.3.23.dev.so
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: Prescott

       user_api: openmp
   internal_api: openmp
    num_threads: 28
         prefix: libgomp
       filepath: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/lib/python3.12/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 28
         prefix: libscipy_openblas
       filepath: /home/gagan/PycharmProjects/scikit_tsne_test/.venv/lib/python3.12/site-packages/scipy.libs/libscipy_openblas-c128ec02.so
        version: 0.3.27.dev
threading_layer: pthreads
   architecture: Haswell

The text was updated successfully, but these errors were encountered:

adrinjalali · 2024-08-14T07:44:19Z

Thanks for the report, I confirm that I can reproduce with:

df = np.random.rand(3000, 3)
%timeit tsne = TSNE(n_components=2, random_state=42, n_jobs=7, verbose=10, n_iter=1500).fit(df)

On 1.3.1:

6.15 s ± 82.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

while on main:

13.2 s ± 160 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Sample run on 1.3.1:

[t-SNE] Computing 91 nearest neighbors...
[t-SNE] Indexed 3000 samples in 0.002s...
[t-SNE] Computed neighbors for 3000 samples in 0.048s...
[t-SNE] Computed conditional probabilities for sample 1000 / 3000
[t-SNE] Computed conditional probabilities for sample 2000 / 3000
[t-SNE] Computed conditional probabilities for sample 3000 / 3000
[t-SNE] Mean sigma: 0.079455
[t-SNE] Computed conditional probabilities in 0.048s
[t-SNE] Iteration 50: error = 74.6297531, gradient norm = 0.0126486 (50 iterations in 0.229s)
[t-SNE] Iteration 100: error = 73.1009979, gradient norm = 0.0013591 (50 iterations in 0.222s)
[t-SNE] Iteration 150: error = 73.0318298, gradient norm = 0.0015761 (50 iterations in 0.215s)
[t-SNE] Iteration 200: error = 72.9756241, gradient norm = 0.0007988 (50 iterations in 0.218s)
[t-SNE] Iteration 250: error = 72.9535065, gradient norm = 0.0003498 (50 iterations in 0.215s)
[t-SNE] KL divergence after 250 iterations with early exaggeration: 72.953506
[t-SNE] Iteration 300: error = 1.9057550, gradient norm = 0.0224387 (50 iterations in 0.216s)
[t-SNE] Iteration 350: error = 1.4685167, gradient norm = 0.0172649 (50 iterations in 0.213s)
[t-SNE] Iteration 400: error = 1.2980621, gradient norm = 0.0145089 (50 iterations in 0.199s)
[t-SNE] Iteration 450: error = 1.2117977, gradient norm = 0.0123560 (50 iterations in 0.198s)
[t-SNE] Iteration 500: error = 1.1618414, gradient norm = 0.0108389 (50 iterations in 0.206s)
[t-SNE] Iteration 550: error = 1.1303933, gradient norm = 0.0094740 (50 iterations in 0.200s)
[t-SNE] Iteration 600: error = 1.1095150, gradient norm = 0.0082941 (50 iterations in 0.194s)
[t-SNE] Iteration 650: error = 1.0954374, gradient norm = 0.0072256 (50 iterations in 0.230s)
[t-SNE] Iteration 700: error = 1.0856805, gradient norm = 0.0059221 (50 iterations in 0.193s)
[t-SNE] Iteration 750: error = 1.0789994, gradient norm = 0.0049726 (50 iterations in 0.217s)
[t-SNE] Iteration 800: error = 1.0746853, gradient norm = 0.0037317 (50 iterations in 0.206s)
[t-SNE] Iteration 850: error = 1.0717297, gradient norm = 0.0026455 (50 iterations in 0.197s)
[t-SNE] Iteration 900: error = 1.0694184, gradient norm = 0.0025572 (50 iterations in 0.191s)
[t-SNE] Iteration 950: error = 1.0672817, gradient norm = 0.0023106 (50 iterations in 0.195s)
[t-SNE] Iteration 1000: error = 1.0657473, gradient norm = 0.0017981 (50 iterations in 0.202s)
[t-SNE] Iteration 1050: error = 1.0644907, gradient norm = 0.0015727 (50 iterations in 0.197s)
[t-SNE] Iteration 1100: error = 1.0635033, gradient norm = 0.0014580 (50 iterations in 0.202s)
[t-SNE] Iteration 1150: error = 1.0625070, gradient norm = 0.0012920 (50 iterations in 0.195s)
[t-SNE] Iteration 1200: error = 1.0616151, gradient norm = 0.0012439 (50 iterations in 0.192s)
[t-SNE] Iteration 1250: error = 1.0609202, gradient norm = 0.0011146 (50 iterations in 0.194s)
[t-SNE] Iteration 1300: error = 1.0601972, gradient norm = 0.0011908 (50 iterations in 0.197s)
[t-SNE] Iteration 1350: error = 1.0595425, gradient norm = 0.0010888 (50 iterations in 0.200s)
[t-SNE] Iteration 1400: error = 1.0588894, gradient norm = 0.0009405 (50 iterations in 0.193s)
[t-SNE] Iteration 1450: error = 1.0583086, gradient norm = 0.0010328 (50 iterations in 0.190s)
[t-SNE] Iteration 1500: error = 1.0576378, gradient norm = 0.0010928 (50 iterations in 0.196s)
[t-SNE] KL divergence after 1500 iterations: 1.057638

Sample run on main:

[t-SNE] Computing 91 nearest neighbors...
[t-SNE] Indexed 3000 samples in 0.001s...
[t-SNE] Computed neighbors for 3000 samples in 0.039s...
[t-SNE] Computed conditional probabilities for sample 1000 / 3000
[t-SNE] Computed conditional probabilities for sample 2000 / 3000
[t-SNE] Computed conditional probabilities for sample 3000 / 3000
[t-SNE] Mean sigma: 0.079133
[t-SNE] Computed conditional probabilities in 0.042s
[t-SNE] Iteration 50: error = 74.6860580, gradient norm = 0.0130361 (50 iterations in 0.561s)
[t-SNE] Iteration 100: error = 73.0622330, gradient norm = 0.0016993 (50 iterations in 0.458s)
[t-SNE] Iteration 150: error = 73.0010300, gradient norm = 0.0008503 (50 iterations in 0.440s)
[t-SNE] Iteration 200: error = 72.9877853, gradient norm = 0.0004468 (50 iterations in 0.447s)
[t-SNE] Iteration 250: error = 72.9845200, gradient norm = 0.0002729 (50 iterations in 0.439s)
[t-SNE] KL divergence after 250 iterations with early exaggeration: 72.984520
[t-SNE] Iteration 300: error = 1.9100612, gradient norm = 0.0219436 (50 iterations in 0.426s)
[t-SNE] Iteration 350: error = 1.4687014, gradient norm = 0.0174686 (50 iterations in 0.410s)
[t-SNE] Iteration 400: error = 1.2948278, gradient norm = 0.0145004 (50 iterations in 0.417s)
[t-SNE] Iteration 450: error = 1.2071151, gradient norm = 0.0125211 (50 iterations in 0.420s)
[t-SNE] Iteration 500: error = 1.1559333, gradient norm = 0.0109722 (50 iterations in 0.416s)
[t-SNE] Iteration 550: error = 1.1242188, gradient norm = 0.0094801 (50 iterations in 0.434s)
[t-SNE] Iteration 600: error = 1.1032512, gradient norm = 0.0085060 (50 iterations in 0.447s)
[t-SNE] Iteration 650: error = 1.0888082, gradient norm = 0.0075305 (50 iterations in 0.433s)
[t-SNE] Iteration 700: error = 1.0787106, gradient norm = 0.0061661 (50 iterations in 0.439s)
[t-SNE] Iteration 750: error = 1.0722656, gradient norm = 0.0045898 (50 iterations in 0.433s)
[t-SNE] Iteration 800: error = 1.0677128, gradient norm = 0.0040846 (50 iterations in 0.434s)
[t-SNE] Iteration 850: error = 1.0642205, gradient norm = 0.0034133 (50 iterations in 0.429s)
[t-SNE] Iteration 900: error = 1.0615629, gradient norm = 0.0030121 (50 iterations in 0.427s)
[t-SNE] Iteration 950: error = 1.0593433, gradient norm = 0.0026399 (50 iterations in 0.430s)
[t-SNE] Iteration 1000: error = 1.0574670, gradient norm = 0.0022615 (50 iterations in 0.438s)
[t-SNE] Iteration 1050: error = 1.0560901, gradient norm = 0.0018045 (50 iterations in 0.433s)
[t-SNE] Iteration 1100: error = 1.0549316, gradient norm = 0.0016540 (50 iterations in 0.427s)
[t-SNE] Iteration 1150: error = 1.0540546, gradient norm = 0.0012580 (50 iterations in 0.447s)
[t-SNE] Iteration 1200: error = 1.0533767, gradient norm = 0.0010741 (50 iterations in 0.434s)
[t-SNE] Iteration 1250: error = 1.0526401, gradient norm = 0.0011667 (50 iterations in 0.433s)
[t-SNE] Iteration 1300: error = 1.0518836, gradient norm = 0.0012524 (50 iterations in 0.424s)
[t-SNE] Iteration 1350: error = 1.0511994, gradient norm = 0.0011868 (50 iterations in 0.434s)
[t-SNE] Iteration 1400: error = 1.0507367, gradient norm = 0.0009460 (50 iterations in 0.430s)
[t-SNE] Iteration 1450: error = 1.0502670, gradient norm = 0.0008885 (50 iterations in 0.426s)
[t-SNE] Iteration 1500: error = 1.0498598, gradient norm = 0.0009467 (50 iterations in 0.432s)
[t-SNE] KL divergence after 1500 iterations: 1.049860
13.2 s ± 160 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This requires more investigation to figure out where the performance hit is coming from.

adrinjalali · 2024-08-14T08:19:33Z

Looking at the git blame and our related PRs, I'm struggling to find the relevant change causing this regression.

cc @scikit-learn/core-devs for help.

jjerphan · 2024-08-14T08:21:46Z

I would recommend using git-bisect.

adrinjalali · 2024-08-14T11:34:01Z

Something did NOT go well down that git bisect path 😆

$ git bisect good                                                                               
0e19a4822ff49951d2a7606444a1a6085c32b56b is the first bad commit
commit 0e19a4822ff49951d2a7606444a1a6085c32b56b
Author: Lucy Liu <jliu176@gmail.com>
Date:   Tue Apr 2 23:07:24 2024 +1100

    DOC Fix typo `LogisticRegressionCV` docstring (#28746)

 sklearn/linear_model/_logistic.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

I'll try again

EDIT: pretty sure it's because I didn't handle meson/setuptools switch well while bisecting.

jjerphan · 2024-08-14T11:37:53Z

Might it be due to the used of meson for the build system with flags being now different? We have seen this in the past in another issue.

adrinjalali · 2024-08-14T11:58:31Z

Yep. Compiling the 1.5.1 tag with python setup.py develop runs TSNE fast (11-12s), while the same tag with meson takes about 20s.

cc @lesteve

lesteve · 2024-08-19T16:40:42Z

After looking at this a bit, it seems like this is maybe linked to OpenMP, it seems like we need to add the openmp_dep to all the extensions that need it, for example for TSNE, this seems to improve the situation (need to look a bit more at it to be 100% sure this fixes the issue):

diff --git a/sklearn/manifold/meson.build b/sklearn/manifold/meson.build
index b112f63dd4..ee83e8afc5 100644
--- a/sklearn/manifold/meson.build
+++ b/sklearn/manifold/meson.build
@@ -9,7 +9,7 @@ py.extension_module(
 py.extension_module(
   '_barnes_hut_tsne',
   '_barnes_hut_tsne.pyx',
-  dependencies: [np_dep],
+  dependencies: [np_dep, openmp_dep],
   cython_args: cython_args,
   subdir: 'sklearn/manifold',
   install: true

My guess right now, is that the setuptools OpenMP flags were added everywhere globally, whereas for meson they need to be added explictly to extension modules that need it.

lesteve · 2024-08-20T09:23:57Z

I have opened #29694 which I think fixes this issue.

gagandeep987123 added Bug Needs Triage Issue requires triage labels Aug 13, 2024

adrinjalali added Performance Regression and removed Bug Needs Triage Issue requires triage labels Aug 14, 2024

adrinjalali added the module:manifold label Aug 14, 2024

This was referenced Aug 19, 2024

ENH allow up to 65536 bins in HGBT #28603

Open

BLD Add missing OpenMP dependencies in relevant meson.build #29694

Merged

lesteve changed the title ~~TSNE efficiency 1.3.1 vs 1.5.1 when using n_jobs~~ TSNE performance regression in 1.5 Aug 22, 2024

adrinjalali closed this as completed in #29694 Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TSNE performance regression in 1.5 #29665

TSNE performance regression in 1.5 #29665

gagandeep987123 commented Aug 13, 2024

adrinjalali commented Aug 14, 2024

adrinjalali commented Aug 14, 2024

jjerphan commented Aug 14, 2024

adrinjalali commented Aug 14, 2024 •

edited

Loading

jjerphan commented Aug 14, 2024

adrinjalali commented Aug 14, 2024

lesteve commented Aug 19, 2024

lesteve commented Aug 20, 2024

TSNE performance regression in 1.5 #29665

TSNE performance regression in 1.5 #29665

Comments

gagandeep987123 commented Aug 13, 2024

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

adrinjalali commented Aug 14, 2024

adrinjalali commented Aug 14, 2024

jjerphan commented Aug 14, 2024

adrinjalali commented Aug 14, 2024 • edited Loading

jjerphan commented Aug 14, 2024

adrinjalali commented Aug 14, 2024

lesteve commented Aug 19, 2024

lesteve commented Aug 20, 2024

adrinjalali commented Aug 14, 2024 •

edited

Loading