[WIP] PCA NEP-37 adding random pathway and CuPy test #17676

viclafargue · 2020-06-23T15:51:13Z

Reference Issues/PRs

This PR completes the existing experimental attempt to enable NEP-37 for the PCA algorithm.
See #16574

What does this implement/fix? Explain your changes.

Implement the pathway that make use of randomized_svd when svd_solver='randomized'
Add a CuPy test

…n_nep37

…n_nep37_bk

viclafargue · 2020-06-24T12:45:24Z

I experimented with CuPy and Dask arrays. I could identify 2 blockers :

With PCA parameters svd_solver='randomized' and iterated_power >= 2, linalg.lu is required. Unfortunately, this function is not implemented in CuPy. It seems like linalg.lu_factor cannot be used as an alternative. One solution, suggested by @ogrisel, is to use a QR decomposition when linalg.lu is not available in the module.
PCA requires linalg.svd. The implementation of this function in the Dask array module differs, as it does not take the full_matrices parameter. The necessary output cannot be retrieved. See Supporting full_matrices argument with Dask's svd dask/dask#3576

ogrisel · 2020-06-24T15:54:47Z

One solution, suggested by @ogrisel, is to use a QR decomposition when linalg.lu is not available in the module.

~~Can you please update your PR to implement this solution?~~

Actually you already did.

ogrisel · 2020-06-24T17:25:52Z

For reference I opened cupy/cupy#3483 to document the lack of linalg.lu upstream.

ogrisel · 2020-06-24T17:28:03Z

@WXBN did you run some benchmarks to see what are the benefits of this GPU-based implementation of PCA / randomized_svd?

For instance on a dataset like MNIST or bigger with 50 components.

sklearn/utils/extmath.py

viclafargue · 2020-06-25T09:09:11Z

I created a benchmark to compare the performance of the PCA algorithm with and without a GPU. The benefit only appears with a large dataset, here a (10k, 100) dataset. I compared the runtime for different PCA parameters including some that make use of randomized_svd.

Here are the results on an NVIDIA Tesla V100 :

With svd_solver='full' and iterated_power=2:

Without GPU : 1.220s
With GPU : 0.727s

With svd_solver='full' and iterated_power=10:

Without GPU : 1.204s
With GPU : 0.013s

With svd_solver='randomized' and iterated_power=2:

Without GPU : 0.106s
With GPU : 0.641s

With svd_solver='randomized' and iterated_power=10:

Without GPU : 0.261s
With GPU : 0.276s

ogrisel · 2020-06-25T12:57:22Z

Thanks for the benchmarks. It also probably depend on the number of features and components to extract.

Also keep in mind that because the GPU version uses QR instead of LU, the results might not have the same explained variance.

ogrisel · 2020-06-25T12:58:59Z

It's weird that you see a difference when changing iterated_power with svd_solver="full". iterated_power should only impact the randomized solver.

viclafargue · 2020-06-25T14:40:26Z

Also keep in mind that because the GPU version uses QR instead of LU, the results might not have the same explained variance.

Yes indeed.

It's weird that you see a difference when changing iterated_power with svd_solver="full". iterated_power should only impact the randomized solver.

Thank you for noticing this. I forgot to run a warm-up launch. CuPy seems to be loading something on the first call, maybe some JIT CUDA code.

I got better results still with NVIDIA Tesla V100 :

With svd_solver='full' and iterated_power=2:

Without GPU : 1.195
With GPU : 0.013

With svd_solver='full' and iterated_power=10:

Without GPU : 1.195
With GPU : 0.013

With svd_solver='randomized' and iterated_power=2:

Without GPU : 0.034
With GPU : 0.010

With svd_solver='randomized' and iterated_power=10:

Without GPU : 0.186
With GPU : 0.010

ogrisel · 2020-06-25T14:43:53Z

Thanks for the update, that's interesting :)

ogrisel · 2020-06-25T15:34:32Z

In your benchmark script could you please report pca.explained_variance_.sum() in addition to the timings?

viclafargue · 2020-06-25T16:19:02Z

Done ;)

ogrisel · 2020-06-25T17:23:45Z

I don't have my GPU machine handy (it's too warm today, I want to keep my flat cool today and tomorrow ;) what are the results ? Do the GPU variants with QR instead of LU explain approximately the same amount of variance?

viclafargue · 2020-06-26T09:02:47Z

Do the GPU variants with QR instead of LU explain approximately the same amount of variance?

There seems to be a small difference :

With svd_solver='full' and iterated_power=2:

Without GPU : explained variance: 5.658
With GPU : explained variance: 5.658

With svd_solver='full' and iterated_power=10:

Without GPU : explained variance: 5.658
With GPU : explained variance: 5.658

With svd_solver='randomized' and iterated_power=2:

Without GPU : explained variance: 5.145
With GPU : explained variance: 5.144

With svd_solver='randomized' and iterated_power=10:

Without GPU : explained variance: 5.624
With GPU : explained variance: 5.636

ogrisel · 2020-06-26T10:00:16Z

@WXBN It would be interesting to compare those results with the cuml.decomposition.PCA implementation from rapidsai.

viclafargue · 2020-06-26T13:31:34Z

Unfortunately, the only options available for cuML's PCA svd_solver parameter are 'full' and 'jacobi'. Because of this, I can only compare on svd_solver='full'.

With svd_solver='full' and iterated_power=2:
Without GPU : runtime: 1.197s, explained variance: 6.967
With CuPy : runtime: 0.016s, explained variance: 6.967
With cuML : runtime: 0.007s, explained variance: 6.967

With svd_solver='full' and iterated_power=10:
Without GPU : runtime: 1.210s, explained variance: 6.967
With CuPy : runtime: 0.016s, explained variance: 6.967
With cuML : runtime: 0.007s, explained variance: 6.967

With svd_solver='randomized' and iterated_power=2:
Without GPU : runtime: 0.032s, explained variance: 6.745
With CuPy : runtime: 0.012s, explained variance: 6.693

With svd_solver='randomized' and iterated_power=10:
Without GPU : runtime: 0.204s, explained variance: 6.945
With CuPy : runtime: 0.040s, explained variance: 6.949

thomasjpfan added 30 commits February 21, 2020 16:09

MNT Adds labeler

3882723

BUG Fix

d79390b

Double quotes are better

d71ecae

BUG Fix

919a519

BUG Fix

a89754c

MNT Adds build ci tag

0f56d96

MNT Use fork for new feature

e4ee673

Merge branch 'only_change_setup'

409be8d

MNT Uses tagged version

3e729e4

Merge remote-tracking branch 'upstream/master'

faef88c

WIP Testing nep37 [skip ci]

60c3834

WIP Testing nep37 [skip ci]

e7bfda8

WIP Testing nep37 [skip ci]

8cdfd2d

WIP Testing nep37 [skip ci]

a8dc598

WIP Testing nep37 [skip ci]

adb07db

WIP Testing nep37 [skip ci]

df8ca82

WIP Testing nep37 [skip ci]

ea1c0fb

WIP Testing nep37 [skip ci]

4fe65fe

Merge remote-tracking branch 'upstream/master' into pca_array_functio…

5ca03e0

…n_nep37

WIP Testing nep37 [skip ci]

fe6293d

WIP Testing nep37 [skip ci]

8c3001f

WIP Testing nep37 [skip ci]

33bbd52

WIP Testing nep37 [skip ci]

19b9d2a

WIP Testing nep37 [skip ci]

a988b8a

WIP Testing nep37 [skip ci]

a33bde0

WIP Testing nep37 [skip ci]

d50fbbf

BUG Fix

817c4f7

Merge remote-tracking branch 'upstream/master' into pca_array_functio…

2718bb7

…n_nep37_bk

WIP Update

d14166f

WIP Enables support for JAx

e2b74a5

Use QR when LU is not available

79f6177

ogrisel reviewed Jun 24, 2020

View reviewed changes

sklearn/utils/extmath.py Outdated Show resolved Hide resolved

viclafargue added 2 commits June 25, 2020 08:38

Use _get_array_module for randomized_svd

fc93b8d

PCA GPU benchmark

fe60628

Adding warm-up for reliable benchmarking

9340f7e

viclafargue force-pushed the pca_nep37_random_pathway branch from 08eacfe to da3135d Compare June 25, 2020 14:43

2 warmups better than 1

cc3e539

viclafargue force-pushed the pca_nep37_random_pathway branch from da3135d to cc3e539 Compare June 25, 2020 14:45

Display sum of explained variance

0610142

Comparison with cuML

51e1f11

oleksandr-pavlyk mentioned this pull request Jul 20, 2020

Related topic: NumPy array protocols data-apis/array-api#1

Closed

stsievert mentioned this pull request Aug 18, 2020

CuPy arrays adriangb/scikeras#64

Open

jeremiedbb mentioned this pull request Nov 9, 2020

Compatibility with scikit-learn API rapidsai/cuml#3125

Closed

Base automatically changed from master to main January 22, 2021 10:52

ogrisel mentioned this pull request Feb 1, 2022

Path for Adopting the Array API spec #22352

Open

Uh oh!

[WIP] PCA NEP-37 adding random pathway and CuPy test #17676

Are you sure you want to change the base?

[WIP] PCA NEP-37 adding random pathway and CuPy test #17676

Uh oh!

Conversation

viclafargue commented Jun 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viclafargue commented Jun 24, 2020

Uh oh!

ogrisel commented Jun 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Jun 24, 2020

Uh oh!

ogrisel commented Jun 24, 2020

Uh oh!

Uh oh!

viclafargue commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Jun 25, 2020

Uh oh!

ogrisel commented Jun 25, 2020

Uh oh!

viclafargue commented Jun 25, 2020

Uh oh!

ogrisel commented Jun 25, 2020

Uh oh!

ogrisel commented Jun 25, 2020

Uh oh!

viclafargue commented Jun 25, 2020

Uh oh!

ogrisel commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viclafargue commented Jun 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Jun 26, 2020

Uh oh!

viclafargue commented Jun 26, 2020

Uh oh!

Uh oh!

viclafargue commented Jun 23, 2020 •

edited

Loading

ogrisel commented Jun 24, 2020 •

edited

Loading

viclafargue commented Jun 25, 2020 •

edited

Loading

ogrisel commented Jun 25, 2020 •

edited

Loading

viclafargue commented Jun 26, 2020 •

edited

Loading