-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] PCA NEP-37 adding random pathway and CuPy test #17676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WIP] PCA NEP-37 adding random pathway and CuPy test #17676
Conversation
I experimented with CuPy and Dask arrays. I could identify 2 blockers :
|
Actually you already did. |
For reference I opened cupy/cupy#3483 to document the lack of |
@WXBN did you run some benchmarks to see what are the benefits of this GPU-based implementation of For instance on a dataset like MNIST or bigger with 50 components. |
I created a benchmark to compare the performance of the PCA algorithm with and without a GPU. The benefit only appears with a large dataset, here a (10k, 100) dataset. I compared the runtime for different PCA parameters including some that make use of Here are the results on an NVIDIA Tesla V100 : With
With
With
With
|
Thanks for the benchmarks. It also probably depend on the number of features and components to extract. Also keep in mind that because the GPU version uses QR instead of LU, the results might not have the same explained variance. |
It's weird that you see a difference when changing |
Yes indeed.
Thank you for noticing this. I forgot to run a warm-up launch. CuPy seems to be loading something on the first call, maybe some JIT CUDA code. I got better results still with NVIDIA Tesla V100 : With
With
With
With
|
08eacfe
to
da3135d
Compare
Thanks for the update, that's interesting :) |
da3135d
to
cc3e539
Compare
In your benchmark script could you please report |
Done ;) |
I don't have my GPU machine handy (it's too warm today, I want to keep my flat cool today and tomorrow ;) what are the results ? Do the GPU variants with QR instead of LU explain approximately the same amount of variance? |
There seems to be a small difference : With
With
With
With
|
@WXBN It would be interesting to compare those results with the |
Unfortunately, the only options available for cuML's PCA With With With With |
Reference Issues/PRs
This PR completes the existing experimental attempt to enable NEP-37 for the PCA algorithm.
See #16574
What does this implement/fix? Explain your changes.
randomized_svd
whensvd_solver='randomized'