Skip to content

Build libafcuda with dynamically loaded CUDA numeric binaries #3205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 22, 2022

Conversation

umar456
Copy link
Member

@umar456 umar456 commented Jan 13, 2022

Dynamically link CUDA numeric libraries instead of always staticly linking.

Description

ArrayFire's CUDA backend linked against the CUDA numeric libraries
staticly before this change. This caused the libafcuda library to be
in the 1.1GB range for CUDA 11.5 even if you were targeting one compute
capability. This is partially due to the fact that the linker does not
remove the compute capabilities of older architectures when linking.

One way around this would be to use nvprune to remove the architectures
that are not being used by the compute cability when building. This
approach is not yet implemented.

This commit will revert back to dynamically linking the CUDA numeric
libraries by default. You can still select the old behavior by setting
the AF_WITH_STATIC_CUDA_NUMERIC_LIBS option in CMake

Changes to Users

Then binary sizes are significantly smaller but requires you to have
the cuda libraries in the library paths. This is only an issue when building
installers.

Checklist

  • Rebased on latest master
  • Code compiles
  • Tests pass
  • [ ] Functions added to unified API
  • [ ] Functions documented

@WilliamTambellini
Copy link
Contributor

@umar456 any way for afcuda to dynload such nvidia libs only if needed (especially cufft, ...)?

@umar456
Copy link
Member Author

umar456 commented Jan 13, 2022

It could be done but would require a good bit of work. We have done this for other libraries but not for things like cublas and cufft.

@9prady9
Copy link
Member

9prady9 commented Jan 20, 2022

This is unfortunate, is this still the same with CUDA 11.6 ?

umar456 added 2 commits March 22, 2022 00:28
ArrayFire's CUDA backend linked against the CUDA numeric libraries
staticly before this change. This caused the libafcuda library to be
in the 1.1GB range for CUDA 11.5 even if you were targeting one compute
capability. This is partially due to the fact that the linker does not
remove the compute capabilities of older architectures when linking.

One way around this would be to use nvprune to remove the architectures
that are not being used by the compute cability when building. This
approach is not yet implemented.

This commit will revert back to dynamically linking the CUDA numeric
libraries by default. You can still select the old behavior by setting
the AF_WITH_STATIC_CUDA_NUMERIC_LIBS option in CMake
@umar456 umar456 merged commit e3f9559 into arrayfire:master Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants