Skip to content

Conversation

frasercrmck
Copy link
Contributor

This commit introduces a new compiler option which lets the target infer the name of the libclc builtins library to link with. This is contrasted with the previous option where the user explicitly provides a namespec.

For the AMDGPU target, the general form of libclc libraries is gfxXYZ-amdgcn--.bc or gfxXYZ-amdgcn-mesa-mesa3d.bc. The target will try to link with either of these libraries, based on the triple and GPU architecture.

Note that this patch doesn't guarantee that a library will necessary be found for all CPUs or GFX architectures. For example, some libclc libraries use another naming scheme, for example 'tonga' instead of 'gfx802'. These are not accounted for. If an inferred library isn't found then the compiler will error as if the user had provided the path themselves. Future patches will focus on libclc and standardize the naming system to reduce the likelihood of this happening.

This commit introduces a new compiler option which lets the target infer
the name of the libclc builtins library to link with. This is contrasted
with the previous option where the user explicitly provides a namespec.

For the AMDGPU target, the general form of libclc libraries is
gfxXYZ-amdgcn--.bc or gfxXYZ-amdgcn-mesa-mesa3d.bc. The target will
try to link with either of these libraries, based on the triple and GPU
architecture.

Note that this patch doesn't guarantee that a library will necessary be
found for all CPUs or GFX architectures. For example, some libclc
libraries use another naming scheme, for example 'tonga' instead of
'gfx802'. These are not accounted for. If an inferred library isn't
found then the compiler will error as if the user had provided the path
themselves. Future patches will focus on libclc and standardize the
naming system to reduce the likelihood of this happening.
@frasercrmck frasercrmck requested review from arsenm and jhuber6 August 12, 2025 10:24
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 12, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Fraser Cormack (frasercrmck)

Changes

This commit introduces a new compiler option which lets the target infer the name of the libclc builtins library to link with. This is contrasted with the previous option where the user explicitly provides a namespec.

For the AMDGPU target, the general form of libclc libraries is gfxXYZ-amdgcn--.bc or gfxXYZ-amdgcn-mesa-mesa3d.bc. The target will try to link with either of these libraries, based on the triple and GPU architecture.

Note that this patch doesn't guarantee that a library will necessary be found for all CPUs or GFX architectures. For example, some libclc libraries use another naming scheme, for example 'tonga' instead of 'gfx802'. These are not accounted for. If an inferred library isn't found then the compiler will error as if the user had provided the path themselves. Future patches will focus on libclc and standardize the naming system to reduce the likelihood of this happening.


Full diff: https://github.com/llvm/llvm-project/pull/153162.diff

5 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+2)
  • (modified) clang/include/clang/Driver/CommonArgs.h (+2-1)
  • (modified) clang/include/clang/Driver/Options.td (+2)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+13-2)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+21-6)
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 0f17f4aa761ea..adbfbfed12da5 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -395,6 +395,8 @@ def warn_drv_fraw_string_literals_in_cxx11 : Warning<
   InGroup<UnusedCommandLineArgument>;
 
 def err_drv_libclc_not_found : Error<"no libclc library '%0' found in the clang resource directory">;
+def err_drv_libclc_not_inferred : Error<
+  "could not infer a libclc library for this target; try providing a namespec or full path">;
 
 def err_drv_invalid_malign_branch_EQ : Error<
   "invalid argument '%0' to -malign-branch=; each element must be one of: %1">;
diff --git a/clang/include/clang/Driver/CommonArgs.h b/clang/include/clang/Driver/CommonArgs.h
index 1464ce4e1b31b..ea4b73082623a 100644
--- a/clang/include/clang/Driver/CommonArgs.h
+++ b/clang/include/clang/Driver/CommonArgs.h
@@ -218,7 +218,8 @@ void addOpenMPDeviceRTL(const Driver &D, const llvm::opt::ArgList &DriverArgs,
                         const ToolChain &HostTC);
 
 void addOpenCLBuiltinsLib(const Driver &D, const llvm::opt::ArgList &DriverArgs,
-                          llvm::opt::ArgStringList &CC1Args);
+                          llvm::opt::ArgStringList &CC1Args,
+                          const StringRef InferredLibclcLibName = "");
 
 void addOutlineAtomicsArgs(const Driver &D, const ToolChain &TC,
                            const llvm::opt::ArgList &Args,
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 6aab43c9ed57f..1dc688d25b9b3 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1435,6 +1435,8 @@ def openacc_macro_override_EQ
 
 // End Clang specific/exclusive options for OpenACC.
 
+def libclc_lib : Joined<["--"], "libclc-lib">, Group<opencl_Group>,
+  HelpText<"Link with the libclc OpenCL bitcode library">;
 def libclc_lib_EQ : Joined<["--"], "libclc-lib=">, Group<opencl_Group>,
   HelpText<"Namespec of libclc OpenCL bitcode library to link">;
 def libomptarget_amdgpu_bc_path_EQ : Joined<["--"], "libomptarget-amdgpu-bc-path=">, Group<i_Group>,
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 5e463b9c98687..f1bcfd47831a3 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -858,8 +858,19 @@ void AMDGPUToolChain::addClangTargetOptions(
     CC1Args.push_back("-fapply-global-visibility-to-externs");
   }
 
-  if (DeviceOffloadingKind == Action::OFK_None)
-    addOpenCLBuiltinsLib(getDriver(), DriverArgs, CC1Args);
+  if (DeviceOffloadingKind == Action::OFK_None) {
+    auto GpuArch = getGPUArch(DriverArgs);
+    auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
+    const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);
+    SmallString<128> InferredLibclcLibName(CanonArch);
+    if (getTriple().getOS() != llvm::Triple::Mesa3D)
+      InferredLibclcLibName += "-amdgcn--";
+    else
+      InferredLibclcLibName += "-amdgcn-mesa-mesa3d";
+    InferredLibclcLibName += ".bc";
+    addOpenCLBuiltinsLib(getDriver(), DriverArgs, CC1Args,
+                         InferredLibclcLibName);
+  }
 }
 
 void AMDGPUToolChain::addClangWarningOptions(ArgStringList &CC1Args) const {
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index eb88f7bd4b469..5be3babc100d2 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -2999,27 +2999,42 @@ void tools::addHIPRuntimeLibArgs(const ToolChain &TC, Compilation &C,
 
 void tools::addOpenCLBuiltinsLib(const Driver &D,
                                  const llvm::opt::ArgList &DriverArgs,
-                                 llvm::opt::ArgStringList &CC1Args) {
+                                 llvm::opt::ArgStringList &CC1Args,
+                                 const StringRef InferredLibclcLibName) {
   // Check whether user specifies a libclc bytecode library
-  const Arg *A = DriverArgs.getLastArg(options::OPT_libclc_lib_EQ);
+  const Arg *A = DriverArgs.getLastArg(options::OPT_libclc_lib,
+                                       options::OPT_libclc_lib_EQ);
   if (!A)
     return;
 
+  bool IsEQOpt = A->getOption().matches(options::OPT_libclc_lib_EQ);
+
   // Find device libraries in <LLVM_DIR>/lib/clang/<ver>/lib/libclc/
   SmallString<128> LibclcPath(D.ResourceDir);
   llvm::sys::path::append(LibclcPath, "lib", "libclc");
 
   // If the namespec is of the form :filename, search for that file.
-  StringRef LibclcNamespec(A->getValue());
-  bool FilenameSearch = LibclcNamespec.consume_front(":");
-  SmallString<128> LibclcTargetFile(LibclcNamespec);
+  bool FilenameSearch = false;
+  SmallString<128> LibclcTargetFile;
+
+  if (IsEQOpt) {
+    StringRef LibclcNamespec(A->getValue());
+    FilenameSearch = LibclcNamespec.consume_front(":");
+    LibclcTargetFile = LibclcNamespec;
+  } else {
+    if (InferredLibclcLibName.empty()) {
+      D.Diag(diag::err_drv_libclc_not_inferred);
+      return;
+    }
+    LibclcTargetFile = InferredLibclcLibName;
+  }
 
   if (FilenameSearch && llvm::sys::fs::exists(LibclcTargetFile)) {
     CC1Args.push_back("-mlink-builtin-bitcode");
     CC1Args.push_back(DriverArgs.MakeArgString(LibclcTargetFile));
   } else {
     // Search the library paths for the file
-    if (!FilenameSearch)
+    if (!FilenameSearch && IsEQOpt)
       LibclcTargetFile += ".bc";
 
     llvm::sys::path::append(LibclcPath, LibclcTargetFile);

@llvmbot
Copy link
Member

llvmbot commented Aug 12, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Fraser Cormack (frasercrmck)

Changes

This commit introduces a new compiler option which lets the target infer the name of the libclc builtins library to link with. This is contrasted with the previous option where the user explicitly provides a namespec.

For the AMDGPU target, the general form of libclc libraries is gfxXYZ-amdgcn--.bc or gfxXYZ-amdgcn-mesa-mesa3d.bc. The target will try to link with either of these libraries, based on the triple and GPU architecture.

Note that this patch doesn't guarantee that a library will necessary be found for all CPUs or GFX architectures. For example, some libclc libraries use another naming scheme, for example 'tonga' instead of 'gfx802'. These are not accounted for. If an inferred library isn't found then the compiler will error as if the user had provided the path themselves. Future patches will focus on libclc and standardize the naming system to reduce the likelihood of this happening.


Full diff: https://github.com/llvm/llvm-project/pull/153162.diff

5 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+2)
  • (modified) clang/include/clang/Driver/CommonArgs.h (+2-1)
  • (modified) clang/include/clang/Driver/Options.td (+2)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+13-2)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+21-6)
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 0f17f4aa761ea..adbfbfed12da5 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -395,6 +395,8 @@ def warn_drv_fraw_string_literals_in_cxx11 : Warning<
   InGroup<UnusedCommandLineArgument>;
 
 def err_drv_libclc_not_found : Error<"no libclc library '%0' found in the clang resource directory">;
+def err_drv_libclc_not_inferred : Error<
+  "could not infer a libclc library for this target; try providing a namespec or full path">;
 
 def err_drv_invalid_malign_branch_EQ : Error<
   "invalid argument '%0' to -malign-branch=; each element must be one of: %1">;
diff --git a/clang/include/clang/Driver/CommonArgs.h b/clang/include/clang/Driver/CommonArgs.h
index 1464ce4e1b31b..ea4b73082623a 100644
--- a/clang/include/clang/Driver/CommonArgs.h
+++ b/clang/include/clang/Driver/CommonArgs.h
@@ -218,7 +218,8 @@ void addOpenMPDeviceRTL(const Driver &D, const llvm::opt::ArgList &DriverArgs,
                         const ToolChain &HostTC);
 
 void addOpenCLBuiltinsLib(const Driver &D, const llvm::opt::ArgList &DriverArgs,
-                          llvm::opt::ArgStringList &CC1Args);
+                          llvm::opt::ArgStringList &CC1Args,
+                          const StringRef InferredLibclcLibName = "");
 
 void addOutlineAtomicsArgs(const Driver &D, const ToolChain &TC,
                            const llvm::opt::ArgList &Args,
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 6aab43c9ed57f..1dc688d25b9b3 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1435,6 +1435,8 @@ def openacc_macro_override_EQ
 
 // End Clang specific/exclusive options for OpenACC.
 
+def libclc_lib : Joined<["--"], "libclc-lib">, Group<opencl_Group>,
+  HelpText<"Link with the libclc OpenCL bitcode library">;
 def libclc_lib_EQ : Joined<["--"], "libclc-lib=">, Group<opencl_Group>,
   HelpText<"Namespec of libclc OpenCL bitcode library to link">;
 def libomptarget_amdgpu_bc_path_EQ : Joined<["--"], "libomptarget-amdgpu-bc-path=">, Group<i_Group>,
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 5e463b9c98687..f1bcfd47831a3 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -858,8 +858,19 @@ void AMDGPUToolChain::addClangTargetOptions(
     CC1Args.push_back("-fapply-global-visibility-to-externs");
   }
 
-  if (DeviceOffloadingKind == Action::OFK_None)
-    addOpenCLBuiltinsLib(getDriver(), DriverArgs, CC1Args);
+  if (DeviceOffloadingKind == Action::OFK_None) {
+    auto GpuArch = getGPUArch(DriverArgs);
+    auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
+    const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);
+    SmallString<128> InferredLibclcLibName(CanonArch);
+    if (getTriple().getOS() != llvm::Triple::Mesa3D)
+      InferredLibclcLibName += "-amdgcn--";
+    else
+      InferredLibclcLibName += "-amdgcn-mesa-mesa3d";
+    InferredLibclcLibName += ".bc";
+    addOpenCLBuiltinsLib(getDriver(), DriverArgs, CC1Args,
+                         InferredLibclcLibName);
+  }
 }
 
 void AMDGPUToolChain::addClangWarningOptions(ArgStringList &CC1Args) const {
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index eb88f7bd4b469..5be3babc100d2 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -2999,27 +2999,42 @@ void tools::addHIPRuntimeLibArgs(const ToolChain &TC, Compilation &C,
 
 void tools::addOpenCLBuiltinsLib(const Driver &D,
                                  const llvm::opt::ArgList &DriverArgs,
-                                 llvm::opt::ArgStringList &CC1Args) {
+                                 llvm::opt::ArgStringList &CC1Args,
+                                 const StringRef InferredLibclcLibName) {
   // Check whether user specifies a libclc bytecode library
-  const Arg *A = DriverArgs.getLastArg(options::OPT_libclc_lib_EQ);
+  const Arg *A = DriverArgs.getLastArg(options::OPT_libclc_lib,
+                                       options::OPT_libclc_lib_EQ);
   if (!A)
     return;
 
+  bool IsEQOpt = A->getOption().matches(options::OPT_libclc_lib_EQ);
+
   // Find device libraries in <LLVM_DIR>/lib/clang/<ver>/lib/libclc/
   SmallString<128> LibclcPath(D.ResourceDir);
   llvm::sys::path::append(LibclcPath, "lib", "libclc");
 
   // If the namespec is of the form :filename, search for that file.
-  StringRef LibclcNamespec(A->getValue());
-  bool FilenameSearch = LibclcNamespec.consume_front(":");
-  SmallString<128> LibclcTargetFile(LibclcNamespec);
+  bool FilenameSearch = false;
+  SmallString<128> LibclcTargetFile;
+
+  if (IsEQOpt) {
+    StringRef LibclcNamespec(A->getValue());
+    FilenameSearch = LibclcNamespec.consume_front(":");
+    LibclcTargetFile = LibclcNamespec;
+  } else {
+    if (InferredLibclcLibName.empty()) {
+      D.Diag(diag::err_drv_libclc_not_inferred);
+      return;
+    }
+    LibclcTargetFile = InferredLibclcLibName;
+  }
 
   if (FilenameSearch && llvm::sys::fs::exists(LibclcTargetFile)) {
     CC1Args.push_back("-mlink-builtin-bitcode");
     CC1Args.push_back(DriverArgs.MakeArgString(LibclcTargetFile));
   } else {
     // Search the library paths for the file
-    if (!FilenameSearch)
+    if (!FilenameSearch && IsEQOpt)
       LibclcTargetFile += ".bc";
 
     llvm::sys::path::append(LibclcPath, LibclcTargetFile);

Comment on lines +866 to +869
if (getTriple().getOS() != llvm::Triple::Mesa3D)
InferredLibclcLibName += "-amdgcn--";
else
InferredLibclcLibName += "-amdgcn-mesa-mesa3d";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just directly take the triple value? In particular this ignores the amdhsa case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be good. It would work for the Mesa3D case but the only amdhsa we have is amdgcn--amdhsa.bc, without any gfx prefixes.

I don't understand the various AMDGPU architectures nor the libclc naming scheme well enough, I'm afraid.

Is using a amdgcn-amd-amdhsa triple compatible with libclc?

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really back to having thirty different versions of the same library? Do the libclc libraries need something that target specific that you can't just keep it as generic?

@arsenm
Copy link
Contributor

arsenm commented Aug 12, 2025

Are we really back to having thirty different versions of the same library? Do the libclc libraries need something that target specific that you can't just keep it as generic?

The build currently creates a bunch of symlinks, and I don't think it currently tries to reproduce the ISA version checks device libs has (or maybe rather, those were ripped out in the distance past).

The full amdgcn bitcode files we end up with appear to be:
amdgcn--amdhsa.bc, tahiti-amdgcn--.bc, and tahiti-amdgcn-mesa-mesa3d.bc

The tahiti prefixed ones are in fact infected with target-cpu=gfx600, so overall the build is pretty busted

@jhuber6
Copy link
Contributor

jhuber6 commented Aug 12, 2025

Using bitcode without -mcpu should work in the general case, but we're doing -mlink-builtin-bitcode here which propagates attributes anyway. I don't see why we'd need multiple builds unless they're using some ISA specific behavior that can't be put through a builtin.

@frasercrmck
Copy link
Contributor Author

frasercrmck commented Aug 13, 2025

I'm not aware of any ISA-specific logic in libclc, but I may well be missing some subtleties. The attribute pollution is unfortunate - we certainly don't want to be building libclc any more times than we need to.

Downstream we have modified the libclc "prepare-builtins" utility to remove target-cpu and target-features attributes from amdgcn libraries. I've not looked closely into this, but is that a possible solution for this problem? It's obviously a hack. I was made vaguely aware of plans to allow AMDGPU to generate "generic" bytecode - is that on the horizon?

@jhuber6 this should indeed work without -mcpu - thanks for pointing that out. I'll push a fix.

@jhuber6
Copy link
Contributor

jhuber6 commented Aug 13, 2025

I'm not aware of any ISA-specific logic in libclc, but I may well be missing some subtleties. The attribute pollution is unfortunate - we certainly don't want to be building libclc any more times than we need to.

Downstream we have modified the libclc "prepare-builtins" utility to remove target-cpu and target-features attributes from amdgcn libraries. I've not looked closely into this, but is that a possible solution for this problem? It's obviously a hack. I was made vaguely aware of plans to allow AMDGPU to generate "generic" bytecode - is that on the horizon?

@jhuber6 this should indeed work without -mcpu - thanks for pointing that out. I'll push a fix.

We've always had """generic""" bitcode, it's just required a little hand holding. The ROCm Device Libs generate IR that gets fed in through -mlink-builtin-bitcode which is designed to propagate target attributes. The libc libraries just compile without -mcpu and intentionally do not use anything that's ISA specific.

@frasercrmck
Copy link
Contributor Author

Getting back to this, I tried removing -mcpu=tahiti from the build (so tahiti-amdgcn--.bc -> amdgcn--.bc and removing all of the gfx* aliases entirely) and I am seeing IR changes.

Most notably, tahiti uses llvm.ldexp.f32.i32 and llvm.fma.*f32 intrinsics where the unadorned one doesn't. I think it's because __HAS_FMAF__ and __HAS_LDEXPF__ are both false. Is this expected? I would assume that we want libclc to use these intrinsics for amdgcn. Any advice? I see that they're deprecated but have been so since at least 2018. Could we unconditionally emit these intrinsics?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants