[lit] Refactor available `ptxas` features #154439

justinfargnoli · 2025-08-19T23:30:03Z

ToT lit currently assumes that a given ptxas version supports all capabilities of prior ptxas releases. This approach was flexible enough to support the removal of 32-bit address compilation from ptxas in CUDA 12.1, but it struggles with the removal of Volta and prior compilation in CUDA 13.0.

To deal with this, this PR refactors how lit defines the set of features available for a given ptxas version. It invokes ptxas not just to get its version, but also to get the list of supported SMs, supported PTX ISA versions, and support for 32-bit compilation.

This approach should be flexible enough to deal with the changing support matrix of ptxas as it goes forward. One obvious downside is that this relies on parsing the stdout of ptxas, something that's inherently unstable. But, IMO, this is something that we can fix as needed.

github-actions · 2025-08-19T23:34:42Z

✅ With the latest revision this PR passed the Python code formatter.

llvm/test/lit.cfg.py

justinfargnoli · 2025-08-21T16:57:51Z

Note: I've only tested this PR on public CUDA 13.0 and internal ToT ptxas.

Trying to see if I can trigger a build with the public build bot.

justinfargnoli · 2025-08-21T17:02:32Z

Trying to see if I can trigger a build with the public build bot.

https://lab.llvm.org/buildbot/#/buildrequests/5249048?redirect_to_build=true

llvmbot · 2025-08-21T18:12:40Z

@llvm/pr-subscribers-backend-nvptx

Author: Justin Fargnoli (justinfargnoli)

Changes

ToT lit currently assumes that a given ptxas version supports all capabilities of prior ptxas releases. This approach was flexible enough to support the removal of 32-bit address compilation from ptxas in CUDA 12.1, but it struggles with the removal of Volta and prior compilation in CUDA 13.0.

To deal with this, this PR refactors how lit defines the set of features available for a given ptxas version. It invokes ptxas not just to get its version, but also to get the list of supported SMs, supported PTX ISA versions, and support for 32-bit compilation.

This approach should be flexible enough to deal with the changing support matrix of ptxas as it goes forward. One obvious downside is that this relies on parsing the stdout of ptxas, something that's inherently unstable. But, IMO, this is something that we can fix as needed.

Patch is 154.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/154439.diff

172 Files Affected:

(modified) llvm/test/CodeGen/NVPTX/access-non-generic.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/activemask.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/addrspacecast.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/alias.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/annotations.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/applypriority.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/arithmetic-fp-sm20.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/arithmetic-int.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/async-copy.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/atomicrmw-expand.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/atomics-sm60.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/atomics-sm70.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/atomics-sm90.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/atomics-with-scope.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/b52037.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/barrier.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/bf16-instructions.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/bmsk.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/bswap.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/byval-arg-vectorize.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/byval-const-global.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/calling-conv.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cluster-dim.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/clusterlaunchcontrol-multicast.ll (+6-6)
(modified) llvm/test/CodeGen/NVPTX/clusterlaunchcontrol.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cmpxchg-sm60.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cmpxchg-sm70.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cmpxchg-sm90.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cmpxchg.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/combine-mad.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/combine-min-max.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/common-linkage.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/compare-int.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-call-to-indirect.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-fp.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-int-sm20.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-sm100.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-sm100a.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/convert-sm80.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-sm89.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/convert-sm90.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-s2g-sm100.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-1cta.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-2cta.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-cta-sm100.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-cta-sm100a.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-cta-sm90.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-gather4.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-im2colw.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s-im2colw128.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-prefetch-sm100a.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-prefetch.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-reduce.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-s2g-scatter4.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-s2g.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/cp-async-bulk.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/discard.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/distributed-shared-cluster.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/dynamic_stackalloc.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/elect.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/f16-abs.ll (+4-4)
(modified) llvm/test/CodeGen/NVPTX/f16-ex2.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/f16-instructions.ll (+4-4)
(modified) llvm/test/CodeGen/NVPTX/f16x2-instructions.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/f32-ex2.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/f32-lg2.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/f32x2-instructions.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/fabs-intrinsics.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fence-cluster.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fence-nocluster.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/fence-proxy-tensormap.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fexp2.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/flog2.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fma-disable.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/fminimum-fmaximum.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fns.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fold-movs.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/fp-contract-f32x2.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/global-addrspace.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/global-ordering.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/griddepcontrol.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/i16x2-instructions.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/idioms.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/indirect_byval.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/inline-asm-b128-test1.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/inline-asm-b128-test2.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/inline-asm-b128-test3.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/intrinsic-old.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/intrinsics.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/kernel-param-align.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/ld-addrspace.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/ld-generic.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/ld-st-addrrspace.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/ldg-invariant-256.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/load-store-256-addressing-invariant.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/load-store-256-addressing.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/load-store-scalars.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/load-store-sm-70.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/load-store-sm-90.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/load-store-vectors-256.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/local-stack-frame.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/managed.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/match.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins-sm53-ptx42.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-autoupgrade.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72-autoupgrade.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/math-intrins.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/mbarrier.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/nanosleep.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/nofunc.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/noreturn.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/packed-aggr.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/param-overalign.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/pr126337.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/pr13291-i1-store.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/prefetch-inferas-test.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/prefetch.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/reduction-intrinsics.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/redux-sync-f32.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/redux-sync.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/reg-types.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/setmaxnreg.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/sext-setcc.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/shfl-sync-p.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/shfl-sync.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/short-ptr.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/simple-call.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/st-addrspace.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/st-generic.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/st-param-imm.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/st_bulk.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/stacksaverestore.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/surf-tex.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/symbol-naming.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/szext.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/tanhf.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-alloc.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-commit.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-cp.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-fence.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-ld.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-shift.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/tcgen05-st.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/trunc-setcc.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/trunc-tofp.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/unreachable.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/vaargs.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/variadics-backend.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/vector-compare.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/vector-select.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/vote.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/weak-global.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/wgmma-sm90a-fence.ll (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx60-sm70.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx63-sm72.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx63-sm75.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx64-sm70.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx65-sm75.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx71-sm80.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx78-sm90.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx86-sm100a.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx86-sm101a.py (+1-1)
(modified) llvm/test/CodeGen/NVPTX/wmma-ptx86-sm120a.py (+1-1)
(modified) llvm/test/lit.cfg.py (+66-64)

diff --git a/llvm/test/CodeGen/NVPTX/access-non-generic.ll b/llvm/test/CodeGen/NVPTX/access-non-generic.ll
index 601a35288f54d..be8d00a10108a 100644
--- a/llvm/test/CodeGen/NVPTX/access-non-generic.ll
+++ b/llvm/test/CodeGen/NVPTX/access-non-generic.ll
@@ -2,7 +2,7 @@
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s --check-prefix PTX
 ; RUN: opt -mtriple=nvptx-- < %s -S -passes=infer-address-spaces | FileCheck %s --check-prefix IR
 ; RUN: opt -mtriple=nvptx64-- < %s -S -passes=infer-address-spaces | FileCheck %s --check-prefix IR
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
+; RUN: %if ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
 
 @array = internal addrspace(3) global [10 x float] zeroinitializer, align 4
diff --git a/llvm/test/CodeGen/NVPTX/activemask.ll b/llvm/test/CodeGen/NVPTX/activemask.ll
index aa3c5819d7f91..96df904290e68 100644
--- a/llvm/test/CodeGen/NVPTX/activemask.ll
+++ b/llvm/test/CodeGen/NVPTX/activemask.ll
@@ -1,5 +1,5 @@
 ; RUN: llc < %s -mtriple=nvptx64 -O2 -mcpu=sm_52 -mattr=+ptx62 | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_52 -mattr=+ptx62 | %ptxas-verify %}
+; RUN: %if ptxas-isa-v6.2 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_52 -mattr=+ptx62 | %ptxas-verify %}
 
 declare i32 @llvm.nvvm.activemask()
 
diff --git a/llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll b/llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll
index 00b17896d2c9e..1b9dc6ab1e122 100644
--- a/llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll
+++ b/llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll
@@ -1,8 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | FileCheck %s -check-prefixes=NOPTRCONV
 ; RUN: llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 --nvptx-short-ptr | FileCheck %s -check-prefixes=PTRCONV
-; RUN: %if ptxas-12.8 %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
-; RUN: %if ptxas-12.8 %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 --nvptx-short-ptr | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 --nvptx-short-ptr | %ptxas-verify -arch=sm_90 %}
 
 ; ALL-LABEL: conv_shared_cluster_to_generic
 define i32 @conv_shared_cluster_to_generic(ptr addrspace(7) %ptr) {
diff --git a/llvm/test/CodeGen/NVPTX/addrspacecast.ll b/llvm/test/CodeGen/NVPTX/addrspacecast.ll
index 86008a1b70058..546d22ca0e691 100644
--- a/llvm/test/CodeGen/NVPTX/addrspacecast.ll
+++ b/llvm/test/CodeGen/NVPTX/addrspacecast.ll
@@ -1,7 +1,7 @@
 ; RUN: llc -O0 < %s -mtriple=nvptx -mcpu=sm_20 | FileCheck %s -check-prefixes=ALL,CLS32
 ; RUN: llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s -check-prefixes=ALL,NOPTRCONV,CLS64
 ; RUN: llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_20 --nvptx-short-ptr | FileCheck %s -check-prefixes=ALL,PTRCONV,CLS64
-; RUN: %if ptxas && !ptxas-12.0 %{ llc -O0 < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
+; RUN: %if ptxas-32 %{ llc -O0 < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_20 --nvptx-short-ptr | %ptxas-verify %}
 
diff --git a/llvm/test/CodeGen/NVPTX/alias.ll b/llvm/test/CodeGen/NVPTX/alias.ll
index 01761c21ab103..4b22df225e33f 100644
--- a/llvm/test/CodeGen/NVPTX/alias.ll
+++ b/llvm/test/CodeGen/NVPTX/alias.ll
@@ -1,5 +1,5 @@
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx64 | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx64 | %ptxas-verify %}
+; RUN: %if ptxas-isa-v6.4 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx64 | %ptxas-verify %}
 
 define i32 @a() { ret i32 0 }
 @b = internal alias i32 (), ptr @a
diff --git a/llvm/test/CodeGen/NVPTX/annotations.ll b/llvm/test/CodeGen/NVPTX/annotations.ll
index 5360e8988777b..e4aa0552e8420 100644
--- a/llvm/test/CodeGen/NVPTX/annotations.ll
+++ b/llvm/test/CodeGen/NVPTX/annotations.ll
@@ -1,6 +1,6 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_20 | FileCheck %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
+; RUN: %if ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
 
 @texture = internal addrspace(1) global i64 0, align 8
diff --git a/llvm/test/CodeGen/NVPTX/applypriority.ll b/llvm/test/CodeGen/NVPTX/applypriority.ll
index 23b1bda9a32bf..9ac52c8bfe188 100644
--- a/llvm/test/CodeGen/NVPTX/applypriority.ll
+++ b/llvm/test/CodeGen/NVPTX/applypriority.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx74| FileCheck --check-prefixes=CHECK-PTX64 %s
-; RUN: %if ptxas-11.4 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx74| %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.4 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx74| %ptxas-verify -arch=sm_80 %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/arithmetic-fp-sm20.ll b/llvm/test/CodeGen/NVPTX/arithmetic-fp-sm20.ll
index ce71d3a78c0de..e88d0396f0858 100644
--- a/llvm/test/CodeGen/NVPTX/arithmetic-fp-sm20.ll
+++ b/llvm/test/CodeGen/NVPTX/arithmetic-fp-sm20.ll
@@ -1,6 +1,6 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_20 -fp-contract=fast | FileCheck %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 -fp-contract=fast | FileCheck %s
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 -fp-contract=fast | %ptxas-verify %}
+; RUN: %if ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 -fp-contract=fast | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 -fp-contract=fast | %ptxas-verify %}
 
 ;; These tests should run for all targets
diff --git a/llvm/test/CodeGen/NVPTX/arithmetic-int.ll b/llvm/test/CodeGen/NVPTX/arithmetic-int.ll
index 1fbfd0a987d7a..9e41e9e240902 100644
--- a/llvm/test/CodeGen/NVPTX/arithmetic-int.ll
+++ b/llvm/test/CodeGen/NVPTX/arithmetic-int.ll
@@ -1,6 +1,6 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_20 | FileCheck %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
+; RUN: %if ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_20 | %ptxas-verify %}
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
 
 ;; These tests should run for all targets
diff --git a/llvm/test/CodeGen/NVPTX/async-copy.ll b/llvm/test/CodeGen/NVPTX/async-copy.ll
index cefb8ede9fa58..97b6b5d4d0097 100644
--- a/llvm/test/CodeGen/NVPTX/async-copy.ll
+++ b/llvm/test/CodeGen/NVPTX/async-copy.ll
@@ -1,7 +1,7 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_80 -mattr=+ptx70 | FileCheck -check-prefixes=CHECK,CHECK_PTX32 %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx70 | FileCheck -check-prefixes=CHECK,CHECK_PTX64 %s
-; RUN: %if ptxas-11.0 && ! ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_80 -mattr=+ptx70 | %ptxas-verify -arch=sm_80 %}
-; RUN: %if ptxas-11.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx70 | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.0 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_80 -mattr=+ptx70 | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx70 | %ptxas-verify -arch=sm_80 %}
 
 declare void @llvm.nvvm.cp.async.wait.group(i32)
 
diff --git a/llvm/test/CodeGen/NVPTX/atomicrmw-expand.ll b/llvm/test/CodeGen/NVPTX/atomicrmw-expand.ll
index 94b3f0a2e1c3e..88fae7a3f78a0 100644
--- a/llvm/test/CodeGen/NVPTX/atomicrmw-expand.ll
+++ b/llvm/test/CodeGen/NVPTX/atomicrmw-expand.ll
@@ -1,7 +1,7 @@
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_30 | FileCheck %s --check-prefixes=ALL,SM30
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_60 | FileCheck %s --check-prefixes=ALL,SM60
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_30 | %ptxas-verify %}
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
+; RUN: %if ptxas-sm_60 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
 
 ; CHECK-LABEL: fadd_double
 define void @fadd_double(ptr %0, double %1) {
diff --git a/llvm/test/CodeGen/NVPTX/atomics-sm60.ll b/llvm/test/CodeGen/NVPTX/atomics-sm60.ll
index 2e11323d1b3e1..5a7a1823cb2a0 100644
--- a/llvm/test/CodeGen/NVPTX/atomics-sm60.ll
+++ b/llvm/test/CodeGen/NVPTX/atomics-sm60.ll
@@ -1,7 +1,7 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_60 | FileCheck %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_60 | FileCheck %s
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
+; RUN: %if ptxas-sm_60 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
+; RUN: %if ptxas-sm_60 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
 
 ; CHECK-LABEL: .func test(
 define void @test(ptr %dp0, ptr addrspace(1) %dp1, ptr addrspace(3) %dp3, double %d) {
diff --git a/llvm/test/CodeGen/NVPTX/atomics-sm70.ll b/llvm/test/CodeGen/NVPTX/atomics-sm70.ll
index f710d7f883a1b..e1a69d2e3db20 100644
--- a/llvm/test/CodeGen/NVPTX/atomics-sm70.ll
+++ b/llvm/test/CodeGen/NVPTX/atomics-sm70.ll
@@ -2,9 +2,9 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx63 | FileCheck %s --check-prefixes=CHECK
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_70 -mattr=+ptx63 | FileCheck %s --check-prefixes=CHECK64
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx62 | FileCheck %s --check-prefixes=CHECKPTX62
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx63 | %ptxas-verify -arch=sm_70 %}
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_70 -mattr=+ptx63 | %ptxas-verify -arch=sm_70 %}
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx62 | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 && ptxas-isa-v6.3 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx63 | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 && ptxas-isa-v6.3 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_70 -mattr=+ptx63 | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 && ptxas-isa-v6.2 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_70 -mattr=+ptx62 | %ptxas-verify -arch=sm_70 %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/atomics-sm90.ll b/llvm/test/CodeGen/NVPTX/atomics-sm90.ll
index f96fd30019025..79e12025ba614 100644
--- a/llvm/test/CodeGen/NVPTX/atomics-sm90.ll
+++ b/llvm/test/CodeGen/NVPTX/atomics-sm90.ll
@@ -2,9 +2,9 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_90 -mattr=+ptx78 | FileCheck %s --check-prefixes=CHECK
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | FileCheck %s --check-prefixes=CHECK64
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_86 -mattr=+ptx71 | FileCheck %s --check-prefixes=CHECKPTX71
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_86 -mattr=+ptx71 | %ptxas-verify -arch=sm_86 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_86 && ptxas-isa-v7.1 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_86 -mattr=+ptx71 | %ptxas-verify -arch=sm_86 %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/atomics-with-scope.ll b/llvm/test/CodeGen/NVPTX/atomics-with-scope.ll
index e6636d706b49d..9e30519b31cc3 100644
--- a/llvm/test/CodeGen/NVPTX/atomics-with-scope.ll
+++ b/llvm/test/CodeGen/NVPTX/atomics-with-scope.ll
@@ -1,7 +1,7 @@
 ; RUN: llc < %s -mtriple=nvptx -mcpu=sm_60 | FileCheck %s -check-prefixes=CHECK,CHECK32
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_60 | FileCheck %s
-; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -mtriple=nvptx -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
+; RUN: %if ptxas-sm_60 && ptxas-32 %{ llc < %s -mtriple=nvptx -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
+; RUN: %if ptxas-sm_60 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_60 | %ptxas-verify -arch=sm_60 %}
 
 ; CHECK-LABEL: .func test_atomics_scope(
 define void @test_atomics_scope(ptr %fp, float %f,
diff --git a/llvm/test/CodeGen/NVPTX/b52037.ll b/llvm/test/CodeGen/NVPTX/b52037.ll
index b6317dfb28597..268a8972ebd22 100644
--- a/llvm/test/CodeGen/NVPTX/b52037.ll
+++ b/llvm/test/CodeGen/NVPTX/b52037.ll
@@ -4,7 +4,7 @@
 ; https://bugs.llvm.org/show_bug.cgi?id=52037 for the gory details.
 ;
 ; RUN: llc -mtriple=nvptx64-nvidia-cuda -mcpu=sm_70 -O3 -o - %s | FileCheck %s
-; RUN: %if ptxas %{ llc -mtriple=nvptx64-nvidia-cuda -mcpu=sm_70 -O3 -o - %s | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 %{ llc -mtriple=nvptx64-nvidia-cuda -mcpu=sm_70 -O3 -o - %s | %ptxas-verify -arch=sm_70 %}
 
 ; CHECK-LABEL: .visible .entry barney(
 ; CHECK-NOT:  .local{{.*}}__local_depot
diff --git a/llvm/test/CodeGen/NVPTX/barrier.ll b/llvm/test/CodeGen/NVPTX/barrier.ll
index a3b0d21f098f2..f209bdd0cfae7 100644
--- a/llvm/test/CodeGen/NVPTX/barrier.ll
+++ b/llvm/test/CodeGen/NVPTX/barrier.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx60 | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx60 | %ptxas-verify %}
+; RUN: %if ptxas-isa-v6.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_30 -mattr=+ptx60 | %ptxas-verify %}
 
 declare void @llvm.nvvm.bar.warp.sync(i32)
 declare void @llvm.nvvm.barrier.cta.sync.aligned.all(i32)
diff --git a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
index aee58a044a986..835e09b9a38e0 100644
--- a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
@@ -3,9 +3,9 @@
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | FileCheck --check-prefixes=CHECK,SM80 %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 -denormal-fp-math-f32=preserve-sign | FileCheck --check-prefixes=CHECK,SM80-FTZ %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | FileCheck --check-prefixes=CHECK,SM90 %s
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 -denormal-fp-math-f32=preserve-sign | %ptxas-verify -arch=sm_80 %}
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.1 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.1 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 -denormal-fp-math-f32=preserve-sign | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll b/llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll
index e1d4ef1073a78..60a5abf03e19f 100644
--- a/llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll
+++ b/llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | FileCheck --check-prefixes=CHECK %s
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.1 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
diff --git a/llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll b/llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll
index b4641d01eb927..4ea8ffc727b56 100644
--- a/llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll
@@ -1,8 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | FileCheck --check-prefixes=CHECK,SM80 %s
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | FileCheck --check-prefixes=CHECK,SM90 %s
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
-; RUN: %if ptxas-11.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
+; RUN: %if ptxas-sm_80 && ptxas-isa-v7.1 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_80 -mattr=+ptx71 | %ptxas-verify -arch=sm_80 %}
+; RUN: %if ptxas-sm_90 && ptxas-isa-v7.8 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx78 | %ptxas-verify -arch=sm_90 %}
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
diff --git a/llvm/test/CodeGen/NVPTX/bmsk.ll b/llvm/test/CodeGen/NVPTX/bmsk.ll
index d5b278657bd52..b9404f2a160cd 100644
--- a/llvm/test/CodeGen/NVPTX/bmsk.ll
+++ b/llvm/test/CodeGen/NVPTX/bmsk.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc -o - < %s -mcpu=sm_70 -mattr=+ptx76 | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -mcpu=sm_70 -mattr=+ptx76 | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 && ptxas-isa-v7.6 %{ llc < %s -mcpu=sm_70 -mattr=+ptx76 | %ptxas-verify -arch=sm_70 %}
 
 target triple = "nvptx64-unknown-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/bswap.ll b/llvm/test/CodeGen/NVPTX/bswap.ll
index 0d1d6da4ba2b6..b913b9a03c553 100644
--- a/llvm/test/CodeGen/NVPTX/bswap.ll
+++ b/llvm/test/CodeGen/NVPTX/bswap.ll
@@ -1,9 +1,9 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx70 | FileCheck -check-prefixes CHECK,PTX70 %s
 ; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
-; RUN: %if ptxas-11.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx70 | %ptxas-verify %}
+; RUN: %if ptxas-isa-v7.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx70 | %ptxas-verify %}
 ; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx71 | FileCheck -check-prefixes CHECK,PTX71 %s
-; RUN: %if ptxas-11.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx71 | %ptxas-verify %}
+; RUN: %if ptxas-isa-v7.1 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 -mattr=+ptx71 | %ptxas-verify %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/byval-arg-vectorize.ll b/llvm/test/CodeGen/NVPTX/byval-arg-vectorize.ll
index 9988d5b122cc1..e7f71f4ad52ea 100644
--- a/llvm/test/CodeGen/NVPTX/byval-arg-vectorize.ll
+++ b/llvm/test/CodeGen/NVPTX/byval-arg-vectorize.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc < %s -mcpu=sm_70 | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -mcpu=sm_70 | %ptxas-verify -arch=sm_70 %}
+; RUN: %if ptxas-sm_70 %{ llc < %s -mcpu=sm_70 | %ptxas-verify -arch=sm_70 %}
 
 target triple = "nvptx64-nvidia-cuda"
 
diff --git a/llvm/test/CodeGen/NVPTX/byval-const-global.ll b/llvm/test/CodeGen/NVPTX/byval-const-global.ll
index b4934e1a94d1b..81e7edfd8602e 100644
--- a/llvm/test/CodeGen/NVPTX/byval-const-global.ll
+++ b/llvm/test/CodeGen/NVPTX/byval-const-global.ll
@@ -1,6 +1,6 @@
...
[truncated]

AlexMaclean

This will be very useful. I've personally run afoul of issues with versions of ptxas not working with a given configuration with %ptxas-verify several times. I would consider tweaking the naming of these a little bit. I'd recommend that we remove the "v" in the ISA features and rename "32" -> "ptr32". I'd also lean towards using () or {} to represent these (i.e. ptxas(sm_90), ptxas(isa-8.4)) but that is more a matter of personal taste so if you or others disagree I'm happy to leave as is.

llvm/test/lit.cfg.py

justinfargnoli · 2025-08-25T22:00:52Z

I'd also lean towards using () or {} to represent these (i.e. ptxas(sm_90), ptxas(isa-8.4)) but that is more a matter of personal taste so if you or others disagree I'm happy to leave as is.

It looks like lit doesn't like the use of (), {}, or <>, so I went back to using -.

llvm/test/lit.cfg.py

Copilot

Pull Request Overview

This PR refactors how lit determines available ptxas features by directly querying ptxas capabilities instead of assuming backward compatibility. The refactoring addresses limitations with the previous approach when CUDA removes support for older architectures.

Key changes:

Replace version-based feature detection with dynamic capability querying
Query ptxas for supported SM architectures, PTX ISA versions, and 32-bit compilation support
Update test conditions across hundreds of test files to use specific capability checks

Reviewed Changes

Copilot reviewed 174 out of 174 changed files in this pull request and generated no comments.

File	Description
llvm/test/lit.cfg.py	Core refactoring from version-based to capability-based ptxas feature detection
llvm/test/CodeGen/NVPTX/*.ll	Updated test conditions to use new capability-based features like `ptxas-sm_XX` and `ptxas-isa-X.Y`

Comments suppressed due to low confidence (4)

AlexMaclean

Nice, LGTM

llvm/test/lit.cfg.py

vvereschaka · 2025-08-28T19:09:17Z

Hi @justinfargnoli ,

https://lab.llvm.org/buildbot/#/builders/54/builds/12220

llvm-lit.py: C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\utils\lit\lit\llvm\config.py:57: note: using lit tools: C:\Program Files\Git\usr\bin
llvm-lit.py: C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\utils\lit\lit\TestingConfig.py:157: fatal: unable to parse config file 'C:\\buildbot\\as-builder-8\\llvm-nvptx-nvidia-win\\llvm-project\\llvm\\test/lit.cfg.py', traceback: Traceback (most recent call last):
  File "C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\utils\lit\lit\TestingConfig.py", line 145, in load_from_path
    exec(compile(data, path, "exec"), cfg_globals, None)
  File "C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test/lit.cfg.py", line 438, in <module>
    enable_ptxas(ptxas_executable)
  File "C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test/lit.cfg.py", line 430, in enable_ptxas
    if ptxas_supports_address_size_32(ptxas_executable):
  File "C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test/lit.cfg.py", line 407, in ptxas_supports_address_size_32
    raise RuntimeError(f"Unexpected ptxas output: {result.stderr}")
RuntimeError: Unexpected ptxas output:

same for https://lab.llvm.org/buildbot/#/builders/155

would you take care of it?

justinfargnoli · 2025-08-28T19:58:19Z

Will do

This reverts commit d77cf57.

justinfargnoli · 2025-08-28T20:11:38Z

@vvereschaka I put up #155912 as a fix forward the issue. However, if that doesn't work for whatever reason, I also opened #155914 to revert the change.

vvereschaka · 2025-08-28T20:24:08Z

@justinfargnoli got it, thank you. I'm checking #155912 currently. I'll let you know about the results.

llvm-ci · 2025-08-28T20:27:08Z

LLVM Buildbot has detected a new failure on builder llvm-nvptx64-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/23833

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: DebugInfo/NVPTX/debug-ptx-symbols.ll' FAILED ********************
Exit Code: 255

Command Output (stderr):
--
/home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll -mcpu=sm_60 | /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll # RUN: at line 1
+ /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/llc -mcpu=sm_60
+ /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll
/home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll -mcpu=sm_60 | /usr/local/cuda/bin/ptxas -c - # RUN: at line 2
+ /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/bin/llc -mcpu=sm_60
+ /usr/local/cuda/bin/ptxas -c -
ptxas fatal   : SM version specified by .target is higher than default SM version assumed

--

********************

llvm-ci · 2025-08-28T20:31:22Z

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/23977

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: DebugInfo/NVPTX/debug-ptx-symbols.ll' FAILED ********************
Exit Code: 255

Command Output (stderr):
--
/home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll -mcpu=sm_60 | /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll # RUN: at line 1
+ /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/llc -mcpu=sm_60
+ /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll
/home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/DebugInfo/NVPTX/debug-ptx-symbols.ll -mcpu=sm_60 | /usr/local/cuda/bin/ptxas -c - # RUN: at line 2
+ /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/bin/llc -mcpu=sm_60
+ /usr/local/cuda/bin/ptxas -c -
ptxas fatal   : SM version specified by .target is higher than default SM version assumed

--

********************

Reverts #154439 in order to resolve #154439 (comment).

Reverts llvm/llvm-project#154439 in order to resolve llvm/llvm-project#154439 (comment).

justinfargnoli requested a review from Copilot August 19, 2025 23:30

This comment was marked as outdated.

Sign in to view

justinfargnoli self-assigned this Aug 20, 2025

justinfargnoli requested a review from Copilot August 20, 2025 23:57

This comment was marked as outdated.

Sign in to view

squash: all changes on dev/jf/upstream-lit

813da96

justinfargnoli force-pushed the dev/jf/upstream-lit branch from 2b19c39 to 813da96 Compare August 21, 2025 16:29

justinfargnoli changed the title ~~[lit] Refactor ptxas usage~~ [lit] Refactor available ptxas features Aug 21, 2025

justinfargnoli requested review from Artem-B, AlexMaclean, schwarzschild-radius and durga4github August 21, 2025 16:47

justinfargnoli commented Aug 21, 2025

View reviewed changes

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

justinfargnoli marked this pull request as ready for review August 21, 2025 18:11

llvmbot added the backend:NVPTX label Aug 21, 2025

AlexMaclean reviewed Aug 21, 2025

View reviewed changes

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

llvm/test/lit.cfg.py Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

AlexMaclean approved these changes Aug 21, 2025

View reviewed changes

Artem-B reviewed Aug 21, 2025

View reviewed changes

llvm/test/lit.cfg.py Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

justinfargnoli added 3 commits August 22, 2025 22:29

Address review comments

26548bd

Manually set isa version pre CUDA 13.0

82efb6b

Use - instead of ()

6146027

justinfargnoli added 3 commits August 25, 2025 22:09

Use format strings

c54ec75

Address review comments

2102b1b

Test with ptxas from CUDA 12.9

4ad7ecc

justinfargnoli enabled auto-merge (squash) August 26, 2025 19:46

AlexMaclean reviewed Aug 26, 2025

View reviewed changes

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

llvm/test/lit.cfg.py Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

AlexMaclean disabled auto-merge August 26, 2025 19:49

Address comments

c3dd89e

AlexMaclean reviewed Aug 26, 2025

View reviewed changes

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

Address comments

e0ced20

justinfargnoli requested review from Copilot and AlexMaclean August 26, 2025 20:44

Copilot AI reviewed Aug 26, 2025

View reviewed changes

justinfargnoli requested a review from Artem-B August 26, 2025 20:48

AlexMaclean approved these changes Aug 27, 2025

View reviewed changes

llvm/test/lit.cfg.py Outdated Show resolved Hide resolved

Address comments

7c984ce

justinfargnoli requested a review from AlexMaclean August 27, 2025 16:11

format

858747e

AlexMaclean approved these changes Aug 28, 2025

View reviewed changes

justinfargnoli merged commit d77cf57 into llvm:main Aug 28, 2025
9 checks passed

justinfargnoli mentioned this pull request Aug 28, 2025

[lit] Use .format() over format strings literals #155912

Closed

justinfargnoli added a commit that referenced this pull request Aug 28, 2025

Revert "[lit] Refactor available ptxas features (#154439)"

755a773

This reverts commit d77cf57.

justinfargnoli mentioned this pull request Aug 28, 2025

Revert "[lit] Refactor available ptxas features" #155914

Merged

justinfargnoli added a commit that referenced this pull request Aug 28, 2025

Revert "[lit] Refactor available ptxas features" (#155914)

826780a

Reverts #154439 in order to resolve #154439 (comment).

justinfargnoli mentioned this pull request Aug 28, 2025

Reland "[lit] Refactor available ptxas features"" #155923

Draft

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 28, 2025

Automerge: Revert "[lit] Refactor available ptxas features" (#155914)

93d3ca4

Reverts llvm/llvm-project#154439 in order to resolve llvm/llvm-project#154439 (comment).

[lit] Refactor available ptxas features #154439

[lit] Refactor available ptxas features #154439

Uh oh!

Conversation

justinfargnoli commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

justinfargnoli commented Aug 21, 2025

Uh oh!

justinfargnoli commented Aug 21, 2025

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

AlexMaclean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

justinfargnoli commented Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

AlexMaclean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vvereschaka commented Aug 28, 2025

Uh oh!

justinfargnoli commented Aug 28, 2025

Uh oh!

justinfargnoli commented Aug 28, 2025

Uh oh!

vvereschaka commented Aug 28, 2025

Uh oh!

llvm-ci commented Aug 28, 2025

Uh oh!

llvm-ci commented Aug 28, 2025

Uh oh!

Uh oh!

[lit] Refactor available `ptxas` features #154439

[lit] Refactor available `ptxas` features #154439

justinfargnoli commented Aug 19, 2025 •

edited

Loading

github-actions bot commented Aug 19, 2025 •

edited

Loading