Skip to content

Conversation

s-perron
Copy link
Contributor

The wrapper used to hold the handle for resource type has just the
default copy constructor and assignment operator. This causes clang to
insert memcpys when it does an assignment of a resource type. The
memcpy then cause optimizations to fail when the memcpy is turned into a
load and store of an i64.

To fix this, we should define copying of a resource type by adding the
operator= and copy constructor.

Partially fixes #154669

The wrapper used to hold the handle for resource type has just the
default copy constructor and assignment operator. This causes clang to
insert memcpys when it does an assignment of a resource type. The
memcpy then cause optimizations to fail when the memcpy is turned into a
load and store of an i64.

To fix this, we should define copying of a resource type by adding the
operator= and copy constructor.

This exposed an issue in the hanlding of OpaqueValueExpr. If there were
OpaqueValueExpr in the sub tree of an OpaqueValueExpr, there were not
getting evaluated correctly, where causing an assert. This fixs that,
creating an exception for HLSLOutArgExpr nodes. Their OpaqueValueExpr
need to be handled different, and are already handled correctly.

Partially fixes llvm#154669
@s-perron s-perron requested a review from hekota September 2, 2025 19:48
@s-perron s-perron marked this pull request as ready for review September 2, 2025 19:49
Copy link
Member

@hekota hekota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add test showing the new constructors AST structure to:

clang/test/AST/HLSL/TypedBuffers-AST.hlsl
clang/test/AST/HLSL/StructuredBuffers-AST.hlsl
clang/test/AST/HLSL/TypedBuffers-AST.hlsl

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Sep 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-hlsl

Author: Steven Perron (s-perron)

Changes

The wrapper used to hold the handle for resource type has just the
default copy constructor and assignment operator. This causes clang to
insert memcpys when it does an assignment of a resource type. The
memcpy then cause optimizations to fail when the memcpy is turned into a
load and store of an i64.

To fix this, we should define copying of a resource type by adding the
operator= and copy constructor.

Partially fixes #154669


Patch is 32.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156075.diff

10 Files Affected:

  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+18-2)
  • (modified) clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp (+65-1)
  • (modified) clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h (+2)
  • (modified) clang/lib/Sema/HLSLExternalSemaSource.cpp (+2)
  • (modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/debug/rwbuffer_debug_info.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl (+65-49)
  • (modified) clang/test/CodeGenHLSL/resources/res-array-local1.hlsl (+64-64)
  • (modified) clang/test/CodeGenHLSL/resources/res-array-local3.hlsl (+62-62)
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 442d8505cc0ed..c5660bc518712 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -828,11 +828,27 @@ llvm::Instruction *CGHLSLRuntime::getConvergenceToken(BasicBlock &BB) {
 
 class OpaqueValueVisitor : public RecursiveASTVisitor<OpaqueValueVisitor> {
 public:
-  llvm::SmallPtrSet<OpaqueValueExpr *, 8> OVEs;
+  llvm::SmallVector<OpaqueValueExpr *, 8> OVEs;
+  llvm::SmallPtrSet<OpaqueValueExpr *, 8> Visited;
   OpaqueValueVisitor() {}
 
+  bool VisitHLSLOutArgExpr(HLSLOutArgExpr *) {
+    // These need to be bound in CodeGenFunction::EmitHLSLOutArgLValues
+    // or CodeGenFunction::EmitHLSLOutArgExpr. If they are part of this
+    // traversal, the temporary containing the copy out will not have
+    // been created yet.
+    return false;
+  }
+
   bool VisitOpaqueValueExpr(OpaqueValueExpr *E) {
-    OVEs.insert(E);
+    // Traverse the source expression first.
+    if (E->getSourceExpr())
+      TraverseStmt(E->getSourceExpr());
+
+    // Then add this OVE if we haven't seen it before.
+    if (Visited.insert(E).second)
+      OVEs.push_back(E);
+
     return true;
   }
 };
diff --git a/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp b/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
index 806800cb7b213..26e48a18b2bf5 100644
--- a/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
+++ b/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
@@ -156,6 +156,9 @@ struct BuiltinTypeMethodBuilder {
   BuiltinTypeDeclBuilder &finalize();
   Expr *getResourceHandleExpr();
 
+  BuiltinTypeMethodBuilder &getResourceHandle(PlaceHolder ResourceRecord);
+  BuiltinTypeMethodBuilder &returnThis();
+
 private:
   void createDecl();
 
@@ -332,7 +335,7 @@ Expr *BuiltinTypeMethodBuilder::convertPlaceholder(PlaceHolder PH) {
   return DeclRefExpr::Create(
       AST, NestedNameSpecifierLoc(), SourceLocation(), ParamDecl, false,
       DeclarationNameInfo(ParamDecl->getDeclName(), SourceLocation()),
-      ParamDecl->getType(), VK_PRValue);
+      ParamDecl->getType().getNonReferenceType(), VK_PRValue);
 }
 
 BuiltinTypeMethodBuilder::BuiltinTypeMethodBuilder(BuiltinTypeDeclBuilder &DB,
@@ -431,6 +434,29 @@ Expr *BuiltinTypeMethodBuilder::getResourceHandleExpr() {
                                     OK_Ordinary);
 }
 
+BuiltinTypeMethodBuilder &
+BuiltinTypeMethodBuilder::getResourceHandle(PlaceHolder ResourceRecord) {
+  ensureCompleteDecl();
+
+  Expr *ResourceExpr = convertPlaceholder(ResourceRecord);
+
+  ASTContext &AST = DeclBuilder.SemaRef.getASTContext();
+  FieldDecl *HandleField = DeclBuilder.getResourceHandleField();
+  MemberExpr *HandleExpr = MemberExpr::CreateImplicit(
+      AST, ResourceExpr, false, HandleField, HandleField->getType(), VK_LValue,
+      OK_Ordinary);
+  StmtsList.push_back(HandleExpr);
+  return *this;
+}
+
+BuiltinTypeMethodBuilder &BuiltinTypeMethodBuilder::returnThis() {
+  ASTContext &AST = DeclBuilder.SemaRef.getASTContext();
+  CXXThisExpr *ThisExpr = CXXThisExpr::Create(
+      AST, SourceLocation(), Method->getFunctionObjectParameterType(), true);
+  StmtsList.push_back(ThisExpr);
+  return *this;
+}
+
 template <typename... Ts>
 BuiltinTypeMethodBuilder &
 BuiltinTypeMethodBuilder::callBuiltin(StringRef BuiltinName,
@@ -676,6 +702,44 @@ BuiltinTypeDeclBuilder::addHandleConstructorFromImplicitBinding() {
       .finalize();
 }
 
+BuiltinTypeDeclBuilder &BuiltinTypeDeclBuilder::addCopyConstructor() {
+  if (Record->isCompleteDefinition())
+    return *this;
+
+  ASTContext &AST = SemaRef.getASTContext();
+  QualType RecordType = AST.getCanonicalTagType(Record);
+  QualType ConstRecordType = RecordType.withConst();
+  QualType ConstRecordRefType = AST.getLValueReferenceType(ConstRecordType);
+
+  using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+
+  return BuiltinTypeMethodBuilder(*this, "", AST.VoidTy, false, true)
+      .addParam("other", ConstRecordRefType)
+      .getResourceHandle(PH::_0)
+      .assign(PH::Handle, PH::LastStmt)
+      .finalize();
+}
+
+BuiltinTypeDeclBuilder &BuiltinTypeDeclBuilder::addCopyAssignmentOperator() {
+  if (Record->isCompleteDefinition())
+    return *this;
+
+  ASTContext &AST = SemaRef.getASTContext();
+  QualType RecordType = AST.getCanonicalTagType(Record);
+  QualType ConstRecordType = RecordType.withConst();
+  QualType ConstRecordRefType = AST.getLValueReferenceType(ConstRecordType);
+  QualType RecordRefType = AST.getLValueReferenceType(RecordType);
+
+  using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+  DeclarationName Name = AST.DeclarationNames.getCXXOperatorName(OO_Equal);
+  return BuiltinTypeMethodBuilder(*this, Name, RecordRefType, false, false)
+      .addParam("other", ConstRecordRefType)
+      .getResourceHandle(PH::_0)
+      .assign(PH::Handle, PH::LastStmt)
+      .returnThis()
+      .finalize();
+}
+
 BuiltinTypeDeclBuilder &BuiltinTypeDeclBuilder::addArraySubscriptOperators() {
   ASTContext &AST = Record->getASTContext();
   DeclarationName Subscript =
diff --git a/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h b/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h
index 098b72692bd3a..4c4c2083a8440 100644
--- a/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h
+++ b/clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h
@@ -80,6 +80,8 @@ class BuiltinTypeDeclBuilder {
   BuiltinTypeDeclBuilder &addDefaultHandleConstructor();
   BuiltinTypeDeclBuilder &addHandleConstructorFromBinding();
   BuiltinTypeDeclBuilder &addHandleConstructorFromImplicitBinding();
+  BuiltinTypeDeclBuilder &addCopyConstructor();
+  BuiltinTypeDeclBuilder &addCopyAssignmentOperator();
 
   // Builtin types methods
   BuiltinTypeDeclBuilder &addLoadMethods();
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 726581d131623..8c893c0c30baf 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -132,6 +132,8 @@ static BuiltinTypeDeclBuilder setupBufferType(CXXRecordDecl *Decl, Sema &S,
   return BuiltinTypeDeclBuilder(S, Decl)
       .addHandleMember(RC, IsROV, RawBuffer)
       .addDefaultHandleConstructor()
+      .addCopyConstructor()
+      .addCopyAssignmentOperator()
       .addHandleConstructorFromBinding()
       .addHandleConstructorFromImplicitBinding();
 }
diff --git a/clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl b/clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl
index 75d9fb8582b3d..116de8eb857a4 100644
--- a/clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl
@@ -26,9 +26,9 @@ void fb(CustomResource a) {
     CustomResource b = a;
 }
 
-// CHECK: define hidden void @_Z2fcN4hlsl8RWBufferIDv4_fEE(ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %a)
-// CHECK: call void @_Z4foo2N4hlsl8RWBufferIDv4_fEE(ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %agg.tmp)
-// CHECK: declare hidden void @_Z4foo2N4hlsl8RWBufferIDv4_fEE(ptr noundef byval(%"class.hlsl::RWBuffer") align 4)
+// CHECK: define hidden void @_Z2fcN4hlsl8RWBufferIDv4_fEE(ptr dead_on_return noundef %a)
+// CHECK: call void @_Z4foo2N4hlsl8RWBufferIDv4_fEE(ptr dead_on_return noundef %{{.*}})
+// CHECK: declare hidden void @_Z4foo2N4hlsl8RWBufferIDv4_fEE(ptr dead_on_return noundef)
 void foo2(RWBuffer<float4> buf);
 
 void fc(RWBuffer<float4> a) {
@@ -44,9 +44,9 @@ struct MyStruct {
   int2 i;
 };
 
-// CHECK: define hidden void @_Z2feN4hlsl16StructuredBufferI8MyStructEE(ptr noundef byval(%"class.hlsl::StructuredBuffer") align 4 %a)
-// CHECK: call void @_Z4foo3N4hlsl16StructuredBufferI8MyStructEE(ptr noundef byval(%"class.hlsl::StructuredBuffer") align 4 %agg.tmp)
-// CHECK: declare hidden void @_Z4foo3N4hlsl16StructuredBufferI8MyStructEE(ptr noundef byval(%"class.hlsl::StructuredBuffer") align 4)
+// CHECK: define hidden void @_Z2feN4hlsl16StructuredBufferI8MyStructEE(ptr dead_on_return noundef %a)
+// CHECK: call void @_Z4foo3N4hlsl16StructuredBufferI8MyStructEE(ptr dead_on_return noundef %{{.*}})
+// CHECK: declare hidden void @_Z4foo3N4hlsl16StructuredBufferI8MyStructEE(ptr dead_on_return noundef)
 void foo3(StructuredBuffer<MyStruct> buf);
 
 void fe(StructuredBuffer<MyStruct> a) {
diff --git a/clang/test/CodeGenHLSL/debug/rwbuffer_debug_info.hlsl b/clang/test/CodeGenHLSL/debug/rwbuffer_debug_info.hlsl
index db0388e41eae9..76ea87fd534c1 100644
--- a/clang/test/CodeGenHLSL/debug/rwbuffer_debug_info.hlsl
+++ b/clang/test/CodeGenHLSL/debug/rwbuffer_debug_info.hlsl
@@ -1,11 +1,11 @@
 // RUN: %clang_cc1 -triple dxil-pc-shadermodel6.6-compute -x hlsl -emit-llvm -disable-llvm-passes -o - -hlsl-entry main %s -debug-info-kind=standalone -dwarf-version=4  | FileCheck %s
 
 
-// CHECK: [[DWTag:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>", 
+// CHECK: [[DWTag:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>",
+// CHECK: [[thisType:![0-9]+]] = !DIDerivedType(tag: DW_TAG_reference_type, baseType: [[DWTag]], size: 32)
 // CHECK: [[RWBuffer:![0-9]+]] = distinct !DISubprogram(name: "RWBuffer",
 // CHECK-SAME: scope: [[DWTag]]
 // CHECK: [[FirstThis:![0-9]+]] = !DILocalVariable(name: "this", arg: 1, scope: [[RWBuffer]], type: [[thisType:![0-9]+]]
-// CHECK: [[thisType]] = !DIDerivedType(tag: DW_TAG_reference_type, baseType: [[DWTag]], size: 32)
 RWBuffer<float> Out : register(u7, space4);
 
 [numthreads(8,1,1)]
diff --git a/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
index 60238cbf8eff5..0c96b73ac6cb1 100644
--- a/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
+++ b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
@@ -31,7 +31,7 @@ uint Find(Node SortedTree[MAX], uint key) {
 }
 
 // CHECK: Function Attrs:{{.*}}norecurse
-// CHECK: define noundef i1 @_Z8InitTreeA100_4NodeN4hlsl8RWBufferIDv4_jEEj(ptr noundef byval([100 x %struct.Node]) align 1 %tree, ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %encodedTree, i32 noundef %maxDepth) [[Attr:\#[0-9]+]]
+// CHECK: define noundef i1 @_Z8InitTreeA100_4NodeN4hlsl8RWBufferIDv4_jEEj(ptr noundef byval([100 x %struct.Node]) align 1 %tree, ptr dead_on_return noundef %encodedTree, i32 noundef %maxDepth) [[Attr:\#[0-9]+]]
 // CHECK: ret i1
 // Initialize tree with given buffer
 // Imagine the inout works
diff --git a/clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl b/clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl
index d803882fd2f72..e5d28d659febe 100644
--- a/clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl
+++ b/clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl
@@ -1,49 +1,65 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.6-compute -finclude-default-header \
-// RUN:   -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s
-
-// This test verifies handling of multi-dimensional local arrays of resources
-// when used as a function argument and local variable.
-
-// CHECK: @_ZL1A = internal global %"class.hlsl::RWBuffer" poison, align 4
-// CHECK: @_ZL1B = internal global %"class.hlsl::RWBuffer" poison, align 4
-
-RWBuffer<float> A : register(u10);
-RWBuffer<float> B : register(u20);
-RWStructuredBuffer<float> Out;
-
-// NOTE: _ZN4hlsl8RWBufferIfEixEj is the subscript operator for RWBuffer<float> and
-//       _ZN4hlsl18RWStructuredBufferIfEixEj is the subscript operator for RWStructuredBuffer<float>
-
-// CHECK: define {{.*}} float @_Z3fooA2_A2_N4hlsl8RWBufferIfEE(ptr noundef byval([2 x [2 x %"class.hlsl::RWBuffer"]]) align 4 %Arr)
-// CHECK-NEXT: entry:
-float foo(RWBuffer<float> Arr[2][2]) {
-// CHECK-NEXT: %[[Arr_1_Ptr:.*]] = getelementptr inbounds [2 x [2 x %"class.hlsl::RWBuffer"]], ptr %Arr, i32 0, i32 1
-// CHECK-NEXT: %[[Arr_1_1_Ptr:.*]] = getelementptr inbounds [2 x %"class.hlsl::RWBuffer"], ptr %[[Arr_1_Ptr]], i32 0, i32 1
-// CHECK-NEXT: %[[BufPtr:.*]] = call {{.*}} ptr @_ZN4hlsl8RWBufferIfEixEj(ptr {{.*}} %[[Arr_1_1_Ptr]], i32 noundef 0)
-// CHECK-NEXT: %[[Value:.*]] = load float, ptr %[[BufPtr]], align 4
-// CHECK-NEXT: ret float %[[Value]]
-  return Arr[1][1][0];
-}
-
-// CHECK: define internal void @_Z4mainv()
-// CHECK-NEXT: entry:
-[numthreads(4,1,1)]
-void main() {
-// CHECK-NEXT: %L = alloca [2 x [2 x %"class.hlsl::RWBuffer"]], align 4
-// CHECK-NEXT: %[[Tmp:.*]] = alloca [2 x [2 x %"class.hlsl::RWBuffer"]], align 4
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %L, ptr align 4 @_ZL1A, i32 4, i1 false)
-// CHECK-NEXT: %[[Ptr1:.*]] = getelementptr inbounds %"class.hlsl::RWBuffer", ptr %L, i32 1
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %[[Ptr1]], ptr align 4 @_ZL1B, i32 4, i1 false)
-// CHECK-NEXT: %[[Ptr2:.*]] = getelementptr inbounds [2 x %"class.hlsl::RWBuffer"], ptr %L, i32 1
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %[[Ptr2]], ptr align 4 @_ZL1A, i32 4, i1 false)
-// CHECK-NEXT: %[[Ptr3:.*]] = getelementptr inbounds %"class.hlsl::RWBuffer", ptr %[[Ptr2]], i32 1
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %[[Ptr3]], ptr align 4 @_ZL1B, i32 4, i1 false)
-  RWBuffer<float> L[2][2] = { { A, B }, { A, B } };
-
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %[[Tmp]], ptr align 4 %L, i32 16, i1 false)
-// CHECK-NEXT: %[[ReturnedValue:.*]] = call {{.*}}float @_Z3fooA2_A2_N4hlsl8RWBufferIfEE(ptr noundef byval([2 x [2 x %"class.hlsl::RWBuffer"]]) align 4 %[[Tmp]])
-// CHECK-NEXT: %[[OutBufPtr:.*]] = call {{.*}} ptr @_ZN4hlsl18RWStructuredBufferIfEixEj(ptr {{.*}} @_ZL3Out, i32 noundef 0)
-// CHECK-NEXT: store float %[[ReturnedValue]], ptr %[[OutBufPtr]], align 4
-// CHECK-NEXT: ret void
-  Out[0] = foo(L);
-}
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.6-compute -finclude-default-header \
+// RUN:   -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s
+
+// This test verifies handling of multi-dimensional local arrays of resources
+// when used as a function argument and local variable.
+
+// CHECK: @_ZL1A = internal global %"class.hlsl::RWBuffer" poison, align 4
+// CHECK: @_ZL1B = internal global %"class.hlsl::RWBuffer" poison, align 4
+
+RWBuffer<float> A : register(u10);
+RWBuffer<float> B : register(u20);
+RWStructuredBuffer<float> Out;
+
+// NOTE: _ZN4hlsl8RWBufferIfEixEj is the subscript operator for RWBuffer<float> and
+//       _ZN4hlsl18RWStructuredBufferIfEixEj is the subscript operator for RWStructuredBuffer<float>
+
+// CHECK: define {{.*}} float @_Z3fooA2_A2_N4hlsl8RWBufferIfEE(ptr noundef byval([2 x [2 x %"class.hlsl::RWBuffer"]]) align 4 %Arr)
+// CHECK-NEXT: entry:
+float foo(RWBuffer<float> Arr[2][2]) {
+// CHECK-NEXT: %[[Arr_1_Ptr:.*]] = getelementptr inbounds [2 x [2 x %"class.hlsl::RWBuffer"]], ptr %Arr, i32 0, i32 1
+// CHECK-NEXT: %[[Arr_1_1_Ptr:.*]] = getelementptr inbounds [2 x %"class.hlsl::RWBuffer"], ptr %[[Arr_1_Ptr]], i32 0, i32 1
+// CHECK-NEXT: %[[BufPtr:.*]] = call {{.*}} ptr @_ZN4hlsl8RWBufferIfEixEj(ptr {{.*}} %[[Arr_1_1_Ptr]], i32 noundef 0)
+// CHECK-NEXT: %[[Value:.*]] = load float, ptr %[[BufPtr]], align 4
+// CHECK-NEXT: ret float %[[Value]]
+  return Arr[1][1][0];
+}
+
+// CHECK: define internal void @_Z4mainv()
+// CHECK-NEXT: entry:
+[numthreads(4,1,1)]
+void main() {
+// CHECK: %L = alloca [2 x [2 x %"class.hlsl::RWBuffer"]], align 4
+// CHECK: %[[ref_tmp:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp1:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp2:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp3:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp4:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp5:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp6:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[ref_tmp7:.*]] = alloca %"class.hlsl::RWBuffer", align 4
+// CHECK: %[[agg_tmp:.*]] = alloca [2 x [2 x %"class.hlsl::RWBuffer"]], align 4
+// CHECK: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr {{.*}} %[[ref_tmp]], ptr {{.*}} @_ZL1A)
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp1]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp]])
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp2]], ptr noundef nonnull align 4 dereferenceable(4) @_ZL1B)
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp3]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp2]])
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp4]], ptr noundef nonnull align 4 dereferenceable(4) @_ZL1A)
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp5]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp4]])
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp6]], ptr noundef nonnull align 4 dereferenceable(4) @_ZL1B)
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp7]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp6]])
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %L, ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp1]])
+// CHECK-NEXT: %[[arrayinit_element:.*]] = getelementptr inbounds %"class.hlsl::RWBuffer", ptr %L, i32 1
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[arrayinit_element]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp3]])
+// CHECK-NEXT: %[[arrayinit_element8:.*]] = getelementptr inbounds [2 x %"class.hlsl::RWBuffer"], ptr %L, i32 1
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[arrayinit_element8]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp5]])
+// CHECK-NEXT: %[[arrayinit_element9:.*]] = getelementptr inbounds %"class.hlsl::RWBuffer", ptr %[[arrayinit_element8]], i32 1
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIfEC1ERKS1_(ptr noundef nonnull align 4 dereferenceable(4) %[[arrayinit_element9]], ptr noundef nonnull align 4 dereferenceable(4) %[[ref_tmp7]])
+  RWBuffer<float> L[2][2] = { { A, B }, { A, B } };
+
+// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %[[agg_tmp]], ptr align 4 %L, i32 16, i1 false)
+// CHECK-NEXT: %[[ReturnedValue:.*]] = call {{.*}}float @_Z3fooA2_A2_N4hlsl8RWBufferIfEE(ptr noundef byval([2 x [2 x %"class.hlsl::RWBuffer"]]) align 4 %[[agg_tmp]])
+// CHECK-NEXT: %[[OutBufPtr:.*]] = call {{.*}} ptr @_ZN4hlsl18RWStructuredBufferIfEixEj(ptr {{.*}} @_ZL3Out, i32 noundef 0)
+// CHECK-NEXT: store float %[[ReturnedValue]], ptr %[[OutBufPtr]], align 4
+// CHECK-NEXT: ret void
+  Out[0] = foo(L);
+}
diff --git a/clang/test/CodeGenHLSL/resources/res-array-local1.hlsl b/clang/test/CodeGenHLSL/resources/res-array-local1.hlsl
index c0d508b1395c3..9e31f4f150c52 100644
--- a/clang/test/CodeGenHLSL/resources/res-array-local1.hlsl
+++ b/clang/test/CodeGenHLSL/resources/res-array-local1.hlsl
@@ -1,64 +1,64 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.6-compute -finclude-default-header \
-// RUN:   -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s
-
-// This test verifies local arrays of resources in HLSL.
-
-// CHECK: @_ZL1A = internal global %"class.hlsl::RWBuffer" poison, align 4
-// CHECK: @_ZL1B = internal global %"class.hlsl::RWBuffer" poison, align 4
-// CHECK: @_ZL1C = internal global %"class.hlsl::RWBuffer" poison, align 4
-
-RWBuffer<float> A : register(u1);
-RWBuffer<float> B : register(u2);
-RWBuffer<float> C : register(u3);
-RWStructuredBuffer<float> Out : register(u0);
-
-// CHECK: define internal void @_Z4mainv()
-// CHECK-NEXT: entry:
-[numthreads(4,1,1)]
-void main() {
-// CHECK-NEXT:  %First = alloca [3 x %"class.hlsl::RWBuffer"], align 4
-// CHECK-NEXT:  %Second = alloca [4 x %"class.hlsl::RWBuffer"], align 4
-  RWBuffer<float> First[3] = { A, B, C };
-  RWBuffer<float> Second[4];
-
-// Verify initialization of First array from an initialization list
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 %First, ptr align 4 @_ZL1A, i32 4, i1 false)
-// CHECK-NEXT: %[[Ptr1:.*]] = getelementptr inbounds %"class.hlsl::...
[truncated]

s-perron added a commit to s-perron/llvm-project that referenced this pull request Sep 3, 2025
The OpaqueValueVisitor was not correctly traversing the AST to find all
OpaqueValueExprs. This resulted in some expressions not being correctly
initialized. This change fixes the visitor to correctly traverse the AST.

A test for this change will be included as part of the work in
llvm#156075.

See discussion in
https://github.com/llvm/llvm-project/pull/156075/files#r2317231524.
@s-perron s-perron requested a review from hekota September 4, 2025 14:55
Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor comments

# Conflicts:
#	clang/test/CodeGenHLSL/resources/res-array-local-multi-dim.hlsl
#	clang/test/CodeGenHLSL/resources/res-array-local1.hlsl
#	clang/test/CodeGenHLSL/resources/res-array-local3.hlsl
Copy link
Member

@hekota hekota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@s-perron s-perron enabled auto-merge (squash) September 6, 2025 11:23
@s-perron s-perron merged commit 54ed459 into llvm:main Sep 6, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SPIR-V] Hitting llvm_unreachable in SPIRVLegalizePointerCast when compiling local resource arrays
4 participants