Skip to content

Conversation

ruiling
Copy link
Contributor

@ruiling ruiling commented Aug 12, 2025

Our GPU compiler usually construct pointers through inttoptr. The memory was pre-allocated before the shader function execution and remains valid through the execution of the shader function. This brings back the expected behavior of instruction hoisting for the test hoist-speculatable-load.ll, which was broken by #126117.

Our GPU compiler usually construct pointers through inttoptr. The memory
was pre-allocated before the shader function execution and remains valid
through the execution of the shader function. This brings back the
expected behavior of instruction hoisting for the test `hoist-speculatable-load.ll`,
which was broken by llvm#126117.
@llvmbot
Copy link
Member

llvmbot commented Aug 12, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ruiling, Song (ruiling)

Changes

Our GPU compiler usually construct pointers through inttoptr. The memory was pre-allocated before the shader function execution and remains valid through the execution of the shader function. This brings back the expected behavior of instruction hoisting for the test hoist-speculatable-load.ll, which was broken by #126117.


Full diff: https://github.com/llvm/llvm-project/pull/153149.diff

6 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+15-1)
  • (modified) llvm/include/llvm/IR/FixedMetadataKinds.def (+1)
  • (modified) llvm/lib/IR/Value.cpp (+3)
  • (modified) llvm/lib/IR/Verifier.cpp (+13)
  • (modified) llvm/test/Transforms/LICM/hoist-speculatable-load.ll (+3-3)
  • (added) llvm/test/Verifier/nofree_metadata.ll (+15)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 162208fb1c81c..2c22190079e28 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -8481,6 +8481,14 @@ specific. The behavior is undefined if the runtime memory address does
 resolve to an object defined in one of the indicated address spaces.
 
 
+'``nofree``' Metadata
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``nofree`` metadata indicates the memory pointed by the pointer will not be
+freed during the execution of the function . This is analogous to the ``nofree``
+function argument attribute.
+
+
 Module Flags Metadata
 =====================
 
@@ -12592,7 +12600,7 @@ Syntax:
 
 ::
 
-      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
+      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !nofree !<empty_node>]            ; yields ty2
 
 Overview:
 """""""""
@@ -12617,6 +12625,12 @@ metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
 ``i64`` entry.
 See ``dereferenceable_or_null`` metadata.
 
+The optional ``!nofree`` metadata must reference a single metadata name
+``<empty_node>`` corresponding to a metadata node with no entries.
+The existence of the ``!nofree`` metadata on the instruction tells the optimizer
+that the memory pointed by the pointer will not be freed during the execution of
+the function.
+
 Semantics:
 """"""""""
 
diff --git a/llvm/include/llvm/IR/FixedMetadataKinds.def b/llvm/include/llvm/IR/FixedMetadataKinds.def
index 90276eae13e4b..d09cc15d65ff6 100644
--- a/llvm/include/llvm/IR/FixedMetadataKinds.def
+++ b/llvm/include/llvm/IR/FixedMetadataKinds.def
@@ -54,3 +54,4 @@ LLVM_FIXED_MD_KIND(MD_coro_outside_frame, "coro.outside.frame", 39)
 LLVM_FIXED_MD_KIND(MD_mmra, "mmra", 40)
 LLVM_FIXED_MD_KIND(MD_noalias_addrspace, "noalias.addrspace", 41)
 LLVM_FIXED_MD_KIND(MD_callee_type, "callee_type", 42)
+LLVM_FIXED_MD_KIND(MD_nofree, "nofree", 43)
diff --git a/llvm/lib/IR/Value.cpp b/llvm/lib/IR/Value.cpp
index 5928c89029b87..4e8f359481b81 100644
--- a/llvm/lib/IR/Value.cpp
+++ b/llvm/lib/IR/Value.cpp
@@ -836,6 +836,9 @@ bool Value::canBeFreed() const {
       return false;
   }
 
+  if (isa<IntToPtrInst>(this) && getMetadata(LLVMContext::MD_nofree))
+    return false;
+
   const Function *F = nullptr;
   if (auto *I = dyn_cast<Instruction>(this))
     F = I->getFunction();
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 1d3c379f461fa..70349e0000796 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -527,6 +527,7 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
   void visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty);
   void visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, Type *Ty);
   void visitDereferenceableMetadata(Instruction &I, MDNode *MD);
+  void visitNofreeMetadata(Instruction &I, MDNode *MD);
   void visitProfMetadata(Instruction &I, MDNode *MD);
   void visitCallStackMetadata(MDNode *MD);
   void visitMemProfMetadata(Instruction &I, MDNode *MD);
@@ -5022,6 +5023,15 @@ void Verifier::visitDereferenceableMetadata(Instruction& I, MDNode* MD) {
         &I);
 }
 
+void Verifier::visitNofreeMetadata(Instruction &I, MDNode *MD) {
+  Check(I.getType()->isPointerTy(), "nofree apply only to pointer types", &I);
+  Check((isa<IntToPtrInst>(I)),
+        "nofree applies only to inttoptr instruction,"
+        " use attributes for calls or invokes",
+        &I);
+  Check(MD->getNumOperands() == 0, "nofree metadata must be empty", &I);
+}
+
 void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
   auto GetBranchingTerminatorNumOperands = [&]() {
     unsigned ExpectedNumOperands = 0;
@@ -5497,6 +5507,9 @@ void Verifier::visitInstruction(Instruction &I) {
   if (MDNode *MD = I.getMetadata(LLVMContext::MD_dereferenceable_or_null))
     visitDereferenceableMetadata(I, MD);
 
+  if (MDNode *MD = I.getMetadata(LLVMContext::MD_nofree))
+    visitNofreeMetadata(I, MD);
+
   if (MDNode *TBAA = I.getMetadata(LLVMContext::MD_tbaa))
     TBAAVerifyHelper.visitTBAAMetadata(I, TBAA);
 
diff --git a/llvm/test/Transforms/LICM/hoist-speculatable-load.ll b/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
index a4a38c2eaadc3..31236e8f29d60 100644
--- a/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
+++ b/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
@@ -4,19 +4,19 @@
 define void @f(i32 %ptr_i, ptr %ptr2, i1 %cond) {
 ; CHECK-LABEL: @f(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[PTR:%.*]] = inttoptr i32 [[PTR_I:%.*]] to ptr
+; CHECK-NEXT:    [[PTR:%.*]] = inttoptr i32 [[PTR_I:%.*]] to ptr, !nofree [[META0:![0-9]+]]
 ; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(ptr [[PTR]], i32 16), "dereferenceable"(ptr [[PTR]], i32 16) ]
 ; CHECK-NEXT:    br i1 [[COND:%.*]], label [[FOR_BODY_LR_PH:%.*]], label [[IF0:%.*]]
 ; CHECK:       if0:
 ; CHECK-NEXT:    store i32 0, ptr [[PTR2:%.*]], align 4
 ; CHECK-NEXT:    br label [[FOR_BODY_LR_PH]]
 ; CHECK:       for.body.lr.ph:
+; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[PTR]], align 4
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[I_08:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH]] ], [ [[INC:%.*]], [[IF_END:%.*]] ]
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF_END]], label [[IF:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[PTR]], align 4, !invariant.load [[META0:![0-9]+]]
 ; CHECK-NEXT:    store i32 [[TMP0]], ptr [[PTR2]], align 4
 ; CHECK-NEXT:    br label [[IF_END]]
 ; CHECK:       if.end:
@@ -27,7 +27,7 @@ define void @f(i32 %ptr_i, ptr %ptr2, i1 %cond) {
 ; CHECK-NEXT:    ret void
 ;
 entry:
-  %ptr = inttoptr i32 %ptr_i to ptr
+  %ptr = inttoptr i32 %ptr_i to ptr, !nofree !{}
   call void @llvm.assume(i1 true) [ "align"(ptr %ptr, i32 16), "dereferenceable"(ptr %ptr, i32 16) ]
   br i1 %cond, label %for.body.lr.ph, label %if0
 
diff --git a/llvm/test/Verifier/nofree_metadata.ll b/llvm/test/Verifier/nofree_metadata.ll
new file mode 100644
index 0000000000000..e4db00987355a
--- /dev/null
+++ b/llvm/test/Verifier/nofree_metadata.ll
@@ -0,0 +1,15 @@
+; RUN: not llvm-as < %s 2>&1 | FileCheck %s
+
+declare ptr @dummy()
+
+; CHECK: nofree applies only to inttoptr instruction, use attributes for calls or invokes
+define void @test_not_inttoptr() {
+  call ptr @dummy(), !nofree !{}
+  ret void
+}
+
+; CHECK: nofree metadata must be empty
+define void @test_invalid_arg(i32 %p) {
+  inttoptr i32 %p to ptr, !nofree !{i32 0}
+  ret void
+}

@llvmbot
Copy link
Member

llvmbot commented Aug 12, 2025

@llvm/pr-subscribers-llvm-ir

Author: Ruiling, Song (ruiling)

Changes

Our GPU compiler usually construct pointers through inttoptr. The memory was pre-allocated before the shader function execution and remains valid through the execution of the shader function. This brings back the expected behavior of instruction hoisting for the test hoist-speculatable-load.ll, which was broken by #126117.


Full diff: https://github.com/llvm/llvm-project/pull/153149.diff

6 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+15-1)
  • (modified) llvm/include/llvm/IR/FixedMetadataKinds.def (+1)
  • (modified) llvm/lib/IR/Value.cpp (+3)
  • (modified) llvm/lib/IR/Verifier.cpp (+13)
  • (modified) llvm/test/Transforms/LICM/hoist-speculatable-load.ll (+3-3)
  • (added) llvm/test/Verifier/nofree_metadata.ll (+15)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 162208fb1c81c..2c22190079e28 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -8481,6 +8481,14 @@ specific. The behavior is undefined if the runtime memory address does
 resolve to an object defined in one of the indicated address spaces.
 
 
+'``nofree``' Metadata
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``nofree`` metadata indicates the memory pointed by the pointer will not be
+freed during the execution of the function . This is analogous to the ``nofree``
+function argument attribute.
+
+
 Module Flags Metadata
 =====================
 
@@ -12592,7 +12600,7 @@ Syntax:
 
 ::
 
-      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
+      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !nofree !<empty_node>]            ; yields ty2
 
 Overview:
 """""""""
@@ -12617,6 +12625,12 @@ metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
 ``i64`` entry.
 See ``dereferenceable_or_null`` metadata.
 
+The optional ``!nofree`` metadata must reference a single metadata name
+``<empty_node>`` corresponding to a metadata node with no entries.
+The existence of the ``!nofree`` metadata on the instruction tells the optimizer
+that the memory pointed by the pointer will not be freed during the execution of
+the function.
+
 Semantics:
 """"""""""
 
diff --git a/llvm/include/llvm/IR/FixedMetadataKinds.def b/llvm/include/llvm/IR/FixedMetadataKinds.def
index 90276eae13e4b..d09cc15d65ff6 100644
--- a/llvm/include/llvm/IR/FixedMetadataKinds.def
+++ b/llvm/include/llvm/IR/FixedMetadataKinds.def
@@ -54,3 +54,4 @@ LLVM_FIXED_MD_KIND(MD_coro_outside_frame, "coro.outside.frame", 39)
 LLVM_FIXED_MD_KIND(MD_mmra, "mmra", 40)
 LLVM_FIXED_MD_KIND(MD_noalias_addrspace, "noalias.addrspace", 41)
 LLVM_FIXED_MD_KIND(MD_callee_type, "callee_type", 42)
+LLVM_FIXED_MD_KIND(MD_nofree, "nofree", 43)
diff --git a/llvm/lib/IR/Value.cpp b/llvm/lib/IR/Value.cpp
index 5928c89029b87..4e8f359481b81 100644
--- a/llvm/lib/IR/Value.cpp
+++ b/llvm/lib/IR/Value.cpp
@@ -836,6 +836,9 @@ bool Value::canBeFreed() const {
       return false;
   }
 
+  if (isa<IntToPtrInst>(this) && getMetadata(LLVMContext::MD_nofree))
+    return false;
+
   const Function *F = nullptr;
   if (auto *I = dyn_cast<Instruction>(this))
     F = I->getFunction();
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 1d3c379f461fa..70349e0000796 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -527,6 +527,7 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
   void visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty);
   void visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, Type *Ty);
   void visitDereferenceableMetadata(Instruction &I, MDNode *MD);
+  void visitNofreeMetadata(Instruction &I, MDNode *MD);
   void visitProfMetadata(Instruction &I, MDNode *MD);
   void visitCallStackMetadata(MDNode *MD);
   void visitMemProfMetadata(Instruction &I, MDNode *MD);
@@ -5022,6 +5023,15 @@ void Verifier::visitDereferenceableMetadata(Instruction& I, MDNode* MD) {
         &I);
 }
 
+void Verifier::visitNofreeMetadata(Instruction &I, MDNode *MD) {
+  Check(I.getType()->isPointerTy(), "nofree apply only to pointer types", &I);
+  Check((isa<IntToPtrInst>(I)),
+        "nofree applies only to inttoptr instruction,"
+        " use attributes for calls or invokes",
+        &I);
+  Check(MD->getNumOperands() == 0, "nofree metadata must be empty", &I);
+}
+
 void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
   auto GetBranchingTerminatorNumOperands = [&]() {
     unsigned ExpectedNumOperands = 0;
@@ -5497,6 +5507,9 @@ void Verifier::visitInstruction(Instruction &I) {
   if (MDNode *MD = I.getMetadata(LLVMContext::MD_dereferenceable_or_null))
     visitDereferenceableMetadata(I, MD);
 
+  if (MDNode *MD = I.getMetadata(LLVMContext::MD_nofree))
+    visitNofreeMetadata(I, MD);
+
   if (MDNode *TBAA = I.getMetadata(LLVMContext::MD_tbaa))
     TBAAVerifyHelper.visitTBAAMetadata(I, TBAA);
 
diff --git a/llvm/test/Transforms/LICM/hoist-speculatable-load.ll b/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
index a4a38c2eaadc3..31236e8f29d60 100644
--- a/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
+++ b/llvm/test/Transforms/LICM/hoist-speculatable-load.ll
@@ -4,19 +4,19 @@
 define void @f(i32 %ptr_i, ptr %ptr2, i1 %cond) {
 ; CHECK-LABEL: @f(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[PTR:%.*]] = inttoptr i32 [[PTR_I:%.*]] to ptr
+; CHECK-NEXT:    [[PTR:%.*]] = inttoptr i32 [[PTR_I:%.*]] to ptr, !nofree [[META0:![0-9]+]]
 ; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(ptr [[PTR]], i32 16), "dereferenceable"(ptr [[PTR]], i32 16) ]
 ; CHECK-NEXT:    br i1 [[COND:%.*]], label [[FOR_BODY_LR_PH:%.*]], label [[IF0:%.*]]
 ; CHECK:       if0:
 ; CHECK-NEXT:    store i32 0, ptr [[PTR2:%.*]], align 4
 ; CHECK-NEXT:    br label [[FOR_BODY_LR_PH]]
 ; CHECK:       for.body.lr.ph:
+; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[PTR]], align 4
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[I_08:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH]] ], [ [[INC:%.*]], [[IF_END:%.*]] ]
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF_END]], label [[IF:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[PTR]], align 4, !invariant.load [[META0:![0-9]+]]
 ; CHECK-NEXT:    store i32 [[TMP0]], ptr [[PTR2]], align 4
 ; CHECK-NEXT:    br label [[IF_END]]
 ; CHECK:       if.end:
@@ -27,7 +27,7 @@ define void @f(i32 %ptr_i, ptr %ptr2, i1 %cond) {
 ; CHECK-NEXT:    ret void
 ;
 entry:
-  %ptr = inttoptr i32 %ptr_i to ptr
+  %ptr = inttoptr i32 %ptr_i to ptr, !nofree !{}
   call void @llvm.assume(i1 true) [ "align"(ptr %ptr, i32 16), "dereferenceable"(ptr %ptr, i32 16) ]
   br i1 %cond, label %for.body.lr.ph, label %if0
 
diff --git a/llvm/test/Verifier/nofree_metadata.ll b/llvm/test/Verifier/nofree_metadata.ll
new file mode 100644
index 0000000000000..e4db00987355a
--- /dev/null
+++ b/llvm/test/Verifier/nofree_metadata.ll
@@ -0,0 +1,15 @@
+; RUN: not llvm-as < %s 2>&1 | FileCheck %s
+
+declare ptr @dummy()
+
+; CHECK: nofree applies only to inttoptr instruction, use attributes for calls or invokes
+define void @test_not_inttoptr() {
+  call ptr @dummy(), !nofree !{}
+  ret void
+}
+
+; CHECK: nofree metadata must be empty
+define void @test_invalid_arg(i32 %p) {
+  inttoptr i32 %p to ptr, !nofree !{i32 0}
+  ret void
+}

@ruiling
Copy link
Contributor Author

ruiling commented Aug 19, 2025

ping

``<empty_node>`` corresponding to a metadata node with no entries.
The existence of the ``!nofree`` metadata on the instruction tells the optimizer
that the memory pointed by the pointer will not be freed during the execution of
the function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"will not be freed during the execution of the function" seems like a problematic definition to me: Doesn't this mean that this metadata must be dropped during during inlining, otherwise it may extend the scope that "nofree" applies to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good thinking! Our use case is actually gpu entry-point function which is a function already being inlined and will not be inlined to others. Is there a good way to fix the concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to say: "The memory pointed by the pointer will not be freed through the lifetime of the pointer"? I think that best matches our need and would not introduce issue you mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to say: "The memory pointed by the pointer will not be freed through the lifetime of the pointer"?

Assuming "lifetime of the pointer" is the time between allocation and free, this would always be true :)

I'm having trouble coming up with a good definition for this. Possibly we can just say "will not be freed after this point"? I guess that's true enough from the perspective of the device code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"will not be freed after this point" is much simpler and clearer. Yes the memory is preallocated by host and would be alive until device code finish execution.

@jayfoad
Copy link
Contributor

jayfoad commented Aug 20, 2025

This brings back the expected behavior of instruction hoisting for the test hoist-speculatable-load.ll, which was broken by #126117.

See also #126117 (comment) - I don't think #126117 really needs nofree.

@ruiling
Copy link
Contributor Author

ruiling commented Aug 21, 2025

This brings back the expected behavior of instruction hoisting for the test hoist-speculatable-load.ll, which was broken by #126117.

See also #126117 (comment) - I don't think #126117 really needs nofree.

It's about not breaking some corner case (#120962 (comment)).

@ruiling
Copy link
Contributor Author

ruiling commented Aug 27, 2025

ping

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd appreciate a second opinion on this one.

^^^^^^^^^^^^^^^^^^^^^

The ``nofree`` metadata indicates the memory pointed by the pointer will not be
freed after the attached instruction. This is analogous to the ``nofree``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably drop the mention of the nofree attribute here, as that one has different semantics (function-local).

Check(I.getType()->isPointerTy(), "nofree applies only to pointer types", &I);
Check((isa<IntToPtrInst>(I)),
"nofree applies only to inttoptr instruction,"
" use attributes for calls or invokes",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call/invoke do not support a nofree return value attribute right now (only param or function), so I don't think we should mentioned it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants