[RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags #152121

mylai-mtk · 2025-08-05T11:06:22Z

Expected Behavior:

Stop from adopting control flow integrity (CFI) behaviors, e.g. generating LPAD/LUI insns, use software guarded jumps, etc., by testing if the target has the Zicfilp extension and instead, do CFI if the LLVM module has all of these flags:

cf-protection-branch: Needs to be a non-zero integer (which means true)
cf-branch-label-scheme: Needs to be unlabeled or func-sig
- // LPAD generation is only supported with unlabeled as of now

Context:

In clang, Zicfilp-based CFI (the unlabeled scheme) can now be enabled by giving the -fcf-protection=branch -mcf-branch-label-scheme=unlabeled options. With these options, the clang frontend adds the above-mentioned flags to LLVM modules. Here we want to align compiler CFI behaviors to be enabled by the semantics of those LLVM module flags, instead of relying on the inaccurate indicator of whether the Zicfilp extension is available, so the toolchain's behavior is more streamlined and expected.

Also, since LPAD insns can be executed regardless of whether Zicfilp extension is available on target or not (due to LPAD insn being encoded as a standard hint insn), clang accepts the above-mentioned CLI options even if Zicfilp is not available and expects backend to still generate LPAD insns. This patch enables LPAD insn generation and other CFI-tailored compiler changes in such cases.

Expected Behavior: Stop codegening LPAD insns by testing if the target has the Zicfilp extension and instead, codegen LPAD insns if the LLVM module has all of these flags: + `cf-protection-branch`: Needs to be a non-zero integer (which means `true`) + `cf-branch-label-scheme`: Needs to be `unlabeled` Context: In clang, Zicfilp-based control flow integrity (the `unlabeled` scheme) can now be enabled by giving the `-fcf-protection=branch -mcf-branch-label-scheme=unlabeled` options. With these options, the clang frontend adds the above-mentioned flags to LLVM modules. Here we want to align LPAD insn codegen to be enabled by the semantics of those LLVM module flags, instead of relying on the inaccurate indicator of whether the Zicfilp extension is available, so the toolchain's behavior is more streamlined and expected. Also, since LPAD insns can be executed regardless of whether Zicfilp is available in target or not (due to LPAD insn being encoded as a standard hint insn), clang accepts the above-mentioned CLI options even if Zicfilp is not enabled and expects backend to still generate LPAD insns. This patch enables LPAD insn generation in such cases.

llvmbot · 2025-08-05T11:06:54Z

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-backend-risc-v

Author: Ming-Yi Lai (mylai-mtk)

Changes

Expected Behavior:

Stop codegening LPAD insns by testing if the target has the Zicfilp extension and instead, codegen LPAD insns if the LLVM module has all of these flags:

cf-protection-branch: Needs to be a non-zero integer (which means true)
cf-branch-label-scheme: Needs to be unlabeled

Context:

In clang, Zicfilp-based control flow integrity (the unlabeled scheme) can now be enabled by giving the -fcf-protection=branch -mcf-branch-label-scheme=unlabeled options. With these options, the clang frontend adds the above-mentioned flags to LLVM modules. Here we want to align LPAD insn codegen to be enabled by the semantics of those LLVM module flags, instead of relying on the inaccurate indicator of whether the Zicfilp extension is available, so the toolchain's behavior is more streamlined and expected.

Also, since LPAD insns can be executed regardless of whether Zicfilp is available in target or not (due to LPAD insn being encoded as a standard hint insn), clang accepts the above-mentioned CLI options even if Zicfilp is not enabled and expects backend to still generate LPAD insns. This patch enables LPAD insn generation in such cases.

Full diff: https://github.com/llvm/llvm-project/pull/152121.diff

3 Files Affected:

(modified) llvm/include/llvm/Support/RISCVISAUtils.h (+19)
(modified) llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp (+20-1)
(modified) llvm/test/CodeGen/RISCV/lpad.ll (+5)

diff --git a/llvm/include/llvm/Support/RISCVISAUtils.h b/llvm/include/llvm/Support/RISCVISAUtils.h
index 165bb08d66431..e2dd833de74fd 100644
--- a/llvm/include/llvm/Support/RISCVISAUtils.h
+++ b/llvm/include/llvm/Support/RISCVISAUtils.h
@@ -14,7 +14,9 @@
 #define LLVM_SUPPORT_RISCVISAUTILS_H
 
 #include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/Support/Compiler.h"
+#include <cassert>
 #include <map>
 #include <string>
 
@@ -43,6 +45,23 @@ struct ExtensionComparator {
 typedef std::map<std::string, ExtensionVersion, ExtensionComparator>
     OrderedExtensionMap;
 
+enum class ZicfilpLabelSchemeKind {
+  Invalid,
+  Unlabeled,
+  FuncSig,
+};
+
+// See clang::getCFBranchLabelSchemeFlagVal() for possible CFBranchLabelScheme
+inline ZicfilpLabelSchemeKind
+getZicfilpLabelScheme(const StringRef CFBranchLabelScheme) {
+  const auto Ret = StringSwitch<ZicfilpLabelSchemeKind>(CFBranchLabelScheme)
+                       .Case("unlabeled", ZicfilpLabelSchemeKind::Unlabeled)
+                       .Case("func-sig", ZicfilpLabelSchemeKind::FuncSig)
+                       .Default(ZicfilpLabelSchemeKind::Invalid);
+  assert(Ret != ZicfilpLabelSchemeKind::Invalid);
+  return Ret;
+}
+
 } // namespace RISCVISAUtils
 
 } // namespace llvm
diff --git a/llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp b/llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp
index 43621b8f0f33d..554f7d5494731 100644
--- a/llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp
+++ b/llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp
@@ -19,11 +19,14 @@
 #include "llvm/CodeGen/MachineFunctionPass.h"
 #include "llvm/CodeGen/MachineInstrBuilder.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Support/RISCVISAUtils.h"
 
 #define DEBUG_TYPE "riscv-indrect-branch-tracking"
 #define PASS_NAME "RISC-V Indirect Branch Tracking"
 
 using namespace llvm;
+using namespace llvm::RISCVISAUtils;
 
 cl::opt<uint32_t> PreferredLandingPadLabel(
     "riscv-landing-pad-label", cl::ReallyHidden,
@@ -64,9 +67,25 @@ static void emitLpad(MachineBasicBlock &MBB, const RISCVInstrInfo *TII,
 bool RISCVIndirectBranchTracking::runOnMachineFunction(MachineFunction &MF) {
   const auto &Subtarget = MF.getSubtarget<RISCVSubtarget>();
   const RISCVInstrInfo *TII = Subtarget.getInstrInfo();
-  if (!Subtarget.hasStdExtZicfilp())
+
+  const Module *const M = MF.getFunction().getParent();
+  if (!M)
+    return false;
+  if (const Metadata *const Flag = M->getModuleFlag("cf-protection-branch");
+      !Flag || mdconst::extract<ConstantInt>(Flag)->isZero())
     return false;
 
+  StringRef CFBranchLabelScheme;
+  if (const Metadata *const MD = M->getModuleFlag("cf-branch-label-scheme"))
+    CFBranchLabelScheme = cast<MDString>(MD)->getString();
+  else
+    report_fatal_error("missing cf-branch-label-scheme module flag");
+
+  const ZicfilpLabelSchemeKind Scheme =
+      getZicfilpLabelScheme(CFBranchLabelScheme);
+  if (Scheme != ZicfilpLabelSchemeKind::Unlabeled)
+    report_fatal_error("unsupported cf-branch-label-scheme module flag");
+
   uint32_t FixedLabel = 0;
   if (PreferredLandingPadLabel.getNumOccurrences() > 0) {
     if (!isUInt<20>(PreferredLandingPadLabel))
diff --git a/llvm/test/CodeGen/RISCV/lpad.ll b/llvm/test/CodeGen/RISCV/lpad.ll
index 93eda6f10eedb..f69b4dad068b5 100644
--- a/llvm/test/CodeGen/RISCV/lpad.ll
+++ b/llvm/test/CodeGen/RISCV/lpad.ll
@@ -289,3 +289,8 @@ define void @interrupt() "interrupt"="machine" {
 ; FIXED-ONE-NEXT:    mret
   ret void
 }
+
+!llvm.module.flags = !{!0, !1}
+
+!0 = !{i32 8, !"cf-protection-branch", i32 1}
+!1 = !{i32 1, !"cf-branch-label-scheme", !"unlabeled"}

llvm/include/llvm/Support/RISCVISAUtils.h

topperc · 2025-08-05T15:28:01Z

llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp

+  if (const Metadata *const MD = M->getModuleFlag("cf-branch-label-scheme"))
+    CFBranchLabelScheme = cast<MDString>(MD)->getString();
+  else
+    report_fatal_error("missing cf-branch-label-scheme module flag");


Use reportFatalUsageError which won't generate a stack trace.

This one is inlined to somewhere (getSubtargetImpl()) that have multiple possible call paths, so I think it's better to print stack for this one.

llvm/lib/Target/RISCV/RISCVIndirectBranchTracking.cpp

topperc · 2025-08-05T21:32:44Z

Do you plan to also address the code that uses Subtarget.hasStdExtZicfilp() in RISCVISelLowering and these patterns in RISCVInstrInfo.td

let Predicates = [NoStdExtZicfilp],                                              
    isBarrier = 1, isBranch = 1, isIndirectBranch = 1, isTerminator = 1 in       
def PseudoBRIND : Pseudo<(outs), (ins GPRJALR:$rs1, simm12:$imm12), []>,         
                  PseudoInstExpansion<(JALR X0, GPR:$rs1, simm12:$imm12)>;       
                                                                                 
let Predicates = [HasStdExtZicfilp],                                             
    isBarrier = 1, isBranch = 1, isIndirectBranch = 1, isTerminator = 1 in {     
def PseudoBRINDNonX7 : Pseudo<(outs), (ins GPRJALRNonX7:$rs1, simm12:$imm12), []>,
                       PseudoInstExpansion<(JALR X0, GPR:$rs1, simm12:$imm12)>;  
def PseudoBRINDX7 : Pseudo<(outs), (ins GPRX7:$rs1, simm12:$imm12), []>,         
                    PseudoInstExpansion<(JALR X0, GPR:$rs1, simm12:$imm12)>;     
}                                                                                
                                                                                 
// For Zicfilp, need to avoid using X7/T2 for indirect branches which need       
// landing pad.                                                                  
let Predicates = [HasStdExtZicfilp] in {                                         
def : Pat<(brind GPRJALRNonX7:$rs1), (PseudoBRINDNonX7 GPRJALRNonX7:$rs1, 0)>;   
def : Pat<(brind (add GPRJALRNonX7:$rs1, simm12:$imm12)),                        
          (PseudoBRINDNonX7 GPRJALRNonX7:$rs1, simm12:$imm12)>;                  
                                                                                 
def : Pat<(riscv_sw_guarded_brind GPRX7:$rs1), (PseudoBRINDX7 GPRX7:$rs1, 0)>;   
def : Pat<(riscv_sw_guarded_brind (add GPRX7:$rs1, simm12:$imm12)),              
          (PseudoBRINDX7 GPRX7:$rs1, simm12:$imm12)>;                            
}                                                                                
                                                                                 
let Predicates = [NoStdExtZicfilp] in {                                          
def : Pat<(brind GPRJALR:$rs1), (PseudoBRIND GPRJALR:$rs1, 0)>;                  
def : Pat<(brind (add GPRJALR:$rs1, simm12:$imm12)),                             
          (PseudoBRIND GPRJALR:$rs1, simm12:$imm12)>;                            
}

…adoption instead of Zicfilp extension availability

…re adoption instead of Zicfilp extension availability

…p CFI feature adoption instead of Zicfilp extension availability

…racking pass

mylai-mtk · 2025-08-07T08:48:13Z

@topperc

Do you plan to also address the code that uses Subtarget.hasStdExtZicfilp() in RISCVISelLowering and these patterns in RISCVInstrInfo.td

It looks like I have to, otherwise tests would just fail. As a result, I reviewed all behaviors that depended on hasStdExtZicfilp() and changed them to hasZicfilpCFI(), which is a new predicate also defined on the RISCVSubtarget class but has its returned value determined solely by the Function's (Module's) cf flags (Zicfilp ext availability is not taken into consideration). I guess I would need to rewrite the PR commit messages as the modified scope is now significantly larger.

llvm/lib/Target/RISCV/RISCVSubtarget.h

lenary · 2025-08-07T22:12:04Z

As soon as you have module flags, you need to ensure that when LTO merges two modules, that you get the right errors or merge behaviour.

I understand that, unlike subtarget features, when linking all modules (and functions inside that module) need the same protection configuration, right? This semantic is probably the easiest to implement.

Please write a test for this and put it in llvm/test/LTO/RISCV

mylai-mtk · 2025-08-08T10:02:11Z

@lenary

you need to ensure that when LTO merges two modules, that you get the right errors or merge behaviour. ... Please write a test for this ...

Thanks for pointing this out. I had added the LTO tests.

mylai-mtk · 2025-08-21T08:48:58Z

Update: Model ZICFILP-func-sig and ZICFILP-unlabeled as real target features

Reason: During my prototyping of some ELF header field emission (cf. riscv-non-isa/riscv-elf-psabi-doc#474), and I found that in the emission function only "real" target features could be accessed, so I think it's better to just consolidate the module flags into real target features.

…sZicfilpCFI() even if new Subtarget is not created

lenary

I think this is broadly correct, but I would like another reviewer.

My comments below are details, not a fundamental problem with the overall approach.

lenary · 2025-08-22T19:06:45Z

llvm/lib/Target/RISCV/RISCVFeatures.td

+def FeatureZicfilpUnlabeled
+    : SubtargetFeature<
+          "zicfilp-unlabeled", "HasZicfilpUnlabeled", "true",
+          "Enforce forward-edge control-flow integrity with ZICFILP-unlabeled">;
+def FeatureZicfilpFuncSig
+    : SubtargetFeature<
+          "zicfilp-func-sig", "HasZicfilpFuncSig", "true",
+          "Enforce forward-edge control-flow integrity with ZICFILP-func-sig">;


You can do a target feature that's an enum, not a boolean. See the RISCVProcFamily enum for how this is done - fewer methods are automatically generated, but it's clearer IMO.

I'm not sure if the "enum" (or in the TableGen backend implementation, "integer") is a proper fit here. It looks like the integer version of SubtargetFeature actually models features with "leveled" semantics, e.g. -target-feature +feature-10 would override -target-feature +feature-0. This feels wrong as the leveled semantics does not suit ZicfilpCFI in that different options should err instead of override. The current TableGen usages of this integer-based SubtargetFeature may work, but I don't think its semantics guarantees future soundness.

// Value - Value the XXXSubtarget field to be set to by feature.
//
// A value of "true" or "false" implies the field is a bool. Otherwise,
// it is assumed to be an integer. the integer value may be the name of an
// enum constant. If multiple features use the same integer field, the
// field will be set to the maximum value of all enabled features that
// share the field.
//
string Value = v;

(

llvm-project/llvm/include/llvm/Target/Target.td

Line 499 in 0263c15

// Value - Value the XXXSubtarget field to be set to by feature.

)

llvm/lib/Target/RISCV/RISCVFeatures.td

llvm/test/CodeGen/RISCV/zicfilp-disabled-indirect-branch.ll

mylai-mtk · 2025-08-25T12:08:38Z

Hi, I have been modeling the cf-protection module flags as target features, and I found some problems so I'm not sure whether I should continue pursuing this path:

By modeling Zicfilp CFI module flags as subtarget features, I hope the following could be achieved:

Provide easy access to whether Zicfilp CFI is enabled.

-> References to MCSubtargetInfo is more widely available throughout the code base; Modules, on the hand, are somehow harder to obtain.

Avoid repeated parsing of module flags. (I mean no duplicated and scattered codes)
Provide correct and immutable indication of whether Zicfilp CFI is enabled since the very beginning of the backend pipeline.

-> By enforcing the consistency between the feature bits and module flags when creating RISCVSubtargets, these 2 goals could be achieved.

The problem arose when I was trying to handle the module-wide application of the Zicfilp CFI module flags: In TargetMachine, there's also a feature bitmap contained in the STI field (typed MCSubtargetInfo). This TM.STI does not belong to any Function, and is created without the influence of a specific Module instance, so I cannot sync (enforce) the information of module flags into it at creation, thus the above goal is somehow, broken.

In my experience with LLVM, I always assumed that TM.STI->hasFeature(XXX) represents whether a feature is enabled for the given Module we're compiling. However, the inability to sync module flags to TM.STI triggered a challenge to this assumption: Could a TargetMachine instance be reused to compile multiple Modules, and TM.STI actually represents what the target machine is capable of, not how it behaves? If this is indeed the correct understanding of TargetMachine, then it would be natural that no module flags could be synced to TM.STI, since module flags controls the Module's behavior, but not necessarily the TargetMachine's, and different Modules compiled with the same TargetMachine instance could exhibit different behaviors. It would further declare that my goal of "Avoid repeated parsing of module flags" is not suitable to implement with TM.STI, since it does not convey Module behaviors. (and thus parsing module flags when implementing module-wide behavior is preferred)

So in a nutshell, I want to ask:

Whether modeling per-Function behavior with RISCVSubtarget is suitable?
Whether modeling per-Module behavior with TM.STI is not suitable, since TM.STI is never intended to be in sync with and serve only one specific Module?

(By the way, I considered recommending this approach to Zicfiss in #152251, but given the uncertainty I'm currently facing, I think it's better that I get approval for this approach beforehand.)

mylai-mtk requested review from jaidTw and kito-cheng August 5, 2025 11:06

llvmbot added backend:RISC-V llvm:support labels Aug 5, 2025

topperc reviewed Aug 5, 2025

View reviewed changes

llvm/include/llvm/Support/RISCVISAUtils.h Outdated Show resolved Hide resolved

topperc reviewed Aug 5, 2025

View reviewed changes

mylai-mtk added 7 commits August 7, 2025 15:10

fixup: Change SelectionDAG patterns to depend on Zicfilp CFI feature …

1f761cd

…adoption instead of Zicfilp extension availability

fixup: Codegen software-guarded long calls based on Zicfilp CFI featu…

968e250

…re adoption instead of Zicfilp extension availability

fixup: Codegen software-guarded jumps for jump tables based on Zicfil…

8f601dd

…p CFI feature adoption instead of Zicfilp extension availability

fixup: Fix codegen of nested functions

876c7ca

fixup: Use RISCVSubtarget::ZicfilpLabelScheme in RISCVIndirectBranchT…

32ee08d

…racking pass

fixup: Codegen landing pad setup insns based on cf flags

fc0bfdf

fixup: Fix misc Zicfilp-based tests to base on cf flags

c8a374d

mylai-mtk commented Aug 7, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVSubtarget.h Outdated Show resolved Hide resolved

mylai-mtk changed the title ~~[RISCV][Zicfilp] Codegen LPAD insns by looking at module flags~~ [RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags Aug 7, 2025

mylai-mtk added 2 commits August 8, 2025 11:07

fixup: Misc improvements

3b92223

fixup: Add LTO tests

cb9940d

fixup: Model ZICFILP-func-sig and ZICFILP-unlabeled as real features

f9f6bba

mylai-mtk force-pushed the zicfilp-codegen-by-cf-flag branch from e0044a7 to f9f6bba Compare August 21, 2025 09:03

fixup: Enforce module flag cf-protection-branch != 0 <-> Subtarget.ha…

ef32dbc

…sZicfilpCFI() even if new Subtarget is not created

lenary reviewed Aug 22, 2025

View reviewed changes

mylai-mtk added 2 commits August 25, 2025 17:11

fixup: Rename HasNoZicfilpCFI to NoZicfilpCFI

20db445

fixup: Fix utils/update_llc_test_checks.py annotations

c583a62

mylai-mtk added 2 commits August 27, 2025 11:41

fixup: Implement +zicfilp-func-sig and +zicfilp-unlabeled using enums

303f8a7

fixup: factor out reportBadZicfilpUsage

314fef8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags #152121

[RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags #152121

Uh oh!

mylai-mtk commented Aug 5, 2025 •

edited

Loading

Uh oh!

llvmbot commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

topperc Aug 5, 2025

Uh oh!

mylai-mtk Aug 7, 2025

Uh oh!

Uh oh!

topperc commented Aug 5, 2025

Uh oh!

mylai-mtk commented Aug 7, 2025

Uh oh!

Uh oh!

lenary commented Aug 7, 2025

Uh oh!

mylai-mtk commented Aug 8, 2025

Uh oh!

mylai-mtk commented Aug 21, 2025

Uh oh!

lenary left a comment

Uh oh!

lenary Aug 22, 2025

Uh oh!

mylai-mtk Aug 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

mylai-mtk commented Aug 25, 2025

Uh oh!

Uh oh!

[RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags #152121

Are you sure you want to change the base?

[RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags #152121

Uh oh!

Conversation

mylai-mtk commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

topperc Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

mylai-mtk Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

topperc commented Aug 5, 2025

Uh oh!

mylai-mtk commented Aug 7, 2025

Uh oh!

Uh oh!

lenary commented Aug 7, 2025

Uh oh!

mylai-mtk commented Aug 8, 2025

Uh oh!

mylai-mtk commented Aug 21, 2025

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

lenary Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

mylai-mtk Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mylai-mtk commented Aug 25, 2025

Uh oh!

Uh oh!

mylai-mtk commented Aug 5, 2025 •

edited

Loading

llvmbot commented Aug 5, 2025 •

edited

Loading

mylai-mtk Aug 25, 2025 •

edited

Loading