Skip to content

Conversation

svs-quic
Copy link
Contributor

This is only used in the small code model and when Xqcili is enabled, where addresses would otherwise use LUI/ADDI. Other code models need to use pc-relative addressing. This patch does this for global/block addresses, constant pools and jumptables.

Overall, this gives a better code size saving as QC.E.LI is easier to relax to QC.LI/LI etc than LUI/ADDI (especially when the LUI/ADDI might have sharing, or be split apart). QC.E.LI has the RISCV_QC_E_32 local relocation attached to it.

This is only used in the small code model and when Xqcili is enabled, where addresses would
otherwise use LUI/ADDI. Other code models need to use pc-relative
addressing.

Overall, this gives a better code size saving as QC.E.LI is easier to
relax than LUI/ADDI (especially when the LUI/ADDI might have sharing, or
be split apart).
@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Sudharsan Veeravalli (svs-quic)

Changes

This is only used in the small code model and when Xqcili is enabled, where addresses would otherwise use LUI/ADDI. Other code models need to use pc-relative addressing. This patch does this for global/block addresses, constant pools and jumptables.

Overall, this gives a better code size saving as QC.E.LI is easier to relax to QC.LI/LI etc than LUI/ADDI (especially when the LUI/ADDI might have sharing, or be split apart). QC.E.LI has the RISCV_QC_E_32 local relocation attached to it.


Full diff: https://github.com/llvm/llvm-project/pull/155819.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+9-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td (+9)
  • (modified) llvm/test/CodeGen/RISCV/codemodel-lowering.ll (+70)
  • (modified) llvm/test/CodeGen/RISCV/jumptable.ll (+81)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 4c39bcf8494a4..59e66ce140f9c 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -8822,7 +8822,15 @@ SDValue RISCVTargetLowering::getAddr(NodeTy *N, SelectionDAG &DAG,
     reportFatalUsageError("Unsupported code model for lowering");
   case CodeModel::Small: {
     // Generate a sequence for accessing addresses within the first 2 GiB of
-    // address space. This generates the pattern (addi (lui %hi(sym)) %lo(sym)).
+    // address space.
+    if (Subtarget.hasVendorXqcili()) {
+      // Use QC.E.LI to generate the address, as this is easier to relax than
+      // LUI/ADDI.
+      SDValue Addr = getTargetNode(N, DL, Ty, DAG, 0);
+      return DAG.getNode(RISCVISD::QC_E_LI, DL, Ty, Addr);
+    }
+
+    // This generates the pattern (addi (lui %hi(sym)) %lo(sym)).
     SDValue AddrHi = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_HI);
     SDValue AddrLo = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_LO);
     SDValue MNHi = DAG.getNode(RISCVISD::HI, DL, Ty, AddrHi);
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
index 2c64b0c220fba..cd021b1c37a30 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
@@ -22,6 +22,8 @@ def SDT_SetMultiple : SDTypeProfile<0, 4, [SDTCisSameAs<0, 1>,
 def qc_setwmi : RVSDNode<"QC_SETWMI", SDT_SetMultiple,
                          [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 
+def qc_e_li : RVSDNode<"QC_E_LI", SDTIntUnaryOp>;
+
 def uimm5nonzero : RISCVOp<XLenVT>,
                    ImmLeaf<XLenVT, [{return (Imm != 0) && isUInt<5>(Imm);}]> {
   let ParserMatchClass = UImmAsmOperand<5, "NonZero">;
@@ -1605,6 +1607,13 @@ def : Pat<(qc_setwmi GPR:$rs3, GPR:$rs1, tuimm5nonzero:$uimm5, tuimm7_lsb00:$uim
           (QC_SETWMI GPR:$rs3, GPR:$rs1, tuimm5nonzero:$uimm5, tuimm7_lsb00:$uimm7)>;
 } // Predicates = [HasVendorXqcilsm, IsRV32]
 
+let Predicates = [HasVendorXqcili, IsRV32] in {
+def: Pat<(qc_e_li tglobaladdr:$A),   (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tblockaddress:$A),   (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tjumptable:$A),   (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tconstpool:$A),   (QC_E_LI bare_simm32:$A)>;
+} // Predicates = [HasVendorXqcili, IsRV32]
+
 //===----------------------------------------------------------------------===/i
 // Compress Instruction tablegen backend.
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/codemodel-lowering.ll b/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
index 086c3ac181521..94f8d7cab9b95 100644
--- a/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
+++ b/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=small -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=RV32I-SMALL,RV32F-SMALL
+; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV32IXQCILI-SMALL,RV32FXQCILI-SMALL
 ; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=medium -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=RV32I-MEDIUM,RV32F-MEDIUM
 ; RUN: llc -mtriple=riscv64 -mattr=+f,+zfh -target-abi=lp64f -code-model=small -verify-machineinstrs < %s \
@@ -11,6 +13,8 @@
 ; RUN:   | FileCheck %s -check-prefixes=RV64I-LARGE,RV64F-LARGE
 ; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=small -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=RV32I-SMALL,RV32FINX-SMALL
+; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV32IXQCILI-SMALL,RV32FINXXQCILI-SMALL
 ; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=medium -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=RV32I-MEDIUM,RV32FINX-MEDIUM
 ; RUN: llc -mtriple=riscv64 -mattr=+zfinx,+zhinx -target-abi=lp64 -code-model=small -verify-machineinstrs < %s \
@@ -30,6 +34,12 @@ define i32 @lower_global(i32 %a) nounwind {
 ; RV32I-SMALL-NEXT:    lw a0, %lo(G)(a0)
 ; RV32I-SMALL-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-LABEL: lower_global:
+; RV32IXQCILI-SMALL:       # %bb.0:
+; RV32IXQCILI-SMALL-NEXT:    qc.e.li a0, G
+; RV32IXQCILI-SMALL-NEXT:    lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT:    ret
+;
 ; RV32I-MEDIUM-LABEL: lower_global:
 ; RV32I-MEDIUM:       # %bb.0:
 ; RV32I-MEDIUM-NEXT:  .Lpcrel_hi0:
@@ -73,6 +83,13 @@ define void @lower_blockaddress() nounwind {
 ; RV32I-SMALL-NEXT:    sw a1, %lo(addr)(a0)
 ; RV32I-SMALL-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-LABEL: lower_blockaddress:
+; RV32IXQCILI-SMALL:       # %bb.0:
+; RV32IXQCILI-SMALL-NEXT:    qc.e.li a0, addr
+; RV32IXQCILI-SMALL-NEXT:    li a1, 1
+; RV32IXQCILI-SMALL-NEXT:    sw a1, 0(a0)
+; RV32IXQCILI-SMALL-NEXT:    ret
+;
 ; RV32I-MEDIUM-LABEL: lower_blockaddress:
 ; RV32I-MEDIUM:       # %bb.0:
 ; RV32I-MEDIUM-NEXT:  .Lpcrel_hi1:
@@ -135,6 +152,26 @@ define signext i32 @lower_blockaddress_displ(i32 signext %w) nounwind {
 ; RV32I-SMALL-NEXT:    addi sp, sp, 16
 ; RV32I-SMALL-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-LABEL: lower_blockaddress_displ:
+; RV32IXQCILI-SMALL:       # %bb.0: # %entry
+; RV32IXQCILI-SMALL-NEXT:    addi sp, sp, -16
+; RV32IXQCILI-SMALL-NEXT:    qc.e.li a1, .Ltmp0
+; RV32IXQCILI-SMALL-NEXT:    li a2, 101
+; RV32IXQCILI-SMALL-NEXT:    sw a1, 8(sp)
+; RV32IXQCILI-SMALL-NEXT:    blt a0, a2, .LBB2_3
+; RV32IXQCILI-SMALL-NEXT:  # %bb.1: # %if.then
+; RV32IXQCILI-SMALL-NEXT:    lw a0, 8(sp)
+; RV32IXQCILI-SMALL-NEXT:    jr a0
+; RV32IXQCILI-SMALL-NEXT:  .Ltmp0: # Block address taken
+; RV32IXQCILI-SMALL-NEXT:  .LBB2_2: # %return
+; RV32IXQCILI-SMALL-NEXT:    li a0, 4
+; RV32IXQCILI-SMALL-NEXT:    addi sp, sp, 16
+; RV32IXQCILI-SMALL-NEXT:    ret
+; RV32IXQCILI-SMALL-NEXT:  .LBB2_3: # %return.clone
+; RV32IXQCILI-SMALL-NEXT:    li a0, 3
+; RV32IXQCILI-SMALL-NEXT:    addi sp, sp, 16
+; RV32IXQCILI-SMALL-NEXT:    ret
+;
 ; RV32I-MEDIUM-LABEL: lower_blockaddress_displ:
 ; RV32I-MEDIUM:       # %bb.0: # %entry
 ; RV32I-MEDIUM-NEXT:    addi sp, sp, -16
@@ -255,6 +292,13 @@ define float @lower_constantpool(float %a) nounwind {
 ; RV32F-SMALL-NEXT:    fadd.s fa0, fa0, fa5
 ; RV32F-SMALL-NEXT:    ret
 ;
+; RV32FXQCILI-SMALL-LABEL: lower_constantpool:
+; RV32FXQCILI-SMALL:       # %bb.0:
+; RV32FXQCILI-SMALL-NEXT:    qc.e.li a0, 1065355264
+; RV32FXQCILI-SMALL-NEXT:    fmv.w.x fa5, a0
+; RV32FXQCILI-SMALL-NEXT:    fadd.s fa0, fa0, fa5
+; RV32FXQCILI-SMALL-NEXT:    ret
+;
 ; RV32F-MEDIUM-LABEL: lower_constantpool:
 ; RV32F-MEDIUM:       # %bb.0:
 ; RV32F-MEDIUM-NEXT:  .Lpcrel_hi3:
@@ -293,6 +337,12 @@ define float @lower_constantpool(float %a) nounwind {
 ; RV32FINX-SMALL-NEXT:    fadd.s a0, a0, a1
 ; RV32FINX-SMALL-NEXT:    ret
 ;
+; RV32FINXXQCILI-SMALL-LABEL: lower_constantpool:
+; RV32FINXXQCILI-SMALL:       # %bb.0:
+; RV32FINXXQCILI-SMALL-NEXT:    qc.e.li a1, 1065355264
+; RV32FINXXQCILI-SMALL-NEXT:    fadd.s a0, a0, a1
+; RV32FINXXQCILI-SMALL-NEXT:    ret
+;
 ; RV32FINX-MEDIUM-LABEL: lower_constantpool:
 ; RV32FINX-MEDIUM:       # %bb.0:
 ; RV32FINX-MEDIUM-NEXT:    lui a1, 260097
@@ -334,6 +384,12 @@ define i32 @lower_extern_weak(i32 %a) nounwind {
 ; RV32I-SMALL-NEXT:    lw a0, %lo(W)(a0)
 ; RV32I-SMALL-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-LABEL: lower_extern_weak:
+; RV32IXQCILI-SMALL:       # %bb.0:
+; RV32IXQCILI-SMALL-NEXT:    qc.e.li a0, W
+; RV32IXQCILI-SMALL-NEXT:    lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT:    ret
+;
 ; RV32F-MEDIUM-LABEL: lower_extern_weak:
 ; RV32F-MEDIUM:       # %bb.0:
 ; RV32F-MEDIUM-NEXT:  .Lpcrel_hi4:
@@ -401,6 +457,13 @@ define half @lower_global_half(half %a) nounwind {
 ; RV32F-SMALL-NEXT:    fadd.h fa0, fa0, fa5
 ; RV32F-SMALL-NEXT:    ret
 ;
+; RV32FXQCILI-SMALL-LABEL: lower_global_half:
+; RV32FXQCILI-SMALL:       # %bb.0:
+; RV32FXQCILI-SMALL-NEXT:    qc.e.li a0, X
+; RV32FXQCILI-SMALL-NEXT:    flh fa5, 0(a0)
+; RV32FXQCILI-SMALL-NEXT:    fadd.h fa0, fa0, fa5
+; RV32FXQCILI-SMALL-NEXT:    ret
+;
 ; RV32F-MEDIUM-LABEL: lower_global_half:
 ; RV32F-MEDIUM:       # %bb.0:
 ; RV32F-MEDIUM-NEXT:  .Lpcrel_hi5:
@@ -440,6 +503,13 @@ define half @lower_global_half(half %a) nounwind {
 ; RV32FINX-SMALL-NEXT:    fadd.h a0, a0, a1
 ; RV32FINX-SMALL-NEXT:    ret
 ;
+; RV32FINXXQCILI-SMALL-LABEL: lower_global_half:
+; RV32FINXXQCILI-SMALL:       # %bb.0:
+; RV32FINXXQCILI-SMALL-NEXT:    qc.e.li a1, X
+; RV32FINXXQCILI-SMALL-NEXT:    lh a1, 0(a1)
+; RV32FINXXQCILI-SMALL-NEXT:    fadd.h a0, a0, a1
+; RV32FINXXQCILI-SMALL-NEXT:    ret
+;
 ; RV32FINX-MEDIUM-LABEL: lower_global_half:
 ; RV32FINX-MEDIUM:       # %bb.0:
 ; RV32FINX-MEDIUM-NEXT:  .Lpcrel_hi4:
diff --git a/llvm/test/CodeGen/RISCV/jumptable.ll b/llvm/test/CodeGen/RISCV/jumptable.ll
index 8584579b81384..a838d54ad5e9b 100644
--- a/llvm/test/CodeGen/RISCV/jumptable.ll
+++ b/llvm/test/CodeGen/RISCV/jumptable.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32I-SMALL
+; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili  < %s \
+; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32IXQCILI-SMALL
 ; RUN: llc -mtriple=riscv32 -code-model=medium -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32I-MEDIUM
 ; RUN: llc -mtriple=riscv32 -relocation-model=pic -verify-machineinstrs < %s \
@@ -13,6 +15,8 @@
 ; RUN:   | FileCheck %s -check-prefixes=CHECK,RV64I-PIC
 ; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
 ; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32I-SMALL-7-ENTRIES
+; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -riscv-min-jump-table-entries=7 -mattr=+experimental-xqcili < %s \
+; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32IXQCILI-SMALL-7-ENTRIES
 ; RUN: llc -mtriple=riscv32 -code-model=medium -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
 ; RUN:   | FileCheck %s -check-prefixes=CHECK,RV32I-MEDIUM-7-ENTRIES
 ; RUN: llc -mtriple=riscv32 -relocation-model=pic -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
@@ -114,6 +118,39 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ; RV32I-SMALL-NEXT:  .LBB1_9: # %exit
 ; RV32I-SMALL-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-LABEL: above_threshold:
+; RV32IXQCILI-SMALL:       # %bb.0: # %entry
+; RV32IXQCILI-SMALL-NEXT:    addi a0, a0, -1
+; RV32IXQCILI-SMALL-NEXT:    li a2, 5
+; RV32IXQCILI-SMALL-NEXT:    bltu a2, a0, .LBB1_9
+; RV32IXQCILI-SMALL-NEXT:  # %bb.1: # %entry
+; RV32IXQCILI-SMALL-NEXT:    slli a0, a0, 2
+; RV32IXQCILI-SMALL-NEXT:    qc.e.li a2, .LJTI1_0
+; RV32IXQCILI-SMALL-NEXT:    add a0, a0, a2
+; RV32IXQCILI-SMALL-NEXT:    lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT:    jr a0
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_2: # %bb1
+; RV32IXQCILI-SMALL-NEXT:    li a0, 4
+; RV32IXQCILI-SMALL-NEXT:    j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_3: # %bb5
+; RV32IXQCILI-SMALL-NEXT:    li a0, 100
+; RV32IXQCILI-SMALL-NEXT:    j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_4: # %bb3
+; RV32IXQCILI-SMALL-NEXT:    li a0, 2
+; RV32IXQCILI-SMALL-NEXT:    j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_5: # %bb4
+; RV32IXQCILI-SMALL-NEXT:    li a0, 1
+; RV32IXQCILI-SMALL-NEXT:    j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_6: # %bb2
+; RV32IXQCILI-SMALL-NEXT:    li a0, 3
+; RV32IXQCILI-SMALL-NEXT:    j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_7: # %bb6
+; RV32IXQCILI-SMALL-NEXT:    li a0, 200
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_8: # %exit
+; RV32IXQCILI-SMALL-NEXT:    sw a0, 0(a1)
+; RV32IXQCILI-SMALL-NEXT:  .LBB1_9: # %exit
+; RV32IXQCILI-SMALL-NEXT:    ret
+;
 ; RV32I-MEDIUM-LABEL: above_threshold:
 ; RV32I-MEDIUM:       # %bb.0: # %entry
 ; RV32I-MEDIUM-NEXT:    addi a0, a0, -1
@@ -334,6 +371,50 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ; RV32I-SMALL-7-ENTRIES-NEXT:  .LBB1_14: # %exit
 ; RV32I-SMALL-7-ENTRIES-NEXT:    ret
 ;
+; RV32IXQCILI-SMALL-7-ENTRIES-LABEL: above_threshold:
+; RV32IXQCILI-SMALL-7-ENTRIES:       # %bb.0: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    blt a2, a0, .LBB1_5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.1: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    beq a0, a2, .LBB1_9
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.2: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    beq a0, a2, .LBB1_11
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.3: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    bne a0, a2, .LBB1_14
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.4: # %bb3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_5: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    beq a0, a2, .LBB1_10
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.6: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    beq a0, a2, .LBB1_12
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.7: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a2, 6
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    bne a0, a2, .LBB1_14
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  # %bb.8: # %bb6
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 200
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_9: # %bb1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_10: # %bb4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_11: # %bb2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_12: # %bb5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    li a0, 100
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_13: # %exit
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    sw a0, 0(a1)
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:  .LBB1_14: # %exit
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT:    ret
+;
 ; RV32I-MEDIUM-7-ENTRIES-LABEL: above_threshold:
 ; RV32I-MEDIUM-7-ENTRIES:       # %bb.0: # %entry
 ; RV32I-MEDIUM-7-ENTRIES-NEXT:    li a2, 3

@svs-quic svs-quic requested review from asb, lenary, topperc and pgodeq August 28, 2025 14:04
@lenary
Copy link
Member

lenary commented Aug 28, 2025

@topperc it would be good to understand your feelings about this (even without a full review). I've been wary of changing the materialisation sequences so far, but this gets us a measurable code size improvement - customer images are seeing a 4-8% improvement (sorry, we were comparing over a few commits, rather than just this one).

These numbers are using our linker which has relaxations for qc.e.li (to qc.li, c.lui, or a gp-relative addi) - but not relaxations that turn lui+addi into qc.e.li or something smaller (which end up needing lots of constraints, including what effectively is a one-use check on the lui).

@topperc
Copy link
Collaborator

topperc commented Aug 28, 2025

@topperc it would be good to understand your feelings about this (even without a full review). I've been wary of changing the materialisation sequences so far, but this gets us a measurable code size improvement - customer images are seeing a 4-8% improvement (sorry, we were comparing over a few commits, rather than just this one).

These numbers are using our linker which has relaxations for qc.e.li (to qc.li, c.lui, or a gp-relative addi) - but not relaxations that turn lui+addi into qc.e.li or something smaller (which end up needing lots of constraints, including what effectively is a one-use check on the lui).

This patch looks like it would lose out on folding the ADDI into load/store instructions. Is there another patch for that or do you get code size savings without it?

@lenary
Copy link
Member

lenary commented Aug 28, 2025

@topperc it would be good to understand your feelings about this (even without a full review). I've been wary of changing the materialisation sequences so far, but this gets us a measurable code size improvement - customer images are seeing a 4-8% improvement (sorry, we were comparing over a few commits, rather than just this one).
These numbers are using our linker which has relaxations for qc.e.li (to qc.li, c.lui, or a gp-relative addi) - but not relaxations that turn lui+addi into qc.e.li or something smaller (which end up needing lots of constraints, including what effectively is a one-use check on the lui).

This patch looks like it would lose out on folding the ADDI into load/store instructions. Is there another patch for that or do you get code size savings without it?

We get code size savings with exactly this patch, and no modifications/changes to MergeBaseOffset. My hypothesis is that before relaxation, you get lots more loads/stores with zero offsets, which compress (whereas before, the loads/stores contained a symbol reference so had to be close to GP or zero to get any savings after relaxation), and then the images fit within 20 bits, so you mostly end up with (for example) qc.li; c.lw (6 bytes) where otherwise you would have got lui; lw (8 bytes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants