-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[RISCV] Use QC_E_LI to materialise addresses #155819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is only used in the small code model and when Xqcili is enabled, where addresses would otherwise use LUI/ADDI. Other code models need to use pc-relative addressing. Overall, this gives a better code size saving as QC.E.LI is easier to relax than LUI/ADDI (especially when the LUI/ADDI might have sharing, or be split apart).
@llvm/pr-subscribers-backend-risc-v Author: Sudharsan Veeravalli (svs-quic) ChangesThis is only used in the small code model and when Overall, this gives a better code size saving as Full diff: https://github.com/llvm/llvm-project/pull/155819.diff 4 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 4c39bcf8494a4..59e66ce140f9c 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -8822,7 +8822,15 @@ SDValue RISCVTargetLowering::getAddr(NodeTy *N, SelectionDAG &DAG,
reportFatalUsageError("Unsupported code model for lowering");
case CodeModel::Small: {
// Generate a sequence for accessing addresses within the first 2 GiB of
- // address space. This generates the pattern (addi (lui %hi(sym)) %lo(sym)).
+ // address space.
+ if (Subtarget.hasVendorXqcili()) {
+ // Use QC.E.LI to generate the address, as this is easier to relax than
+ // LUI/ADDI.
+ SDValue Addr = getTargetNode(N, DL, Ty, DAG, 0);
+ return DAG.getNode(RISCVISD::QC_E_LI, DL, Ty, Addr);
+ }
+
+ // This generates the pattern (addi (lui %hi(sym)) %lo(sym)).
SDValue AddrHi = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_HI);
SDValue AddrLo = getTargetNode(N, DL, Ty, DAG, RISCVII::MO_LO);
SDValue MNHi = DAG.getNode(RISCVISD::HI, DL, Ty, AddrHi);
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
index 2c64b0c220fba..cd021b1c37a30 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
@@ -22,6 +22,8 @@ def SDT_SetMultiple : SDTypeProfile<0, 4, [SDTCisSameAs<0, 1>,
def qc_setwmi : RVSDNode<"QC_SETWMI", SDT_SetMultiple,
[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
+def qc_e_li : RVSDNode<"QC_E_LI", SDTIntUnaryOp>;
+
def uimm5nonzero : RISCVOp<XLenVT>,
ImmLeaf<XLenVT, [{return (Imm != 0) && isUInt<5>(Imm);}]> {
let ParserMatchClass = UImmAsmOperand<5, "NonZero">;
@@ -1605,6 +1607,13 @@ def : Pat<(qc_setwmi GPR:$rs3, GPR:$rs1, tuimm5nonzero:$uimm5, tuimm7_lsb00:$uim
(QC_SETWMI GPR:$rs3, GPR:$rs1, tuimm5nonzero:$uimm5, tuimm7_lsb00:$uimm7)>;
} // Predicates = [HasVendorXqcilsm, IsRV32]
+let Predicates = [HasVendorXqcili, IsRV32] in {
+def: Pat<(qc_e_li tglobaladdr:$A), (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tblockaddress:$A), (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tjumptable:$A), (QC_E_LI bare_simm32:$A)>;
+def: Pat<(qc_e_li tconstpool:$A), (QC_E_LI bare_simm32:$A)>;
+} // Predicates = [HasVendorXqcili, IsRV32]
+
//===----------------------------------------------------------------------===/i
// Compress Instruction tablegen backend.
//===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/codemodel-lowering.ll b/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
index 086c3ac181521..94f8d7cab9b95 100644
--- a/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
+++ b/llvm/test/CodeGen/RISCV/codemodel-lowering.ll
@@ -1,6 +1,8 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=small -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=RV32I-SMALL,RV32F-SMALL
+; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili < %s \
+; RUN: | FileCheck %s -check-prefixes=RV32IXQCILI-SMALL,RV32FXQCILI-SMALL
; RUN: llc -mtriple=riscv32 -mattr=+f,+zfh -target-abi=ilp32f -code-model=medium -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=RV32I-MEDIUM,RV32F-MEDIUM
; RUN: llc -mtriple=riscv64 -mattr=+f,+zfh -target-abi=lp64f -code-model=small -verify-machineinstrs < %s \
@@ -11,6 +13,8 @@
; RUN: | FileCheck %s -check-prefixes=RV64I-LARGE,RV64F-LARGE
; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=small -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=RV32I-SMALL,RV32FINX-SMALL
+; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili < %s \
+; RUN: | FileCheck %s -check-prefixes=RV32IXQCILI-SMALL,RV32FINXXQCILI-SMALL
; RUN: llc -mtriple=riscv32 -mattr=+zfinx,+zhinx -target-abi=ilp32 -code-model=medium -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=RV32I-MEDIUM,RV32FINX-MEDIUM
; RUN: llc -mtriple=riscv64 -mattr=+zfinx,+zhinx -target-abi=lp64 -code-model=small -verify-machineinstrs < %s \
@@ -30,6 +34,12 @@ define i32 @lower_global(i32 %a) nounwind {
; RV32I-SMALL-NEXT: lw a0, %lo(G)(a0)
; RV32I-SMALL-NEXT: ret
;
+; RV32IXQCILI-SMALL-LABEL: lower_global:
+; RV32IXQCILI-SMALL: # %bb.0:
+; RV32IXQCILI-SMALL-NEXT: qc.e.li a0, G
+; RV32IXQCILI-SMALL-NEXT: lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT: ret
+;
; RV32I-MEDIUM-LABEL: lower_global:
; RV32I-MEDIUM: # %bb.0:
; RV32I-MEDIUM-NEXT: .Lpcrel_hi0:
@@ -73,6 +83,13 @@ define void @lower_blockaddress() nounwind {
; RV32I-SMALL-NEXT: sw a1, %lo(addr)(a0)
; RV32I-SMALL-NEXT: ret
;
+; RV32IXQCILI-SMALL-LABEL: lower_blockaddress:
+; RV32IXQCILI-SMALL: # %bb.0:
+; RV32IXQCILI-SMALL-NEXT: qc.e.li a0, addr
+; RV32IXQCILI-SMALL-NEXT: li a1, 1
+; RV32IXQCILI-SMALL-NEXT: sw a1, 0(a0)
+; RV32IXQCILI-SMALL-NEXT: ret
+;
; RV32I-MEDIUM-LABEL: lower_blockaddress:
; RV32I-MEDIUM: # %bb.0:
; RV32I-MEDIUM-NEXT: .Lpcrel_hi1:
@@ -135,6 +152,26 @@ define signext i32 @lower_blockaddress_displ(i32 signext %w) nounwind {
; RV32I-SMALL-NEXT: addi sp, sp, 16
; RV32I-SMALL-NEXT: ret
;
+; RV32IXQCILI-SMALL-LABEL: lower_blockaddress_displ:
+; RV32IXQCILI-SMALL: # %bb.0: # %entry
+; RV32IXQCILI-SMALL-NEXT: addi sp, sp, -16
+; RV32IXQCILI-SMALL-NEXT: qc.e.li a1, .Ltmp0
+; RV32IXQCILI-SMALL-NEXT: li a2, 101
+; RV32IXQCILI-SMALL-NEXT: sw a1, 8(sp)
+; RV32IXQCILI-SMALL-NEXT: blt a0, a2, .LBB2_3
+; RV32IXQCILI-SMALL-NEXT: # %bb.1: # %if.then
+; RV32IXQCILI-SMALL-NEXT: lw a0, 8(sp)
+; RV32IXQCILI-SMALL-NEXT: jr a0
+; RV32IXQCILI-SMALL-NEXT: .Ltmp0: # Block address taken
+; RV32IXQCILI-SMALL-NEXT: .LBB2_2: # %return
+; RV32IXQCILI-SMALL-NEXT: li a0, 4
+; RV32IXQCILI-SMALL-NEXT: addi sp, sp, 16
+; RV32IXQCILI-SMALL-NEXT: ret
+; RV32IXQCILI-SMALL-NEXT: .LBB2_3: # %return.clone
+; RV32IXQCILI-SMALL-NEXT: li a0, 3
+; RV32IXQCILI-SMALL-NEXT: addi sp, sp, 16
+; RV32IXQCILI-SMALL-NEXT: ret
+;
; RV32I-MEDIUM-LABEL: lower_blockaddress_displ:
; RV32I-MEDIUM: # %bb.0: # %entry
; RV32I-MEDIUM-NEXT: addi sp, sp, -16
@@ -255,6 +292,13 @@ define float @lower_constantpool(float %a) nounwind {
; RV32F-SMALL-NEXT: fadd.s fa0, fa0, fa5
; RV32F-SMALL-NEXT: ret
;
+; RV32FXQCILI-SMALL-LABEL: lower_constantpool:
+; RV32FXQCILI-SMALL: # %bb.0:
+; RV32FXQCILI-SMALL-NEXT: qc.e.li a0, 1065355264
+; RV32FXQCILI-SMALL-NEXT: fmv.w.x fa5, a0
+; RV32FXQCILI-SMALL-NEXT: fadd.s fa0, fa0, fa5
+; RV32FXQCILI-SMALL-NEXT: ret
+;
; RV32F-MEDIUM-LABEL: lower_constantpool:
; RV32F-MEDIUM: # %bb.0:
; RV32F-MEDIUM-NEXT: .Lpcrel_hi3:
@@ -293,6 +337,12 @@ define float @lower_constantpool(float %a) nounwind {
; RV32FINX-SMALL-NEXT: fadd.s a0, a0, a1
; RV32FINX-SMALL-NEXT: ret
;
+; RV32FINXXQCILI-SMALL-LABEL: lower_constantpool:
+; RV32FINXXQCILI-SMALL: # %bb.0:
+; RV32FINXXQCILI-SMALL-NEXT: qc.e.li a1, 1065355264
+; RV32FINXXQCILI-SMALL-NEXT: fadd.s a0, a0, a1
+; RV32FINXXQCILI-SMALL-NEXT: ret
+;
; RV32FINX-MEDIUM-LABEL: lower_constantpool:
; RV32FINX-MEDIUM: # %bb.0:
; RV32FINX-MEDIUM-NEXT: lui a1, 260097
@@ -334,6 +384,12 @@ define i32 @lower_extern_weak(i32 %a) nounwind {
; RV32I-SMALL-NEXT: lw a0, %lo(W)(a0)
; RV32I-SMALL-NEXT: ret
;
+; RV32IXQCILI-SMALL-LABEL: lower_extern_weak:
+; RV32IXQCILI-SMALL: # %bb.0:
+; RV32IXQCILI-SMALL-NEXT: qc.e.li a0, W
+; RV32IXQCILI-SMALL-NEXT: lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT: ret
+;
; RV32F-MEDIUM-LABEL: lower_extern_weak:
; RV32F-MEDIUM: # %bb.0:
; RV32F-MEDIUM-NEXT: .Lpcrel_hi4:
@@ -401,6 +457,13 @@ define half @lower_global_half(half %a) nounwind {
; RV32F-SMALL-NEXT: fadd.h fa0, fa0, fa5
; RV32F-SMALL-NEXT: ret
;
+; RV32FXQCILI-SMALL-LABEL: lower_global_half:
+; RV32FXQCILI-SMALL: # %bb.0:
+; RV32FXQCILI-SMALL-NEXT: qc.e.li a0, X
+; RV32FXQCILI-SMALL-NEXT: flh fa5, 0(a0)
+; RV32FXQCILI-SMALL-NEXT: fadd.h fa0, fa0, fa5
+; RV32FXQCILI-SMALL-NEXT: ret
+;
; RV32F-MEDIUM-LABEL: lower_global_half:
; RV32F-MEDIUM: # %bb.0:
; RV32F-MEDIUM-NEXT: .Lpcrel_hi5:
@@ -440,6 +503,13 @@ define half @lower_global_half(half %a) nounwind {
; RV32FINX-SMALL-NEXT: fadd.h a0, a0, a1
; RV32FINX-SMALL-NEXT: ret
;
+; RV32FINXXQCILI-SMALL-LABEL: lower_global_half:
+; RV32FINXXQCILI-SMALL: # %bb.0:
+; RV32FINXXQCILI-SMALL-NEXT: qc.e.li a1, X
+; RV32FINXXQCILI-SMALL-NEXT: lh a1, 0(a1)
+; RV32FINXXQCILI-SMALL-NEXT: fadd.h a0, a0, a1
+; RV32FINXXQCILI-SMALL-NEXT: ret
+;
; RV32FINX-MEDIUM-LABEL: lower_global_half:
; RV32FINX-MEDIUM: # %bb.0:
; RV32FINX-MEDIUM-NEXT: .Lpcrel_hi4:
diff --git a/llvm/test/CodeGen/RISCV/jumptable.ll b/llvm/test/CodeGen/RISCV/jumptable.ll
index 8584579b81384..a838d54ad5e9b 100644
--- a/llvm/test/CodeGen/RISCV/jumptable.ll
+++ b/llvm/test/CodeGen/RISCV/jumptable.ll
@@ -1,6 +1,8 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=CHECK,RV32I-SMALL
+; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -mattr=+experimental-xqcili < %s \
+; RUN: | FileCheck %s -check-prefixes=CHECK,RV32IXQCILI-SMALL
; RUN: llc -mtriple=riscv32 -code-model=medium -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefixes=CHECK,RV32I-MEDIUM
; RUN: llc -mtriple=riscv32 -relocation-model=pic -verify-machineinstrs < %s \
@@ -13,6 +15,8 @@
; RUN: | FileCheck %s -check-prefixes=CHECK,RV64I-PIC
; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
; RUN: | FileCheck %s -check-prefixes=CHECK,RV32I-SMALL-7-ENTRIES
+; RUN: llc -mtriple=riscv32 -code-model=small -verify-machineinstrs -riscv-min-jump-table-entries=7 -mattr=+experimental-xqcili < %s \
+; RUN: | FileCheck %s -check-prefixes=CHECK,RV32IXQCILI-SMALL-7-ENTRIES
; RUN: llc -mtriple=riscv32 -code-model=medium -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
; RUN: | FileCheck %s -check-prefixes=CHECK,RV32I-MEDIUM-7-ENTRIES
; RUN: llc -mtriple=riscv32 -relocation-model=pic -verify-machineinstrs -riscv-min-jump-table-entries=7 < %s \
@@ -114,6 +118,39 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
; RV32I-SMALL-NEXT: .LBB1_9: # %exit
; RV32I-SMALL-NEXT: ret
;
+; RV32IXQCILI-SMALL-LABEL: above_threshold:
+; RV32IXQCILI-SMALL: # %bb.0: # %entry
+; RV32IXQCILI-SMALL-NEXT: addi a0, a0, -1
+; RV32IXQCILI-SMALL-NEXT: li a2, 5
+; RV32IXQCILI-SMALL-NEXT: bltu a2, a0, .LBB1_9
+; RV32IXQCILI-SMALL-NEXT: # %bb.1: # %entry
+; RV32IXQCILI-SMALL-NEXT: slli a0, a0, 2
+; RV32IXQCILI-SMALL-NEXT: qc.e.li a2, .LJTI1_0
+; RV32IXQCILI-SMALL-NEXT: add a0, a0, a2
+; RV32IXQCILI-SMALL-NEXT: lw a0, 0(a0)
+; RV32IXQCILI-SMALL-NEXT: jr a0
+; RV32IXQCILI-SMALL-NEXT: .LBB1_2: # %bb1
+; RV32IXQCILI-SMALL-NEXT: li a0, 4
+; RV32IXQCILI-SMALL-NEXT: j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT: .LBB1_3: # %bb5
+; RV32IXQCILI-SMALL-NEXT: li a0, 100
+; RV32IXQCILI-SMALL-NEXT: j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT: .LBB1_4: # %bb3
+; RV32IXQCILI-SMALL-NEXT: li a0, 2
+; RV32IXQCILI-SMALL-NEXT: j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT: .LBB1_5: # %bb4
+; RV32IXQCILI-SMALL-NEXT: li a0, 1
+; RV32IXQCILI-SMALL-NEXT: j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT: .LBB1_6: # %bb2
+; RV32IXQCILI-SMALL-NEXT: li a0, 3
+; RV32IXQCILI-SMALL-NEXT: j .LBB1_8
+; RV32IXQCILI-SMALL-NEXT: .LBB1_7: # %bb6
+; RV32IXQCILI-SMALL-NEXT: li a0, 200
+; RV32IXQCILI-SMALL-NEXT: .LBB1_8: # %exit
+; RV32IXQCILI-SMALL-NEXT: sw a0, 0(a1)
+; RV32IXQCILI-SMALL-NEXT: .LBB1_9: # %exit
+; RV32IXQCILI-SMALL-NEXT: ret
+;
; RV32I-MEDIUM-LABEL: above_threshold:
; RV32I-MEDIUM: # %bb.0: # %entry
; RV32I-MEDIUM-NEXT: addi a0, a0, -1
@@ -334,6 +371,50 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
; RV32I-SMALL-7-ENTRIES-NEXT: .LBB1_14: # %exit
; RV32I-SMALL-7-ENTRIES-NEXT: ret
;
+; RV32IXQCILI-SMALL-7-ENTRIES-LABEL: above_threshold:
+; RV32IXQCILI-SMALL-7-ENTRIES: # %bb.0: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: blt a2, a0, .LBB1_5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.1: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: beq a0, a2, .LBB1_9
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.2: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: beq a0, a2, .LBB1_11
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.3: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: bne a0, a2, .LBB1_14
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.4: # %bb3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_5: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: beq a0, a2, .LBB1_10
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.6: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: beq a0, a2, .LBB1_12
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.7: # %entry
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a2, 6
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: bne a0, a2, .LBB1_14
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: # %bb.8: # %bb6
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 200
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_9: # %bb1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_10: # %bb4
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 1
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_11: # %bb2
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 3
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: j .LBB1_13
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_12: # %bb5
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: li a0, 100
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_13: # %exit
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: sw a0, 0(a1)
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: .LBB1_14: # %exit
+; RV32IXQCILI-SMALL-7-ENTRIES-NEXT: ret
+;
; RV32I-MEDIUM-7-ENTRIES-LABEL: above_threshold:
; RV32I-MEDIUM-7-ENTRIES: # %bb.0: # %entry
; RV32I-MEDIUM-7-ENTRIES-NEXT: li a2, 3
|
@topperc it would be good to understand your feelings about this (even without a full review). I've been wary of changing the materialisation sequences so far, but this gets us a measurable code size improvement - customer images are seeing a 4-8% improvement (sorry, we were comparing over a few commits, rather than just this one). These numbers are using our linker which has relaxations for |
This patch looks like it would lose out on folding the ADDI into load/store instructions. Is there another patch for that or do you get code size savings without it? |
We get code size savings with exactly this patch, and no modifications/changes to MergeBaseOffset. My hypothesis is that before relaxation, you get lots more loads/stores with zero offsets, which compress (whereas before, the loads/stores contained a symbol reference so had to be close to GP or zero to get any savings after relaxation), and then the images fit within 20 bits, so you mostly end up with (for example) |
This is only used in the small code model and when
Xqcili
is enabled, where addresses would otherwise useLUI/ADDI
. Other code models need to use pc-relative addressing. This patch does this for global/block addresses, constant pools and jumptables.Overall, this gives a better code size saving as
QC.E.LI
is easier to relax toQC.LI/LI
etc thanLUI/ADDI
(especially when theLUI/ADDI
might have sharing, or be split apart).QC.E.LI
has theRISCV_QC_E_32
local relocation attached to it.