Skip to content

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Mar 28, 2025

The R_AARCH64_PATCHINST relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

Part of the AArch64 psABI extension:
ARM-software/abi-aa#340

pcc and others added 2 commits March 28, 2025 15:33
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-lld

Author: Peter Collingbourne (pcc)

Changes

The R_AARCH64_INST32 relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

TODO:

  • Agree on semantics and relocation type number.
  • Add tests.

Full diff: https://github.com/llvm/llvm-project/pull/133534.diff

2 Files Affected:

  • (modified) lld/ELF/Arch/AArch64.cpp (+8)
  • (modified) llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def (+1)
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 9538dd4a70bae..110d087230a9c 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -135,6 +135,7 @@ RelExpr AArch64::getRelExpr(RelType type, const Symbol &s,
   switch (type) {
   case R_AARCH64_ABS16:
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_ABS64:
   case R_AARCH64_ADD_ABS_LO12_NC:
   case R_AARCH64_LDST128_ABS_LO12_NC:
@@ -278,6 +279,7 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_PREL16:
     return SignExtend64<16>(read16(ctx, buf));
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_PREL32:
     return SignExtend64<32>(read32(ctx, buf));
   case R_AARCH64_ABS64:
@@ -505,6 +507,12 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     checkIntUInt(ctx, loc, val, 32, rel);
     write32(ctx, loc, val);
     break;
+  case R_AARCH64_INST32:
+    if (!rel.sym->isUndefined()) {
+      checkIntUInt(ctx, loc, val, 32, rel);
+      write32(ctx, loc, val);
+    }
+    break;
   case R_AARCH64_PLT32:
   case R_AARCH64_GOTPCREL32:
     checkInt(ctx, loc, val, 32, rel);
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
index 05b79eae573f7..c9f17ee4e0c7a 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
@@ -61,6 +61,7 @@ ELF_RELOC(R_AARCH64_LD64_GOT_LO12_NC,                0x138)
 ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15,               0x139)
 ELF_RELOC(R_AARCH64_PLT32,                           0x13a)
 ELF_RELOC(R_AARCH64_GOTPCREL32,                      0x13b)
+ELF_RELOC(R_AARCH64_INST32,                          0x13c)
 // General dynamic TLS relocations
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PREL21,                0x200)
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PAGE21,                0x201)

@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-lld-elf

Author: Peter Collingbourne (pcc)

Changes

The R_AARCH64_INST32 relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

TODO:

  • Agree on semantics and relocation type number.
  • Add tests.

Full diff: https://github.com/llvm/llvm-project/pull/133534.diff

2 Files Affected:

  • (modified) lld/ELF/Arch/AArch64.cpp (+8)
  • (modified) llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def (+1)
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 9538dd4a70bae..110d087230a9c 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -135,6 +135,7 @@ RelExpr AArch64::getRelExpr(RelType type, const Symbol &s,
   switch (type) {
   case R_AARCH64_ABS16:
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_ABS64:
   case R_AARCH64_ADD_ABS_LO12_NC:
   case R_AARCH64_LDST128_ABS_LO12_NC:
@@ -278,6 +279,7 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_PREL16:
     return SignExtend64<16>(read16(ctx, buf));
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_PREL32:
     return SignExtend64<32>(read32(ctx, buf));
   case R_AARCH64_ABS64:
@@ -505,6 +507,12 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     checkIntUInt(ctx, loc, val, 32, rel);
     write32(ctx, loc, val);
     break;
+  case R_AARCH64_INST32:
+    if (!rel.sym->isUndefined()) {
+      checkIntUInt(ctx, loc, val, 32, rel);
+      write32(ctx, loc, val);
+    }
+    break;
   case R_AARCH64_PLT32:
   case R_AARCH64_GOTPCREL32:
     checkInt(ctx, loc, val, 32, rel);
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
index 05b79eae573f7..c9f17ee4e0c7a 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
@@ -61,6 +61,7 @@ ELF_RELOC(R_AARCH64_LD64_GOT_LO12_NC,                0x138)
 ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15,               0x139)
 ELF_RELOC(R_AARCH64_PLT32,                           0x13a)
 ELF_RELOC(R_AARCH64_GOTPCREL32,                      0x13b)
+ELF_RELOC(R_AARCH64_INST32,                          0x13c)
 // General dynamic TLS relocations
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PREL21,                0x200)
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PAGE21,                0x201)

@smithp35
Copy link
Collaborator

Apologies for the delay in responding, just come back from vacation. Not had a chance to look through the Discourse posts yet, will do so this week.

While it may seem a bit premature, would you be able to open an issue in the AArch64 ABI (https://github.com/ARM-software/abi-aa/issues) with a request to add a new relocation directive, citing the Discourse posts and PRs?

@pcc
Copy link
Contributor Author

pcc commented Mar 31, 2025

Apologies for the delay in responding, just come back from vacation. Not had a chance to look through the Discourse posts yet, will do so this week.

While it may seem a bit premature, would you be able to open an issue in the AArch64 ABI (https://github.com/ARM-software/abi-aa/issues) with a request to add a new relocation directive, citing the Discourse posts and PRs?

Sure, created ARM-software/abi-aa#317

pcc added a commit to pcc/llvm-project that referenced this pull request Apr 3, 2025
The R_AARCH64_INST32 relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

TODO:
- Agree on semantics and relocation type number.
- Add tests.

Pull Request: llvm#133534
pcc added a commit to pcc/llvm-project that referenced this pull request Apr 4, 2025
The R_AARCH64_INST32 relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

TODO:
- Agree on semantics and relocation type number.
- Add tests.

Pull Request: llvm#133534
pcc and others added 4 commits May 12, 2025 21:37
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
pcc and others added 4 commits July 8, 2025 21:38
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
@pcc pcc changed the title ELF: Add support for R_AARCH64_INST32 relocation. ELF: Introduce R_AARCH64_PATCHINST relocation type. Jul 10, 2025
@llvmbot llvmbot added backend:AArch64 llvm:mc Machine (object) code labels Jul 10, 2025
@pcc
Copy link
Contributor Author

pcc commented Jul 10, 2025

Renamed to R_AARCH64_PATCHINST and changed semantics as discussed in ABI thread and added test.

@smithp35 @MaskRay PTAL.

@@ -137,6 +137,7 @@ RelExpr AArch64::getRelExpr(RelType type, const Symbol &s,
switch (type) {
case R_AARCH64_ABS16:
case R_AARCH64_ABS32:
case R_AARCH64_PATCHINST:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the relocation is used as described in the RFC, with the target symbol absolute all looks well. If a non ABS symbol is used in a position independent context I think you'll get a recompile with PIC error message. If not position independent then I think the relocation will pass through and give a likely unpredictable value to patch the instruction.

Could it be worth putting some checking (likely in Relocation.cpp) to make sure the target symbol is SHN_ABS?

A related question is whether non-zero addends should be supported? For most instructions the addend doesn't make logical sense. There could be some instructions where the immediate field is in the right place to make it work, but I think these cases may not be worth supporting.

An alternative formulation for RELA is to completely ignore the symbol value and just use the bottom 32-bits of the addend field. That would give the object producer that defines the relocations more control over what is patched in. Not got a strong opinion on whether that is better though.

Copy link
Contributor Author

@pcc pcc Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be worth putting some checking (likely in Relocation.cpp) to make sure the target symbol is SHN_ABS?

That makes sense. getRelExpr in Arch/AArch64.cpp seems like a slightly better place since it's arch-specific semantics, so I added it there.

A related question is whether non-zero addends should be supported? For most instructions the addend doesn't make logical sense. There could be some instructions where the immediate field is in the right place to make it work, but I think these cases may not be worth supporting.

An alternative formulation for RELA is to completely ignore the symbol value and just use the bottom 32-bits of the addend field. That would give the object producer that defines the relocations more control over what is patched in. Not got a strong opinion on whether that is better though.

This could be useful for something like an hypothetical PAC variant that takes three operands like most other instructions:

pacda x0, x1, x2

meaning sign x1 using x2 and store the result in x0. Then we could encode the instruction

mov x0, x1

in the relocation addend and have the object defining the deactivation symbol define it to absolute 0. And then of course you can have the replacement instruction use whichever registers the original instruction did (or replace it entirely with NOP if the source and destination registers are the same). This doesn't seem useful for PAC given how the existing instructions work but maybe there's a non-PAC use case for this.

It doesn't seem harmful to have the flexibility of letting the instruction be defined by the symbol, the relocation or both. On the psABI issue we discussed restrictions for things like erratum fixes but for that purpose it doesn't seem to matter how the instruction word is computed. We can make it the responsibility of the object file producer(s) to ensure that the instruction formed by value+addend follows the rules.

add x0, x1, x2
# This instruction has two relocations: the DS relocation and the JUMP26 to f1.
# Make sure that the DS relocation takes precedence.
.reloc ., R_AARCH64_PATCHINST, ds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth a test with emit-relocs to show both relocations coming out.

Thinking of Bolt, which relies on emit-relocs, I expect that it would just ignore the R_AARCH_PATCHINST relocations on their own as it wouldn't know how to recreate the original value [1]. It would have to discard any relocation at the same location. We're hoping to create a binary analysis ABI supplement soon to document conventions that binary analysis tools are using. First step ARM-software/abi-aa#333

[1] In theory if we did want to let Bolt reverse a patch, the emit-relocs output could give the reverse patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pcc and others added 3 commits July 11, 2025 16:54
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
AlexVlx and others added 4 commits July 29, 2025 21:51
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
pcc added a commit to pcc/llvm-project that referenced this pull request Aug 1, 2025
The R_AARCH64_PATCHINST relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

TODO:
- Agree on semantics and relocation type number.

Pull Request: llvm#133534
@pcc
Copy link
Contributor Author

pcc commented Aug 5, 2025

psABI update landed, this is now unblocked.

// CHECK: R_AARCH64_JUMP26
// CHECK-NEXT: R_AARCH64_PATCHINST
.reloc ., R_AARCH64_PATCHINST, ds
b f1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve the test to test that with more than one fragments and more than one PATCHINST, the relocations are still ordered.

.reloc ., R_AARCH64_PATCHINST, ds
b f1
.balign 8
.reloc ., R_AARCH64_PATCHINST, ds
b f2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 55e71c0

@@ -0,0 +1,87 @@
# RUN: rm -rf %t && split-file %s %t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding && cd %t so that we can remove %t/ below, which clutter up the commands...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this form so that the commands are copy-pastable into the shell.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembler part (BinaryFormat / Target/AArch64 changes) look good. The linker change should be made separate. But thank for combining this in a single PR, making the full picture clear:)

@@ -61,6 +61,7 @@ ELF_RELOC(R_AARCH64_LD64_GOT_LO12_NC, 0x138)
ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15, 0x139)
ELF_RELOC(R_AARCH64_PLT32, 0x13a)
ELF_RELOC(R_AARCH64_GOTPCREL32, 0x13b)
ELF_RELOC(R_AARCH64_PATCHINST, 0x13c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a test to tools/llvm-readobj/ELF/reloc-types-aarch64.test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@MaskRay MaskRay requested a review from smithp35 August 9, 2025 18:07
@MaskRay
Copy link
Member

MaskRay commented Aug 9, 2025

The R_AARCH64_PATCHINST relocation type is to support deactivation symbols. For more information, see the RFC: discourse.llvm.org/t/rfc-deactivation-symbols/85556

An AArch64 psABI extension proposal has been made: ARM-software/abi-aa#329

The RFC is quite long and doesn't come up with an assembly example. Adding some description of how the relocation works will help future readers.

@MaskRay MaskRay requested a review from davemgreen August 9, 2025 18:14
@MaskRay
Copy link
Member

MaskRay commented Aug 9, 2025

Need an aarch64 maintainer's signoff on the llvm part.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left my approval on the AArch64 side, the relocation id matches the ABI. I'm not a an official maintainer, so please leave a few days to see if @davemgreen has any objections.

@davemgreen
Copy link
Collaborator

I'm not a an official maintainer,

Yeah you are. If you say this LGTY then it SGTM :)

pcc added 2 commits August 11, 2025 11:32
Created using spr 1.3.6-beta.1

[skip ci]
Created using spr 1.3.6-beta.1
@pcc pcc changed the title ELF: Introduce R_AARCH64_PATCHINST relocation type. ELF: Add support for R_AARCH64_PATCHINST relocation type. Aug 11, 2025
@pcc pcc changed the base branch from users/pcc/spr/main.elf-add-support-for-r_aarch64_inst32-relocation to main August 11, 2025 18:33
@pcc pcc merged commit 53c41f1 into main Aug 11, 2025
6 of 12 checks passed
@pcc pcc deleted the users/pcc/spr/elf-add-support-for-r_aarch64_inst32-relocation branch August 11, 2025 18:33
@pcc
Copy link
Contributor Author

pcc commented Aug 11, 2025

The assembler part (BinaryFormat / Target/AArch64 changes) look good. The linker change should be made separate. But thank for combining this in a single PR, making the full picture clear:)

MC parts were moved to a922731, LLD parts were landed in this PR.

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 11, 2025
The R_AARCH64_PATCHINST relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

Part of the AArch64 psABI extension:
ARM-software/abi-aa#340

Reviewers: smithp35, davemgreen, MaskRay

Reviewed By: MaskRay, davemgreen, smithp35

Pull Request: llvm/llvm-project#133534
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

8 participants