Skip to content

Conversation

mykouHW
Copy link

@mykouHW mykouHW commented Sep 1, 2025

fixes #156414 : There is a compatibility issue with filename matching. When input files use absolute paths, the matching results from lld do not meet expectations.

Copy link

github-actions bot commented Sep 1, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 1, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: None (mykouHW)

Changes

Fixing the vulnerability in LLVM lld regarding file matching in linker scripts: There is a compatibility issue with filename matching. When input files use absolute paths, the matching results from mc lld do not meet expectations.


Full diff: https://github.com/llvm/llvm-project/pull/156353.diff

3 Files Affected:

  • (modified) lld/ELF/LinkerScript.cpp (+24-7)
  • (modified) lld/ELF/LinkerScript.h (+1-1)
  • (added) lld/test/ELF/linkerscript/abs-path-match.s (+56)
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 921128dae2bdb..c325f18616fe3 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -411,7 +411,17 @@ void LinkerScript::assignSymbol(SymbolAssignment *cmd, bool inSec) {
   cmd->sym->type = v.type;
 }
 
-bool InputSectionDescription::matchesFile(const InputFile &file) const {
+// Convert an absolute address to a filename
+static inline StringRef getExtractFilename(StringRef filename) {
+  size_t pos = filename.rfind("/");
+  if (pos != std::string::npos) {
+    return filename.substr(pos + 1);
+  }
+  return filename;
+}
+
+bool InputSectionDescription::matchesFile(const InputFile &file,
+                                          bool ExtractFlag) const {
   if (filePat.isTrivialMatchAll())
     return true;
 
@@ -419,10 +429,17 @@ bool InputSectionDescription::matchesFile(const InputFile &file) const {
     if (matchType == MatchType::WholeArchive) {
       matchesFileCache.emplace(&file, filePat.match(file.archiveName));
     } else {
-      if (matchType == MatchType::ArchivesExcluded && !file.archiveName.empty())
+      if (matchType == MatchType::ArchivesExcluded && !file.archiveName.empty()){
         matchesFileCache.emplace(&file, false);
-      else
-        matchesFileCache.emplace(&file, filePat.match(file.getNameForScript()));
+      } else {
+        bool MatchFilename = filePat.match(file.getNameForScript());
+        StringRef ExtractFilename = getExtractFilename(file.getNameForScript());
+        // only use for computeInputSections
+        if (ExtractFlag) {
+          MatchFilename = MatchFilename || filePat.match(ExtractFilename);
+        }
+        matchesFileCache.emplace(&file, MatchFilename);
+      }
     }
   }
 
@@ -442,7 +459,7 @@ bool SectionPattern::excludesFile(const InputFile &file) const {
 
 bool LinkerScript::shouldKeep(InputSectionBase *s) {
   for (InputSectionDescription *id : keptSections)
-    if (id->matchesFile(*s->file))
+    if (id->matchesFile(*s->file, false))
       for (SectionPattern &p : id->sectionPatterns)
         if (p.sectionPat.match(s->name) &&
             (s->flags & id->withFlags) == id->withFlags &&
@@ -571,8 +588,8 @@ LinkerScript::computeInputSections(const InputSectionDescription *cmd,
         if (!pat.sectionPat.match(sec->name))
           continue;
 
-        if (!cmd->matchesFile(*sec->file) || pat.excludesFile(*sec->file) ||
-            !flagsMatch(sec))
+        if (!cmd->matchesFile(*sec->file, true) ||
+            pat.excludesFile(*sec->file) || !flagsMatch(sec))
           continue;
 
         if (sec->parent) {
diff --git a/lld/ELF/LinkerScript.h b/lld/ELF/LinkerScript.h
index 80c4f564afabc..452cfbcd9b777 100644
--- a/lld/ELF/LinkerScript.h
+++ b/lld/ELF/LinkerScript.h
@@ -227,7 +227,7 @@ class InputSectionDescription : public SectionCommand {
     return c->kind == InputSectionKind;
   }
 
-  bool matchesFile(const InputFile &file) const;
+  bool matchesFile(const InputFile &file, bool ExtractFilename) const;
 
   // Input sections that matches at least one of SectionPatterns
   // will be associated with this InputSectionDescription.
diff --git a/lld/test/ELF/linkerscript/abs-path-match.s b/lld/test/ELF/linkerscript/abs-path-match.s
new file mode 100644
index 0000000000000..cc31dcd1e8031
--- /dev/null
+++ b/lld/test/ELF/linkerscript/abs-path-match.s
@@ -0,0 +1,56 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && mkdir -p %t
+# RUN: split-file %s %t && cd %t
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 main.s -o main.o
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 foo.s -o foo.o
+# RUN: llvm-objcopy --rename-section .text=.text_foo  foo.o foo.o
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 bar.s -o bar.o
+# RUN: llvm-objcopy --rename-section .text=.text_bar  bar.o bar.o
+
+# RUN: ld.lld -r main.o %t/foo.o %t/bar.o -T script.ld -o main_abs.o
+
+# RUN: llvm-objdump -S main_abs.o > main_abs
+# RUN: llvm-objdump -S main_abs.o | FileCheck %s
+# CHECK: Disassembly of section .goo:
+
+
+#--- foo.s
+    .text
+    .globl	foo
+    .p2align	4
+    .type	foo,@function
+foo:
+    nop
+
+
+#--- bar.s
+    .text
+    .globl	bar
+    .p2align	4
+    .type	bar,@function
+bar:      
+    nop
+
+
+#--- main.s
+	.text
+	.globl	main
+	.p2align	4
+	.type	main,@function
+main:
+	callq	foo@PLT
+	callq	bar@PLT
+	retq
+
+
+#--- script.ld
+SECTIONS {
+  .text : { *(.text) }
+  .goo : {
+    bar.o(.text_bar);
+    foo.o(.text_foo);
+  }
+}
\ No newline at end of file

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the description difficult to follow.

When input files use absolute paths, the matching results from mc lld do not meet expectations.

Can you say what your expectations are and why?

If I've understood your patch correctly, you expect lld to match the file part of file.o(section) with an object with a filename of /path/to/file.o

When I look at https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html I can't see anything that says that when a full path is given the linker is expected to extract the filename. I think the recommended pattern in this case is *file.o(section)

Trying to make a similar example with GNU ld, I got an error message:
ld: cannot find foo.o: No such file or directory

So we definitely have a discrepancy between GNU ld and lld, but if we think it is important I'd prefer that we fix it to match GNU ld behaviour. I think there could be problems with automatically discarding path information, particularly when there are multiple object files with the same name, but different paths path1/foo.o path2/foo.o

As an aside mc lld, doesn't exist as a name, it is just lld.

# RUN: llvm-mc -filetype=obj -triple=x86_64 main.s -o main.o

# RUN: llvm-mc -filetype=obj -triple=x86_64 foo.s -o foo.o
# RUN: llvm-objcopy --rename-section .text=.text_foo foo.o foo.o
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using objcopy could you use .section .text_foo, "ax", %progbits instead of .text


# RUN: ld.lld -r main.o %t/foo.o %t/bar.o -T script.ld -o main_abs.o

# RUN: llvm-objdump -S main_abs.o > main_abs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't appear to be used in the test


# RUN: llvm-objdump -S main_abs.o > main_abs
# RUN: llvm-objdump -S main_abs.o | FileCheck %s
# CHECK: Disassembly of section .goo:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to show that .text_foo and .text_bar have been placed in .goo I think checking the output of the linker -Map is more direct. You wouldn't need to use llvm-objdump at all then.

@@ -411,18 +411,35 @@ void LinkerScript::assignSymbol(SymbolAssignment *cmd, bool inSec) {
cmd->sym->type = v.type;
}

bool InputSectionDescription::matchesFile(const InputFile &file) const {
// Convert an absolute address to a filename
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "address" here? I think this attempting to extract the filename from an absolute path.

Assuming this is the right thing to do, then there are probably utility functions that exist already. Can you see if there is anything in llvm::sys::path .

Fixing the compatibility issue with filename matching in linker scripts. When input files use absolute paths, the matching results from lld do not meet expectations.
@mykouHW
Copy link
Author

mykouHW commented Sep 4, 2025

I found the description difficult to follow.

When input files use absolute paths, the matching results from mc lld do not meet expectations.

Can you say what your expectations are and why?

If I've understood your patch correctly, you expect lld to match the file part of file.o(section) with an object with a filename of /path/to/file.o

When I look at https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html I can't see anything that says that when a full path is given the linker is expected to extract the filename. I think the recommended pattern in this case is *file.o(section)

Trying to make a similar example with GNU ld, I got an error message: ld: cannot find foo.o: No such file or directory

So we definitely have a discrepancy between GNU ld and lld, but if we think it is important I'd prefer that we fix it to match GNU ld behaviour. I think there could be problems with automatically discarding path information, particularly when there are multiple object files with the same name, but different paths path1/foo.o path2/foo.o

As an aside mc lld, doesn't exist as a name, it is just lld.

Thanks for your comment!

When the absolute path is used to input the object file in the ld.lld command, the expectation of "merging bar and foo into the .goo section" in the linker script is not achieved.

For example, if use the command like this:
ld.lld main.o %t/foo.o %t/bar.o -T script.ld -o main_abs.o -Map=main_abs.map,
the sections .text_foo and .text_bar will be set as OrphanSection and no .goo section will be generated.
abs

However, if use the command like this:
ld.lld main.o foo.o bar.o -T script.ld -o main_abs.o -Map=main_abs.map,
no .text_foo and .text_bar sections will be generated, only a .goo section containing bar and foo will be generated.
rel

@smithp35
Copy link
Collaborator

smithp35 commented Sep 4, 2025

Thank you for the clarification.

I think lld is behaving as I would expect given the linker script. As mentioned in the first document, the documentation https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html and GNU ld [1] (which lld intends to be a drop in replacement for) require the full path name to be matched by the file-pattern.

While I can see that the design decision over whether to match object.o to /path/to/object.o could be made either way. I think lld should stick to GNU ld behaviour unless there is a good reason not to. There's also the chance that this could alter the behaviour of existing linker scripts see [2].

[1] To give an example with GNU ld
foo.s

.text
nop

bar.s

.text
nop

tmp/foo.s

.section .text_foo, "ax", %progbits
nop

tmp/bar.s

.section .text_bar, "ax", %progbits
nop

sec.lds

SECTIONS {
         .goo : {
         foo.o(.text_foo)
         bar.o(.text_bar)
         }
         .text :
         {
           *(.text)
         }
}

clang -c foo.s bar.s
clang -c tmp/foo.s -o tmp/foo.o
clang -c tmp/bar.s -o tmp/bar.o
ld.bfd foo.o bar.o tmp/foo.o tmp/bar.o -T sec.lds --orphan-handling=warn --print-map

...
ld.bfd: warning: orphan section `.text_foo' from `tmp/foo.o' being placed in section `.text_foo'
ld.bfd: warning: orphan section `.text_bar' from `tmp/bar.o' being placed in section `.text_bar'
...
.goo
 foo.o(.text_foo)
 bar.o(.text_bar)

.text           0x0000000000000000        0x8
 *(.text)
 .text          0x0000000000000000        0x1 foo.o
 *fill*         0x0000000000000001        0x3
 .text          0x0000000000000004        0x1 bar.o
 *fill*         0x0000000000000005        0x3
 .text          0x0000000000000008        0x0 tmp/foo.o
 .text          0x0000000000000008        0x0 tmp/bar.o

[2] Example with behaviour changed
sec2.lds

SECTIONS {
         .goo : {
         foo.o(.text_foo)
         bar.o(.text_bar)
         }
         .text :
         {
           *(.text)
         }
         .hoo : {
         *foo.o(.text_foo)
         *bar.o(.text_bar)
         }
}

ld.bfd foo.o bar.o tmp/foo.o tmp/bar.o -T sec2.lds --orphan-handling=warn --print-map

.goo
 foo.o(.text_foo)
 bar.o(.text_bar)

.text           0x0000000000000000        0x8
 *(.text)
 .text          0x0000000000000000        0x1 foo.o
 *fill*         0x0000000000000001        0x3
 .text          0x0000000000000004        0x1 bar.o
 *fill*         0x0000000000000005        0x3
 .text          0x0000000000000008        0x0 tmp/foo.o
 .text          0x0000000000000008        0x0 tmp/bar.o

.iplt           0x0000000000000008        0x0
 .iplt          0x0000000000000008        0x0 foo.o

.hoo            0x0000000000000008        0x2
 *foo.o(.text_foo)
 .text_foo      0x0000000000000008        0x1 tmp/foo.o
 *bar.o(.text_bar)
 .text_bar      0x0000000000000009        0x1 tmp/bar.o

The .text_foo and .text_bar selection to .hoo would be changed to .goo with this PR implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LLD] lld linker script does not support absolute path
3 participants