-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[lld][ELF] Enable link script to support absolute path matching #156353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-elf Author: None (mykouHW) ChangesFixing the vulnerability in LLVM lld regarding file matching in linker scripts: There is a compatibility issue with filename matching. When input files use absolute paths, the matching results from mc lld do not meet expectations. Full diff: https://github.com/llvm/llvm-project/pull/156353.diff 3 Files Affected:
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 921128dae2bdb..c325f18616fe3 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -411,7 +411,17 @@ void LinkerScript::assignSymbol(SymbolAssignment *cmd, bool inSec) {
cmd->sym->type = v.type;
}
-bool InputSectionDescription::matchesFile(const InputFile &file) const {
+// Convert an absolute address to a filename
+static inline StringRef getExtractFilename(StringRef filename) {
+ size_t pos = filename.rfind("/");
+ if (pos != std::string::npos) {
+ return filename.substr(pos + 1);
+ }
+ return filename;
+}
+
+bool InputSectionDescription::matchesFile(const InputFile &file,
+ bool ExtractFlag) const {
if (filePat.isTrivialMatchAll())
return true;
@@ -419,10 +429,17 @@ bool InputSectionDescription::matchesFile(const InputFile &file) const {
if (matchType == MatchType::WholeArchive) {
matchesFileCache.emplace(&file, filePat.match(file.archiveName));
} else {
- if (matchType == MatchType::ArchivesExcluded && !file.archiveName.empty())
+ if (matchType == MatchType::ArchivesExcluded && !file.archiveName.empty()){
matchesFileCache.emplace(&file, false);
- else
- matchesFileCache.emplace(&file, filePat.match(file.getNameForScript()));
+ } else {
+ bool MatchFilename = filePat.match(file.getNameForScript());
+ StringRef ExtractFilename = getExtractFilename(file.getNameForScript());
+ // only use for computeInputSections
+ if (ExtractFlag) {
+ MatchFilename = MatchFilename || filePat.match(ExtractFilename);
+ }
+ matchesFileCache.emplace(&file, MatchFilename);
+ }
}
}
@@ -442,7 +459,7 @@ bool SectionPattern::excludesFile(const InputFile &file) const {
bool LinkerScript::shouldKeep(InputSectionBase *s) {
for (InputSectionDescription *id : keptSections)
- if (id->matchesFile(*s->file))
+ if (id->matchesFile(*s->file, false))
for (SectionPattern &p : id->sectionPatterns)
if (p.sectionPat.match(s->name) &&
(s->flags & id->withFlags) == id->withFlags &&
@@ -571,8 +588,8 @@ LinkerScript::computeInputSections(const InputSectionDescription *cmd,
if (!pat.sectionPat.match(sec->name))
continue;
- if (!cmd->matchesFile(*sec->file) || pat.excludesFile(*sec->file) ||
- !flagsMatch(sec))
+ if (!cmd->matchesFile(*sec->file, true) ||
+ pat.excludesFile(*sec->file) || !flagsMatch(sec))
continue;
if (sec->parent) {
diff --git a/lld/ELF/LinkerScript.h b/lld/ELF/LinkerScript.h
index 80c4f564afabc..452cfbcd9b777 100644
--- a/lld/ELF/LinkerScript.h
+++ b/lld/ELF/LinkerScript.h
@@ -227,7 +227,7 @@ class InputSectionDescription : public SectionCommand {
return c->kind == InputSectionKind;
}
- bool matchesFile(const InputFile &file) const;
+ bool matchesFile(const InputFile &file, bool ExtractFilename) const;
// Input sections that matches at least one of SectionPatterns
// will be associated with this InputSectionDescription.
diff --git a/lld/test/ELF/linkerscript/abs-path-match.s b/lld/test/ELF/linkerscript/abs-path-match.s
new file mode 100644
index 0000000000000..cc31dcd1e8031
--- /dev/null
+++ b/lld/test/ELF/linkerscript/abs-path-match.s
@@ -0,0 +1,56 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && mkdir -p %t
+# RUN: split-file %s %t && cd %t
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 main.s -o main.o
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 foo.s -o foo.o
+# RUN: llvm-objcopy --rename-section .text=.text_foo foo.o foo.o
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 bar.s -o bar.o
+# RUN: llvm-objcopy --rename-section .text=.text_bar bar.o bar.o
+
+# RUN: ld.lld -r main.o %t/foo.o %t/bar.o -T script.ld -o main_abs.o
+
+# RUN: llvm-objdump -S main_abs.o > main_abs
+# RUN: llvm-objdump -S main_abs.o | FileCheck %s
+# CHECK: Disassembly of section .goo:
+
+
+#--- foo.s
+ .text
+ .globl foo
+ .p2align 4
+ .type foo,@function
+foo:
+ nop
+
+
+#--- bar.s
+ .text
+ .globl bar
+ .p2align 4
+ .type bar,@function
+bar:
+ nop
+
+
+#--- main.s
+ .text
+ .globl main
+ .p2align 4
+ .type main,@function
+main:
+ callq foo@PLT
+ callq bar@PLT
+ retq
+
+
+#--- script.ld
+SECTIONS {
+ .text : { *(.text) }
+ .goo : {
+ bar.o(.text_bar);
+ foo.o(.text_foo);
+ }
+}
\ No newline at end of file
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the description difficult to follow.
When input files use absolute paths, the matching results from mc lld do not meet expectations.
Can you say what your expectations are and why?
If I've understood your patch correctly, you expect lld to match the file part of file.o(section)
with an object with a filename of /path/to/file.o
When I look at https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html I can't see anything that says that when a full path is given the linker is expected to extract the filename. I think the recommended pattern in this case is *file.o(section)
Trying to make a similar example with GNU ld, I got an error message:
ld: cannot find foo.o: No such file or directory
So we definitely have a discrepancy between GNU ld and lld, but if we think it is important I'd prefer that we fix it to match GNU ld behaviour. I think there could be problems with automatically discarding path information, particularly when there are multiple object files with the same name, but different paths path1/foo.o
path2/foo.o
As an aside mc lld, doesn't exist as a name, it is just lld.
# RUN: llvm-mc -filetype=obj -triple=x86_64 main.s -o main.o | ||
|
||
# RUN: llvm-mc -filetype=obj -triple=x86_64 foo.s -o foo.o | ||
# RUN: llvm-objcopy --rename-section .text=.text_foo foo.o foo.o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using objcopy could you use .section .text_foo, "ax", %progbits
instead of .text
|
||
# RUN: ld.lld -r main.o %t/foo.o %t/bar.o -T script.ld -o main_abs.o | ||
|
||
# RUN: llvm-objdump -S main_abs.o > main_abs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't appear to be used in the test
|
||
# RUN: llvm-objdump -S main_abs.o > main_abs | ||
# RUN: llvm-objdump -S main_abs.o | FileCheck %s | ||
# CHECK: Disassembly of section .goo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intent is to show that .text_foo
and .text_bar
have been placed in .goo I think checking the output of the linker -Map
is more direct. You wouldn't need to use llvm-objdump at all then.
lld/ELF/LinkerScript.cpp
Outdated
@@ -411,18 +411,35 @@ void LinkerScript::assignSymbol(SymbolAssignment *cmd, bool inSec) { | |||
cmd->sym->type = v.type; | |||
} | |||
|
|||
bool InputSectionDescription::matchesFile(const InputFile &file) const { | |||
// Convert an absolute address to a filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "address" here? I think this attempting to extract the filename from an absolute path.
Assuming this is the right thing to do, then there are probably utility functions that exist already. Can you see if there is anything in llvm::sys::path
.
Fixing the compatibility issue with filename matching in linker scripts. When input files use absolute paths, the matching results from lld do not meet expectations.
fa49fbe
to
21474f0
Compare
Thanks for your comment! When the absolute path is used to input the object file in the ld.lld command, the expectation of "merging bar and foo into the .goo section" in the linker script is not achieved. For example, if use the command like this: However, if use the command like this: |
Thank you for the clarification. I think lld is behaving as I would expect given the linker script. As mentioned in the first document, the documentation https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html and GNU ld [1] (which lld intends to be a drop in replacement for) require the full path name to be matched by the file-pattern. While I can see that the design decision over whether to match object.o to /path/to/object.o could be made either way. I think lld should stick to GNU ld behaviour unless there is a good reason not to. There's also the chance that this could alter the behaviour of existing linker scripts see [2]. [1] To give an example with GNU ld
bar.s
tmp/foo.s
tmp/bar.s
sec.lds
clang -c foo.s bar.s
[2] Example with behaviour changed
ld.bfd foo.o bar.o tmp/foo.o tmp/bar.o -T sec2.lds --orphan-handling=warn --print-map
The .text_foo and .text_bar selection to .hoo would be changed to .goo with this PR implemented. |
fixes #156414 : There is a compatibility issue with filename matching. When input files use absolute paths, the matching results from lld do not meet expectations.