-
Notifications
You must be signed in to change notification settings - Fork 14.9k
big archive recognition by the llvm-symbolizer #150401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
big archive recognition by the llvm-symbolizer #150401
Conversation
…des7/llvm-project into midhun7/big-archive-recognition
@llvm/pr-subscribers-llvm-binary-utilities Author: Midhunesh (midhuncodes7) ChangesThis PR implements big archive recognition by the symbolizer. Patch is 26.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150401.diff 10 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-symbolizer.rst b/llvm/docs/CommandGuide/llvm-symbolizer.rst
index 2da1b2470a83e..8f3a132139fe9 100644
--- a/llvm/docs/CommandGuide/llvm-symbolizer.rst
+++ b/llvm/docs/CommandGuide/llvm-symbolizer.rst
@@ -535,16 +535,20 @@ MACH-O SPECIFIC OPTIONS
.. option:: --default-arch <arch>
If a binary contains object files for multiple architectures (e.g. it is a
- Mach-O universal binary), symbolize the object file for a given architecture.
- You can also specify the architecture by writing ``binary_name:arch_name`` in
- the input (see example below). If the architecture is not specified in either
- way, the address will not be symbolized. Defaults to empty string.
+ Mach-O universal binary or an AIX archive with architecture variants),
+ symbolize the object file for a given architecture. You can also specify
+ the architecture by writing ``binary_name:arch_name`` in the input (see
+ example below). For AIX archives, the format ``archive.a(member.o):arch``
+ is also supported. If the architecture is not specified in either way,
+ the address will not be symbolized. Defaults to empty string.
.. code-block:: console
$ cat addr.txt
/tmp/mach_universal_binary:i386 0x1f84
/tmp/mach_universal_binary:x86_64 0x100000f24
+ /tmp/archive.a(member.o):ppc 0x1000
+ /tmp/archive.a(member.o):ppc64 0x2000
$ llvm-symbolizer < addr.txt
_main
@@ -553,6 +557,12 @@ MACH-O SPECIFIC OPTIONS
_main
/tmp/source_x86_64.cc:8
+ _foo
+ /tmp/source_ppc.cc:12
+
+ _foo
+ /tmp/source_ppc64.cc:12
+
.. option:: --dsym-hint <path/to/file.dSYM>
If the debug info for a binary isn't present in the default location, look for
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index fb8f3d8af6b1b..5144085f3e23c 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -29,6 +29,12 @@
#include <utility>
#include <vector>
+#if defined(_AIX)
+# define SYMBOLIZE_AIX 1
+#else
+# define SYMBOLIZE_AIX 0
+#endif
+
namespace llvm {
namespace object {
class ELFObjectFileBase;
@@ -202,6 +208,12 @@ class LLVMSymbolizer {
Expected<ObjectFile *> getOrCreateObject(const std::string &Path,
const std::string &ArchName);
+ /// Return a pointer to object file at specified path, for a specified
+ /// architecture that is present inside an archive file
+ Expected<ObjectFile *> getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName);
+
/// Update the LRU cache order when a binary is accessed.
void recordAccess(CachedBinary &Bin);
@@ -226,6 +238,20 @@ class LLVMSymbolizer {
std::map<std::pair<std::string, std::string>, std::unique_ptr<ObjectFile>>
ObjectForUBPathAndArch;
+ struct ArchiveCacheKey {
+ std::string ArchivePath; // Storage for StringRef
+ std::string MemberName; // Storage for StringRef
+ std::string ArchName; // Storage for StringRef
+
+ // Required for map comparison
+ bool operator<(const ArchiveCacheKey &Other) const {
+ return std::tie(ArchivePath, MemberName, ArchName) <
+ std::tie(Other.ArchivePath, Other.MemberName, Other.ArchName);
+ }
+ };
+
+ std::map<ArchiveCacheKey, std::unique_ptr<ObjectFile>> ObjectForArchivePathAndArch;
+
Options Opts;
std::unique_ptr<BuildIDFetcher> BIDFetcher;
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 56527719da51f..6dddc3a709239 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -33,6 +33,7 @@
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Path.h"
+#include "llvm/Object/Archive.h"
#include <cassert>
#include <cstring>
@@ -286,6 +287,7 @@ LLVMSymbolizer::findSymbol(ArrayRef<uint8_t> BuildID, StringRef Symbol,
void LLVMSymbolizer::flush() {
ObjectForUBPathAndArch.clear();
+ ObjectForArchivePathAndArch.clear();
LRUBinaries.clear();
CacheSize = 0;
BinaryForPath.clear();
@@ -321,7 +323,7 @@ bool checkFileCRC(StringRef Path, uint32_t CRCHash) {
bool getGNUDebuglinkContents(const ObjectFile *Obj, std::string &DebugName,
uint32_t &CRCHash) {
- if (!Obj)
+ if (!Obj || !isa<ObjectFile>(Obj))
return false;
for (const SectionRef &Section : Obj->sections()) {
StringRef Name;
@@ -557,19 +559,101 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
if (!DbgObj)
DbgObj = Obj;
ObjectPair Res = std::make_pair(Obj, DbgObj);
- std::string DbgObjPath = DbgObj->getFileName().str();
auto Pair =
ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res);
- BinaryForPath.find(DbgObjPath)->second.pushEvictor([this, I = Pair.first]() {
+ std::string DbgObjPath = DbgObj->getFileName().str();
+ auto BinIter = BinaryForPath.find(DbgObjPath);
+ if (BinIter != BinaryForPath.end()) {
+ BinIter->second.pushEvictor([this, I = Pair.first]() {
ObjectPairForPathArch.erase(I);
- });
+ });
+ }
return Res;
}
+Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName) {
+ Binary *Bin = nullptr;
+ auto Pair = BinaryForPath.emplace(ArchivePath.str(), OwningBinary<Binary>());
+ if (!Pair.second) {
+ Bin = Pair.first->second->getBinary();
+ recordAccess(Pair.first->second);
+ } else {
+ Expected<OwningBinary<Binary>> ArchiveOrErr = createBinary(ArchivePath);
+ if (!ArchiveOrErr) {
+ return ArchiveOrErr.takeError();
+ }
+
+ CachedBinary &CachedBin = Pair.first->second;
+ CachedBin = std::move(ArchiveOrErr.get());
+ CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); });
+ LRUBinaries.push_back(CachedBin);
+ CacheSize += CachedBin.size();
+ Bin = CachedBin->getBinary();
+ }
+
+ if (!Bin || !isa<object::Archive>(Bin))
+ return errorCodeToError(object_error::invalid_file_type);
+
+ object::Archive *Archive = cast<object::Archive>(Bin);
+ Error Err = Error::success();
+
+ // On AIX, archives can contain multiple members with same name but different types
+ // We need to check all matches and find one that matches both name and architecture
+ for (auto &Child : Archive->children(Err, /*SkipInternal=*/true)) {
+ Expected<StringRef> NameOrErr = Child.getName();
+ if (!NameOrErr)
+ continue;
+ if (*NameOrErr == llvm::sys::path::filename(MemberName)) {
+ Expected<std::unique_ptr<object::Binary>> MemberOrErr = Child.getAsBinary();
+ if (!MemberOrErr)
+ continue;
+
+ std::unique_ptr<object::Binary> Binary = std::move(*MemberOrErr);
+ if (auto *Obj = dyn_cast<object::ObjectFile>(Binary.get())) {
+#if defined(_AIX)
+ Triple::ArchType ObjArch = Obj->makeTriple().getArch();
+ Triple RequestedTriple;
+ RequestedTriple.setArch(Triple::getArchTypeForLLVMName(ArchName));
+ if (ObjArch != RequestedTriple.getArch())
+ continue;
+#endif
+ ArchiveCacheKey CacheKey{ArchivePath.str(), MemberName.str(), ArchName};
+ auto I = ObjectForArchivePathAndArch.find(CacheKey);
+ if (I != ObjectForArchivePathAndArch.end())
+ return I->second.get();
+
+ auto CachedObj = std::unique_ptr<ObjectFile>(Obj);
+ auto NewEntry = ObjectForArchivePathAndArch.emplace(
+ CacheKey, std::move(CachedObj));
+ Binary.release();
+ BinaryForPath.find(ArchivePath.str())->second.pushEvictor(
+ [this, Iter = NewEntry.first]() { ObjectForArchivePathAndArch.erase(Iter); });
+ return NewEntry.first->second.get();
+ }
+ }
+ }
+ if (Err)
+ return std::move(Err);
+ return errorCodeToError(object_error::arch_not_found);
+}
+
Expected<ObjectFile *>
LLVMSymbolizer::getOrCreateObject(const std::string &Path,
const std::string &ArchName) {
- Binary *Bin;
+ // First check for archive(member) format - more efficient to check closing paren first
+ size_t CloseParen = Path.rfind(')');
+ if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) {
+ size_t OpenParen = Path.rfind('(', CloseParen);
+ if (OpenParen != std::string::npos) {
+ StringRef ArchivePath = StringRef(Path).substr(0, OpenParen);
+ StringRef MemberName = StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1);
+ return getOrCreateObjectFromArchive(ArchivePath, MemberName, ArchName);
+ }
+ }
+
+ Binary *Bin = nullptr;
auto Pair = BinaryForPath.emplace(Path, OwningBinary<Binary>());
if (!Pair.second) {
Bin = Pair.first->second->getBinary();
@@ -648,7 +732,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
auto I = Modules.find(ModuleName);
if (I != Modules.end()) {
- recordAccess(BinaryForPath.find(BinaryName)->second);
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ recordAccess(BinIter->second);
return I->second.get();
}
@@ -716,7 +802,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
createModuleInfo(Objects.first, std::move(Context), ModuleName);
if (ModuleOrErr) {
auto I = Modules.find(ModuleName);
- BinaryForPath.find(BinaryName)->second.pushEvictor([this, I]() {
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ BinIter->second.pushEvictor([this, I]() {
Modules.erase(I);
});
}
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-32.yaml b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
new file mode 100644
index 0000000000000..2080607a1a88c
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
@@ -0,0 +1,119 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1DF
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xA0
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0x64
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x1C
+ Size: 0xC
+ FileOffsetToData: 0x80
+ FileOffsetToRelocations: 0x8C
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000002800000000'
+ Relocations:
+ - Address: 0x1C
+ Symbol: 0x5
+ Info: 0x1F
+ Type: 0x0
+ - Address: 0x20
+ Symbol: 0x9
+ Info: 0x1F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (145c02cece3630765e6412e6820bc446ddb4e138)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 25
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 3
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: foo
+ Value: 0x1C
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_DS
+ SectionOrLength: 12
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: TOC
+ Value: 0x28
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLength: 0
+ StabInfoIndex: 0
+ StabSectNum: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-64.yaml b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
new file mode 100644
index 0000000000000..9bbb1107555e0
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
@@ -0,0 +1,115 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1F7
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xF8
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0xA8
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x20
+ Size: 0x18
+ FileOffsetToData: 0xC4
+ FileOffsetToRelocations: 0xDC
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000000000000000000000380000000000000000'
+ Relocations:
+ - Address: 0x20
+ Symbol: 0x5
+ Info: 0x3F
+ Type: 0x0
+ - Address: 0x28
+ Symbol: 0x9
+ Info: 0x3F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (5ca72bc8d2e87445649eab1825dffd2a047440ba)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 25
+ SectionOrLengthHi: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 3
+ SectionOrLengthHi: 0
+ - Name: foo
+ Value: 0x20
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 3
+ StorageMappingClass: XMC_DS
+ SectionOrLengthLo: 24
+ SectionOrLengthHi: 0
+ - Name: TOC
+ Value: 0x38
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLengthLo: 0
+ SectionOrLengthHi: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
new file mode 100644
index 0000000000000..8e5c929e82878
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
@@ -0,0 +1,68 @@
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_REL
+ Machine: EM_PPC64
+ Flags: [ ]
+ SectionHeaderStringTable: .strtab
+Sections:
+ - Name: .text
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ AddressAlign: 0x10
+ Content: '2000804E000000000000000000000000'
+ - Name: .comment
+ Type: SHT_PROGBITS
+ Flags: [ SHF_MERGE, SHF_STRINGS ]
+ AddressAlign: 0x1
+ EntSize: 0x1
+ Content: 0049424D204F70656E20584C20432F432B2B20666F72204C696E7578206F6E20506F7765722031372E312E322028353732352D4337322C20353736352D4A3230292C2076657273696F6E2031372E312E322E302C20636C616E672076657273696F6E2032312E302E306769742028676974406769746875622E69626D2E636F6D3A636F6D70696C65722F6C6C766D2D70726F6A6563742E67697420653165653233663838333532623937333563363735386661396335653035313366626234393361322900
+ - Name: .note.GNU-stack
+ Type: SHT_PROGBITS
+ AddressAlign: 0x1
+ - Name: .eh_frame
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC ]
+ AddressAlign: 0x8
+ Content: 1000000000000000017A5200047841011B0C01001000000018000000000000001000000000000000
+ - Name: .rela.eh_frame
+ Type: SHT_RELA
+ Flags: [ SHF_INFO_LINK ]
+ Link: .symtab
+ AddressAlign: 0x8
+ Info: .eh_frame
+ Relocations:
+ - Offset: 0x1C
+ Symbol: .text
+ Type: R_PPC64_REL32
+ - Name: .llvm_addrsig
+ Type: SHT_LLVM_ADDRSIG
+ Flags: [ SHF_EXCLUDE ]
+ Link: .symtab
+ AddressAlign: 0x1
+ Offset: 0x1B8
+ Symbols: [ ]
+ - Type: SectionHeaderTable
+ Sections:
+ - Name: .strtab
+ - Name: .text
+ - Name: .comment
+ - Name: .note.GNU-stack
+ - Name: .eh_frame
+ - Name: .rela.eh_frame
+ - Name: .llvm_addrsig
+ - Name: .symtab
+Symbols:
+ - Name: foo1.c
+ Type: STT_FILE
+ Index: SHN_ABS
+ - Name: .text
+ Type: STT_SECTION
+ Section: .text
+ - Name: foo1
+ Type: STT_FUNC
+ Section: .text
+ Binding: STB_GLOBAL
+ Size: 0x10
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml
new file mode 100644
index 00000000...
[truncated]
|
@llvm/pr-subscribers-debuginfo Author: Midhunesh (midhuncodes7) ChangesThis PR implements big archive recognition by the symbolizer. Patch is 26.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150401.diff 10 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-symbolizer.rst b/llvm/docs/CommandGuide/llvm-symbolizer.rst
index 2da1b2470a83e..8f3a132139fe9 100644
--- a/llvm/docs/CommandGuide/llvm-symbolizer.rst
+++ b/llvm/docs/CommandGuide/llvm-symbolizer.rst
@@ -535,16 +535,20 @@ MACH-O SPECIFIC OPTIONS
.. option:: --default-arch <arch>
If a binary contains object files for multiple architectures (e.g. it is a
- Mach-O universal binary), symbolize the object file for a given architecture.
- You can also specify the architecture by writing ``binary_name:arch_name`` in
- the input (see example below). If the architecture is not specified in either
- way, the address will not be symbolized. Defaults to empty string.
+ Mach-O universal binary or an AIX archive with architecture variants),
+ symbolize the object file for a given architecture. You can also specify
+ the architecture by writing ``binary_name:arch_name`` in the input (see
+ example below). For AIX archives, the format ``archive.a(member.o):arch``
+ is also supported. If the architecture is not specified in either way,
+ the address will not be symbolized. Defaults to empty string.
.. code-block:: console
$ cat addr.txt
/tmp/mach_universal_binary:i386 0x1f84
/tmp/mach_universal_binary:x86_64 0x100000f24
+ /tmp/archive.a(member.o):ppc 0x1000
+ /tmp/archive.a(member.o):ppc64 0x2000
$ llvm-symbolizer < addr.txt
_main
@@ -553,6 +557,12 @@ MACH-O SPECIFIC OPTIONS
_main
/tmp/source_x86_64.cc:8
+ _foo
+ /tmp/source_ppc.cc:12
+
+ _foo
+ /tmp/source_ppc64.cc:12
+
.. option:: --dsym-hint <path/to/file.dSYM>
If the debug info for a binary isn't present in the default location, look for
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index fb8f3d8af6b1b..5144085f3e23c 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -29,6 +29,12 @@
#include <utility>
#include <vector>
+#if defined(_AIX)
+# define SYMBOLIZE_AIX 1
+#else
+# define SYMBOLIZE_AIX 0
+#endif
+
namespace llvm {
namespace object {
class ELFObjectFileBase;
@@ -202,6 +208,12 @@ class LLVMSymbolizer {
Expected<ObjectFile *> getOrCreateObject(const std::string &Path,
const std::string &ArchName);
+ /// Return a pointer to object file at specified path, for a specified
+ /// architecture that is present inside an archive file
+ Expected<ObjectFile *> getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName);
+
/// Update the LRU cache order when a binary is accessed.
void recordAccess(CachedBinary &Bin);
@@ -226,6 +238,20 @@ class LLVMSymbolizer {
std::map<std::pair<std::string, std::string>, std::unique_ptr<ObjectFile>>
ObjectForUBPathAndArch;
+ struct ArchiveCacheKey {
+ std::string ArchivePath; // Storage for StringRef
+ std::string MemberName; // Storage for StringRef
+ std::string ArchName; // Storage for StringRef
+
+ // Required for map comparison
+ bool operator<(const ArchiveCacheKey &Other) const {
+ return std::tie(ArchivePath, MemberName, ArchName) <
+ std::tie(Other.ArchivePath, Other.MemberName, Other.ArchName);
+ }
+ };
+
+ std::map<ArchiveCacheKey, std::unique_ptr<ObjectFile>> ObjectForArchivePathAndArch;
+
Options Opts;
std::unique_ptr<BuildIDFetcher> BIDFetcher;
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 56527719da51f..6dddc3a709239 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -33,6 +33,7 @@
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Path.h"
+#include "llvm/Object/Archive.h"
#include <cassert>
#include <cstring>
@@ -286,6 +287,7 @@ LLVMSymbolizer::findSymbol(ArrayRef<uint8_t> BuildID, StringRef Symbol,
void LLVMSymbolizer::flush() {
ObjectForUBPathAndArch.clear();
+ ObjectForArchivePathAndArch.clear();
LRUBinaries.clear();
CacheSize = 0;
BinaryForPath.clear();
@@ -321,7 +323,7 @@ bool checkFileCRC(StringRef Path, uint32_t CRCHash) {
bool getGNUDebuglinkContents(const ObjectFile *Obj, std::string &DebugName,
uint32_t &CRCHash) {
- if (!Obj)
+ if (!Obj || !isa<ObjectFile>(Obj))
return false;
for (const SectionRef &Section : Obj->sections()) {
StringRef Name;
@@ -557,19 +559,101 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
if (!DbgObj)
DbgObj = Obj;
ObjectPair Res = std::make_pair(Obj, DbgObj);
- std::string DbgObjPath = DbgObj->getFileName().str();
auto Pair =
ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res);
- BinaryForPath.find(DbgObjPath)->second.pushEvictor([this, I = Pair.first]() {
+ std::string DbgObjPath = DbgObj->getFileName().str();
+ auto BinIter = BinaryForPath.find(DbgObjPath);
+ if (BinIter != BinaryForPath.end()) {
+ BinIter->second.pushEvictor([this, I = Pair.first]() {
ObjectPairForPathArch.erase(I);
- });
+ });
+ }
return Res;
}
+Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName) {
+ Binary *Bin = nullptr;
+ auto Pair = BinaryForPath.emplace(ArchivePath.str(), OwningBinary<Binary>());
+ if (!Pair.second) {
+ Bin = Pair.first->second->getBinary();
+ recordAccess(Pair.first->second);
+ } else {
+ Expected<OwningBinary<Binary>> ArchiveOrErr = createBinary(ArchivePath);
+ if (!ArchiveOrErr) {
+ return ArchiveOrErr.takeError();
+ }
+
+ CachedBinary &CachedBin = Pair.first->second;
+ CachedBin = std::move(ArchiveOrErr.get());
+ CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); });
+ LRUBinaries.push_back(CachedBin);
+ CacheSize += CachedBin.size();
+ Bin = CachedBin->getBinary();
+ }
+
+ if (!Bin || !isa<object::Archive>(Bin))
+ return errorCodeToError(object_error::invalid_file_type);
+
+ object::Archive *Archive = cast<object::Archive>(Bin);
+ Error Err = Error::success();
+
+ // On AIX, archives can contain multiple members with same name but different types
+ // We need to check all matches and find one that matches both name and architecture
+ for (auto &Child : Archive->children(Err, /*SkipInternal=*/true)) {
+ Expected<StringRef> NameOrErr = Child.getName();
+ if (!NameOrErr)
+ continue;
+ if (*NameOrErr == llvm::sys::path::filename(MemberName)) {
+ Expected<std::unique_ptr<object::Binary>> MemberOrErr = Child.getAsBinary();
+ if (!MemberOrErr)
+ continue;
+
+ std::unique_ptr<object::Binary> Binary = std::move(*MemberOrErr);
+ if (auto *Obj = dyn_cast<object::ObjectFile>(Binary.get())) {
+#if defined(_AIX)
+ Triple::ArchType ObjArch = Obj->makeTriple().getArch();
+ Triple RequestedTriple;
+ RequestedTriple.setArch(Triple::getArchTypeForLLVMName(ArchName));
+ if (ObjArch != RequestedTriple.getArch())
+ continue;
+#endif
+ ArchiveCacheKey CacheKey{ArchivePath.str(), MemberName.str(), ArchName};
+ auto I = ObjectForArchivePathAndArch.find(CacheKey);
+ if (I != ObjectForArchivePathAndArch.end())
+ return I->second.get();
+
+ auto CachedObj = std::unique_ptr<ObjectFile>(Obj);
+ auto NewEntry = ObjectForArchivePathAndArch.emplace(
+ CacheKey, std::move(CachedObj));
+ Binary.release();
+ BinaryForPath.find(ArchivePath.str())->second.pushEvictor(
+ [this, Iter = NewEntry.first]() { ObjectForArchivePathAndArch.erase(Iter); });
+ return NewEntry.first->second.get();
+ }
+ }
+ }
+ if (Err)
+ return std::move(Err);
+ return errorCodeToError(object_error::arch_not_found);
+}
+
Expected<ObjectFile *>
LLVMSymbolizer::getOrCreateObject(const std::string &Path,
const std::string &ArchName) {
- Binary *Bin;
+ // First check for archive(member) format - more efficient to check closing paren first
+ size_t CloseParen = Path.rfind(')');
+ if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) {
+ size_t OpenParen = Path.rfind('(', CloseParen);
+ if (OpenParen != std::string::npos) {
+ StringRef ArchivePath = StringRef(Path).substr(0, OpenParen);
+ StringRef MemberName = StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1);
+ return getOrCreateObjectFromArchive(ArchivePath, MemberName, ArchName);
+ }
+ }
+
+ Binary *Bin = nullptr;
auto Pair = BinaryForPath.emplace(Path, OwningBinary<Binary>());
if (!Pair.second) {
Bin = Pair.first->second->getBinary();
@@ -648,7 +732,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
auto I = Modules.find(ModuleName);
if (I != Modules.end()) {
- recordAccess(BinaryForPath.find(BinaryName)->second);
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ recordAccess(BinIter->second);
return I->second.get();
}
@@ -716,7 +802,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
createModuleInfo(Objects.first, std::move(Context), ModuleName);
if (ModuleOrErr) {
auto I = Modules.find(ModuleName);
- BinaryForPath.find(BinaryName)->second.pushEvictor([this, I]() {
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ BinIter->second.pushEvictor([this, I]() {
Modules.erase(I);
});
}
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-32.yaml b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
new file mode 100644
index 0000000000000..2080607a1a88c
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
@@ -0,0 +1,119 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1DF
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xA0
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0x64
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x1C
+ Size: 0xC
+ FileOffsetToData: 0x80
+ FileOffsetToRelocations: 0x8C
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000002800000000'
+ Relocations:
+ - Address: 0x1C
+ Symbol: 0x5
+ Info: 0x1F
+ Type: 0x0
+ - Address: 0x20
+ Symbol: 0x9
+ Info: 0x1F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (145c02cece3630765e6412e6820bc446ddb4e138)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 25
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 3
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: foo
+ Value: 0x1C
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_DS
+ SectionOrLength: 12
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: TOC
+ Value: 0x28
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLength: 0
+ StabInfoIndex: 0
+ StabSectNum: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-64.yaml b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
new file mode 100644
index 0000000000000..9bbb1107555e0
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
@@ -0,0 +1,115 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1F7
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xF8
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0xA8
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x20
+ Size: 0x18
+ FileOffsetToData: 0xC4
+ FileOffsetToRelocations: 0xDC
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000000000000000000000380000000000000000'
+ Relocations:
+ - Address: 0x20
+ Symbol: 0x5
+ Info: 0x3F
+ Type: 0x0
+ - Address: 0x28
+ Symbol: 0x9
+ Info: 0x3F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (5ca72bc8d2e87445649eab1825dffd2a047440ba)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 25
+ SectionOrLengthHi: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 3
+ SectionOrLengthHi: 0
+ - Name: foo
+ Value: 0x20
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 3
+ StorageMappingClass: XMC_DS
+ SectionOrLengthLo: 24
+ SectionOrLengthHi: 0
+ - Name: TOC
+ Value: 0x38
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLengthLo: 0
+ SectionOrLengthHi: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
new file mode 100644
index 0000000000000..8e5c929e82878
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
@@ -0,0 +1,68 @@
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_REL
+ Machine: EM_PPC64
+ Flags: [ ]
+ SectionHeaderStringTable: .strtab
+Sections:
+ - Name: .text
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ AddressAlign: 0x10
+ Content: '2000804E000000000000000000000000'
+ - Name: .comment
+ Type: SHT_PROGBITS
+ Flags: [ SHF_MERGE, SHF_STRINGS ]
+ AddressAlign: 0x1
+ EntSize: 0x1
+ Content: 0049424D204F70656E20584C20432F432B2B20666F72204C696E7578206F6E20506F7765722031372E312E322028353732352D4337322C20353736352D4A3230292C2076657273696F6E2031372E312E322E302C20636C616E672076657273696F6E2032312E302E306769742028676974406769746875622E69626D2E636F6D3A636F6D70696C65722F6C6C766D2D70726F6A6563742E67697420653165653233663838333532623937333563363735386661396335653035313366626234393361322900
+ - Name: .note.GNU-stack
+ Type: SHT_PROGBITS
+ AddressAlign: 0x1
+ - Name: .eh_frame
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC ]
+ AddressAlign: 0x8
+ Content: 1000000000000000017A5200047841011B0C01001000000018000000000000001000000000000000
+ - Name: .rela.eh_frame
+ Type: SHT_RELA
+ Flags: [ SHF_INFO_LINK ]
+ Link: .symtab
+ AddressAlign: 0x8
+ Info: .eh_frame
+ Relocations:
+ - Offset: 0x1C
+ Symbol: .text
+ Type: R_PPC64_REL32
+ - Name: .llvm_addrsig
+ Type: SHT_LLVM_ADDRSIG
+ Flags: [ SHF_EXCLUDE ]
+ Link: .symtab
+ AddressAlign: 0x1
+ Offset: 0x1B8
+ Symbols: [ ]
+ - Type: SectionHeaderTable
+ Sections:
+ - Name: .strtab
+ - Name: .text
+ - Name: .comment
+ - Name: .note.GNU-stack
+ - Name: .eh_frame
+ - Name: .rela.eh_frame
+ - Name: .llvm_addrsig
+ - Name: .symtab
+Symbols:
+ - Name: foo1.c
+ Type: STT_FILE
+ Index: SHN_ABS
+ - Name: .text
+ Type: STT_SECTION
+ Section: .text
+ - Name: foo1
+ Type: STT_FUNC
+ Section: .text
+ Binding: STB_GLOBAL
+ Size: 0x10
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml
new file mode 100644
index 00000000...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Expected<ObjectFile *> | ||
LLVMSymbolizer::getOrCreateObject(const std::string &Path, | ||
const std::string &ArchName) { | ||
Binary *Bin; | ||
// First check for archive(member) format - more efficient to check closing paren first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// First check for archive(member) format - more efficient to check closing paren first | |
// First check for archive(member) format - more efficient to check closing paren first. |
Nit.
This behaviour change isn't guarded to Big Archive format as stated in the documentation. It looks to me like a UNIX archive would be impacted by this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping. The second part of this comment hasn't been addressed as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we intend to allow naming archive members in non-AIX contexts (as that may be helpful). The documentation will be made more generic like how it used to be before.
…des7/llvm-project into midhun7/big-archive-recognition
I'm still catching up on reviews left over from my time off last week. I'll be taking a look again at this hopefully tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't mark conversation threads that I have initiated as resolved. I need the "resolved" status to help me spot things I have and have not checked, because it's not unheard of for people to mark things as resolved without actually addressing them. You can find more context on https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178.
@@ -1,5 +1,5 @@ | |||
// Test archive member recognition by name (ELF format) | |||
// UNSUPPORTED: target={{.*}}-aix{{.*}} | |||
// REQUIRES: system-linux || system-aix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why won't this test work on Windows or Mach-O machines?
In general, tests should be allowed to run on all platforms and build configurations that they functionally work on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aix linux specific requirement removed
@@ -1,27 +1,31 @@ | |||
// Test big archive recognition and error handling in llvm-symbolizer | |||
// REQUIRES: system-aix, target={{.*}}-aix{{.*}} | |||
// REQUIRES: system-linux || system-aix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aix linux specific requirement removed
symbolize the object file for a given architecture. You can also specify | ||
the architecture by writing ``binary_name:arch_name`` in the input (see | ||
example below). For AIX archives, the format ``archive.a(member.o):arch`` | ||
is also supported. If the architecture is not specified in either way, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is also supported. If the architecture is not specified in either way, | |
is also supported. If the architecture is not specified, |
"in either way" no longer makes sense given there are three ways of specifying things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
Expected<ObjectFile *> getOrCreateObject(const std::string &InputPath, | ||
const std::string &DefaultArchName); | ||
|
||
/// Return a pointer to object file at specified path, for a specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Return a pointer to object file at specified path, for a specified | |
/// Return a pointer to the object file at specified path, for a specified |
(the comment above also has this issue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
const std::string &DefaultArchName); | ||
|
||
/// Return a pointer to object file at specified path, for a specified | ||
/// architecture that is present inside an archive file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As previously stated, all comments should end in ".", per LLVM style guide. Please review all comments you've added, both in regular source and test code, and update accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
std::string ArchivePath; // Storage for StringRef | ||
std::string MemberName; // Storage for StringRef | ||
std::string ArchName; // Storage for StringRef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these need the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
} | ||
}; | ||
|
||
std::map<ArchiveCacheKey, std::unique_ptr<ObjectFile>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at this and wondering if there's a way to fold this and ObjectForUBPathAndArch
together. The only difference is that one has an additional key. Could you, for example, use the three-part key version for both cases, but just leave that additional key as an empty string when only using the two parts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored code to remove duplicates
|
||
object::Archive *Archive = dyn_cast_if_present<object::Archive>(Bin); | ||
if (!Archive) | ||
return errorCodeToError(object_error::invalid_file_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd create a manually written error, using e.g. createStringError(...)
, which allows you to tailor the error message more precisely and provide any appropriate additional context, e.g. the name of the archive and member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added more detailed error message
|
||
Error Err = Error::success(); | ||
|
||
// On AIX, archives can contain multiple members with same name but different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// On AIX, archives can contain multiple members with same name but different | |
// On AIX, archives can contain multiple members with the same name but different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
return Res; | ||
} | ||
|
||
Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone back over the code a bit more and can't help but feel that this function heavily overlaps/is significantly a duplicate of other code in this file. Would it be possible to find ways to avoid this duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored code to remove duplicates
@midhuncodes7, it looks like you're still making regular changes to this branch. Please let me know when you're finished and ready for further review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178, please don't resolve conversations that I have initiated. I will resolve them when I'm satisfied with a response. This helps me track what I've commented on.
/// Helper function to load binary. | ||
Expected<object::Binary *> loadOrGetBinary(const std::string &Path); | ||
|
||
/// Helper function to find and get object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Helper function to find and get object | |
/// Helper function to find and get object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
object::Archive *Archive = dyn_cast_if_present<object::Archive>(Bin); | ||
if (!Archive) | ||
return errorCodeToError(object_error::invalid_file_type); | ||
return createStringError(std::errc::invalid_argument, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
} | ||
} | ||
} | ||
if (Err) | ||
return std::move(Err); | ||
return errorCodeToError(object_error::arch_not_found); | ||
return createStringError(std::errc::invalid_argument, | ||
"No matching member '%s' with arch '%s' in '%s'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM coding standards for errors in new code is to use lower-case for the first word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
Expected<ObjectFile *> getOrCreateObject(const std::string &InputPath, | ||
const std::string &DefaultArchName); | ||
|
||
/// Return a pointer to the object file at specified path, for a specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back to this, the first part of this comment needs improving.
/// Return a pointer to the object file at specified path, for a specified | |
/// Return a pointer to the object file with the specified name, for a specified |
@@ -436,13 +437,13 @@ bool LLVMSymbolizer::findDebugBinary(const std::string &OrigPath, | |||
SmallString<16> OrigDir(OrigPath); | |||
llvm::sys::path::remove_filename(OrigDir); | |||
SmallString<16> DebugPath = OrigDir; | |||
// Try relative/path/to/original_binary/debuglink_name | |||
// Try relative/path/to/original_binary/debuglink_name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to fix/update comments unrelated to your patch, as it just muddies the git blame output. Same goes throughout. Only fix comments you otherwise need to change or are adjacent to other changes you're making.
}); | ||
std::string DbgObjPath = DbgObj->getFileName().str(); | ||
auto BinIter = BinaryForPath.find(DbgObjPath); | ||
if (BinIter != BinaryForPath.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I get an explanation for why this behaviour has changed, please. Also, I assume you have test coverage for it?
Modules.erase(I); | ||
}); | ||
auto BinIter = BinaryForPath.find(BinaryName); | ||
if (BinIter != BinaryForPath.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: what's the purpose of the logic change and what test coverage is there for it?
@@ -648,7 +741,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) { | |||
|
|||
auto I = Modules.find(ModuleName); | |||
if (I != Modules.end()) { | |||
recordAccess(BinaryForPath.find(BinaryName)->second); | |||
auto BinIter = BinaryForPath.find(BinaryName); | |||
if (BinIter != BinaryForPath.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: what's the purpose of the logic change and what test coverage is there for it?
Bin = CachedBin->getBinary(); | ||
Expected<OwningBinary<Binary>> BinOrErr = createBinary(Path); | ||
if (!BinOrErr) { | ||
BinaryForPath.erase(Pair.first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't here before. Why is it here now?
const ArchiveCacheKey &Key, | ||
llvm::function_ref<Expected<std::unique_ptr<ObjectFile>>()> Loader, | ||
const std::string &PathForBinaryCache) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: don't have blank lines at the start of a function.
size_t CloseParen = Path.rfind(')'); | ||
if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't efficient at all, especially for the case where there is no closing paren (likely the common case), but even for the case where you have one. The only case you're interested in is if CloseParen
is the last character in Path
, so why not just do if (Path.back() == ')')
(but be careful you don't have an empty string)?
if (OpenParen != std::string::npos) { | ||
StringRef ArchivePath = StringRef(Path).substr(0, OpenParen); | ||
StringRef MemberName = | ||
StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with the case where you have a string like foo.a()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It errors out as "no matching member"
Expected<StringRef> NameOrErr = Child.getName(); | ||
if (!NameOrErr) | ||
continue; | ||
if (*NameOrErr == sys::path::filename(MemberName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the sys::path::filename
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MemberName
may sometimes contain the entire path to the member. In those cases this condition will fail as NameOrErr
will contain only the member name. In order to deal with such cases as well, sys::path::filename
call is used
This PR implements big archive recognition by the symbolizer.
The archive input format should be in archive.a(member.o) format