-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[WIP] [lldb][TypeSystemClang] Create clang::SourceLocation from DWARF and attach to AST #127829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lldb Author: Michael Buch (Michael137) ChangesThis patch revives https://reviews.llvm.org/D78095 with some changes. This patch uses the
We would see:
There's a couple of quirks in the output still, mainly because Clang we doesn't emit This is currently WIP. It's lacking tests and we probably also want to make sure we only use the file I split up the patch into 3 separate commits:
The three main differences to https://reviews.llvm.org/D78095 are:
Patch is 20.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/127829.diff 5 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 2d4d22559963f..abf3b22b0ae15 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -1372,7 +1372,7 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
ignore_containing_context ? m_ast.GetTranslationUnitDecl()
: containing_decl_ctx,
GetOwningClangModule(die), name, clang_type, attrs.storage,
- attrs.is_inline);
+ attrs.is_inline, attrs.decl);
std::free(name_buf);
if (has_template_params) {
@@ -1382,11 +1382,11 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
ignore_containing_context ? m_ast.GetTranslationUnitDecl()
: containing_decl_ctx,
GetOwningClangModule(die), attrs.name.GetStringRef(), clang_type,
- attrs.storage, attrs.is_inline);
+ attrs.storage, attrs.is_inline, attrs.decl);
clang::FunctionTemplateDecl *func_template_decl =
m_ast.CreateFunctionTemplateDecl(
containing_decl_ctx, GetOwningClangModule(die),
- template_function_decl, template_param_infos);
+ template_function_decl, template_param_infos, attrs.decl);
m_ast.CreateFunctionTemplateSpecializationInfo(
template_function_decl, func_template_decl, template_param_infos);
}
@@ -1858,7 +1858,7 @@ DWARFASTParserClang::ParseStructureLikeDIE(const SymbolContext &sc,
clang::ClassTemplateSpecializationDecl *class_specialization_decl =
m_ast.CreateClassTemplateSpecializationDecl(
containing_decl_ctx, GetOwningClangModule(die), class_template_decl,
- tag_decl_kind, template_param_infos);
+ tag_decl_kind, template_param_infos, attrs.decl);
clang_type =
m_ast.CreateClassTemplateSpecializationType(class_specialization_decl);
@@ -1870,7 +1870,7 @@ DWARFASTParserClang::ParseStructureLikeDIE(const SymbolContext &sc,
clang_type = m_ast.CreateRecordType(
containing_decl_ctx, GetOwningClangModule(die), attrs.accessibility,
attrs.name.GetCString(), tag_decl_kind, attrs.class_language, metadata,
- attrs.exports_symbols);
+ attrs.exports_symbols, attrs.decl);
}
TypeSP type_sp = dwarf->MakeType(
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
index 5d4b22d08b111..ecb1a7dc571b4 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
@@ -1128,7 +1128,7 @@ void PdbAstBuilder::CreateFunctionParameters(PdbCompilandSymId func_id,
CompilerType param_type_ct = m_clang.GetType(qt);
clang::ParmVarDecl *param = m_clang.CreateParameterDeclaration(
&function_decl, OptionalClangModuleID(), param_name.str().c_str(),
- param_type_ct, clang::SC_None, true);
+ param_type_ct, clang::SC_None, clang::SourceLocation(), true);
lldbassert(m_uid_to_decl.count(toOpaqueUid(param_uid)) == 0);
m_uid_to_decl[toOpaqueUid(param_uid)] = param;
diff --git a/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp b/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
index c6dd72e22fb4c..e98ccf87cc2f7 100644
--- a/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
+++ b/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
@@ -969,7 +969,8 @@ PDBASTParser::GetDeclForSymbol(const llvm::pdb::PDBSymbol &symbol) {
clang::ParmVarDecl *param = m_ast.CreateParameterDeclaration(
decl, OptionalClangModuleID(), nullptr,
- arg_type->GetForwardCompilerType(), clang::SC_None, true);
+ arg_type->GetForwardCompilerType(), clang::SC_None,
+ clang::SourceLocation(), true);
if (param)
params.push_back(param);
}
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 1e0c7f0514941..e03bb8b952e39 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -13,6 +13,7 @@
#include "llvm/Support/Casting.h"
#include "llvm/Support/FormatAdapters.h"
#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
#include <mutex>
#include <memory>
@@ -361,6 +362,39 @@ static void SetMemberOwningModule(clang::Decl *member,
}
}
+/// Creates a dummy main file for the given SourceManager.
+/// This file only serves as a container for include locations to other
+/// FileIDs that are put into this type system (either by the ASTImporter
+/// or when TypeSystemClang generates source locations for declarations).
+/// This file is not reflected to disk.
+static clang::FileID CreateDummyMainFile(clang::SourceManager &sm,
+ clang::FileManager &fm) {
+ llvm::StringRef main_file_path = "<LLDB Dummy Main File>";
+ // The file contents are empty and should never be seen by the user. The new
+ // line is just there to not throw off any line counting logic that might
+ // expect files to end with a newline.
+ llvm::StringRef main_file_contents = "\n";
+ const time_t mod_time = 0;
+ const off_t file_size = static_cast<off_t>(main_file_contents.size());
+
+ // Create a virtual FileEntry for our dummy file.
+ auto fe = fm.getVirtualFileRef(main_file_path, file_size, mod_time);
+
+ // Overwrite the file buffer with our empty file contents.
+ llvm::SmallVector<char, 64> buffer;
+ buffer.append(main_file_contents.begin(), main_file_contents.end());
+ auto file_contents = std::make_unique<llvm::SmallVectorMemoryBuffer>(
+ std::move(buffer), main_file_path);
+ sm.overrideFileContents(fe, std::move(file_contents));
+
+ // Create the actual file id for the FileEntry and set it as the main file.
+ clang::FileID fid =
+ sm.createFileID(fe, SourceLocation(), clang::SrcMgr::C_User);
+ sm.setMainFileID(fid);
+
+ return fid;
+}
+
char TypeSystemClang::ID;
bool TypeSystemClang::IsOperator(llvm::StringRef name,
@@ -692,6 +726,11 @@ void TypeSystemClang::CreateASTContext() {
m_diagnostic_consumer_up = std::make_unique<NullDiagnosticConsumer>();
m_ast_up->getDiagnostics().setClient(m_diagnostic_consumer_up.get(), false);
+ // Set up the MainFileID of this ASTContext. All other FileIDs created
+ // by this TypeSystem will act as-if included into this dummy main file.
+ auto fid = CreateDummyMainFile(*m_source_manager_up, *m_file_manager_up);
+ assert(m_ast_up->getSourceManager().getMainFileID() == fid);
+
// This can be NULL if we don't know anything about the architecture or if
// the target for an architecture isn't enabled in the llvm/clang that we
// built
@@ -1243,7 +1282,7 @@ CompilerType TypeSystemClang::CreateRecordType(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
AccessType access_type, llvm::StringRef name, int kind,
LanguageType language, std::optional<ClangASTMetadata> metadata,
- bool exports_symbols) {
+ bool exports_symbols, const Declaration &declaration) {
ASTContext &ast = getASTContext();
if (decl_ctx == nullptr)
@@ -1298,6 +1337,10 @@ CompilerType TypeSystemClang::CreateRecordType(
decl->setAnonymousStructOrUnion(true);
}
+ auto location = GetLocForDecl(declaration);
+ decl->setLocStart(location);
+ decl->setLocation(location);
+
if (metadata)
SetMetadata(decl, *metadata);
@@ -1415,7 +1458,8 @@ static TemplateParameterList *CreateTemplateParameterList(
clang::FunctionTemplateDecl *TypeSystemClang::CreateFunctionTemplateDecl(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
clang::FunctionDecl *func_decl,
- const TemplateParameterInfos &template_param_infos) {
+ const TemplateParameterInfos &template_param_infos,
+ const Declaration &declaration) {
// /// Create a function template node.
ASTContext &ast = getASTContext();
@@ -1429,6 +1473,7 @@ clang::FunctionTemplateDecl *TypeSystemClang::CreateFunctionTemplateDecl(
func_tmpl_decl->setDeclName(func_decl->getDeclName());
func_tmpl_decl->setTemplateParameters(template_param_list);
func_tmpl_decl->init(func_decl);
+ func_tmpl_decl->setLocation(GetLocForDecl(declaration));
SetOwningModule(func_tmpl_decl, owning_module);
for (size_t i = 0, template_param_decl_count = template_param_decls.size();
@@ -1654,7 +1699,8 @@ ClassTemplateSpecializationDecl *
TypeSystemClang::CreateClassTemplateSpecializationDecl(
DeclContext *decl_ctx, OptionalClangModuleID owning_module,
ClassTemplateDecl *class_template_decl, int kind,
- const TemplateParameterInfos &template_param_infos) {
+ const TemplateParameterInfos &template_param_infos,
+ const Declaration &declaration) {
ASTContext &ast = getASTContext();
llvm::SmallVector<clang::TemplateArgument, 2> args(
template_param_infos.Size() +
@@ -1689,6 +1735,8 @@ TypeSystemClang::CreateClassTemplateSpecializationDecl(
class_template_specialization_decl->setSpecializationKind(
TSK_ExplicitSpecialization);
+ class_template_specialization_decl->setLocation(GetLocForDecl(declaration));
+
return class_template_specialization_decl;
}
@@ -2149,7 +2197,8 @@ std::string TypeSystemClang::GetTypeNameForDecl(const NamedDecl *named_decl,
FunctionDecl *TypeSystemClang::CreateFunctionDeclaration(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
llvm::StringRef name, const CompilerType &function_clang_type,
- clang::StorageClass storage, bool is_inline) {
+ clang::StorageClass storage, bool is_inline,
+ const Declaration &declaration) {
FunctionDecl *func_decl = nullptr;
ASTContext &ast = getASTContext();
if (!decl_ctx)
@@ -2170,6 +2219,11 @@ FunctionDecl *TypeSystemClang::CreateFunctionDeclaration(
func_decl->setConstexprKind(isConstexprSpecified
? ConstexprSpecKind::Constexpr
: ConstexprSpecKind::Unspecified);
+
+ const clang::SourceLocation location = GetLocForDecl(declaration);
+ func_decl->setLocation(location);
+ func_decl->setRangeEnd(location);
+
SetOwningModule(func_decl, owning_module);
decl_ctx->addDecl(func_decl);
@@ -2219,7 +2273,7 @@ CompilerType TypeSystemClang::CreateFunctionType(
ParmVarDecl *TypeSystemClang::CreateParameterDeclaration(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
const char *name, const CompilerType ¶m_type, int storage,
- bool add_decl) {
+ clang::SourceLocation loc, bool add_decl) {
ASTContext &ast = getASTContext();
auto *decl = ParmVarDecl::CreateDeserialized(ast, GlobalDeclID());
decl->setDeclContext(decl_ctx);
@@ -2227,6 +2281,7 @@ ParmVarDecl *TypeSystemClang::CreateParameterDeclaration(
decl->setDeclName(&ast.Idents.get(name));
decl->setType(ClangUtil::GetQualType(param_type));
decl->setStorageClass(static_cast<clang::StorageClass>(storage));
+ decl->setLocation(loc);
SetOwningModule(decl, owning_module);
if (add_decl)
decl_ctx->addDecl(decl);
@@ -2316,10 +2371,10 @@ CompilerType TypeSystemClang::CreateEnumerationType(
OptionalClangModuleID owning_module, const Declaration &decl,
const CompilerType &integer_clang_type, bool is_scoped,
std::optional<clang::EnumExtensibilityAttr::Kind> enum_kind) {
- // TODO: Do something intelligent with the Declaration object passed in
- // like maybe filling in the SourceLocation with it...
ASTContext &ast = getASTContext();
+ auto location = GetLocForDecl(decl);
+
// TODO: ask about these...
// const bool IsFixed = false;
EnumDecl *enum_decl = EnumDecl::CreateDeserialized(ast, GlobalDeclID());
@@ -2329,6 +2384,8 @@ CompilerType TypeSystemClang::CreateEnumerationType(
enum_decl->setScoped(is_scoped);
enum_decl->setScopedUsingClassTag(is_scoped);
enum_decl->setFixed(false);
+ enum_decl->setLocation(location);
+ enum_decl->setLocStart(location);
SetOwningModule(enum_decl, owning_module);
if (decl_ctx)
decl_ctx->addDecl(enum_decl);
@@ -7755,10 +7812,10 @@ TypeSystemClang::CreateParameterDeclarations(
llvm::StringRef name =
!parameter_names.empty() ? parameter_names[param_index] : "";
- auto *param =
- CreateParameterDeclaration(func, /*owning_module=*/{}, name.data(),
- GetType(prototype.getParamType(param_index)),
- clang::SC_None, /*add_decl=*/false);
+ auto *param = CreateParameterDeclaration(
+ func, /*owning_module=*/{}, name.data(),
+ GetType(prototype.getParamType(param_index)), clang::SC_None,
+ func->getLocation(), /*add_decl=*/false);
assert(param);
params.push_back(param);
@@ -7771,7 +7828,8 @@ clang::CXXMethodDecl *TypeSystemClang::AddMethodToCXXRecordType(
lldb::opaque_compiler_type_t type, llvm::StringRef name,
const char *mangled_name, const CompilerType &method_clang_type,
lldb::AccessType access, bool is_virtual, bool is_static, bool is_inline,
- bool is_explicit, bool is_attr_used, bool is_artificial) {
+ bool is_explicit, bool is_attr_used, bool is_artificial,
+ const Declaration &declaration) {
if (!type || !method_clang_type.IsValid() || name.empty())
return nullptr;
@@ -7912,6 +7970,10 @@ clang::CXXMethodDecl *TypeSystemClang::AddMethodToCXXRecordType(
cxx_method_decl->setParams(CreateParameterDeclarations(
cxx_method_decl, *method_function_prototype, /*parameter_names=*/{}));
+ const clang::SourceLocation location = GetLocForDecl(declaration);
+ cxx_method_decl->setLocation(location);
+ cxx_method_decl->setRangeEnd(location);
+
AddAccessSpecifierDecl(cxx_record_decl, getASTContext(),
GetCXXRecordDeclAccess(cxx_record_decl),
access_specifier);
@@ -9882,3 +9944,33 @@ void TypeSystemClang::LogCreation() const {
LLDB_LOG(log, "Created new TypeSystem for (ASTContext*){0:x} '{1}'",
&getASTContext(), getDisplayName());
}
+
+clang::SourceLocation TypeSystemClang::GetLocForDecl(const Declaration &decl) {
+ // If the Declaration is invalid there is nothing to do.
+ if (!decl.IsValid())
+ return {};
+
+ clang::SourceManager &sm = getASTContext().getSourceManager();
+ clang::FileManager &fm = sm.getFileManager();
+
+ auto fe = fm.getFileRef(decl.GetFile().GetPath());
+ if (!fe)
+ return {};
+
+ clang::FileID fid = sm.translateFile(*fe);
+ if (fid.isInvalid()) {
+ // We see the file for the first time, so create a dummy file for it now.
+
+ // Connect the new dummy file to the main file via some fake include
+ // location. This is necessary as all file's in the SourceManager need to be
+ // reachable via an include chain from the main file.
+ SourceLocation ToIncludeLocOrFakeLoc;
+ assert(sm.getMainFileID().isValid());
+ ToIncludeLocOrFakeLoc = sm.getLocForStartOfFile(sm.getMainFileID());
+ fid = sm.createFileID(*fe, ToIncludeLocOrFakeLoc, clang::SrcMgr::C_User);
+ }
+
+ // Clang requires column numbers to be >= 1..
+ return sm.translateLineCol(fid, decl.GetLine(),
+ std::max<uint16_t>(decl.GetColumn(), 1));
+}
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
index 99d9becffd128..f59eaee6705f9 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
@@ -326,13 +326,12 @@ class TypeSystemClang : public TypeSystem {
bool is_framework = false,
bool is_explicit = false);
- CompilerType
- CreateRecordType(clang::DeclContext *decl_ctx,
- OptionalClangModuleID owning_module,
- lldb::AccessType access_type, llvm::StringRef name, int kind,
- lldb::LanguageType language,
- std::optional<ClangASTMetadata> metadata = std::nullopt,
- bool exports_symbols = false);
+ CompilerType CreateRecordType(
+ clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
+ lldb::AccessType access_type, llvm::StringRef name, int kind,
+ lldb::LanguageType language,
+ std::optional<ClangASTMetadata> metadata = std::nullopt,
+ bool exports_symbols = false, const Declaration &declaration = {});
class TemplateParameterInfos {
public:
@@ -420,7 +419,8 @@ class TypeSystemClang : public TypeSystem {
clang::FunctionTemplateDecl *CreateFunctionTemplateDecl(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
- clang::FunctionDecl *func_decl, const TemplateParameterInfos &infos);
+ clang::FunctionDecl *func_decl, const TemplateParameterInfos &infos,
+ const Declaration &declaration);
void CreateFunctionTemplateSpecializationInfo(
clang::FunctionDecl *func_decl, clang::FunctionTemplateDecl *Template,
@@ -437,7 +437,7 @@ class TypeSystemClang : public TypeSystem {
clang::ClassTemplateSpecializationDecl *CreateClassTemplateSpecializationDecl(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
clang::ClassTemplateDecl *class_template_decl, int kind,
- const TemplateParameterInfos &infos);
+ const TemplateParameterInfos &infos, const Declaration &declaration);
CompilerType
CreateClassTemplateSpecializationType(clang::ClassTemplateSpecializationDecl *
@@ -476,7 +476,8 @@ class TypeSystemClang : public TypeSystem {
clang::FunctionDecl *CreateFunctionDeclaration(
clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
llvm::StringRef name, const CompilerType &function_Type,
- clang::StorageClass storage, bool is_inline);
+ clang::StorageClass storage, bool is_inline,
+ const Declaration &declaration = {});
CompilerType
CreateFunctionType(const CompilerType &result_type, const CompilerType *args,
@@ -484,11 +485,10 @@ class TypeSystemClang : public TypeSystem {
clang::CallingConv cc = clang::CC_C,
clang::RefQualifierKind ref_qual = clang::RQ_None);
- clang::ParmVarDecl *
- CreateParameterDeclaration(clang::DeclContext *decl_ctx,
- OptionalClangModuleID owning_module,
- const char *name, const CompilerType ¶m_type,
- int storage, bool add_decl = false);
+ clang::ParmVarDecl *CreateParameterDeclaration(
+ clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
+ const char *name, const CompilerType ¶m_type, int storage,
+ clang::SourceLocation loc, bool add_decl = false);
CompilerType CreateBlockPointerType(const CompilerType &function_type);
@@ -996,7 +996,8 @@ class TypeSystemClang : public TypeSystem {
lldb::opaque_compiler_type_t type, llvm::StringRef name,
const char *mangled_name, const CompilerType &method_type,
lldb::AccessType access, bool is_virtual, bool is_static, bool is_inline,
- bool is_explicit, bool is_attr_used, bool is_artificial);
+ bool is_explicit, bool is_attr_used, bool is_artificial,
+ const Declaration &declaration = {});
void AddMethodOverridesForCXXRecordType(lldb::opaque_compiler_type_t type);
@@ -1188,6 +1189,15 @@ class TypeSystemClang : public TypeSystem {
std::optional<uint64_t> GetObjCBitSize(clang::QualType qual_type,
ExecutionContextScope *exe_scope);
+ /// Turns the given \c decl into a \c clang::SourceLocation.
+ ///
+ /// Will create a \c FileID in this \c DWARFASTParserClang's \c ASTContext
+ /// if necessary.
+ ///
+ /// If no \c FileID could be found/created, returns an empty \c
+ /// SourceLocation.
+ clang::SourceLocation GetLocForDecl(const lldb_private::Declaration &decl);
+
// Classes that inherit from TypeSystemClang can see and modify these
std::string m_target_tr...
[truncated]
|
5dae80a
to
c998ea3
Compare
c998ea3
to
a05a2ca
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
2da2b4e
to
3f095fc
Compare
Sounds like a nice feature to have. I'm not really familiar with these APIs, so I don't have much to say in the way of specifics. One high level question I have (I can't tell from the patch) is how are these files actually handled. Do we need to (preemptively) read them into some clang-owned buffer, or can we say something like "there is a file here, come back to me if you need it". I have two reasons for asking this: I'm wondering if this will have any impact on our memory footprint; and I'm wondering how will this behave if the file is changed (or deleted) during the course of the debug session. |
3f095fc
to
8eb01f6
Compare
Yea I wasn't sure what the best answer to this is yet. Currently we create a
From my limited understanding of these APIs, I don't think there's currently such a customization point. LLDB would have to somehow store that column/line info somewhere until Clang asks for it. We could maybe stuff But I guess we'd first have to decide whether opening the source files is an unacceptable increase in the memory footprint for the common debugging use-case. I can collect some memory usage numbers when debugging clang/lldb as one data-point |
I looked at the memory footprint of LLDB while debugging LLDB and there wasn't noticeable difference in resident memory usage between top-of-tree and this patch. I made sure to evaluate expressions that open up large source files. I do agree it feels a bit weird to be paying for opening source files if we only ever use it in diagnostics (not the common case). But maybe that's a cost worth paying? |
Do we only pay the cost when a diagnostic is generated, or every time a DW_AT_file is parsed? |
In the former case I have no concerns about this. |
Currently it's on creation of the decl |
Yeah, I expect this could be a bit of a problem on several fronts:
With the current implementation, I expect that we would need this to be controlled by some sort of a setting, although I'd really rather not do that. I wonder if we could cheat by pointing all files to some fake memory buffer (containing a bunch of newlines or something), and then hijacking the error printing logic to look up the "real" contents of the file (taking into account the current source map, file checksums and everything). |
Why do we need to touch source files to add a source file attribution from debug info to a declaration? We don't require the actual presence of source files in order to make source file attributions anywhere else in the debug info handling.
Jim
… On Feb 25, 2025, at 2:06 AM, Pavel Labath ***@***.***> wrote:
labath
left a comment
(llvm/llvm-project#127829)
Yeah, I expect this could be a bit of a problem on several fronts:
even if the memory usage is low enough to not matter, the fact that we have to hit the filesystem to load the file is most likely going to slow things down, particularly when you're using a remote filesystem (like we are). Normally, all one needs to parse debug info is the DWARF data, but now we would be fetching potentially thousands of files without actually using most most of them
this setup also makes it hard to display the source files in the same way that the rest of lldb does. In particular the target.source-map setting is inaccessible to the debug info code, and if I'm not mistaken the source manager in lldb always takes the current value of the setting (whereas the mapping in the clang ast is fixed at construction time)
mmapping the files (I suspect this is the reason this doesn't have a big impact on memory footprint) also has some unfortunate side effects on various systems. On windows, it prevents the file from being deleted (some editors implement "editing" as deleting and recreating a file), and on most unix systems (with the exception of Darwin which has MAP_RESILIENT_MEDIA) accessing the file after it has been truncated (which is the other way to implement "editing a file") can cause a SIGBUS.
I wonder whether we can exhaust the SourceLocation space in this way (and what happens if we do). 4GB sounds like a lot, but I know that some people manage to do that with just a single compile unit (I think it involves #includeing the same file repeatedly), so I wouldn't be surprised if we hit it after placing the entire large binary into a single AST context. (Maybe it's not as acute as we're not modelling #includes, but still..)
With the current implementation, I expect that we would need this to be controlled by some sort of a setting, although I'd really rather not do that. I wonder if we could cheat by pointing all files to some fake memory buffer (containing a bunch of newlines or something), and then hijacking the error printing logic to look up the "real" contents of the file (taking into account the current source map, file checksums and everything).
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are on a team that was mentioned.
<#127829 (comment)> <https://github.com/notifications/unsubscribe-auth/ADUPVW4LRQST7324NFPME6D2RQ6C7AVCNFSM6AAAAABXOWNWGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBRGQZDMMJTGQ>
labath
left a comment
(llvm/llvm-project#127829)
<#127829 (comment)>
Yeah, I expect this could be a bit of a problem on several fronts:
even if the memory usage is low enough to not matter, the fact that we have to hit the filesystem to load the file is most likely going to slow things down, particularly when you're using a remote filesystem (like we are). Normally, all one needs to parse debug info is the DWARF data, but now we would be fetching potentially thousands of files without actually using most most of them
this setup also makes it hard to display the source files in the same way that the rest of lldb does. In particular the target.source-map setting is inaccessible to the debug info code, and if I'm not mistaken the source manager in lldb always takes the current value of the setting (whereas the mapping in the clang ast is fixed at construction time)
mmapping the files (I suspect this is the reason this doesn't have a big impact on memory footprint) also has some unfortunate side effects on various systems. On windows, it prevents the file from being deleted (some editors implement "editing" as deleting and recreating a file), and on most unix systems (with the exception of Darwin which has MAP_RESILIENT_MEDIA) accessing the file after it has been truncated (which is the other way to implement "editing a file") can cause a SIGBUS.
I wonder whether we can exhaust the SourceLocation space in this way (and what happens if we do). 4GB sounds like a lot, but I know that some people manage to do that with just a single compile unit (I think it involves #includeing the same file repeatedly), so I wouldn't be surprised if we hit it after placing the entire large binary into a single AST context. (Maybe it's not as acute as we're not modelling #includes, but still..)
With the current implementation, I expect that we would need this to be controlled by some sort of a setting, although I'd really rather not do that. I wonder if we could cheat by pointing all files to some fake memory buffer (containing a bunch of newlines or something), and then hijacking the error printing logic to look up the "real" contents of the file (taking into account the current source map, file checksums and everything).
—
Reply to this email directly, view it on GitHub <#127829 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADUPVW4LRQST7324NFPME6D2RQ6C7AVCNFSM6AAAAABXOWNWGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBRGQZDMMJTGQ>.
You are receiving this because you are on a team that was mentioned.
|
A |
It might be worth exploring a SourceManager subclass that has in addition to a MemoryBuffer-backed location also a source file type that is a file/line variant that is only converted into a MemoryBuffer-backed location if we do an operation that really needs it. |
This patch revives https://reviews.llvm.org/D78095 with some changes.
This patch uses the
DW_AT_decl_XXX
DWARF attributes to createclang::SourceLocation
s and attach them to the Clang decls that LLDB creates. The primary motivation for this is better expression evaluator diagnostics. Instead of:We would see:
There's a couple of quirks in the output still, mainly because Clang doesn't emit
DW_AT_decl_column
, so Clang has to place the caret on column1
. But this is something we can probably configure (either as a pre-requisite to this patch or as a follow-up).This is currently WIP. It's lacking tests and we probably also want to make sure we only use
DW_AT_decl_file
if the file hasn't changed since the debug-info was generated (i think there's some checksum we can use for this, like we do in the SupportFiles?). Another open question is whether we want to adjust our typename printer to omit the locations of unnamed/anonymous structures/enums (i.e.,(unnamed struct at /path/to/some/file.cpp:12:1)
vs.(unnamed struct)
...this can be configured inclang::PrintingPolicy::AnonymousTagLocations
).Just looking for early concerns/feedback.
I split up the patch into 3 separate commits:
MainFileID
. LLDB will pretend that this dummy file is the main source file, and all other FileIDs (those that we create in commit 3) are included into itsetLocation
in all the appropriate places.GetLocForDecl
which turns alldb_private::Declaration
intoclang::SourceLocation
The three main differences to https://reviews.llvm.org/D78095 are:
DW_AT_decl_file
for displaying source snippetsclang::TextDiagnosticPrinter
for printing these source snippets. In theory it would be nice to do this, because we could configure it separately (e.g., to hide the column number). I didn't carry that over because (1) settingShowCarets
tofalse
causes the source snippet to not be printed (probably something we could change in Clang), and (2) we somehow need to differentiate whether aFileID
was created by LLDB versus Clang, which got tricky because we need to have cross-DWARFASTParserClang
state, which also meant that we have to copy over this metadata to the expression context. All of that didn't seem worth it, but we could definitely add this backrdar://8670536