Skip to content

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Feb 19, 2025

This patch revives https://reviews.llvm.org/D78095 with some changes.

This patch uses the DW_AT_decl_XXX DWARF attributes to create clang::SourceLocations and attach them to the Clang decls that LLDB creates. The primary motivation for this is better expression evaluator diagnostics. Instead of:

(lldb) expr func()
            ˄
            ╰─ error: no matching function for call to 'func'
note: note: candidate function not viable: requires 1 argument, but 0 were provided
note: note: candidate function not viable: requires 1 argument, but 0 were provided
note: note: candidate function not viable: requires single argument 'x', but no arguments were provided
note: note: candidate function not viable: requires single argument 'y', but no arguments were provided
note: note: candidate function not viable: requires 2 arguments, but 0 were provided

We would see:

(lldb) expr func()
            ˄
            ╰─ error: no matching function for call to 'func'
note: /Users/michaelbuch/source-locs/lib.h:4:1: candidate function not viable: requires 1 argument, but 0 were provided
    4 | void func(char const* c, char) {}; void func(bool) {}
      | ^
note: /Users/michaelbuch/source-locs/lib.h:4:1: candidate function not viable: requires 2 arguments, but 0 were provided
    4 | void func(char const* c, char) {}; void func(bool) {}
      | ^
note: /Users/michaelbuch/source-locs/main.cpp:5:1: candidate function not viable: requires single argument 'x', but no arguments were provided
    5 | void func(int x) { var = 6; }
      | ^
note: /Users/michaelbuch/source-locs/main.cpp:6:1: candidate function not viable: requires single argument 'y', but no arguments were provided
    6 | void func(float y) { var = 6; }
      | ^

There's a couple of quirks in the output still, mainly because Clang doesn't emit DW_AT_decl_column, so Clang has to place the caret on column 1. But this is something we can probably configure (either as a pre-requisite to this patch or as a follow-up).

This is currently WIP. It's lacking tests and we probably also want to make sure we only use DW_AT_decl_file if the file hasn't changed since the debug-info was generated (i think there's some checksum we can use for this, like we do in the SupportFiles?). Another open question is whether we want to adjust our typename printer to omit the locations of unnamed/anonymous structures/enums (i.e., (unnamed struct at /path/to/some/file.cpp:12:1) vs. (unnamed struct)...this can be configured in clang::PrintingPolicy::AnonymousTagLocations).

Just looking for early concerns/feedback.

I split up the patch into 3 separate commits:

  1. Commit 1: makes sure each ASTContext that LLDB creates (apart from the scratch AST), has a MainFileID. LLDB will pretend that this dummy file is the main source file, and all other FileIDs (those that we create in commit 3) are included into it
  2. Commit 2: makes sure we call setLocation in all the appropriate places.
  3. Implements GetLocForDecl which turns a lldb_private::Declaration into clang::SourceLocation

The three main differences to https://reviews.llvm.org/D78095 are:

  1. we no longer hide this feature behind a flag
  2. the original patch didn't yet support using DW_AT_decl_file for displaying source snippets
  3. we no longer have a separate clang::TextDiagnosticPrinter for printing these source snippets. In theory it would be nice to do this, because we could configure it separately (e.g., to hide the column number). I didn't carry that over because (1) setting ShowCarets to false causes the source snippet to not be printed (probably something we could change in Clang), and (2) we somehow need to differentiate whether a FileID was created by LLDB versus Clang, which got tricky because we need to have cross-DWARFASTParserClang state, which also meant that we have to copy over this metadata to the expression context. All of that didn't seem worth it, but we could definitely add this back

rdar://8670536

@llvmbot
Copy link
Member

llvmbot commented Feb 19, 2025

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

This patch revives https://reviews.llvm.org/D78095 with some changes.

This patch uses the DW_AT_decl_XXX DWARF attributes to create clang::SourceLocations and attach them to the Clang decls that LLDB creates. The primary motivation for this is better expression evaluator diagnostics. Instead of:

(lldb) expr func()
            ˄
            ╰─ error: no matching function for call to 'func'
note: note: candidate function not viable: requires 1 argument, but 0 were provided
note: note: candidate function not viable: requires 1 argument, but 0 were provided
note: note: candidate function not viable: requires single argument 'x', but no arguments were provided
note: note: candidate function not viable: requires single argument 'y', but no arguments were provided
note: note: candidate function not viable: requires 2 arguments, but 0 were provided

We would see:

(lldb) expr func()
            ˄
            ╰─ error: no matching function for call to 'func'
note: /Users/michaelbuch/source-locs/lib.h:4:1: candidate function not viable: requires 1 argument, but 0 were provided
    4 | void func(char const* c, char) {}; void func(bool) {}
      | ^
note: /Users/michaelbuch/source-locs/lib.h:4:1: candidate function not viable: requires 2 arguments, but 0 were provided
    4 | void func(char const* c, char) {}; void func(bool) {}
      | ^
note: /Users/michaelbuch/source-locs/main.cpp:5:1: candidate function not viable: requires single argument 'x', but no arguments were provided
    5 | void func(int x) { var = 6; }
      | ^
note: /Users/michaelbuch/source-locs/main.cpp:6:1: candidate function not viable: requires single argument 'y', but no arguments were provided
    6 | void func(float y) { var = 6; }
      | ^

There's a couple of quirks in the output still, mainly because Clang we doesn't emit DW_AT_decl_column, so Clang has to place the caret on column 1. But this is something we can probably configure (either as a pre-requisite to this patch or as a follow-up).

This is currently WIP. It's lacking tests and we probably also want to make sure we only use the file DW_AT_decl_file hasn't changed since the debug-info was generated (i think there's some checksum we can use for this, like we do in the SupportFiles?). Just looking for early concerns/feedback.

I split up the patch into 3 separate commits:

  1. Commit 1: makes sure each ASTContext that LLDB creates (apart from the scratch AST), has a MainFileID. LLDB will pretend that this dummy file is the main source file, and all other FileIDs (those that we create in commit 3) are included into it
  2. Commit 2: makes sure we call setLocation in all the appropriate places.
  3. Implements GetLocForDecl which turns a lldb_private::Declaration into clang::SourceLocation

The three main differences to https://reviews.llvm.org/D78095 are:

  1. we no longer hide this feature behind a flag
  2. the original patch didn't yet support using DW_AT_decl_file for displaying source snippets
  3. we no longer have a separate clang::TextDiagnosticPrinter for printing these source snippets. In theory it would be nice to do this, because we could configure it separately (e.g., to hide the column number). I didn't carry that over because (1) setting ShowCarets to false causes the source snippet to not be printed (probably something we could change in Clang), and (2) we somehow need to differentiate whether a FileID was created by LLDB versus Clang, which got tricky because we need to have cross-DWARFASTParserClang state, which also meant that we have to copy over this metadata to the expression context. All of that didn't seem worth it, but we could definitely add this back

Patch is 20.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/127829.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+5-5)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp (+2-1)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+104-12)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h (+26-16)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 2d4d22559963f..abf3b22b0ae15 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -1372,7 +1372,7 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
             ignore_containing_context ? m_ast.GetTranslationUnitDecl()
                                       : containing_decl_ctx,
             GetOwningClangModule(die), name, clang_type, attrs.storage,
-            attrs.is_inline);
+            attrs.is_inline, attrs.decl);
         std::free(name_buf);
 
         if (has_template_params) {
@@ -1382,11 +1382,11 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
               ignore_containing_context ? m_ast.GetTranslationUnitDecl()
                                         : containing_decl_ctx,
               GetOwningClangModule(die), attrs.name.GetStringRef(), clang_type,
-              attrs.storage, attrs.is_inline);
+              attrs.storage, attrs.is_inline, attrs.decl);
           clang::FunctionTemplateDecl *func_template_decl =
               m_ast.CreateFunctionTemplateDecl(
                   containing_decl_ctx, GetOwningClangModule(die),
-                  template_function_decl, template_param_infos);
+                  template_function_decl, template_param_infos, attrs.decl);
           m_ast.CreateFunctionTemplateSpecializationInfo(
               template_function_decl, func_template_decl, template_param_infos);
         }
@@ -1858,7 +1858,7 @@ DWARFASTParserClang::ParseStructureLikeDIE(const SymbolContext &sc,
     clang::ClassTemplateSpecializationDecl *class_specialization_decl =
         m_ast.CreateClassTemplateSpecializationDecl(
             containing_decl_ctx, GetOwningClangModule(die), class_template_decl,
-            tag_decl_kind, template_param_infos);
+            tag_decl_kind, template_param_infos, attrs.decl);
     clang_type =
         m_ast.CreateClassTemplateSpecializationType(class_specialization_decl);
 
@@ -1870,7 +1870,7 @@ DWARFASTParserClang::ParseStructureLikeDIE(const SymbolContext &sc,
     clang_type = m_ast.CreateRecordType(
         containing_decl_ctx, GetOwningClangModule(die), attrs.accessibility,
         attrs.name.GetCString(), tag_decl_kind, attrs.class_language, metadata,
-        attrs.exports_symbols);
+        attrs.exports_symbols, attrs.decl);
   }
 
   TypeSP type_sp = dwarf->MakeType(
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
index 5d4b22d08b111..ecb1a7dc571b4 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
@@ -1128,7 +1128,7 @@ void PdbAstBuilder::CreateFunctionParameters(PdbCompilandSymId func_id,
     CompilerType param_type_ct = m_clang.GetType(qt);
     clang::ParmVarDecl *param = m_clang.CreateParameterDeclaration(
         &function_decl, OptionalClangModuleID(), param_name.str().c_str(),
-        param_type_ct, clang::SC_None, true);
+        param_type_ct, clang::SC_None, clang::SourceLocation(), true);
     lldbassert(m_uid_to_decl.count(toOpaqueUid(param_uid)) == 0);
 
     m_uid_to_decl[toOpaqueUid(param_uid)] = param;
diff --git a/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp b/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
index c6dd72e22fb4c..e98ccf87cc2f7 100644
--- a/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
+++ b/lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
@@ -969,7 +969,8 @@ PDBASTParser::GetDeclForSymbol(const llvm::pdb::PDBSymbol &symbol) {
 
           clang::ParmVarDecl *param = m_ast.CreateParameterDeclaration(
               decl, OptionalClangModuleID(), nullptr,
-              arg_type->GetForwardCompilerType(), clang::SC_None, true);
+              arg_type->GetForwardCompilerType(), clang::SC_None,
+              clang::SourceLocation(), true);
           if (param)
             params.push_back(param);
         }
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 1e0c7f0514941..e03bb8b952e39 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -13,6 +13,7 @@
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/FormatAdapters.h"
 #include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 
 #include <mutex>
 #include <memory>
@@ -361,6 +362,39 @@ static void SetMemberOwningModule(clang::Decl *member,
     }
 }
 
+/// Creates a dummy main file for the given SourceManager.
+/// This file only serves as a container for include locations to other
+/// FileIDs that are put into this type system (either by the ASTImporter
+/// or when TypeSystemClang generates source locations for declarations).
+/// This file is not reflected to disk.
+static clang::FileID CreateDummyMainFile(clang::SourceManager &sm,
+                                         clang::FileManager &fm) {
+  llvm::StringRef main_file_path = "<LLDB Dummy Main File>";
+  // The file contents are empty and should never be seen by the user. The new
+  // line is just there to not throw off any line counting logic that might
+  // expect files to end with a newline.
+  llvm::StringRef main_file_contents = "\n";
+  const time_t mod_time = 0;
+  const off_t file_size = static_cast<off_t>(main_file_contents.size());
+
+  // Create a virtual FileEntry for our dummy file.
+  auto fe = fm.getVirtualFileRef(main_file_path, file_size, mod_time);
+
+  // Overwrite the file buffer with our empty file contents.
+  llvm::SmallVector<char, 64> buffer;
+  buffer.append(main_file_contents.begin(), main_file_contents.end());
+  auto file_contents = std::make_unique<llvm::SmallVectorMemoryBuffer>(
+      std::move(buffer), main_file_path);
+  sm.overrideFileContents(fe, std::move(file_contents));
+
+  // Create the actual file id for the FileEntry and set it as the main file.
+  clang::FileID fid =
+      sm.createFileID(fe, SourceLocation(), clang::SrcMgr::C_User);
+  sm.setMainFileID(fid);
+
+  return fid;
+}
+
 char TypeSystemClang::ID;
 
 bool TypeSystemClang::IsOperator(llvm::StringRef name,
@@ -692,6 +726,11 @@ void TypeSystemClang::CreateASTContext() {
   m_diagnostic_consumer_up = std::make_unique<NullDiagnosticConsumer>();
   m_ast_up->getDiagnostics().setClient(m_diagnostic_consumer_up.get(), false);
 
+  // Set up the MainFileID of this ASTContext. All other FileIDs created
+  // by this TypeSystem will act as-if included into this dummy main file.
+  auto fid = CreateDummyMainFile(*m_source_manager_up, *m_file_manager_up);
+  assert(m_ast_up->getSourceManager().getMainFileID() == fid);
+
   // This can be NULL if we don't know anything about the architecture or if
   // the target for an architecture isn't enabled in the llvm/clang that we
   // built
@@ -1243,7 +1282,7 @@ CompilerType TypeSystemClang::CreateRecordType(
     clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
     AccessType access_type, llvm::StringRef name, int kind,
     LanguageType language, std::optional<ClangASTMetadata> metadata,
-    bool exports_symbols) {
+    bool exports_symbols, const Declaration &declaration) {
   ASTContext &ast = getASTContext();
 
   if (decl_ctx == nullptr)
@@ -1298,6 +1337,10 @@ CompilerType TypeSystemClang::CreateRecordType(
       decl->setAnonymousStructOrUnion(true);
   }
 
+  auto location = GetLocForDecl(declaration);
+  decl->setLocStart(location);
+  decl->setLocation(location);
+
   if (metadata)
     SetMetadata(decl, *metadata);
 
@@ -1415,7 +1458,8 @@ static TemplateParameterList *CreateTemplateParameterList(
 clang::FunctionTemplateDecl *TypeSystemClang::CreateFunctionTemplateDecl(
     clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
     clang::FunctionDecl *func_decl,
-    const TemplateParameterInfos &template_param_infos) {
+    const TemplateParameterInfos &template_param_infos,
+    const Declaration &declaration) {
   //    /// Create a function template node.
   ASTContext &ast = getASTContext();
 
@@ -1429,6 +1473,7 @@ clang::FunctionTemplateDecl *TypeSystemClang::CreateFunctionTemplateDecl(
   func_tmpl_decl->setDeclName(func_decl->getDeclName());
   func_tmpl_decl->setTemplateParameters(template_param_list);
   func_tmpl_decl->init(func_decl);
+  func_tmpl_decl->setLocation(GetLocForDecl(declaration));
   SetOwningModule(func_tmpl_decl, owning_module);
 
   for (size_t i = 0, template_param_decl_count = template_param_decls.size();
@@ -1654,7 +1699,8 @@ ClassTemplateSpecializationDecl *
 TypeSystemClang::CreateClassTemplateSpecializationDecl(
     DeclContext *decl_ctx, OptionalClangModuleID owning_module,
     ClassTemplateDecl *class_template_decl, int kind,
-    const TemplateParameterInfos &template_param_infos) {
+    const TemplateParameterInfos &template_param_infos,
+    const Declaration &declaration) {
   ASTContext &ast = getASTContext();
   llvm::SmallVector<clang::TemplateArgument, 2> args(
       template_param_infos.Size() +
@@ -1689,6 +1735,8 @@ TypeSystemClang::CreateClassTemplateSpecializationDecl(
   class_template_specialization_decl->setSpecializationKind(
       TSK_ExplicitSpecialization);
 
+  class_template_specialization_decl->setLocation(GetLocForDecl(declaration));
+
   return class_template_specialization_decl;
 }
 
@@ -2149,7 +2197,8 @@ std::string TypeSystemClang::GetTypeNameForDecl(const NamedDecl *named_decl,
 FunctionDecl *TypeSystemClang::CreateFunctionDeclaration(
     clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
     llvm::StringRef name, const CompilerType &function_clang_type,
-    clang::StorageClass storage, bool is_inline) {
+    clang::StorageClass storage, bool is_inline,
+    const Declaration &declaration) {
   FunctionDecl *func_decl = nullptr;
   ASTContext &ast = getASTContext();
   if (!decl_ctx)
@@ -2170,6 +2219,11 @@ FunctionDecl *TypeSystemClang::CreateFunctionDeclaration(
   func_decl->setConstexprKind(isConstexprSpecified
                                   ? ConstexprSpecKind::Constexpr
                                   : ConstexprSpecKind::Unspecified);
+
+  const clang::SourceLocation location = GetLocForDecl(declaration);
+  func_decl->setLocation(location);
+  func_decl->setRangeEnd(location);
+
   SetOwningModule(func_decl, owning_module);
   decl_ctx->addDecl(func_decl);
 
@@ -2219,7 +2273,7 @@ CompilerType TypeSystemClang::CreateFunctionType(
 ParmVarDecl *TypeSystemClang::CreateParameterDeclaration(
     clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
     const char *name, const CompilerType &param_type, int storage,
-    bool add_decl) {
+    clang::SourceLocation loc, bool add_decl) {
   ASTContext &ast = getASTContext();
   auto *decl = ParmVarDecl::CreateDeserialized(ast, GlobalDeclID());
   decl->setDeclContext(decl_ctx);
@@ -2227,6 +2281,7 @@ ParmVarDecl *TypeSystemClang::CreateParameterDeclaration(
     decl->setDeclName(&ast.Idents.get(name));
   decl->setType(ClangUtil::GetQualType(param_type));
   decl->setStorageClass(static_cast<clang::StorageClass>(storage));
+  decl->setLocation(loc);
   SetOwningModule(decl, owning_module);
   if (add_decl)
     decl_ctx->addDecl(decl);
@@ -2316,10 +2371,10 @@ CompilerType TypeSystemClang::CreateEnumerationType(
     OptionalClangModuleID owning_module, const Declaration &decl,
     const CompilerType &integer_clang_type, bool is_scoped,
     std::optional<clang::EnumExtensibilityAttr::Kind> enum_kind) {
-  // TODO: Do something intelligent with the Declaration object passed in
-  // like maybe filling in the SourceLocation with it...
   ASTContext &ast = getASTContext();
 
+  auto location = GetLocForDecl(decl);
+
   // TODO: ask about these...
   //    const bool IsFixed = false;
   EnumDecl *enum_decl = EnumDecl::CreateDeserialized(ast, GlobalDeclID());
@@ -2329,6 +2384,8 @@ CompilerType TypeSystemClang::CreateEnumerationType(
   enum_decl->setScoped(is_scoped);
   enum_decl->setScopedUsingClassTag(is_scoped);
   enum_decl->setFixed(false);
+  enum_decl->setLocation(location);
+  enum_decl->setLocStart(location);
   SetOwningModule(enum_decl, owning_module);
   if (decl_ctx)
     decl_ctx->addDecl(enum_decl);
@@ -7755,10 +7812,10 @@ TypeSystemClang::CreateParameterDeclarations(
     llvm::StringRef name =
         !parameter_names.empty() ? parameter_names[param_index] : "";
 
-    auto *param =
-        CreateParameterDeclaration(func, /*owning_module=*/{}, name.data(),
-                                   GetType(prototype.getParamType(param_index)),
-                                   clang::SC_None, /*add_decl=*/false);
+    auto *param = CreateParameterDeclaration(
+        func, /*owning_module=*/{}, name.data(),
+        GetType(prototype.getParamType(param_index)), clang::SC_None,
+        func->getLocation(), /*add_decl=*/false);
     assert(param);
 
     params.push_back(param);
@@ -7771,7 +7828,8 @@ clang::CXXMethodDecl *TypeSystemClang::AddMethodToCXXRecordType(
     lldb::opaque_compiler_type_t type, llvm::StringRef name,
     const char *mangled_name, const CompilerType &method_clang_type,
     lldb::AccessType access, bool is_virtual, bool is_static, bool is_inline,
-    bool is_explicit, bool is_attr_used, bool is_artificial) {
+    bool is_explicit, bool is_attr_used, bool is_artificial,
+    const Declaration &declaration) {
   if (!type || !method_clang_type.IsValid() || name.empty())
     return nullptr;
 
@@ -7912,6 +7970,10 @@ clang::CXXMethodDecl *TypeSystemClang::AddMethodToCXXRecordType(
   cxx_method_decl->setParams(CreateParameterDeclarations(
       cxx_method_decl, *method_function_prototype, /*parameter_names=*/{}));
 
+  const clang::SourceLocation location = GetLocForDecl(declaration);
+  cxx_method_decl->setLocation(location);
+  cxx_method_decl->setRangeEnd(location);
+
   AddAccessSpecifierDecl(cxx_record_decl, getASTContext(),
                          GetCXXRecordDeclAccess(cxx_record_decl),
                          access_specifier);
@@ -9882,3 +9944,33 @@ void TypeSystemClang::LogCreation() const {
     LLDB_LOG(log, "Created new TypeSystem for (ASTContext*){0:x} '{1}'",
              &getASTContext(), getDisplayName());
 }
+
+clang::SourceLocation TypeSystemClang::GetLocForDecl(const Declaration &decl) {
+  // If the Declaration is invalid there is nothing to do.
+  if (!decl.IsValid())
+    return {};
+
+  clang::SourceManager &sm = getASTContext().getSourceManager();
+  clang::FileManager &fm = sm.getFileManager();
+
+  auto fe = fm.getFileRef(decl.GetFile().GetPath());
+  if (!fe)
+    return {};
+
+  clang::FileID fid = sm.translateFile(*fe);
+  if (fid.isInvalid()) {
+    // We see the file for the first time, so create a dummy file for it now.
+
+    // Connect the new dummy file to the main file via some fake include
+    // location. This is necessary as all file's in the SourceManager need to be
+    // reachable via an include chain from the main file.
+    SourceLocation ToIncludeLocOrFakeLoc;
+    assert(sm.getMainFileID().isValid());
+    ToIncludeLocOrFakeLoc = sm.getLocForStartOfFile(sm.getMainFileID());
+    fid = sm.createFileID(*fe, ToIncludeLocOrFakeLoc, clang::SrcMgr::C_User);
+  }
+
+  // Clang requires column numbers to be >= 1..
+  return sm.translateLineCol(fid, decl.GetLine(),
+                             std::max<uint16_t>(decl.GetColumn(), 1));
+}
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
index 99d9becffd128..f59eaee6705f9 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
@@ -326,13 +326,12 @@ class TypeSystemClang : public TypeSystem {
                                                bool is_framework = false,
                                                bool is_explicit = false);
 
-  CompilerType
-  CreateRecordType(clang::DeclContext *decl_ctx,
-                   OptionalClangModuleID owning_module,
-                   lldb::AccessType access_type, llvm::StringRef name, int kind,
-                   lldb::LanguageType language,
-                   std::optional<ClangASTMetadata> metadata = std::nullopt,
-                   bool exports_symbols = false);
+  CompilerType CreateRecordType(
+      clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
+      lldb::AccessType access_type, llvm::StringRef name, int kind,
+      lldb::LanguageType language,
+      std::optional<ClangASTMetadata> metadata = std::nullopt,
+      bool exports_symbols = false, const Declaration &declaration = {});
 
   class TemplateParameterInfos {
   public:
@@ -420,7 +419,8 @@ class TypeSystemClang : public TypeSystem {
 
   clang::FunctionTemplateDecl *CreateFunctionTemplateDecl(
       clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
-      clang::FunctionDecl *func_decl, const TemplateParameterInfos &infos);
+      clang::FunctionDecl *func_decl, const TemplateParameterInfos &infos,
+      const Declaration &declaration);
 
   void CreateFunctionTemplateSpecializationInfo(
       clang::FunctionDecl *func_decl, clang::FunctionTemplateDecl *Template,
@@ -437,7 +437,7 @@ class TypeSystemClang : public TypeSystem {
   clang::ClassTemplateSpecializationDecl *CreateClassTemplateSpecializationDecl(
       clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
       clang::ClassTemplateDecl *class_template_decl, int kind,
-      const TemplateParameterInfos &infos);
+      const TemplateParameterInfos &infos, const Declaration &declaration);
 
   CompilerType
   CreateClassTemplateSpecializationType(clang::ClassTemplateSpecializationDecl *
@@ -476,7 +476,8 @@ class TypeSystemClang : public TypeSystem {
   clang::FunctionDecl *CreateFunctionDeclaration(
       clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
       llvm::StringRef name, const CompilerType &function_Type,
-      clang::StorageClass storage, bool is_inline);
+      clang::StorageClass storage, bool is_inline,
+      const Declaration &declaration = {});
 
   CompilerType
   CreateFunctionType(const CompilerType &result_type, const CompilerType *args,
@@ -484,11 +485,10 @@ class TypeSystemClang : public TypeSystem {
                      clang::CallingConv cc = clang::CC_C,
                      clang::RefQualifierKind ref_qual = clang::RQ_None);
 
-  clang::ParmVarDecl *
-  CreateParameterDeclaration(clang::DeclContext *decl_ctx,
-                             OptionalClangModuleID owning_module,
-                             const char *name, const CompilerType &param_type,
-                             int storage, bool add_decl = false);
+  clang::ParmVarDecl *CreateParameterDeclaration(
+      clang::DeclContext *decl_ctx, OptionalClangModuleID owning_module,
+      const char *name, const CompilerType &param_type, int storage,
+      clang::SourceLocation loc, bool add_decl = false);
 
   CompilerType CreateBlockPointerType(const CompilerType &function_type);
 
@@ -996,7 +996,8 @@ class TypeSystemClang : public TypeSystem {
       lldb::opaque_compiler_type_t type, llvm::StringRef name,
       const char *mangled_name, const CompilerType &method_type,
       lldb::AccessType access, bool is_virtual, bool is_static, bool is_inline,
-      bool is_explicit, bool is_attr_used, bool is_artificial);
+      bool is_explicit, bool is_attr_used, bool is_artificial,
+      const Declaration &declaration = {});
 
   void AddMethodOverridesForCXXRecordType(lldb::opaque_compiler_type_t type);
 
@@ -1188,6 +1189,15 @@ class TypeSystemClang : public TypeSystem {
   std::optional<uint64_t> GetObjCBitSize(clang::QualType qual_type,
                                          ExecutionContextScope *exe_scope);
 
+  /// Turns the given \c decl into a \c clang::SourceLocation.
+  ///
+  /// Will create a \c FileID in this \c DWARFASTParserClang's \c ASTContext
+  /// if necessary.
+  ///
+  /// If no \c FileID could be found/created, returns an empty \c
+  /// SourceLocation.
+  clang::SourceLocation GetLocForDecl(const lldb_private::Declaration &decl);
+
   // Classes that inherit from TypeSystemClang can see and modify these
   std::string m_target_tr...
[truncated]

@Michael137 Michael137 force-pushed the lldb/source-locations-v1 branch from 5dae80a to c998ea3 Compare February 19, 2025 17:24
@Michael137 Michael137 force-pushed the lldb/source-locations-v1 branch from c998ea3 to a05a2ca Compare February 20, 2025 18:31
Copy link

github-actions bot commented Feb 20, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Michael137 Michael137 force-pushed the lldb/source-locations-v1 branch 2 times, most recently from 2da2b4e to 3f095fc Compare February 21, 2025 00:14
@labath
Copy link
Collaborator

labath commented Feb 21, 2025

Sounds like a nice feature to have. I'm not really familiar with these APIs, so I don't have much to say in the way of specifics. One high level question I have (I can't tell from the patch) is how are these files actually handled. Do we need to (preemptively) read them into some clang-owned buffer, or can we say something like "there is a file here, come back to me if you need it". I have two reasons for asking this: I'm wondering if this will have any impact on our memory footprint; and I'm wondering how will this behave if the file is changed (or deleted) during the course of the debug session.

@Michael137 Michael137 force-pushed the lldb/source-locations-v1 branch from 3f095fc to 8eb01f6 Compare February 21, 2025 10:26
@Michael137
Copy link
Member Author

Michael137 commented Feb 21, 2025

Sounds like a nice feature to have. I'm not really familiar with these APIs, so I don't have much to say in the way of specifics. One high level question I have (I can't tell from the patch) is how are these files actually handled. Do we need to (preemptively) read them into some clang-owned buffer, or can we say something like "there is a file here, come back to me if you need it". I have two reasons for asking this: I'm wondering if this will have any impact on our memory footprint; and I'm wondering how will this behave if the file is changed (or deleted) during the course of the debug session.

Yea I wasn't sure what the best answer to this is yet. Currently we create a SourceLocation (which is just a compact <FileID, offset> pair, where offset is the position into the file) using translateLineCol, which will mmap the file if we haven't done so yet, and then calculate the offset based on the line/column info.

"there is a file here, come back to me if you need it"

From my limited understanding of these APIs, I don't think there's currently such a customization point. LLDB would have to somehow store that column/line info somewhere until Clang asks for it. We could maybe stuff FileID+column+line into the SourceLocation, and steal a bit as an indicator that this is a "lazy source location"? I'd have to investigate this a bit more.

But I guess we'd first have to decide whether opening the source files is an unacceptable increase in the memory footprint for the common debugging use-case. I can collect some memory usage numbers when debugging clang/lldb as one data-point

@Michael137
Copy link
Member Author

I looked at the memory footprint of LLDB while debugging LLDB and there wasn't noticeable difference in resident memory usage between top-of-tree and this patch. I made sure to evaluate expressions that open up large source files. I do agree it feels a bit weird to be paying for opening source files if we only ever use it in diagnostics (not the common case). But maybe that's a cost worth paying?

@adrian-prantl
Copy link
Collaborator

Do we only pay the cost when a diagnostic is generated, or every time a DW_AT_file is parsed?

@adrian-prantl
Copy link
Collaborator

In the former case I have no concerns about this.

@Michael137
Copy link
Member Author

Do we only pay the cost when a diagnostic is generated, or every time a DW_AT_file is parsed?

Currently it's on creation of the decl

@labath
Copy link
Collaborator

labath commented Feb 25, 2025

Yeah, I expect this could be a bit of a problem on several fronts:

  • even if the memory usage is low enough to not matter, the fact that we have to hit the filesystem to load the file is most likely going to slow things down, particularly when you're using a remote filesystem (like we are). Normally, all one needs to parse debug info is the DWARF data, but now we would be fetching potentially thousands of files without actually using most most of them
  • this setup also makes it hard to display the source files in the same way that the rest of lldb does. In particular the target.source-map setting is inaccessible to the debug info code, and if I'm not mistaken the source manager in lldb always takes the current value of the setting (whereas the mapping in the clang ast is fixed at construction time)
  • mmapping the files (I suspect this is the reason this doesn't have a big impact on memory footprint) also has some unfortunate side effects on various systems. On windows, it prevents the file from being deleted (some editors implement "editing" as deleting and recreating a file), and on most unix systems (with the exception of Darwin which has MAP_RESILIENT_MEDIA) accessing the file after it has been truncated (which is the other way to implement "editing a file") can cause a SIGBUS.
  • I wonder whether we can exhaust the SourceLocation space in this way (and what happens if we do). 4GB sounds like a lot, but I know that some people manage to do that with just a single compile unit (I think it involves #includeing the same file repeatedly), so I wouldn't be surprised if we hit it after placing the entire large binary into a single AST context. (Maybe it's not as acute as we're not modelling #includes, but still..)

With the current implementation, I expect that we would need this to be controlled by some sort of a setting, although I'd really rather not do that. I wonder if we could cheat by pointing all files to some fake memory buffer (containing a bunch of newlines or something), and then hijacking the error printing logic to look up the "real" contents of the file (taking into account the current source map, file checksums and everything).

@jimingham
Copy link
Collaborator

jimingham commented Feb 25, 2025 via email

@Michael137
Copy link
Member Author

Why do we need to touch source files to add a source file attribution from debug info to a declaration? We don't require the actual presence of source files in order to make source file attributions anywhere else in the debug info handling. Jim

A clang::SourceLocation is an offset into the file. From DWARF we only have column and line numbers. So when we turn the DW_AT_decl_line/DW_AT_decl_column values into a SourceLocation, we use clang to do that transformation, which requires opening the file and inspecting in the contents. Haven't found a good way to do this lazily yet

@adrian-prantl
Copy link
Collaborator

It might be worth exploring a SourceManager subclass that has in addition to a MemoryBuffer-backed location also a source file type that is a file/line variant that is only converted into a MemoryBuffer-backed location if we do an operation that really needs it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants