Skip to content

Conversation

Nerixyz
Copy link
Contributor

@Nerixyz Nerixyz commented Sep 3, 2025

To find global variables, SymbolFileNativePDB used to search the globals stream for the name passed to FindGlobalVariables. However, the symbols in the globals stream contain the fully qualified name and FindGlobalVariables only gets the basename. The approach here is similar to the one for types and functions.

As we already search the globals stream for functions, we can cache the basenames for global variables there as well.

This makes the expressions.test from the DIA PDB plugin pass with the native one (#114906).

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

To find global variables, SymbolFileNativePDB used to search the globals stream for the name passed to FindGlobalVariables. However, the symbols in the globals stream contain the fully qualified name and FindGlobalVariables only gets the basename. The approach here is similar to the one for types and functions.

As we already search the globals stream for functions, we can cache the basenames for global variables there as well.

This makes the expressions.test from the DIA PDB plugin pass with the native one (#114906).


Full diff: https://github.com/llvm/llvm-project/pull/156736.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+72-28)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h (+7-1)
  • (modified) lldb/test/Shell/SymbolFile/PDB/expressions.test (+2-1)
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 112eb06e462fc..330e839a7ca13 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1655,22 +1655,62 @@ void SymbolFileNativePDB::DumpClangAST(Stream &s, llvm::StringRef filter) {
   clang->GetNativePDBParser()->Dump(s, filter);
 }
 
-void SymbolFileNativePDB::CacheFunctionNames() {
-  if (!m_func_full_names.IsEmpty())
+void SymbolFileNativePDB::CacheGlobalBaseNames() {
+  if (!m_func_full_names.IsEmpty() || !m_global_variable_base_names.IsEmpty())
     return;
 
   // (segment, code offset) -> gid
-  std::map<std::pair<uint16_t, uint32_t>, uint32_t> addr_ids;
+  std::map<std::pair<uint16_t, uint32_t>, uint32_t> func_addr_ids;
 
-  // First, find all function references in the globals table.
+  // First, look through all items in the globals table.
   for (const uint32_t gid : m_index->globals().getGlobalsTable()) {
-    CVSymbol ref_sym = m_index->symrecords().readRecord(gid);
-    auto kind = ref_sym.kind();
+    CVSymbol sym = m_index->symrecords().readRecord(gid);
+    auto kind = sym.kind();
+
+    // If this is a global variable, we only need to look at the name
+    llvm::StringRef name;
+    switch (kind) {
+    case SymbolKind::S_GDATA32:
+    case SymbolKind::S_LDATA32: {
+      DataSym data =
+          llvm::cantFail(SymbolDeserializer::deserializeAs<DataSym>(sym));
+      name = data.Name;
+      break;
+    }
+    case SymbolKind::S_GTHREAD32:
+    case SymbolKind::S_LTHREAD32: {
+      ThreadLocalDataSym data = llvm::cantFail(
+          SymbolDeserializer::deserializeAs<ThreadLocalDataSym>(sym));
+      name = data.Name;
+      break;
+    }
+    case SymbolKind::S_CONSTANT: {
+      ConstantSym data =
+          llvm::cantFail(SymbolDeserializer::deserializeAs<ConstantSym>(sym));
+      name = data.Name;
+      break;
+    }
+    default:
+      break;
+    }
+
+    if (!name.empty()) {
+      llvm::StringRef base = MSVCUndecoratedNameParser::DropScope(name);
+      if (base.empty())
+        base = name;
+
+      m_global_variable_base_names.Append(ConstString(base), gid);
+      continue;
+    }
+
     if (kind != S_PROCREF && kind != S_LPROCREF)
       continue;
 
+    // For functions, we need to follow the reference to the procedure and look
+    // at the type
+
     ProcRefSym ref =
-        llvm::cantFail(SymbolDeserializer::deserializeAs<ProcRefSym>(ref_sym));
+        llvm::cantFail(SymbolDeserializer::deserializeAs<ProcRefSym>(sym));
     if (ref.Name.empty())
       continue;
 
@@ -1694,7 +1734,7 @@ void SymbolFileNativePDB::CacheFunctionNames() {
     // The function/procedure symbol only contains the demangled name.
     // The mangled names are in the publics table. Save the address of this
     // function to lookup the mangled name later.
-    addr_ids.emplace(std::make_pair(proc.Segment, proc.CodeOffset), gid);
+    func_addr_ids.emplace(std::make_pair(proc.Segment, proc.CodeOffset), gid);
 
     llvm::StringRef basename = MSVCUndecoratedNameParser::DropScope(proc.Name);
     if (basename.empty())
@@ -1729,8 +1769,8 @@ void SymbolFileNativePDB::CacheFunctionNames() {
       continue;
 
     // Check if this symbol is for one of our functions.
-    auto it = addr_ids.find({pub.Segment, pub.Offset});
-    if (it != addr_ids.end())
+    auto it = func_addr_ids.find({pub.Segment, pub.Offset});
+    if (it != func_addr_ids.end())
       m_func_full_names.Append(ConstString(pub.Name), it->second);
   }
 
@@ -1741,31 +1781,35 @@ void SymbolFileNativePDB::CacheFunctionNames() {
   m_func_method_names.SizeToFit();
   m_func_base_names.Sort();
   m_func_base_names.SizeToFit();
+  m_global_variable_base_names.Sort(std::less<uint32_t>());
+  m_global_variable_base_names.SizeToFit();
 }
 
 void SymbolFileNativePDB::FindGlobalVariables(
     ConstString name, const CompilerDeclContext &parent_decl_ctx,
     uint32_t max_matches, VariableList &variables) {
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
-  using SymbolAndOffset = std::pair<uint32_t, llvm::codeview::CVSymbol>;
 
-  std::vector<SymbolAndOffset> results = m_index->globals().findRecordsByName(
-      name.GetStringRef(), m_index->symrecords());
-  for (const SymbolAndOffset &result : results) {
-    switch (result.second.kind()) {
-    case SymbolKind::S_GDATA32:
-    case SymbolKind::S_LDATA32:
-    case SymbolKind::S_GTHREAD32:
-    case SymbolKind::S_LTHREAD32:
-    case SymbolKind::S_CONSTANT: {
-      PdbGlobalSymId global(result.first, false);
-      if (VariableSP var = GetOrCreateGlobalVariable(global))
-        variables.AddVariable(var);
-      break;
-    }
-    default:
+  CacheGlobalBaseNames();
+
+  std::vector<uint32_t> results;
+  m_global_variable_base_names.GetValues(name, results);
+
+  size_t n_matches = 0;
+  for (uint32_t gid : results) {
+    PdbGlobalSymId global(gid, false);
+
+    if (parent_decl_ctx.IsValid() &&
+        GetDeclContextContainingUID(toOpaqueUid(global)) != parent_decl_ctx)
       continue;
-    }
+
+    VariableSP var = GetOrCreateGlobalVariable(global);
+    if (!var)
+      continue;
+    variables.AddVariable(var);
+
+    if (++n_matches >= max_matches)
+      break;
   }
 }
 
@@ -1783,7 +1827,7 @@ void SymbolFileNativePDB::FindFunctions(
         name_type_mask & eFunctionNameTypeBase ||
         name_type_mask & eFunctionNameTypeMethod))
     return;
-  CacheFunctionNames();
+  CacheGlobalBaseNames();
 
   std::set<uint32_t> resolved_ids; // avoid duplicate lookups
   auto resolve_from = [&](UniqueCStringMap<uint32_t> &Names) {
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
index cfa00416d9673..57db1a7ad9bb0 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
@@ -260,7 +260,10 @@ class SymbolFileNativePDB : public SymbolFileCommon {
 
   std::vector<CompilerContext> GetContextForType(llvm::codeview::TypeIndex ti);
 
-  void CacheFunctionNames();
+  /// Caches the basenames of symbols found in the globals stream.
+  ///
+  /// This includes functions and global variables
+  void CacheGlobalBaseNames();
 
   void CacheUdtDeclarations();
   llvm::Expected<Declaration> ResolveUdtDeclaration(PdbTypeSymId type_id);
@@ -306,6 +309,9 @@ class SymbolFileNativePDB : public SymbolFileCommon {
   lldb_private::UniqueCStringMap<uint32_t> m_func_base_names;
   /// method basename -> Global ID(s)
   lldb_private::UniqueCStringMap<uint32_t> m_func_method_names;
+
+  /// basename -> Global ID(s)
+  lldb_private::UniqueCStringMap<uint32_t> m_global_variable_base_names;
 };
 
 } // namespace npdb
diff --git a/lldb/test/Shell/SymbolFile/PDB/expressions.test b/lldb/test/Shell/SymbolFile/PDB/expressions.test
index 1932be74ca878..33b02e2d67965 100644
--- a/lldb/test/Shell/SymbolFile/PDB/expressions.test
+++ b/lldb/test/Shell/SymbolFile/PDB/expressions.test
@@ -1,6 +1,7 @@
 REQUIRES: target-windows, msvc
 RUN: %build --compiler=msvc --nodefaultlib --output=%t.exe %S/Inputs/ExpressionsTest.cpp
-RUN: not %lldb -b -s %S/Inputs/ExpressionsTest0.script -s %S/Inputs/ExpressionsTest1.script -s %S/Inputs/ExpressionsTest2.script -- %t.exe 2>&1 | FileCheck %s
+RUN: env LLDB_USE_NATIVE_PDB_READER=0 not %lldb -b -s %S/Inputs/ExpressionsTest0.script -s %S/Inputs/ExpressionsTest1.script -s %S/Inputs/ExpressionsTest2.script -- %t.exe 2>&1 | FileCheck %s
+RUN: env LLDB_USE_NATIVE_PDB_READER=1 not %lldb -b -s %S/Inputs/ExpressionsTest0.script -s %S/Inputs/ExpressionsTest1.script -s %S/Inputs/ExpressionsTest2.script -- %t.exe 2>&1 | FileCheck %s
 
 // Check the variable value through `expression`
 CHECK: (lldb) expression result

Copy link
Contributor

@ZequanWu ZequanWu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Co-authored-by: Zequan Wu <zequanwu@google.com>
@Nerixyz Nerixyz merged commit bdb9283 into llvm:main Sep 4, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants