Skip to content

Conversation

kazutakahirata
Copy link
Contributor

I'm planning to remove the V2 support. Now, HotColdSchema is one
feature that's tested in V2 but not in V4.

This patch derives:

RecordSerializationRoundTripVersion4HotColdSchema

from:

RecordSerializationRoundTripVersion2HotColdSchema

and then adds DummyMap, a V4-speicifc item.

I'm planning to remove the V2 support.  Now, HotColdSchema is one
feature that's tested in V2 but not in V4.

This patch derives:

  RecordSerializationRoundTripVersion4HotColdSchema

from:

  RecordSerializationRoundTripVersion2HotColdSchema

and then adds DummyMap, a V4-speicifc item.
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Sep 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-pgo

Author: Kazu Hirata (kazutakahirata)

Changes

I'm planning to remove the V2 support. Now, HotColdSchema is one
feature that's tested in V2 but not in V4.

This patch derives:

RecordSerializationRoundTripVersion4HotColdSchema

from:

RecordSerializationRoundTripVersion2HotColdSchema

and then adds DummyMap, a V4-speicifc item.


Full diff: https://github.com/llvm/llvm-project/pull/156783.diff

1 Files Affected:

  • (modified) llvm/unittests/ProfileData/MemProfTest.cpp (+69)
diff --git a/llvm/unittests/ProfileData/MemProfTest.cpp b/llvm/unittests/ProfileData/MemProfTest.cpp
index b57567e29f1c8..abe36bc759658 100644
--- a/llvm/unittests/ProfileData/MemProfTest.cpp
+++ b/llvm/unittests/ProfileData/MemProfTest.cpp
@@ -404,6 +404,75 @@ TEST(MemProf, RecordSerializationRoundTripVersion2HotColdSchema) {
   EXPECT_EQ(Record, GotRecord);
 }
 
+TEST(MemProf, RecordSerializationRoundTripVersion4HotColdSchema) {
+  const auto Schema = getHotColdSchema();
+
+  MemInfoBlock Info;
+  Info.AllocCount = 11;
+  Info.TotalSize = 22;
+  Info.TotalLifetime = 33;
+  Info.TotalLifetimeAccessDensity = 44;
+
+  llvm::SmallVector<CallStackId> CallStackIds = {0x123, 0x456};
+
+  llvm::SmallVector<CallStackId> CallSiteIds = {0x333, 0x444};
+
+  IndexedMemProfRecord Record;
+  for (const auto &CSId : CallStackIds) {
+    // Use the same info block for both allocation sites.
+    Record.AllocSites.emplace_back(CSId, Info, Schema);
+  }
+  for (auto CSId : CallSiteIds)
+    Record.CallSites.push_back(IndexedCallSiteInfo(CSId));
+
+  std::bitset<llvm::to_underlying(Meta::Size)> SchemaBitSet;
+  for (auto Id : Schema)
+    SchemaBitSet.set(llvm::to_underlying(Id));
+
+  // Verify that SchemaBitSet has the fields we expect and nothing else, which
+  // we check with count().
+  EXPECT_EQ(SchemaBitSet.count(), 4U);
+  EXPECT_TRUE(SchemaBitSet[llvm::to_underlying(Meta::AllocCount)]);
+  EXPECT_TRUE(SchemaBitSet[llvm::to_underlying(Meta::TotalSize)]);
+  EXPECT_TRUE(SchemaBitSet[llvm::to_underlying(Meta::TotalLifetime)]);
+  EXPECT_TRUE(
+      SchemaBitSet[llvm::to_underlying(Meta::TotalLifetimeAccessDensity)]);
+
+  // Verify that Schema has propagated all the way to the Info field in each
+  // IndexedAllocationInfo.
+  ASSERT_THAT(Record.AllocSites, SizeIs(2));
+  EXPECT_EQ(Record.AllocSites[0].Info.getSchema(), SchemaBitSet);
+  EXPECT_EQ(Record.AllocSites[1].Info.getSchema(), SchemaBitSet);
+
+  std::string Buffer;
+  llvm::raw_string_ostream OS(Buffer);
+  // Need a dummy map for V4 serialization
+  llvm::DenseMap<CallStackId, LinearCallStackId> DummyMap = {
+      {0x123, 1}, {0x456, 2}, {0x333, 3}, {0x444, 4}};
+  Record.serialize(Schema, OS, Version4, &DummyMap);
+
+  const IndexedMemProfRecord GotRecord = IndexedMemProfRecord::deserialize(
+      Schema, reinterpret_cast<const unsigned char *>(Buffer.data()), Version4);
+
+  // Verify that Schema comes back correctly after deserialization. Technically,
+  // the comparison between Record and GotRecord below includes the comparison
+  // of their Schemas, but we'll verify the Schemas on our own.
+  ASSERT_THAT(GotRecord.AllocSites, SizeIs(2));
+  EXPECT_EQ(GotRecord.AllocSites[0].Info.getSchema(), SchemaBitSet);
+  EXPECT_EQ(GotRecord.AllocSites[1].Info.getSchema(), SchemaBitSet);
+
+  // Create the expected record using the linear IDs from the dummy map.
+  IndexedMemProfRecord ExpectedRecord;
+  for (const auto &CSId : CallStackIds) {
+    ExpectedRecord.AllocSites.emplace_back(DummyMap[CSId], Info, Schema);
+  }
+  for (const auto &CSId : CallSiteIds) {
+    ExpectedRecord.CallSites.emplace_back(DummyMap[CSId]);
+  }
+
+  EXPECT_EQ(ExpectedRecord, GotRecord);
+}
+
 TEST(MemProf, SymbolizationFilter) {
   auto Symbolizer = std::make_unique<MockSymbolizer>();
 

@kazutakahirata kazutakahirata merged commit cc19bc7 into llvm:main Sep 5, 2025
11 checks passed
@kazutakahirata kazutakahirata deleted the memprof_RecordSerializationRoundTripVersion4HotCold branch September 5, 2025 02:58
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 5, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/22201

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/243/332' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-3715499-243-332.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=332 GTEST_SHARD_INDEX=243 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Note: This is test shard 244 of 332.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CompletionStringTest
[ RUN      ] CompletionStringTest.Documentation
[       OK ] CompletionStringTest.Documentation (0 ms)
[----------] 1 test from CompletionStringTest (0 ms total)

[----------] 1 test from FuzzyMatch
[ RUN      ] FuzzyMatch.Matches
[       OK ] FuzzyMatch.Matches (7 ms)
[----------] 1 test from FuzzyMatch (7 ms total)

[----------] 1 test from CrossFileRenameTests
[ RUN      ] CrossFileRenameTests.WithUpToDateIndex
ASTWorker building file /clangd-test/foo.h version null with command 
[/clangd-test]
clang -xobjective-c++ /clangd-test/foo.h
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name foo.h -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test -fcoverage-compilation-dir=/clangd-test -resource-dir lib/clang/22 -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fobjc-runtime=gcc -fobjc-encode-cxx-class-template-spec -fobjc-exceptions -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x objective-c++ /clangd-test/foo.h
Building first preamble for /clangd-test/foo.h version null
Built preamble of size 820948 for file /clangd-test/foo.h version null in 0.30 seconds
indexed preamble AST for /clangd-test/foo.h version null:
  symbol slab: 0 symbols, 120 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
indexed file AST for /clangd-test/foo.h version null:
  symbol slab: 3 symbols, 4912 bytes
  ref slab: 3 symbols, 5 refs, 4320 bytes
  relations slab: 0 relations, 24 bytes
Build dynamic index for main-file symbols with estimated memory usage of 12648 bytes
ASTWorker building file /clangd-test/foo.cc version null with command 
[/clangd-test]
clang -xobjective-c++ /clangd-test/foo.cc
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name foo.cc -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test -fcoverage-compilation-dir=/clangd-test -resource-dir lib/clang/22 -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fobjc-runtime=gcc -fobjc-encode-cxx-class-template-spec -fobjc-exceptions -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x objective-c++ /clangd-test/foo.cc
Building first preamble for /clangd-test/foo.cc version null
Built preamble of size 824492 for file /clangd-test/foo.cc version null in 0.21 seconds
indexed preamble AST for /clangd-test/foo.cc version null:
  symbol slab: 3 symbols, 4912 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
Build dynamic index for header symbols with estimated memory usage of 7444 bytes
indexed file AST for /clangd-test/foo.cc version null:
  symbol slab: 3 symbols, 4912 bytes
  ref slab: 4 symbols, 9 refs, 4320 bytes
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants