Skip to content

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Jul 18, 2025

That's enabling Clang's -gkey-instructions, cc1's -gkey-instructions remains off by default.

Key Instructions improves the optimized-code debug-stepping experience in debuggers that use DWARF's is_stmt line table register to determine stepping behaviour.

The feature can be disabled with -gno-key-instructions.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668


Questions:

LLDB - I believe LLDB ignores Clang's is_stmts for stepping, but I don't know whether moving is_stmts around in ways LLDB hasn't seen before (at least as output from Clang) is going to have any unexpected consequences. cc @adrian-prantl @JDevlieghere for opinions - what testing and checking would be needed / possible here, to be confident that this doesn't break optimized code debugging for LLDB?

Clang languages/extensions - I've implemented support for C/C++, but I've not tested or specifically implemented anything for other languages (?) or extensions (?) like CUDA. Is there a good way to filter for this in the driver code in this patch?

…enabled

Key Instructions improves the optimized-code debug-stepping experience in
debuggers that use DWARF's `is_stmt` line table register to determine stepping
behaviour.

The feature can be disabled with `-gno-key-instructions`.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jul 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

That's enabling Clang's -gkey-instructions, cc1's -gkey-instructions remains off by default.

Key Instructions improves the optimized-code debug-stepping experience in debuggers that use DWARF's is_stmt line table register to determine stepping behaviour.

The feature can be disabled with -gno-key-instructions.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668


Questions:

LLDB - I believe LLDB ignores Clang's is_stmts for stepping, but I don't know whether moving is_stmts around in ways LLDB hasn't seen before (at least as output from Clang) is going to have any unexpected consequences. cc @adrian-prantl @JDevlieghere for opinions - what testing and checking would be needed / possible here, to be confident that this doesn't break optimized code debugging for LLDB?

Clang languages/extensions - I've implemented support for C/C++, but I've not tested or specifically implemented anything for other languages (?) or extensions (?) like CUDA. Is there a good way to filter for this in the driver code in this patch?


Full diff: https://github.com/llvm/llvm-project/pull/149509.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+7-1)
  • (modified) clang/test/DebugInfo/KeyInstructions/flag.cpp (+31-2)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 8880c9375143f..38eebbd6f9b5f 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4611,8 +4611,14 @@ renderDebugOptions(const ToolChain &TC, const Driver &D, const llvm::Triple &T,
       CmdArgs.push_back("-gembed-source");
   }
 
+  // Enable Key Instructions by default if optimisations are enabled and
+  // we're emitting DWARF.
+  Arg *OptLevel = Args.getLastArg(options::OPT_O_Group);
+  bool KeyInstructionsOnByDefault =
+      EmitDwarf && OptLevel && !OptLevel->getOption().matches(options::OPT_O0);
   if (Args.hasFlag(options::OPT_gkey_instructions,
-                   options::OPT_gno_key_instructions, false))
+                   options::OPT_gno_key_instructions,
+                   KeyInstructionsOnByDefault))
     CmdArgs.push_back("-gkey-instructions");
 
   if (EmitCodeView) {
diff --git a/clang/test/DebugInfo/KeyInstructions/flag.cpp b/clang/test/DebugInfo/KeyInstructions/flag.cpp
index e34faa6cbb347..05a34ac670feb 100644
--- a/clang/test/DebugInfo/KeyInstructions/flag.cpp
+++ b/clang/test/DebugInfo/KeyInstructions/flag.cpp
@@ -1,7 +1,5 @@
 // RUN: %clang -### -target x86_64 -c -gdwarf -gkey-instructions %s 2>&1 | FileCheck %s --check-prefixes=KEY-INSTRUCTIONS
 // RUN: %clang -### -target x86_64 -c -gdwarf -gno-key-instructions %s 2>&1 | FileCheck %s --check-prefixes=NO-KEY-INSTRUCTIONS
-//// Default: Off.
-// RUN: %clang -### -target x86_64 -c -gdwarf %s 2>&1 | FileCheck %s --check-prefixes=NO-KEY-INSTRUCTIONS
 
 //// Help.
 // RUN %clang --help | FileCheck %s --check-prefix=HELP
@@ -23,3 +21,34 @@ void f() {}
 // RUN: %clang_cc1 %s -triple x86_64-linux-gnu -gkey-instructions -debug-info-kind=line-tables-only -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-ON
 // SMOKETEST-ON: keyInstructions: true
 // SMOKETEST-ON: atomGroup: 1
+
+//// Enable Key Instructions by default if optimisations are enabled and we're
+//// emitting DWARF.
+////
+//// | opt level | -gkey-instructions | feature |
+//// |         0 |                 no |     off |
+//// |         0 |                yes |      on |
+//// |       >=1 |                 no |      on |
+//// |       >=1 |                yes |      on |
+//// |       >=1 |   no & no -g flags |     off |
+//// |       >=1 | no & emit codeview |     off |
+//
+// RUN: %clang %s     -target x86_64 -gdwarf    -gmlt                    -### 2>&1 | FileCheck %s --check-prefix=NO-KEY-INSTRUCTIONS
+// RUN: %clang %s     -target x86_64 -gdwarf    -gmlt -gkey-instructions -### 2>&1 | FileCheck %s --check-prefix=KEY-INSTRUCTIONS
+// RUN: %clang %s -O0 -target x86_64 -gdwarf    -gmlt                    -### 2>&1 | FileCheck %s --check-prefix=NO-KEY-INSTRUCTIONS
+// RUN: %clang %s -O0 -target x86_64 -gdwarf    -gmlt -gkey-instructions -### 2>&1 | FileCheck %s --check-prefix=KEY-INSTRUCTIONS
+// RUN: %clang %s -O1 -target x86_64 -gdwarf    -gmlt                    -### 2>&1 | FileCheck %s --check-prefix=KEY-INSTRUCTIONS
+// RUN: %clang %s -O1 -target x86_64 -gdwarf    -gmlt -gkey-instructions -### 2>&1 | FileCheck %s --check-prefix=KEY-INSTRUCTIONS
+// RUN: %clang %s -O1 -target x86_64                                     -### 2>&1 | FileCheck %s --check-prefixes=NO-KEY-INSTRUCTIONS
+// RUN: %clang %s -O1 -target x86_64 -gcodeview -gmlt                    -### 2>&1 | FileCheck %s --check-prefixes=NO-KEY-INSTRUCTIONS
+//
+// RUN: %clang %s     -target x86_64 -gdwarf    -gmlt                    -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-OFF
+// RUN: %clang %s     -target x86_64 -gdwarf    -gmlt -gkey-instructions -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-ON
+// RUN: %clang %s -O0 -target x86_64 -gdwarf    -gmlt                    -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-OFF
+// RUN: %clang %s -O0 -target x86_64 -gdwarf    -gmlt -gkey-instructions -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-ON
+// RUN: %clang %s -O1 -target x86_64 -gdwarf    -gmlt                    -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-ON
+// RUN: %clang %s -O1 -target x86_64 -gdwarf    -gmlt -gkey-instructions -S -emit-llvm -o - | FileCheck %s --check-prefix=SMOKETEST-ON
+// RUN: %clang %s -O1 -target x86_64                                     -S -emit-llvm -o - | FileCheck %s --check-prefixes=SMOKETEST-OFF,SMOKETEST-NO-DEBUG
+// RUN: %clang %s -O1 -target x86_64 -gcodeview -gmlt                    -S -emit-llvm -o - | FileCheck %s --check-prefixes=SMOKETEST-OFF
+// SMOKETEST-NO-DEBUG: llvm.module.flags
+// SMOKETEST-NO-DEBUG-NOT: DICompileUnit

@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 8, 2025

Ping (should I move the question to discourse?)

@dwblaikie
Copy link
Collaborator

A summary of DWARF file size/section size changes (how much change as a % of total file size in object and linked executables, as a % of aggregate DWARF section sizes in object and linked executables, and as a % of the specific DWARF section sizes (eg: {.rela,}.debug_line changes by x%) - with specific debug info flags (my preference would be for -fno-standalone-debug -gz=zstd with crel, but I understand if that's not your deployment target, etc, it's less relevant)) and which debuggers are effected in what ways (some discussion of gdb would be nice).

If this has a cost and no benefit for lldb, seems like maybe we'd want to keep it off under lldb tuning. If it has a cost and no benefit to gdb - then I'd probably say off-by-default, on-for-SCE.

@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 11, 2025

Thanks for taking a look!

A summary of DWARF file size/section size changes (how much change as a % of total file size in object and linked executables, as a % of aggregate DWARF section sizes in object and linked executables, and as a % of the specific DWARF section sizes (eg: {.rela,}.debug_line changes by x%) - with specific debug info flags (my preference would be for -fno-standalone-debug -gz=zstd with crel, but I understand if that's not your deployment target, etc, it's less relevant))

Here's bloaty figures for some CTMark builds ($ bloaty <-gkey-instructions> -- <-gno-key-instructions>)

I'm only including .debug_line as bloaty reports changes to a few other sections of <= 5 bytes in the first table and <= 50 bytes in the second, which is negligible.

-O2 -g -gkey-instructions vs baseline of -O2 -g:

executable .debug_line delta file size delta
7zip-benchmark +2.4% +15.3Ki +0.3% +15.3Ki
bullet +0.4% +1.60Ki +0.0% +1.61Ki
clamscan +2.4% +7.02Ki +0.3% +7.03Ki
consumer-typeset +12% +16.1Ki +1.2% +16.1Ki
kc +2.4% +6.28Ki +0.1% +6.29Ki
lencod +1.5% +4.64Ki +0.2% +4.65Ki
pairlocalalign +1.9% +3.92Ki +0.3% +3.93Ki
SPASS +1.2% +5.11Ki +0.2% +5.12Ki
sqlite3 +1.6% +4.75Ki +0.2% +4.76Ki
tramp3d-v4 +1.1% +5.43Ki +0.1% +5.43Ki

-fno-standalone-debug -gz=zstd -O2 -g -gkey-instructions vs baseline of -fno-standalone-debug -gz=zstd -O2 -g:

executable .debug_line delta file size delta
7zip-benchmark +0.5% +1.20Ki +0.0% +1.27Ki
bullet +0.4% +832 +0.0% +792
clamscan +1.5% +2.36Ki +0.2% +2.34Ki
consumer-typeset +5.4% +4.32Ki +0.5% +4.32Ki
kc +1.7% +1.55Ki +0.1% +1.60Ki
lencod +0.9% +1.38Ki +0.1% +1.40Ki
pairlocalalign +0.4% +352 +0.0% +376
SPASS +0.7% +1.23Ki +0.1% +1.23Ki
sqlite3 +0.9% +1.40Ki +0.1% +1.43Ki
tramp3d-v4 +1.0% +864 +0.0% +832

There's a compile time cost to enabling the feature (0.52% on compile-time-tracker) which was included in the compile-time-cost tables in this post which explains where we put work in to "pay for" the cost of the feature (and we have longer term general improvements we'd like to make).

I'm happy to get more numbers if there're any other configurations you'd like to see.

and which debuggers are effected in what ways (some discussion of gdb would be nice). If this has a cost and no benefit for lldb, seems like maybe we'd want to keep it off under lldb tuning. If it has a cost and no benefit to gdb - then I'd probably say off-by-default, on-for-SCE.

SCE (full disclosure - most tested) and GDB (*) and both work with the feature (both provide the promised improved optimisied-code stepping), but LLDB doesn't yet. I'd still like to gently suggest we enable it for LLDB too for uniformity, especially as we've tried to pay some of the costs in other areas, so the net "feature doesn't exist" -> "feature on by default" cost is < 0.3% increased instructions:u on compile-tim-tracker (and from the other bits of work, max rss has come down by 0.87%), unless that seems unreasonable or generally undesirable.

(*) Checked with 15.0.50 and 12.0.90. GCC's -gstatement-frontiers option (which is also enabled by default for optimisations) requires the same debugger support as key-instructions and the earliest mention I could find was gcc-8.1 (2018) https://gcc.gnu.org/onlinedocs/gcc-8.1.0/gcc/Debugging-Options.html - I would assume (but have not verified) that gdb versions from around then would support gcc's -gstatement-frontiers and therefore key instructions.

@dwblaikie
Copy link
Collaborator

Cool - thansks for all the details. I'm fine with making the change in general then, but should get some thougths from the apple folks, probably @JDevlieghere or @adrian-prantl

@rnk
Copy link
Collaborator

rnk commented Aug 11, 2025

I think you've all gone above and beyond to pay for the (marginal) cost increase from this feature, and we should go ahead and land this.

Thanks for all the hard work and data gathering!

@OCHyams OCHyams requested a review from labath August 19, 2025 13:00
@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 19, 2025

Thanks all for the comments!

In addition to the cc'd apple folks, @labath do you have an opinion on the topic of enabling this by default for all debuggers versus not doing so for LLDB tuning? (There's a question in the summary, and it's been discussed a little in the comments).

One less exciting question that remains, if anyone has any ideas off the top of their heads:

Clang languages/extensions - I've implemented support for C/C++, but I've not tested or specifically implemented anything for other languages (?) or extensions (?) like CUDA. Is there a good way to filter for this in the driver code in this patch?

@labath
Copy link
Collaborator

labath commented Aug 25, 2025

Ideally, I'd also go with the uniform approach of enabling the feature everywhere. It's true that this doesn't have any value for lldb right now, but it also shouldn't break anything (lldb ignores is_stmt), and it's not really true that this (disabling the flag on -glldb) will completely avoid having to deal with this input as it's fairly common to compile linux binaries with -ggdb (the default) and debug with lldb.

Enabling the flag may also provide the incentive for someone to add support for this to lldb (I wish that someone was me, but it looks like that's not going to happen). Even if that happens a year from now, it would be nice if we could say that all binaries compiled within the last year would benefit from that.

@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 26, 2025

Thanks everyone for the discussion. It sounds like the consensus so far is to enable for all debugger tunings by default, for "plain" C/C++ if optimisations are enabled (though no apple folks have chimed in).

I've added a check to filter out not-plain C/C++. Please could someone review the code and approve if they're happy?

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +4401 to +4404
bool IRInput = isLLVMIR(InputType);
bool PlainCOrCXX = isDerivedFromC(InputType) && !isCuda(InputType) &&
!isHIP(InputType) && !isObjC(InputType) &&
!isOpenCL(InputType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious question, is there a better way of selecting "just C and C++" rather than "list of extensions that this shouldn't apply to"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love it either, but I couldn't find a better way. I asked for advice in the pull request description and a comment - I'll happily take suggestions if anyone has any! Are you happy for this to go in as-is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fine IMO.

@OCHyams OCHyams merged commit 9d65f77 into llvm:main Aug 27, 2025
9 checks passed
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request Aug 29, 2025
Key Instructions (-gkey-instructions) is now enabled by default when DWARF is
being emitted, the input is plain C/C++, and optimisations are enabled.

Add release note for the change in default behaviour.
OCHyams added a commit that referenced this pull request Sep 4, 2025
Key Instructions (-gkey-instructions) is now enabled by default when DWARF is
being emitted, the input is plain C/C++, and optimisations are enabled.

Add release note for the change in default behaviour.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants