Skip to content

Conversation

DataCorrupted
Copy link
Member

@DataCorrupted DataCorrupted commented Aug 22, 2025

One-instruction functions are not considered as a Sequence of instructions in the eye of DebugInfo, this has made the debugability of an ICF'ed function really poor. Both verification #152807 and relocation #149618 of debug info are hard because of it.

We have two options moving forward:

  1. Fix debug info [NFC][DebugInfo] Allow single instruction sequence in line table to make symbolication of merged functions easier #154851
  2. Add a nop to ICF'ed functions (this PR) so thunks are not one-instruction functions anymore.

For design discussions / more context please move to #154851

Signed-off-by: Peter Rong <PeterRong@meta.com>
@DataCorrupted DataCorrupted reopened this Aug 22, 2025
Signed-off-by: Peter Rong <PeterRong@meta.com>
Signed-off-by: Peter Rong <PeterRong@meta.com>
@@ -1913,7 +1913,6 @@ void AsmPrinter::emitFunctionBody() {

// Print out code for the function.
bool HasAnyRealCode = false;
int NumInstsInFunction = 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumInstsInFunction was already calculated for debugging purposes. We just promote them as a class member to help us determine if DW_AT_LLVM_stmt_sequence should be emitted.

@DataCorrupted DataCorrupted requested review from alx32, tomershafir, dwblaikie, kyulee-com and ellishg and removed request for alx32 and tomershafir August 25, 2025 20:08
@dwblaikie
Copy link
Collaborator

So far as I know, one instruction sequences are fine - /zero/ instruction sequences are problematic.

Is this about one instruction, or zero instruction? Could you say more about what problems one instruction sequences have?

I had trouble following info on the linked PR #154851 - it says things like "LowPC == HighPC" - but that shouldn't be the case for a single instruction sequence, right? (I'd have expected: LowPC + sizeof(instruction) + 1 == HighPC)

@DataCorrupted
Copy link
Member Author

DataCorrupted commented Aug 25, 2025

@dwblaikie Thanks for taking a look, here's more explanation, hopefully it helps:

Is this about one instruction, or zero instruction.

One instruction and one instruction only.

but that shouldn't be the case for a single instruction sequence

Ideally, it shouldn't.

But this is the case right now. HighPC is the last row's address, which means for one-instruction sequences, the last row is the first row, and thus HighPC and LowPC are the same.

Sequence.HighPC = Row.Address.Address;

I'd have expected: LowPC + sizeof(instruction) + 1 == HighPC

This is what #154851 trying to achieve/fix by setting HighPC to the last row's address + one (Row.Address.Address + MinInstLength)

problems one instruction sequences have?

As mentioned above, one-instruction won't be its own sequence. But normally that's fine.

The problem is specific to safe-thunk ICF and DW_AT_LLVM_stmt_sequence:

  1. The thunk is a one-instruction function
  2. We insert an offset to Sequence to teach symbolication where in the line table it should start looking.

Combining 1 and 2 creates a new problem: when one-instruction is not its own sequence, we can't correctly insert such offset.

@dwblaikie
Copy link
Collaborator

@dwblaikie Thanks for taking a look, here's more explanation, hopefully it helps:

Is this about one instruction, or zero instruction.

One instruction and one instruction only.

but that shouldn't be the case for a single instruction sequence

Ideally, it shouldn't.

But this is the case right now. HighPC is the last row's address, which means for one-instruction sequences, the last row is the first row, and thus HighPC and LowPC are the same.

Ranges are half open - so the last row's address should be one past the last instruction, see here for example: https://godbolt.org/z/ffP17MPaP

A single instruction function (containts a single x86 ret instruction) has two entries in the line table:

0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x0000000000000001      1     12      1   0             0       0  is_stmt end_sequence

@DataCorrupted
Copy link
Member Author

DataCorrupted commented Aug 27, 2025

Thanks for pointing that out!

But if you have three consecutive one-instruction functions (which would be a common case if you enable ICF safe thunks), they will be concat into one sequence: https://godbolt.org/z/GhrYP8njv

0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x0000000000000010      2     12      1   0             0       0  is_stmt prologue_end
0x0000000000000020      3     12      1   0             0       0  is_stmt prologue_end
0x0000000000000021      3     12      1   0             0       0  is_stmt end_sequence

The problem we have here is, for ICF thunks to be probably symbolicated, each function needs to be its own sequence when we enable -mllvm -emit-func-debug-line-table-offsets: https://godbolt.org/z/qbndoGWbM

0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x0000000000000000      1     12      1   0             0       0  is_stmt end_sequence
0x0000000000000010      2     12      1   0             0       0  is_stmt prologue_end
0x0000000000000010      2     12      1   0             0       0  is_stmt end_sequence
0x0000000000000020      3     12      1   0             0       0  is_stmt prologue_end
0x0000000000000021      3     12      1   0             0       0  is_stmt end_sequence

In this case, we cannot just advance PC, or it will go into other thunks.

@dwblaikie
Copy link
Collaborator

Thanks for pointing that out!

But if you have three consecutive one-instruction functions (which would be a common case if you enable ICF safe thunks), they will be concat into one sequence: https://godbolt.org/z/GhrYP8njv

0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x0000000000000010      2     12      1   0             0       0  is_stmt prologue_end
0x0000000000000020      3     12      1   0             0       0  is_stmt prologue_end
0x0000000000000021      3     12      1   0             0       0  is_stmt end_sequence

The problem we have here is, for ICF thunks to be probably symbolicated, each function needs to be its own sequence when we enable -mllvm -emit-func-debug-line-table-offsets: https://godbolt.org/z/qbndoGWbM

0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x0000000000000000      1     12      1   0             0       0  is_stmt end_sequence
0x0000000000000010      2     12      1   0             0       0  is_stmt prologue_end
0x0000000000000010      2     12      1   0             0       0  is_stmt end_sequence
0x0000000000000020      3     12      1   0             0       0  is_stmt prologue_end
0x0000000000000021      3     12      1   0             0       0  is_stmt end_sequence

In this case, we can just advance PC, as the PC will go into other thunks.

I'm pretty sure this table is incorrect/buggy - it contains duplicate entries for the same offset.

It describes zero-length, not one-length sequences.

3 single-instruction functions with separate sequences should be emitted like this:

0x0000000000000000      1     12      0   0             0       0  is_stmt prologue_end
0x0000000000000001      1     12      0   0             0       0  is_stmt end_sequence
0x0000000000000001      2     12      0   0             0       0  is_stmt prologue_end
0x0000000000000002      2     12      0   0             0       0  is_stmt end_sequence
0x0000000000000002      3     13      0   0             0       0  is_stmt prologue_end
0x0000000000000003      3     13      0   0             0       0  is_stmt end_sequence

@dwblaikie
Copy link
Collaborator

(if the functions have padding to force 0x10 alignment, then their ranges can be [0, 1) or [0, 2) up to [0, 10) - any of those values would be OK, doesn't matter if the padding is included in the function or not)

In reality, if you build this example and actually link it (not sure how to then dwarfdump that on godbolt, so you'll have to do this locally):

void f1() { }
void f2() { }
int main() { }
clang++-tot test.cpp -g -O3 -ffunction-sections && llvm-dwarfdump-tot -debug-line a.out
Address            Line   Column File   ISA Discriminator OpIndex Flags
------------------ ------ ------ ------ --- ------------- ------- -------------
0x0000000000001130      1     12      0   0             0       0  is_stmt prologue_end
0x0000000000001131      1     12      0   0             0       0  is_stmt end_sequence
0x0000000000001140      2     12      0   0             0       0  is_stmt prologue_end
0x0000000000001141      2     12      0   0             0       0  is_stmt end_sequence
0x0000000000001150      3     13      0   0             0       0  is_stmt prologue_end
0x0000000000001153      3     13      0   0             0       0  is_stmt end_sequence

@DataCorrupted
Copy link
Member Author

any of those values would be OK, doesn't matter if the padding is included in the function or not)

I agree, if the last row's address is not considered as part of the Sequence, it shouldn't bother with symbolication.

If we agree that its -mllvm -emit-func-debug-line-table-offsets itself, not the linetable that's buggy, let me draft a patch to update how this end_seqence is emitted. We could emit an DW_LNS_advance_pc by MinInstLength before DW_LNE_end_sequence:

0x00000058: 05 DW_LNS_set_column (12)
0x0000005a: 0a DW_LNS_set_prologue_end
0x0000005b: 00 DW_LNE_set_address (0x0000000000000000)
0x00000066: 01 DW_LNS_copy
            0x0000000000000000      1     12      1   0             0       0  is_stmt prologue_end
0x00000067: 00 DW_LNE_end_sequence
            0x0000000000000000      1     12      1   0             0       0  is_stmt end_sequence
0x0000006a: 05 DW_LNS_set_column (12)
0x0000006c: 0a DW_LNS_set_prologue_end
0x0000006d: 00 DW_LNE_set_address (0x0000000000000010)
0x00000078: 13 address += 0,  line += 1,  op-index += 0
            0x0000000000000010      2     12      1   0             0       0  is_stmt prologue_end
0x00000079: 00 DW_LNE_end_sequence
            0x0000000000000010      2     12      1   0             0       0  is_stmt end_sequence
0x0000007c: 05 DW_LNS_set_column (12)
0x0000007e: 0a DW_LNS_set_prologue_end
0x0000007f: 00 DW_LNE_set_address (0x0000000000000020)
0x0000008a: 14 address += 0,  line += 2,  op-index += 0
            0x0000000000000020      3     12      1   0             0       0  is_stmt prologue_end
0x0000008b: 02 DW_LNS_advance_pc (addr += 1, op-index += 0)
0x0000008d: 00 DW_LNE_end_sequence
            0x0000000000000021      3     12      1   0             0       0  is_stmt end_sequence

@DataCorrupted
Copy link
Member Author

FYI: @dwblaikie Seems you guys had a similar discussion a year ago

@dwblaikie
Copy link
Collaborator

FYI: @dwblaikie Seems you guys had a similar discussion a year ago

Thanks for dredging up the history, perhaps it's informative to you/your work now.

any of those values would be OK, doesn't matter if the padding is included in the function or not)

I agree, if the last row's address is not considered as part of the Sequence, it shouldn't bother with symbolication.

If we agree that its -mllvm -emit-func-debug-line-table-offsets itself, not the linetable that's buggy, let me draft a patch to update how this end_seqence is emitted. We could emit an DW_LNS_advance_pc by MinInstLength before DW_LNE_end_sequence:

That still sounds like a bit of a dodgy workaround for some more fundamental problem - we know how long the instruction is, we can produce a sequence that properly describes it (as can be seen in other single-instruction examples, with -ffunction-sections, etc) - we should figure out how those work correctly and how the -emit-func-debug-line-table-offsets is missing that functionality and correct that, hopefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants