Skip to content

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Sep 3, 2025

getTypeAllocSize() currently works by taking the type store size and aligning it to the ABI alignment. However, this ends up doing redundant work in various cases, for example arrays will unnecessarily repeat the alignment step, and structs will fetch the StructLayout multiple times.

As this code is rather hot (it is called every time we need to calculate GEP offsets for example), specialize the implementation. This repeats a small amount of logic from getAlignment(), but I think that's worthwhile.

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=38b376f1927df5c1dea1065041779b28b13b9dd9&to=848ed4571d91d1c6b2d2e484a28a0fe2b67e7995&stat=instructions%3Au

getTypeAllocSize() currently works by taking the type store size
and aligning it to the ABI alignment. However, this ends up doing
redundant work in various cases, for example arrays will
unnecessarily repeat the alignment step, and structs will fetch
the StructLayout multiple times.

As this code is rather hot (it is called every time we need to
calculate GEP offsets for example), specialize the implementation.
This repeats a small amount of logic from getAlignment(), but I
think that's worthwhile.
@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Nikita Popov (nikic)

Changes

getTypeAllocSize() currently works by taking the type store size and aligning it to the ABI alignment. However, this ends up doing redundant work in various cases, for example arrays will unnecessarily repeat the alignment step, and structs will fetch the StructLayout multiple times.

As this code is rather hot (it is called every time we need to calculate GEP offsets for example), specialize the implementation. This repeats a small amount of logic from getAlignment(), but I think that's worthwhile.

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=38b376f1927df5c1dea1065041779b28b13b9dd9&to=848ed4571d91d1c6b2d2e484a28a0fe2b67e7995&stat=instructions%3Au


Full diff: https://github.com/llvm/llvm-project/pull/156687.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/DataLayout.h (+1-4)
  • (modified) llvm/lib/IR/DataLayout.cpp (+38)
  • (modified) llvm/test/CodeGen/AArch64/alloca-oversized.ll (-3)
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 2992484c47d06..2acae246c0b1e 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -501,10 +501,7 @@ class DataLayout {
   ///
   /// This is the amount that alloca reserves for this type. For example,
   /// returns 12 or 16 for x86_fp80, depending on alignment.
-  TypeSize getTypeAllocSize(Type *Ty) const {
-    // Round up to the next alignment boundary.
-    return alignTo(getTypeStoreSize(Ty), getABITypeAlign(Ty).value());
-  }
+  TypeSize getTypeAllocSize(Type *Ty) const;
 
   /// Returns the offset in bits between successive objects of the
   /// specified type, including alignment padding; always a multiple of 8.
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index ee43ad49d0df2..2cf96f8720a05 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -844,6 +844,44 @@ Align DataLayout::getAlignment(Type *Ty, bool abi_or_pref) const {
   }
 }
 
+TypeSize DataLayout::getTypeAllocSize(Type *Ty) const {
+  switch (Ty->getTypeID()) {
+  case Type::ArrayTyID: {
+    // The alignment of the array is the alignment of the element, so there
+    // is no need for further adjustment.
+    auto *ATy = cast<ArrayType>(Ty);
+    return ATy->getNumElements() * getTypeAllocSize(ATy->getElementType());
+  }
+  case Type::StructTyID: {
+    const StructLayout *Layout = getStructLayout(cast<StructType>(Ty));
+    TypeSize Size = Layout->getSizeInBytes();
+
+    if (cast<StructType>(Ty)->isPacked())
+      return Size;
+
+    Align A = std::max(StructABIAlignment, Layout->getAlignment());
+    return alignTo(Size, A.value());
+  }
+  case Type::IntegerTyID: {
+    unsigned BitWidth = Ty->getIntegerBitWidth();
+    TypeSize Size = TypeSize::getFixed(divideCeil(BitWidth, 8));
+    Align A = getIntegerAlignment(BitWidth, /*ABI=*/true);
+    return alignTo(Size, A.value());
+  }
+  case Type::PointerTyID: {
+    unsigned AS = Ty->getPointerAddressSpace();
+    TypeSize Size = TypeSize::getFixed(getPointerSize(AS));
+    return alignTo(Size, getPointerABIAlignment(AS).value());
+  }
+  case Type::TargetExtTyID: {
+    Type *LayoutTy = cast<TargetExtType>(Ty)->getLayoutType();
+    return getTypeAllocSize(LayoutTy);
+  }
+  default:
+    return alignTo(getTypeStoreSize(Ty), getABITypeAlign(Ty).value());
+  }
+}
+
 Align DataLayout::getABITypeAlign(Type *Ty) const {
   return getAlignment(Ty, true);
 }
diff --git a/llvm/test/CodeGen/AArch64/alloca-oversized.ll b/llvm/test/CodeGen/AArch64/alloca-oversized.ll
index e57bbcdf99800..81d301019e7b9 100644
--- a/llvm/test/CodeGen/AArch64/alloca-oversized.ll
+++ b/llvm/test/CodeGen/AArch64/alloca-oversized.ll
@@ -10,10 +10,7 @@ define void @test_oversized(ptr %dst, i32 %cond) {
 ; CHECK-NEXT:    .cfi_offset w30, -8
 ; CHECK-NEXT:    .cfi_offset w29, -16
 ; CHECK-NEXT:    mov x8, sp
-; CHECK-NEXT:    mov x9, #2305843009213693952 // =0x2000000000000000
-; CHECK-NEXT:    sub x8, x8, x9
 ; CHECK-NEXT:    sub x9, x29, #32
-; CHECK-NEXT:    mov sp, x8
 ; CHECK-NEXT:    cmp w1, #0
 ; CHECK-NEXT:    csel x8, x9, x8, eq
 ; CHECK-NEXT:    str x8, [x0]

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-llvm-ir

Author: Nikita Popov (nikic)

Changes

getTypeAllocSize() currently works by taking the type store size and aligning it to the ABI alignment. However, this ends up doing redundant work in various cases, for example arrays will unnecessarily repeat the alignment step, and structs will fetch the StructLayout multiple times.

As this code is rather hot (it is called every time we need to calculate GEP offsets for example), specialize the implementation. This repeats a small amount of logic from getAlignment(), but I think that's worthwhile.

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=38b376f1927df5c1dea1065041779b28b13b9dd9&amp;to=848ed4571d91d1c6b2d2e484a28a0fe2b67e7995&amp;stat=instructions%3Au


Full diff: https://github.com/llvm/llvm-project/pull/156687.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/DataLayout.h (+1-4)
  • (modified) llvm/lib/IR/DataLayout.cpp (+38)
  • (modified) llvm/test/CodeGen/AArch64/alloca-oversized.ll (-3)
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 2992484c47d06..2acae246c0b1e 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -501,10 +501,7 @@ class DataLayout {
   ///
   /// This is the amount that alloca reserves for this type. For example,
   /// returns 12 or 16 for x86_fp80, depending on alignment.
-  TypeSize getTypeAllocSize(Type *Ty) const {
-    // Round up to the next alignment boundary.
-    return alignTo(getTypeStoreSize(Ty), getABITypeAlign(Ty).value());
-  }
+  TypeSize getTypeAllocSize(Type *Ty) const;
 
   /// Returns the offset in bits between successive objects of the
   /// specified type, including alignment padding; always a multiple of 8.
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index ee43ad49d0df2..2cf96f8720a05 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -844,6 +844,44 @@ Align DataLayout::getAlignment(Type *Ty, bool abi_or_pref) const {
   }
 }
 
+TypeSize DataLayout::getTypeAllocSize(Type *Ty) const {
+  switch (Ty->getTypeID()) {
+  case Type::ArrayTyID: {
+    // The alignment of the array is the alignment of the element, so there
+    // is no need for further adjustment.
+    auto *ATy = cast<ArrayType>(Ty);
+    return ATy->getNumElements() * getTypeAllocSize(ATy->getElementType());
+  }
+  case Type::StructTyID: {
+    const StructLayout *Layout = getStructLayout(cast<StructType>(Ty));
+    TypeSize Size = Layout->getSizeInBytes();
+
+    if (cast<StructType>(Ty)->isPacked())
+      return Size;
+
+    Align A = std::max(StructABIAlignment, Layout->getAlignment());
+    return alignTo(Size, A.value());
+  }
+  case Type::IntegerTyID: {
+    unsigned BitWidth = Ty->getIntegerBitWidth();
+    TypeSize Size = TypeSize::getFixed(divideCeil(BitWidth, 8));
+    Align A = getIntegerAlignment(BitWidth, /*ABI=*/true);
+    return alignTo(Size, A.value());
+  }
+  case Type::PointerTyID: {
+    unsigned AS = Ty->getPointerAddressSpace();
+    TypeSize Size = TypeSize::getFixed(getPointerSize(AS));
+    return alignTo(Size, getPointerABIAlignment(AS).value());
+  }
+  case Type::TargetExtTyID: {
+    Type *LayoutTy = cast<TargetExtType>(Ty)->getLayoutType();
+    return getTypeAllocSize(LayoutTy);
+  }
+  default:
+    return alignTo(getTypeStoreSize(Ty), getABITypeAlign(Ty).value());
+  }
+}
+
 Align DataLayout::getABITypeAlign(Type *Ty) const {
   return getAlignment(Ty, true);
 }
diff --git a/llvm/test/CodeGen/AArch64/alloca-oversized.ll b/llvm/test/CodeGen/AArch64/alloca-oversized.ll
index e57bbcdf99800..81d301019e7b9 100644
--- a/llvm/test/CodeGen/AArch64/alloca-oversized.ll
+++ b/llvm/test/CodeGen/AArch64/alloca-oversized.ll
@@ -10,10 +10,7 @@ define void @test_oversized(ptr %dst, i32 %cond) {
 ; CHECK-NEXT:    .cfi_offset w30, -8
 ; CHECK-NEXT:    .cfi_offset w29, -16
 ; CHECK-NEXT:    mov x8, sp
-; CHECK-NEXT:    mov x9, #2305843009213693952 // =0x2000000000000000
-; CHECK-NEXT:    sub x8, x8, x9
 ; CHECK-NEXT:    sub x9, x29, #32
-; CHECK-NEXT:    mov sp, x8
 ; CHECK-NEXT:    cmp w1, #0
 ; CHECK-NEXT:    csel x8, x9, x8, eq
 ; CHECK-NEXT:    str x8, [x0]

Copy link
Contributor Author

@nikic nikic Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can see that this is not entirely NFC. The reason is that this implementation does not always go through the size in bits, which may end up discarding high bits of the type. As such, the new implementation is technically "more correct". But really type sizes the overflow when converting bits <-> bytes are not really something we can sensibly support -- we should probably try to make the IR verifier reject these.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@nikic nikic merged commit 4d927a5 into llvm:main Sep 4, 2025
12 checks passed
@nikic nikic deleted the dl-type-align branch September 4, 2025 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants