Skip to content

Conversation

vortex73
Copy link
Contributor

@vortex73 vortex73 commented May 15, 2025

This PR details the implementation details of the proposed ABI lowering library.

  • Implementing a Shadow TypeSystem
  • Implement a bridge between Clang Qualtype to the shadow Typesystem
  • Integrate into Clang
  • Implement BPF ABIInfo
  • Implement SysV ABIInfo

Copy link

github-actions bot commented May 26, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@vortex73 vortex73 force-pushed the abi branch 2 times, most recently from 43855e6 to 114cab7 Compare May 26, 2025 21:51
@vortex73 vortex73 force-pushed the abi branch 2 times, most recently from 4410289 to 2832642 Compare June 1, 2025 16:47
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing still missing here is handling for C++ structs with base classes.

@makslevental makslevental self-requested a review June 2, 2025 17:22
@vortex73 vortex73 force-pushed the abi branch 2 times, most recently from 329556a to 0f045da Compare June 12, 2025 18:20
ABIFunctionInfoArgInfo(Type *T, ABIArgInfo A) : ABIType(T), ArgInfo(A) {}
};

class ABIFunctionInfo final
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both FunctionABIInfo and ABIFunctionInfo is pretty confusing... do these need to be separate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

um, I've renamed it for the time being - if I feel it doesn't serve a purpose in a few days I'll put it somewhere else

@vortex73 vortex73 force-pushed the abi branch 2 times, most recently from 268c470 to 690068f Compare June 21, 2025 13:44
StructType *
ABITypeMapper::createStructFromFields(ArrayRef<abi::FieldInfo> Fields,
uint32_t NumFields, TypeSize Size,
Align Alignment, bool IsUnion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... this is tricky. I'm not entirely sure what to do here. For the purpose of mapping ABIArgInfo, we should never encounter unions. For the struct case, things are tricky. Mostly, I believe we just want LLVM structs as a means of holding multiple values. E.g. a StructType for Direct arguments will normally get unpacked into individual arguments. In this case, we really don't want any kind of padding fillers in it. I think the only case that cares about that is CoerceAndExpand, which has two struct types, one just with the elements (which will be passed as separate arguments) and one that has the actual layout (with correct alignment etc).

This kind of makes me wonder whether it would make sense to start by storing llvm types in ABIArgInfo to match what clang currently does. abi::Type is conceptually nicer, but also not quite the right representation for everything (e.g. because it can't really represent the "struct without layout" case well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should never encounter unions

Oh yeah! My bad...

Also I'm assuming both times you referred to ABIArgInfo you were talking about abi::ABIArgInfo and not codegen::ABIArgInfo (should probably name that something slightly different sorry).

Given that assumption - I really don't think we need to bring llvm::Type into the library as the struct without layout case can be added to the ABI typesystem itself (add another type of abi::StructPacking or decide whether we need padding while converting). But agreed a lot of cleanup to be done here, I'll keep it in mind while writing the x86 target.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I'm assuming both times you referred to ABIArgInfo you were talking about abi::ABIArgInfo and not codegen::ABIArgInfo (should probably name that something slightly different sorry).

Yeah, I mean abi::ABIArgInfo. Btw, I wonder whether we should call these abi::ArgInfo etc, given that we already have abi in the namespace name...

Given that assumption - I really don't think we need to bring llvm::Type into the library as the struct without layout case can be added to the ABI typesystem itself (add another type of abi::StructPacking or decide whether we need padding while converting). But agreed a lot of cleanup to be done here, I'll keep it in mind while writing the x86 target.

Yeah, this is an option. I was mostly thinking about using llvm::Type as the path of least resistance to clang integration...

ABIFunctionInfoArgInfo(Type *T, ABIArgInfo A) : ABIType(T), ArgInfo(A) {}
};

class ABIFunctionInfo final
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

um, I've renamed it for the time being - if I feel it doesn't serve a purpose in a few days I'll put it somewhere else

StructType *
ABITypeMapper::createStructFromFields(ArrayRef<abi::FieldInfo> Fields,
uint32_t NumFields, TypeSize Size,
Align Alignment, bool IsUnion) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should never encounter unions

Oh yeah! My bad...

Also I'm assuming both times you referred to ABIArgInfo you were talking about abi::ABIArgInfo and not codegen::ABIArgInfo (should probably name that something slightly different sorry).

Given that assumption - I really don't think we need to bring llvm::Type into the library as the struct without layout case can be added to the ABI typesystem itself (add another type of abi::StructPacking or decide whether we need padding while converting). But agreed a lot of cleanup to be done here, I'll keep it in mind while writing the x86 target.

};

private:
TypeBuilder &TB;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note for the future: We probably need to do something to avoid leaking memory from these temporary types. We can probably checkpoint the type arena before computing ABIArgInfo and then restore the checkpoint once it's no longer needed.

}

if (!SeenNamedMember) {
SeenNamedMember = isNamedMember(Field);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... do we really need this named member logic? Looking at CGRecordLowering::lowerUnion(), it seems like the named members are only relevant in the !isZeroInitializable case, but you don't actually handle that and instead always override with the type of a named member.

What happens if you just drop this named member logic and only keep the alignment/size based logic below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't think ABI rules would care about ZeroInit, so it prefers named members and upgrades to a better type if it has better align/size. I think without it there was some ABI mismatch from clang.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I'm asking is that I'd like to minimize the amount of "weird stuff" that the frontend has to pass in when creating types.

I think as implemented this logic is pretty weird and doesn't make a lot of sense -- I think it will only make a difference in the specific case where you first have an unnamed field with a "better" type, then a named field (with which we override) and then all fields (named or unnamed) with "worse" types. But not for example if we first have a named field and then a better type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, none of the tests fail if I drop the SeenNamedMember check, so I guess its fine :)


class ABITypeMapper {
public:
explicit ABITypeMapper(LLVMContext &Ctx, const DataLayout &DataLayout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a general note, in C++ ctors it's okay to use the same name for the ctor arg and the member. So something like DL(DL) works and is idiomatic.

if (!FieldType)
continue;

if (Field.IsBitField && Field.BitFieldWidth > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this code, I doubt it does something correct for bit fields. Bit field layout is complicated.

For what kind of structs does ABITypeMapper actually get used? Does it only ever get used on "coerced structs" in practice, or also other ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ABITypeMapper is used on everything.

Yeah was worried about Bitfield packing, but it didn't cause any blaring issues, so I assumed LLVM took care of it

@vortex73 vortex73 force-pushed the abi branch 3 times, most recently from b949795 to 565bf3b Compare August 28, 2025 21:42
@vortex73 vortex73 force-pushed the abi branch 2 times, most recently from 90500f7 to 7097478 Compare August 29, 2025 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants