-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[libclang/python] Add missing enum variants #143264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang Author: Jannick Kremer (DeinAlptraum) ChangesAdd tests to ensure that all C-enum variants are defined on Python side. Full diff: https://github.com/llvm/llvm-project/pull/143264.diff 1 Files Affected:
diff --git a/clang/bindings/python/tests/cindex/test_enums.py b/clang/bindings/python/tests/cindex/test_enums.py
index 9e7f44fcf7867..54c0c77e4df29 100644
--- a/clang/bindings/python/tests/cindex/test_enums.py
+++ b/clang/bindings/python/tests/cindex/test_enums.py
@@ -1,4 +1,5 @@
import unittest
+from pathlib import Path
from clang.cindex import (
AccessSpecifier,
@@ -12,6 +13,7 @@
TemplateArgumentKind,
TLSKind,
TokenKind,
+ TranslationUnit,
TypeKind,
)
@@ -44,8 +46,53 @@ def test_from_id(self):
def test_duplicate_ids(self):
"""Check that no two kinds have the same id"""
- # for enum in self.enums:
for enum in self.enums:
num_declared_variants = len(enum._member_map_.keys())
num_unique_variants = len(list(enum))
self.assertEqual(num_declared_variants, num_unique_variants)
+
+ def test_all_variants(self):
+ """Check that all libclang enum values are also defined in cindex"""
+ cenum_to_pythonenum = {
+ "CX_CXXAccessSpecifier": AccessSpecifier,
+ "CXAvailabilityKind": AvailabilityKind,
+ "CXBinaryOperatorKind": BinaryOperator,
+ "CXCursorKind": CursorKind,
+ "CXCursor_ExceptionSpecificationKind": ExceptionSpecificationKind,
+ "CXLinkageKind": LinkageKind,
+ "CXRefQualifierKind": RefQualifierKind,
+ "CX_StorageClass": StorageClass,
+ "CXTemplateArgumentKind": TemplateArgumentKind,
+ "CXTLSKind": TLSKind,
+ "CXTokenKind": TokenKind,
+ "CXTypeKind": TypeKind,
+ }
+
+ indexheader = (
+ Path(__file__).parent.parent.parent.parent.parent
+ / "include/clang-c/Index.h"
+ )
+ tu = TranslationUnit.from_source(indexheader, ["-x", "c++"])
+
+ enum_variant_map = {}
+ # For all enums in self.enums, extract all enum variants defined in Index.h
+ for cursor in tu.cursor.walk_preorder():
+ type_class = cenum_to_pythonenum.get(cursor.type.spelling)
+ if (
+ cursor.kind == CursorKind.ENUM_CONSTANT_DECL
+ and type_class in self.enums
+ ):
+ if type_class not in enum_variant_map:
+ enum_variant_map[type_class] = []
+ enum_variant_map[type_class].append(cursor.enum_value)
+
+ for enum in self.enums:
+ with self.subTest(enum):
+ python_kinds = set([kind.value for kind in enum])
+ c_kinds = set(enum_variant_map[enum])
+ missing_python_kinds = c_kinds - python_kinds
+ self.assertEqual(
+ missing_python_kinds,
+ set(),
+ f"Please ensure these variants are defined inside {enum} in cindex.py.",
+ )
|
@Endilll this adds the missing enum variants, and also adds a test using libclang, parsing This uses only the numeric enum values for comparisons since this is easier to implement (especially because some variants have aliases on the C-side) Sidenote: I've noticed that a few enum variants have inconsistent namings, e.g. all Python enum variants are fully uppercase, execpt for This is probably by far our slowest test, responsible for about 20% of the entire test suite's runtime, but since all tests together take like 500ms that's acceptable imo |
Add tests to ensure that all C-enum variants are defined on Python side.
808ea38
to
fc97cbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry it took me so long to get to this
for enum in self.enums: | ||
num_declared_variants = len(enum._member_map_.keys()) | ||
num_unique_variants = len(list(enum)) | ||
self.assertEqual(num_declared_variants, num_unique_variants) | ||
|
||
def test_all_variants(self): | ||
"""Check that all libclang enum values are also defined in cindex""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Check that all libclang enum values are also defined in cindex""" | |
"""Check that all libclang enum values are also defined in cindex.py""" |
USHORTACCUM = 36 | ||
UACCUM = 37 | ||
ULONGACCUM = 38 | ||
BFLOAT16 = 39 | ||
IBM128 = 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see constants that correspond to CXType_FirstBuiltin
and CXType_LastBuiltin
. Is this intentional?
Path(__file__).parent.parent.parent.parent.parent | ||
/ "include/clang-c/Index.h" | ||
) | ||
tu = TranslationUnit.from_source(indexheader, ["-x", "c++"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would make any practical difference, but maybe we still should parse C header in C language mode to make it less surprising
"CXCursor_ExceptionSpecificationKind": ExceptionSpecificationKind, | ||
"CXLinkageKind": LinkageKind, | ||
"CXRefQualifierKind": RefQualifierKind, | ||
"CX_StorageClass": StorageClass, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keys in this dictionary seem to be sorted lexicographically, except for this one. I'd expect it to go second instead of being put in the middle of the list. The underscore after CX
is another example of consistent naming...
enum_variant_map = {} | ||
# For all enums in self.enums, extract all enum variants defined in Index.h | ||
for cursor in tu.cursor.walk_preorder(): | ||
type_class = cenum_to_pythonenum.get(cursor.type.spelling) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_class = cenum_to_pythonenum.get(cursor.type.spelling) | |
python_enum = cenum_to_pythonenum.get(cursor.type.spelling) |
Otherwise it's not clear why cenum_to_pythonenum
gives you type class, which for Clang contributors usually mean something else (the kind of AST Type
node).
type_class = cenum_to_pythonenum.get(cursor.type.spelling) | ||
if ( | ||
cursor.kind == CursorKind.ENUM_CONSTANT_DECL | ||
and type_class in self.enums |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that (I think) all enums inherit from BaseEnumeration
, can we replace the registry of enums that contributors and reviewers are prone to forget about, and inspect the module instead?
This code ignores enums that are present in Index.h
but not in self.enums
. I'd prefer to have a test that fails if libclang and Python wrapper are out of sync on declared enums.
with self.subTest(enum): | ||
python_kinds = set([kind.value for kind in enum]) | ||
c_kinds = set(enum_variant_map[enum]) | ||
missing_python_kinds = c_kinds - python_kinds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you resort to comparing the number of enumerators, because we can't establish mapping between individual enumerators by name? Having names of missing enumerators would make test failure even more actionable.
Ideally I'd have enumerators with both correct and incorrect spelling, mark incorrect as deprecated, and phase them out, but |
Add tests to ensure that all C-enum variants are defined on Python side.