Skip to content

gh-100926: Move ctype's pointers cache to StgInfo #131282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 75 commits into
base: main
Choose a base branch
from

Conversation

sergey-miryanov
Copy link
Contributor

@sergey-miryanov sergey-miryanov commented Mar 15, 2025

This is a bit straightfoward.
I keep the global cache for manually registered pairs of pointers and classes, particularly for comtypes (gh-124520, PyCSimpleTypeAsMetaclassTest).

@picnixz picnixz requested a review from encukou March 16, 2025 16:44
sergey-miryanov and others added 2 commits March 17, 2025 12:21
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
@encukou encukou removed the skip news label Mar 17, 2025
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this needs a NEWS blurb. I'd even add a What's New entry -- _pointer_type_cache is nominally internal API, but it's been there for a long time and people are using it.

@sergey-miryanov
Copy link
Contributor Author

IMO, this needs a NEWS blurb. I'd even add a What's New entry -- _pointer_type_cache is nominally internal API, but it's been there for a long time and people are using it.

Thanks! I'll add both entries later today.

@sergey-miryanov
Copy link
Contributor Author

I have added news and whatsnew entries, but I believe we should update them after we decided what to do with _ctypes_ptrtype_cache behaviour.

@encukou
Copy link
Member

encukou commented Mar 18, 2025

As far as I can see, this numeric ID key is only used for a check, and I think that it's OK to remove a check for obsolete functionality from a deprecated function.

The feature is "incomplete pointers" -- ones that were created with a name, but weren't assigned an underlying type yet. See test_ctypes.test_incomplete.py for the intended usage. That test references "the tutorial", but the current tutorial doesn't include this -- instead, you can create incomplete structs for this use case.
Back when ctypes was an independent project, this started working in ctypes 0.6.3 (2003) and was made obsolete in 0.9.5 (2005). ctypes was added to Python 2.5 (2006), without SetPointerType mentioned in docs.


What I'm a bit worried about is the duplication between _pointer_type_cache and info->pointer_type. Which one should have priority? When one is updated, should the other one be updated too?
What kind of relationship is expected and/or guaranteed are between info->proto (ptr_type._type_) and info->pointer_type?

Should we expose the pointer type as a Python attribute?
If we do, with a setter to allow registration, could _pointer_type_cache be turned into a deprecated proxy for that? Something like:

        class PointerTypeCache:
            def __setitem__(self, cls, pointer_type):
                cls.pointer_type = pointer_type

            def __getitem__(self, cls):
                try:
                    return cls.pointer_type
                except AttributeError:
                    raise KeyError(cls)

            def __contains__(self, cls):
                return hasattr(cls, 'pointer_type')

        _pointer_type_cache = PointerTypeCache()

Would having all the info in one place be worth that?

Or am I overthinking it, and we should fix the immediate issue and merge this PR as is?
Entirely possible, but I'd like to have solid understanding first.

@sergey-miryanov
Copy link
Contributor Author

What I'm a bit worried about is the duplication between _pointer_type_cache and info->pointer_type. Which one should have priority? When one is updated, should the other one be updated too? What kind of relationship is expected and/or guaranteed are between info->proto (ptr_type._type_) and info->pointer_type?

AFAIU, if we introduce info->pointer_type then the only use for _pointer_type_cache is a manual registration like for comtypes, where POINTER for non-ctypes type created and registered in the cache. And for incomplete types as well. So, since we create POINTER type from our factory, it doesn't matter how _pointer_type_cache and info->pointer_type are related. They store different data. Also, I expect that ctypes POINTER types will not be registered in _pointer_type_cache and I see the only way to control this is to deprecate _pointer_type_cache and add a registration function, that should check where to store the POINTER type (like register_pointer_type(type, pointer_type) and if type is a non-ctypes then register their pointer in _pointer_type_cache, also same for incomplete types). So, I expect no regular ctypes pointer types will be stored in _pointer_type_cache.

I'm not that familiar with info->proto and ptr_type._type_. I need to dig deeper to answer this.

Should we expose the pointer type as a Python attribute? If we do, with a setter to allow registration, could _pointer_type_cache be turned into a deprecated proxy for that? Something like:

If we allow to poison regular Python classes with our pointer_type attribute, then it is worth idea. But this is will not work for incomplete types (cause cls will be str for all of them), we should strictly check that cls is not a builtin type and so on. I'm not sure it will be an easy way to go :)

Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
@neonene
Copy link
Contributor

neonene commented Mar 18, 2025

Just in my gut feeling, but would the following make sense?

  • Enhance PyCPointerType.set_type() (currently used in the deprecated SetPointerType()) to store pointer_type in stginfo.

  • Make _pointer_type_cache a dict subclass, where the __setitem__() uses the set_type(). The dict would be like {None: c_void_p}.

@sergey-miryanov
Copy link
Contributor Author

Just in my gut feeling, but would the following make sense?

  • Enhance PyCPointerType.set_type() (currently used in the deprecated SetPointerType()) to store pointer_type in stginfo.
  • Make _pointer_type_cache a dict subclass, where the __setitem__() uses the set_type(). The dict would be like {None: c_void_p}.

Yeah, this is like @encukou proposed for PointerTypeCache with addition of mapping interface that would be useful for introspection for example. But please see my reply about incomplete types and non-ctypes types above.

@encukou
Copy link
Member

encukou commented Mar 19, 2025

As far as I can see, only three kinds of keys are acceptable in _pointer_type_cache:

  • a ctypes type, which do have stginfo. Note that this includes comtypes (those inherit from ctypes)
  • an integer -- the id of an incomplete pointer. AFAICS, such an entry in _pointer_type_cache is only used as a “flag” to allow SetPointerType to work; this info might as well be stored in stginfo instead. (I imagine it's equivalent to info->proto == NULL, but I haven't checked for further complications.)
  • None, a special case that can be handled elsewhere

Is there a use case for putting something else in _pointer_type_cache? If we can't find one, let's not support it. We don't want to break comtypes and such (i.e. users who don't have another way of doing what they need), but on the other hand, _pointer_type_cache is an internal implementation detail that's subject to change.

Make _pointer_type_cache a dict subclass

A dict-like object, please. Subclassing dict is too fragile, plus this proxy this can't support iteration.


There's one more aspect to think about: the registration that comtypes does isn't really the right thing for comtypes to do.

comtypes needs to add some functionality to all of its types, so it defines subclasses of the ctypes (meta)classes. But, it also wants to add some functionality to pointers to its types, and pointers are normally created on demand.
So, for every type ctypes defines, it also defines a pointer type, and registers that. It only does that for one “level” though: a pointer to a pointer to a comtype is a plain old ctypes pointer. comtypes could register another level, but not infintely many of them.
IMO, the proper solution here would be to add a make_pointer_type method to the metaclasses, which comtypes could override, so that its pointer types are created on demand using the regular mechanism.
This would make registration unnecessary for comtypes.

I can't rule out another use case for registration, but I don't see one. Perhaps we can deprecate _pointer_type_cache and plan to remove it entirely unless someone speaks up.
(Deprecation/removal would also be easier if it becomes a simple proxy for set_type & co., rather than a source of truth.)

@sergey-miryanov
Copy link
Contributor Author

I'm very new to the internals of ctypes, but I agree with three of the use cases of _pointer_type_cache that you mentioned.
One comment: I may not have checked correctly. I remember that when I analyzed PyCSimpleTypeAsMetaclassTest, I saw that a non-ctypes cls was passed to POINTER.
As far as I can tell, this is because the code ptr_bases = (self, POINTER(bases[0]) called with CtBase that hasn't StgInfo. I'm not sure that I understand how we can replace this with regular pointer type creation and avoid registration.

@encukou
Copy link
Member

encukou commented Mar 19, 2025

Ah! Looks like you're correct, thanks!

I'm not on a Windows box now to check.
I believe CtBase mirrors comtypes' IUnknown, which has an interesting comment:

        `IUnknown` behaves as a ctypes type, and `POINTER` can take it.
        This behavior is defined by some metaclasses in runtime.

So, IUnknown is a ctypes type in spirit, but not in reality :)

We could expose info->pointer_type as a reserved Python attribute __pointer_type__, and for classes without stginfo, look up the attribute. Do you think that's worth pursuing?
I don't think we need to worry builtin types (and others that disable setting attributes).

Some concerns about having _pointer_type_cache store only the manually registered types:

  • lookups in _pointer_type_cache, as a way to check if a pointer type is assigned (without creating the type if it's not), stop working.
  • iteration over _pointer_type_cache will give fewer results.

But, given that it is an internal implementation detail, perhaps that's acceptable.

@sergey-miryanov
Copy link
Contributor Author

@encukou Thanks for review. I have fixed your suggestions. Please take a look.

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent a few more suggestions as a PR: sergey-miryanov#1

  • cache old-style incomplete types, with a "fallback cache" used when __pointer_type__ is unavailable
  • allow setting __pointer_type__ to arbitrary values (_ctypes doesn't care)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
@neonene
Copy link
Contributor

neonene commented May 2, 2025

I sent a few more suggestions as a PR

For me, test_pointers.py on the PR sometimes fails. PyCPointerType.set_type() can update a pointer type's attrdict through the _type_ key, which is not always reflected when accessing the attribute.

@encukou
Copy link
Member

encukou commented May 2, 2025

Looks like that's a 3.13 regression. I hope to fix it later today.

@sergey-miryanov
Copy link
Contributor Author

As I wrote yesterday this should fix it sergey-miryanov@aa44997

@encukou
Copy link
Member

encukou commented May 2, 2025

PyType_Modified works now, but setattr is safer; see #133292 (separate PR since it should be backported)

@sergey-miryanov
Copy link
Contributor Author

PyType_Modified works now, but setattr is safer; see #133292 (separate PR since it should be backported)

Thanks! I'll keep it in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants