Add sys.set_object_tags() and sys.get_object_tags() APIs for debugging and experimental Use #134819

corona10 · 2025-05-28T01:52:02Z

Background

CPython currently exposes several internal implementation details via APIs such as:

sys._defer_refcount
sys._is_immortal
sys._is_interned

These APIs leak implementation-specific details and create implicit expectations of cross-version(e.g 3.13, 3.14, 3.15...) and cross-implementation compatibility(e.g CPython, PyPy, RustPython), even though they are not part of any formal public API contract.

For example, other Python implementations may not support or emulate these features, and yet their presence in CPython can create unintentional backward compatibility burdens when new releases are made.

Proposal

To address this, I would like to propose introducing two weak introspection APIs in the sys module:

sys.set_tags(obj, ["defer_refcount", "tag2"]) -> None
sys.get_tags(obj) -> tuple

`sys.set_tags(obj, tags: Iterable[str]) -> None`

Sets optional "tags" on an object.
Tags are hints for the Python implementation and are not guaranteed to be applied or have any effect.
The implementation may accept or ignore any or all provided tags.
These tags are advisory only, intended primarily for debugging, experimentation, and tooling used by Python implementation developers.

`sys.get_tags(obj) -> tuple[str, ...]`

Returns the tags currently associated with the object.
These reflect only the tags actually recognized and retained by the interpreter.

For example:

sys.set_tags(o, ["defer_refcount", "tag2"])
print(sys.get_tags(o))  # May return: ('defer_refcount',)

If the object is already immortal due to previous operations, you might see:
```
sys.get_tags(o)  # May return: ('defer_refcount', 'immortal')
```

Goals and Non-Goals

Goals:

Provide a mechanism to annotate or mark objects for introspection/debugging.
Allow developers of Python implementations or advanced tools to experiment with internal object states in a controlled manner.

Non-Goals:

These APIs are not intended to be stable or relied upon for program behavior.
No tag is guaranteed to have any effect or to be preserved between runs, interpreter versions, or across implementations.

Documentation and Guarantees

We will clearly document that:

These APIs are for Python implementation developers only.
The presence or absence of any particular tag does not imply any behavioral guarantees.
Tags may be implementation-specific, and unsupported tags will be silently ignored.
Maybe possible to provide Python-specific tags in somewhere but should note that it will not be guarantee according to versions

cc @ZeroIntensity @vstinner @Fidget-Spinner @colesbury

Linked PRs

gh-134819: Add sys.set_object_tags and sys.get_object_tags #135073

The text was updated successfully, but these errors were encountered:

corona10 · 2025-05-28T01:54:50Z

FYI, I am even fine with sys._set_tags and sys._get_tags, but I believe that it would be better than providing every Python API per implementations.

corona10 · 2025-05-28T02:16:45Z

And please let me know if there are better namings

ZeroIntensity · 2025-05-28T12:36:44Z

I like the general idea, but I have a few notes/concerns:

Should set_tags be plural? I'd think that in most cases, you'd want to set one tag only.
set_tags seems too misleading if the interpreter is allowed to ignore it. If I see anything called "set", I'd expect it to actually set something upon calling it. How about request_tags?
get_tags/set_tags only covers implementation details for objects themselves. They won't work for experimental APIs that need parameters.

An alternative could be to properly expose unstable APIs like we do with PyUnstable in the C API. Maybe something like sys.unstable_defer_refcount, or an unstable module (from unstable.sys import defer_refcount) could work.

corona10 · 2025-05-28T12:51:21Z

An alternative could be to properly expose unstable APIs like we do with PyUnstable in the C API. Maybe something like sys.unstable_defer_refcount, or an unstable module (from unstable.sys import defer_refcount) could work.

I no longer like adding more such APIs. The basic concept of this API is not making CPython a specific implementation's API anymore. It will break other implementations and cause compatibility issues. unstable.sys will not solve the current situtations.

Should set_tags be plural? I'd think that in most cases, you'd want to set one tag only.

Well, API will not care about whether the user adding multiple attributes and singe attribute anyway.

set_tags seems too misleading if the interpreter is allowed to ignore it. If I see anything called "set", I'd expect it to actually set something upon calling it. How about request_tags?

I don't care about the naming, I thought that get/set is conventional naming. For me, this is matter of documentation and I still think that people should not use this API as much as possible.

get_tags/set_tags only covers implementation details for objects themselves. They won't work for experimental APIs that need parameters

Would you like to provide a concrete example? Currently, we only care about defer_refcount and immortal, so I didn't think about it. Well, we could change the signature of set_tags to be set_tag and make it receive parameters.

ZeroIntensity · 2025-05-28T13:08:13Z

I no longer like adding more such APIs. The basic concept of this API is not making CPython a specific implementation's API anymore. It will break other implementations and cause compatibility issues. unstable.sys will not solve the current situtations.

I'm worried that get_tags isn't much better. If someone were to write something like this, it would not be portable to other implementations:

def do_something_to_constant(op):
    # Use immortality as a notion of constant-ness
    if "immortal" not in sys.get_tags(op):
        raise ValueError()

Couldn't other implementations just implement _is_interned or whatever as just return False?

Would you like to provide a concrete example?

What if we wanted to provide an API for object flags someday?

It's also not totally clear to me if get_tags/set_tags is supposed to cover general object implementation details (e.g., immortality and DRC), or something specific to a type (e.g., string interning).

corona10 · 2025-05-28T13:20:43Z

I'm worried that get_tags isn't much better. If someone were to write something like this, it would not be portable to other implementations:

The key point is where the focus lies. If you care about portability, then you shouldn’t rely on unstable or implementation-specific APIs, which may not exist in other versions or implementations. However, sys.get_tags itself will be available consistently across implementations and versions. As I mentioned earlier, we don’t guarantee the specific output — and if a third-party library depends on certain tags being present, that’s their responsibility.

Consider the case where we want to remove sys._is_immortal(). With sys.get_tags, we can simply stop returning the "immortal" tag — the code using it won’t break; only the implementation detail changes, which is exactly what we want. On the other hand, sys._is_immortal() is a different story: in some cases, we might have to keep the API even if we no longer want to support it.

ZeroIntensity · 2025-05-28T13:25:10Z

Consider the case where we want to remove sys._is_immortal(). With sys.get_tags, we can simply stop returning the "immortal" tag — the code using it won’t break; only the implementation detail changes, which is exactly what we want. On the other hand, sys._is_immortal() is a different story: in some cases, we might have to keep the API even if we no longer want to support it.

Ok, that makes sense.

The place where I'm getting a little tripped up is that the whole point of the _ prefix was that we could remove it any version--it's supposed to be a private API, we just document it and thus shift the maintenance responsibility to users. I don't see it as much different than using a private method (prefixed with _). Why doesn't that work?

corona10 · 2025-05-28T13:26:14Z

What if we wanted to provide an API for object flags someday?

I'm open to making the API design more flexible, but we should still try to avoid exposing implementation details whenever possible. So, should we plan to support object flags in the future? The reason I mention this is that we can not cover all cases :)

How about sys.set_tag(obj, tag: str, *, options: dict[str, Any] = {}) -> None this?

corona10 · 2025-05-28T13:30:19Z

The place where I'm getting a little tripped up is that the whole point of the _ prefix was that we could remove it any version--it's supposed to be a private API, we just document it and thus shift the maintenance responsibility to users. I don't see it as much different than using a private method (prefixed with _). Why doesn't that work?

See: #134762 (comment), this is a real-world example.
There are also several alternative Python implementations, such as PyPy, GraalPython, and RustPython, which often copy parts of the CPython implementation and adapt them to their own runtimes. Introducing this API would help reduce their catch up burden and make the CPython runtime less tied to specific implementation details :)

ZeroIntensity · 2025-05-28T13:37:34Z

I was under the impression that it'd be totally fine to remove sys._getframe, we just won't in practice because frames are exposed in other public APIs (e.g., inspect.currentframe). I think we might just need some additional rules on when something is private (or "unstable") and not.

vstinner · 2025-05-28T14:04:26Z

Please rename your API to get/set_object_tags(). get_tags() name is too generic: tags of what?

corona10 · 2025-05-28T14:05:25Z

Please rename your API to get/set_object_tags(). get_tags() name is too generic: tags of what?

Looks good!

vstinner · 2025-05-28T15:32:15Z

I would prefer sys.set_tags(obj, defer_refcount=True) API. So it would be possible to clear an hypothetical future tag using sys.set_tags(obj, future_tag=False). Or set different values than just True/False: sys.set_tags(obj, future_tag=123).

And get_tags() should return a dictionary with values, like: get_object_tags(obj) -> {'defer_refcount': True, 'interned: True}.

By the way, is it possible to mark a string as interned with your API? Something like: sys.set_tags(obj, interned=True). Does it fail with non-string objects?

I suppose that sys.set_tags(obj, immortal=True) cannot be implemented, or maybe it should work on immortal objects and fail on non-immortal objects?

Fidget-Spinner · 2025-05-28T15:35:04Z

I would prefer sys.set_tags(obj, defer_refcount=True) API.

What do we do if we remove the tag in the future though? E.g. if one day we remove defer_refcount. Wouldn't that break the API?

vstinner · 2025-05-28T15:40:47Z

Should we include GC "tags" (related to the PyGC_Head structure) in get_object_tags()?

gc_tracked (bool): similar to gc.is_tracked(obj)
finalized (bool): similar to gc.is_finalized(obj)

Not sure about set_object_tags(). Should it be possible to track/untrack using set_object_tags()? It sounds dangerous, maybe don't allow that.

And what about other low-level object attributes?

managed_dict (bool): test type(obj).tp_flags & Py_TPFLAGS_MANAGED_DICT
managed_weakref (bool): test type(obj).tp_flags & Py_TPFLAGS_MANAGED_WEAKREF
inline_values (bool): test type(obj).tp_flags & Py_TPFLAGS_INLINE_VALUES

Would it be interesting to expose these tags? They cannot be modified by set_object_tags().

corona10 · 2025-05-28T16:06:40Z

Would it be interesting to expose these tags? They cannot be modified by set_object_tags().

Even if tags that can not be set by set_object_tags(), get_object_tags can return such tags.
It doesn't need to be 1:1. It's just for checking object status.

corona10 · 2025-05-28T16:08:10Z

I would prefer sys.set_tags(obj, defer_refcount=True) API. So it would be possible to clear an hypothetical future tag using

I still prefer to use string-based tag since those keywords would be meaningless for other implementations.

corona10 · 2025-05-28T16:49:43Z

Should we include GC "tags" (related to the PyGC_Head structure) in get_object_tags()?

I am not sure about these, if we think that gc.is_tracked(obj) and gc.is_finalized(obj) are already public API and widely used.
Let's focus on internal implementation details first. And if it occurs that some tags are frequently used all over the place maybe we can promote specific API as public like gc.is_tracked(obj) and gc.is_finalized(obj)

colesbury · 2025-05-28T17:50:25Z

I think we'd be better of with dedicated APIs for the different pieces of functionality. I think combining lots of functionality into a single Swiss army knife style API isn't good design: it makes it harder to use, harder to discover, harder to implement correctly, and harder to add new features. To quote Robert C. Martin: "Functions should do one thing. They should do it well. They should do it only."

I am not convinced that this style of design makes it any easier for implementers of alternate Python runtimes.

It is incredibly easy to support APIs like sys._defer_refcount, sys._is_immortal, sys._is_interned even if the runtime doesn't support deferred reference counting, immortality, and interning: just do nothing (for sys._defer_refcount) or return False, which is exactly what we would do in the default build for sys._defer_refcount.

On the other hand, the semantics of sys.set_tags and sys.get_tags is more confusing for both implementers and users: are unknown tags ignored? Okay, now it's easy to introduce bugs due to minor spelling errors in strings. Are unknown tags preserved between sys.set_tags and sys.get_tags? (My guess is no.)

colesbury · 2025-05-28T17:54:11Z

To expand on my previous comment: if you want the APIs to be clearly labeled as experimental or implementation details, stick them in a namespace whose name that makes that clear.

vstinner · 2025-05-28T18:04:14Z

I concur with @colesbury, I also prefer multiple functions rather than a single one (two in practice).

corona10 · 2025-05-28T22:39:32Z

I think we'd be better of with dedicated APIs for the different pieces of functionality. I
On the other hand, the semantics of sys.set_tags and sys.get_tags is more confusing for both implementers and users

Most of case this is correct approach and rasonable concern, but intention of this API is not recommending people to use. The user of this API would be limited and we prefer to use our internal purpose which can allow to change anything without any burnden.

corona10 · 2025-05-28T22:42:02Z

On the other hand, the semantics of sys.set_tags and sys.get_tags is more confusing for both implementers and users

Once the API is added most of implementer does not need to care about it. Most of case it will set anything or return nothing.

For the user, I believe that they should not rely on this API and even for the current usage.

corona10 · 2025-05-28T23:07:28Z

Okay, now it's easy to introduce bugs due to minor spelling errors in strings.

Yeah, that’s true. But why should we care about people who rely on private APIs like sys._xxx?

If the current stance is that these sys._xxx APIs can be removed freely without documentation, that would make things much less stressful.
But when I look at this PR (#134762), I wonder, why are we trying to expose and document them?
Why are we shooting ourselves in the foot by making implementation details public?

corona10 · 2025-05-28T23:12:39Z

My motivation for this API is that if exposing implementation details is inevitable, then why aren't we exposing a sandbox API that can be easily managed, and nothing needs to be guaranteed?

corona10 mentioned this issue May 28, 2025

gh-134761: Use deferred reference counting for threading concurrency primitives #134762

Open

corona10 changed the title ~~Add sys.set_tags() and sys.get_tags() APIs for Debugging and Experimental Use~~ Add sys.set_tags() and sys.get_tags() APIs for debugging and experimental Use May 28, 2025

emmatyping added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 28, 2025

corona10 self-assigned this May 28, 2025

vstinner changed the title ~~Add sys.set_tags() and sys.get_tags() APIs for debugging and experimental Use~~ Add sys.set_object_tags() and sys.get_object_tags() APIs for debugging and experimental Use May 28, 2025

mostafaammer added this to lavitaconnect@MOSTAFAAMMER May 31, 2025

bedevere-app bot mentioned this issue Jun 3, 2025

gh-134819: Add sys.set_object_tags and sys.get_object_tags #135073

Open

corona10 added a commit to corona10/cpython that referenced this issue Jun 3, 2025

pythongh-134819: Add sys.set_object_tags and sys.get_object_tags

2cc543c

Uh oh!

Add sys.set_object_tags() and sys.get_object_tags() APIs for debugging and experimental Use #134819

Add sys.set_object_tags() and sys.get_object_tags() APIs for debugging and experimental Use #134819

Comments

corona10 commented May 28, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Proposal

sys.set_tags(obj, tags: Iterable[str]) -> None

sys.get_tags(obj) -> tuple[str, ...]

Goals and Non-Goals

Documentation and Guarantees

Linked PRs

corona10 commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 commented May 28, 2025

Uh oh!

ZeroIntensity commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

ZeroIntensity commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

ZeroIntensity commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

ZeroIntensity commented May 28, 2025

Uh oh!

vstinner commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

vstinner commented May 28, 2025

Uh oh!

Fidget-Spinner commented May 28, 2025

Uh oh!

vstinner commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

colesbury commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colesbury commented May 28, 2025

Uh oh!

vstinner commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025

Uh oh!

corona10 commented May 28, 2025 •

edited by bedevere-app bot

Loading

`sys.set_tags(obj, tags: Iterable[str]) -> None`

`sys.get_tags(obj) -> tuple[str, ...]`

corona10 commented May 28, 2025 •

edited

Loading

vstinner commented May 28, 2025 •

edited

Loading

corona10 commented May 28, 2025 •

edited

Loading

colesbury commented May 28, 2025 •

edited

Loading