-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-99633: Add context manager support to contextvars.Context
#99634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks interesting :)
@threading_helper.requires_working_threading() | ||
def test_context_manager_2(self): | ||
ctx = contextvars.copy_context() | ||
thread = threading.Thread(target=ctx.__enter__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it won't hurt to add a thread-safety test, something like test_context_threads_1
. How hard would it be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what test_context_threads_1
is meant to test. Maybe it's testing that each thread automatically gets its own Context instance? If threads didn't automatically get their own context then that test would fail with high probability, I think.
If that test is meant to test internal state data races then I don't understand how it accomplishes that goal, as sleeping for even 1ms significantly drops the probability of losing a data race.
It would be great to write a test that guarantees that concurrent Context.__enter__()
and/or Context.run()
calls on the same Context object will result in exactly one successful call, and RuntimeError
exceptions for the other calls, without any internal state corruption due to data races. I'm not sure how to write such a test.
However, I'm not sure it's worth adding such a test in this PR:
- The possibility of concurrent enters is not new to this PR (one can call
Context.run()
concurrently today). - This PR doesn't make concurrent enters more likely.
- Users are unlikely to encounter a race because Context objects are not meant to be used across threads. (There is no way to hand off ownership of an entered Context to another thread, and exiting a Context from a different thread will raise.)
- I think the test would fail because
_PyContext_Enter
does not atomically check and setctx->ctx_entered
. Fixing that feels out of scope for this PR.
It might be worth addressing thread safety in another PR, but as far as I know nobody has complained so far, so maybe it's better to just leave it alone.
45a8327
to
eb110b6
Compare
eb110b6
to
f5ef259
Compare
442d790
to
b21b76f
Compare
Before merging it, I'd like to have a look (probably tomorrow). |
@cjw296 no, I don't have any concerns :) |
@picnixz - great! I'd love more eyes on making sure this really has the same effect as |
Thanks for the review, @picnixz! I'll push a revision probably tomorrow. |
14f0277
to
248d2d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling this will not work at all with async/await. If that's the case then this API is a no-go.
Please add a series of async/await tests with multiple context switches and contexts to show that this primitive is composable.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
248d2d8
to
0678535
Compare
You are correct, @1st1. I left a comment in #99634 with more details; let's continue the discussion there. I'm going to mark this PR as draft until consensus is reached on a solution. |
…back I believe that the value of a simpler API (and defense against poorly written callbacks) outweighs the cost of backing up and restoring the thread's exception state.
The exception is ignored so change the return type from `int` to `void` to discourage callbacks from raising an exception in the first place.
0678535
to
f3a5f93
Compare
The PyContext struct is not intended to be public, and users of the API don't need anything more specific than PyObject. Also see pythongh-78943.
Users want to know when the current context switches to a different context object. Right now this happens when and only when a context is entered or exited, so the enter and exit events are synonymous with "switched". However, if the changes proposed for pythongh-99633 are implemented, the current context will also switch for reasons other than context enter or exit. Since users actually care about context switches and not enter or exit, replace the enter and exit events with a single switched event. The former exit event was emitted just before exiting the context. The new switched event is emitted after the context is exited to match the semantics users expect of an event with a past-tense name. If users need the ability to clean up before the switch takes effect, another event type can be added in the future. It is not added here because YAGNI. I skipped 0 in the enum as a matter of practice. Skipping 0 makes it easier to troubleshoot when code forgets to set zeroed memory, and it aligns with best practices for other tools (e.g., https://protobuf.dev/programming-guides/dos-donts/#unspecified-enum).
Change `PyObject *` to/from `PyContext *` to reduce the amount of casting and improve readability.
…rowed Improve readability by moving destination assignment next to source reset, and comment that the ref is stolen.
No public API provides access to the current context yet (only a new copy), however: * This is good defensive practice. * This improves code readability. * Context watchers are now notified about the initial context. * A planned future commit will make it possible for users to access the thread's initial context object. Without this change, users would be able to enter the context a second time, causing a cycle in the context stack.
This will make it easier to refactor `_PyContext_Enter` and `_PyContext_Exit` for a planned feature.
Add a new `_context` property to generator (and coroutine) objects to get/set the "current context" that is observed by (and only by) the generator and the functions it calls. When `generator._context` is set to `None` (the default), the generator is called a "dependent generator". It behaves the same as it always has: the "current context" observed by the generator is the thread's context. This means that the observed context can change arbitrarily during a `yield`; the generator *depends* on the sender to enter the appropriate context before it calls `generator.send`. When `generator._context` is set to a `contextvars.Context` object, the generator is called an "independent generator". It acts more like a separate thread with its own independent context stack. The value of `_context` is the head of that independent stack. Whenever the generator starts or resumes execution (via `generator.send`), the current context temporarily becomes the generator's associated context. When the generator yields, returns, or propagates an exception, the current context reverts back to what it was before. The generator's context is *independent* from the sender's context. If an independent generator calls `contextvars.Context.run`, then the value of the `_context` property will (temporarily) change to the newly entered context. If an independent generator sends a value to a second independent generator, the second generator's context will shadow the first generator's context until the second generator returns or yields. The `generator._context` property is private for now until experience and feedback is collected. Nothing is using this yet, but that will change in future commits. Motivations for this change: * First, this change makes it possible for a future commit to add context manager support to `contextvars.Context`. A `yield` after entering a context causes execution to leave the generator with a different context at the top of the context stack than when execution started. Swapping contexts in and out when execution suspends and resumes can only be done by the generator itself. * Second, this paves the way for a public API that will enable developers to guarantee that the context remains consistent throughout a generator's execution. Right now the context can change arbitrarily during a `yield`, which can lead to subtle bugs that are difficult to root cause. (Coroutines run by an asyncio event loop do not suffer from this same problem because asyncio manually sets the context each time it executes a step of an asynchronous function. See the call to `contextvars.Context.run` in `asyncio.Handle._run`.) * Finally, this makes it possible to move the responsibility for activating an async coroutine's context from the event loop to the coroutine, where it more naturally belongs (alongside the rest of the execution state such as local variable bindings and the instruction pointer). This ensures consistent behavior between different event loop implementations. Example: ```python import contextvars cvar = contextvars.ContextVar('cvar', default='initial') def make_generator(): yield cvar.get() yield cvar.get() yield cvar.get() yield cvar.get() cvar.set('updated by generator') yield cvar.get() gen = make_generator() print('1.', next(gen)) def callback(): cvar.set('updated by callback') print('2.', next(gen)) contextvars.copy_context().run(callback) print('3.', next(gen)) cvar.set('updated at top level') print('4.', next(gen)) print('5.', next(gen)) print('6.', cvar.get()) ``` The above prints: ``` 1. initial 2. updated by callback 3. initial 4. updated at top level 5. updated by generator 6. updated by generator ``` Now add the following line after the creation of the generator: ```python gen._context = contextvars.copy_context() ``` With that change, the script now outputs: ``` 1. initial 2. initial 3. initial 4. initial 5. updated by generator 6. updated by top level ```
The functions `_PyGen_ActivateContext` and `_PyGen_DeactivateContext` are called every time a value or exception is sent to a coroutine. These functions are no-ops for dependent coroutines (coroutines without their own independent context stack). Coroutines are dependent by default, and the vast majority of performance-sensitive coroutines are expected to be dependent, so move the check that determines whether the coroutine is dependent or independent to an inline function to speed up send calls.
f3a5f93
to
01afd3c
Compare
contextvars.Context
#99633