Skip to content

gh-99633: Add context manager support to contextvars.Context #99634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

rhansen
Copy link
Contributor

@rhansen rhansen commented Nov 21, 2022

@ghost
Copy link

ghost commented Nov 21, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks interesting :)

@threading_helper.requires_working_threading()
def test_context_manager_2(self):
ctx = contextvars.copy_context()
thread = threading.Thread(target=ctx.__enter__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't hurt to add a thread-safety test, something like test_context_threads_1. How hard would it be?

Copy link
Contributor Author

@rhansen rhansen Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what test_context_threads_1 is meant to test. Maybe it's testing that each thread automatically gets its own Context instance? If threads didn't automatically get their own context then that test would fail with high probability, I think.

If that test is meant to test internal state data races then I don't understand how it accomplishes that goal, as sleeping for even 1ms significantly drops the probability of losing a data race.

It would be great to write a test that guarantees that concurrent Context.__enter__() and/or Context.run() calls on the same Context object will result in exactly one successful call, and RuntimeError exceptions for the other calls, without any internal state corruption due to data races. I'm not sure how to write such a test.

However, I'm not sure it's worth adding such a test in this PR:

  • The possibility of concurrent enters is not new to this PR (one can call Context.run() concurrently today).
  • This PR doesn't make concurrent enters more likely.
  • Users are unlikely to encounter a race because Context objects are not meant to be used across threads. (There is no way to hand off ownership of an entered Context to another thread, and exiting a Context from a different thread will raise.)
  • I think the test would fail because _PyContext_Enter does not atomically check and set ctx->ctx_entered. Fixing that feels out of scope for this PR.

It might be worth addressing thread safety in another PR, but as far as I know nobody has complained so far, so maybe it's better to just leave it alone.

@rhansen rhansen force-pushed the contextvars-context-manager branch from 45a8327 to eb110b6 Compare November 21, 2022 22:00
@rhansen rhansen force-pushed the contextvars-context-manager branch from eb110b6 to f5ef259 Compare February 2, 2023 06:58
@rhansen rhansen force-pushed the contextvars-context-manager branch 2 times, most recently from 442d790 to b21b76f Compare July 10, 2024 22:23
@cjw296
Copy link
Contributor

cjw296 commented Sep 6, 2024

@sobolevn - do you still have outstanding concerns?

I think we'll have to assume @1st1 is not going to get to this PR and it would be great to have it landed.

@picnixz
Copy link
Member

picnixz commented Sep 6, 2024

Before merging it, I'd like to have a look (probably tomorrow).

@picnixz picnixz self-requested a review September 6, 2024 17:38
@sobolevn
Copy link
Member

sobolevn commented Sep 6, 2024

@cjw296 no, I don't have any concerns :)
Please, take this over!

@cjw296
Copy link
Contributor

cjw296 commented Sep 6, 2024

@picnixz - great! I'd love more eyes on making sure this really has the same effect as copy_context() and then having the body on the managed context behave equivalently to being within a ctx.run()!

@rhansen
Copy link
Contributor Author

rhansen commented Sep 8, 2024

Thanks for the review, @picnixz! I'll push a revision probably tomorrow.

@rhansen rhansen force-pushed the contextvars-context-manager branch 2 times, most recently from 14f0277 to 248d2d8 Compare September 8, 2024 22:25
@rhansen rhansen requested a review from picnixz September 8, 2024 22:30
Copy link
Member

@1st1 1st1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling this will not work at all with async/await. If that's the case then this API is a no-go.

Please add a series of async/await tests with multiple context switches and contexts to show that this primitive is composable.

@bedevere-app
Copy link

bedevere-app bot commented Sep 10, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@rhansen rhansen force-pushed the contextvars-context-manager branch from 248d2d8 to 0678535 Compare September 14, 2024 08:33
@rhansen
Copy link
Contributor Author

rhansen commented Sep 14, 2024

I have a feeling this will not work at all with async/await.

You are correct, @1st1. I left a comment in #99634 with more details; let's continue the discussion there. I'm going to mark this PR as draft until consensus is reached on a solution.

@rhansen rhansen marked this pull request as draft September 14, 2024 08:36
…back

I believe that the value of a simpler API (and defense against poorly
written callbacks) outweighs the cost of backing up and restoring the
thread's exception state.
The exception is ignored so change the return type from `int` to
`void` to discourage callbacks from raising an exception in the first
place.
@rhansen rhansen force-pushed the contextvars-context-manager branch from 0678535 to f3a5f93 Compare October 7, 2024 23:10
The PyContext struct is not intended to be public, and users of the
API don't need anything more specific than PyObject.  Also see
pythongh-78943.
Users want to know when the current context switches to a different
context object.  Right now this happens when and only when a context
is entered or exited, so the enter and exit events are synonymous with
"switched".  However, if the changes proposed for pythongh-99633 are
implemented, the current context will also switch for reasons other
than context enter or exit.  Since users actually care about context
switches and not enter or exit, replace the enter and exit events with
a single switched event.

The former exit event was emitted just before exiting the context.
The new switched event is emitted after the context is exited to match
the semantics users expect of an event with a past-tense name.  If
users need the ability to clean up before the switch takes effect,
another event type can be added in the future.  It is not added here
because YAGNI.

I skipped 0 in the enum as a matter of practice.  Skipping 0 makes it
easier to troubleshoot when code forgets to set zeroed memory, and it
aligns with best practices for other tools (e.g.,
https://protobuf.dev/programming-guides/dos-donts/#unspecified-enum).
Change `PyObject *` to/from `PyContext *` to reduce the amount of
casting and improve readability.
…rowed

Improve readability by moving destination assignment next to source
reset, and comment that the ref is stolen.
No public API provides access to the current context yet (only a new
copy), however:

  * This is good defensive practice.
  * This improves code readability.
  * Context watchers are now notified about the initial context.
  * A planned future commit will make it possible for users to access
    the thread's initial context object.  Without this change, users
    would be able to enter the context a second time, causing a cycle
    in the context stack.
This will make it easier to refactor `_PyContext_Enter` and
`_PyContext_Exit` for a planned feature.
Add a new `_context` property to generator (and coroutine) objects to
get/set the "current context" that is observed by (and only by) the
generator and the functions it calls.

When `generator._context` is set to `None` (the default), the
generator is called a "dependent generator".  It behaves the same as
it always has: the "current context" observed by the generator is the
thread's context.  This means that the observed context can change
arbitrarily during a `yield`; the generator *depends* on the sender to
enter the appropriate context before it calls `generator.send`.

When `generator._context` is set to a `contextvars.Context` object,
the generator is called an "independent generator".  It acts more like
a separate thread with its own independent context stack.  The value
of `_context` is the head of that independent stack.  Whenever the
generator starts or resumes execution (via `generator.send`), the
current context temporarily becomes the generator's associated
context.  When the generator yields, returns, or propagates an
exception, the current context reverts back to what it was before.
The generator's context is *independent* from the sender's context.

If an independent generator calls `contextvars.Context.run`, then the
value of the `_context` property will (temporarily) change to the
newly entered context.

If an independent generator sends a value to a second independent
generator, the second generator's context will shadow the first
generator's context until the second generator returns or yields.

The `generator._context` property is private for now until experience
and feedback is collected.  Nothing is using this yet, but that will
change in future commits.

Motivations for this change:

  * First, this change makes it possible for a future commit to add
    context manager support to `contextvars.Context`.  A `yield` after
    entering a context causes execution to leave the generator with a
    different context at the top of the context stack than when
    execution started.  Swapping contexts in and out when execution
    suspends and resumes can only be done by the generator itself.

  * Second, this paves the way for a public API that will enable
    developers to guarantee that the context remains consistent
    throughout a generator's execution.  Right now the context can
    change arbitrarily during a `yield`, which can lead to subtle bugs
    that are difficult to root cause.  (Coroutines run by an asyncio
    event loop do not suffer from this same problem because asyncio
    manually sets the context each time it executes a step of an
    asynchronous function.  See the call to `contextvars.Context.run`
    in `asyncio.Handle._run`.)

  * Finally, this makes it possible to move the responsibility for
    activating an async coroutine's context from the event loop to the
    coroutine, where it more naturally belongs (alongside the rest of
    the execution state such as local variable bindings and the
    instruction pointer).  This ensures consistent behavior between
    different event loop implementations.

Example:

```python
import contextvars

cvar = contextvars.ContextVar('cvar', default='initial')

def make_generator():
    yield cvar.get()
    yield cvar.get()
    yield cvar.get()
    yield cvar.get()
    cvar.set('updated by generator')
    yield cvar.get()

gen = make_generator()
print('1.', next(gen))

def callback():
    cvar.set('updated by callback')
    print('2.', next(gen))

contextvars.copy_context().run(callback)
print('3.', next(gen))
cvar.set('updated at top level')
print('4.', next(gen))
print('5.', next(gen))
print('6.', cvar.get())
```

The above prints:

```
1. initial
2. updated by callback
3. initial
4. updated at top level
5. updated by generator
6. updated by generator
```

Now add the following line after the creation of the generator:

```python
gen._context = contextvars.copy_context()
```

With that change, the script now outputs:

```
1. initial
2. initial
3. initial
4. initial
5. updated by generator
6. updated by top level
```
The functions `_PyGen_ActivateContext` and `_PyGen_DeactivateContext`
are called every time a value or exception is sent to a coroutine.
These functions are no-ops for dependent coroutines (coroutines
without their own independent context stack).  Coroutines are
dependent by default, and the vast majority of performance-sensitive
coroutines are expected to be dependent, so move the check that
determines whether the coroutine is dependent or independent to an
inline function to speed up send calls.
@rhansen rhansen force-pushed the contextvars-context-manager branch from f3a5f93 to 01afd3c Compare October 10, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants