Skip to content

gh-111375: Fix handling of exceptions within @contextmanager-decorated functions #111676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

pR0Ps
Copy link
Contributor

@pR0Ps pR0Ps commented Nov 3, 2023

Previously, when using a try...yield...except construct within a @contextmanager function, an exception raised by the yield wouldn't be properly cleared when handled by the except block. Instead, calling sys.exception() would continue to return the exception, leading to confusing tracebacks and logs.

This was happening due to the way that the @contextmanager decorator drives its decorated generator function. When an exception occurs, it uses .throw(exc) to throw the exception into the generator. However, even if the generator handles this exception, because the exception was thrown into it in the context of the @contextmanager decorator handling it (and not being finished yet), sys.exception() was not being reset.

For an example of this and more information, see #111375

In order to fix this, the exception context as the @contextmanager decorator is __enter__'d is stored and set as the current exception context just before throwing the new exception into the generator. Doing this means that after the generator handles the thrown exception, sys.exception() reverts back to what it was when the @contextmanager function was started.

Potential reviewers based on activity in the linked issue: @iritkatriel @ericsnowcurrently @ncoghlan

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Mostly your solution fundamentally seems correct and specifically a good approach. I just have concerns about 2 points:

  • relying on ctypes
  • deleting self.exc_context instead of setting it back to None

@bedevere-app
Copy link

bedevere-app bot commented Nov 3, 2023

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@iritkatriel
Copy link
Member

The function _PyErr_GetTopmostException skips over Nones, and returns the first proper exception in the chain. I don't know what could get confused if we start setting the exception to None.

@pR0Ps pR0Ps force-pushed the bugfix/with-statement-exception-handling branch 2 times, most recently from 582be48 to 79766c8 Compare November 4, 2023 03:54
@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 4, 2023

@iritkatriel

The function _PyErr_GetTopmostException skips over Nones, and returns the first proper exception in the chain. I don't know what could get confused if we start setting the exception to None.

Hmm, yeah, now that you mention it, I don't know why this is currently working. Consider:

import sys
try:
    raise TypeError()
except TypeError:
    try:
        raise ValueError()
    except:
        assert isinstance(sys.exception(), ValueError)
        sys._set_exception(None)  # !!!
        assert sys.exception() is None  # !!!
    assert isinstance(sys.exception(), TypeError)

This works (using the latest commit on this branch), yet it does look like it shouldn't. It seems like sys._set_exception(None) should set tstate->exc_info->exc_value to Py_None and therefore should make PyErr_GetTopmostException skip over it and get the TypeError instead when assert sys.exception() is None runs.

Unless I'm misunderstanding something?

@pR0Ps pR0Ps force-pushed the bugfix/with-statement-exception-handling branch from 79766c8 to b115754 Compare November 4, 2023 04:34
@iritkatriel
Copy link
Member

@iritkatriel

The function _PyErr_GetTopmostException skips over Nones, and returns the first proper exception in the chain. I don't know what could get confused if we start setting the exception to None.

Hmm, yeah, now that you mention it, I don't know why this is currently working. Consider:

import sys
try:
    raise TypeError()
except TypeError:
    try:
        raise ValueError()
    except:
        assert isinstance(sys.exception(), ValueError)
        sys._set_exception(None)  # !!!
        assert sys.exception() is None  # !!!
    assert isinstance(sys.exception(), TypeError)

This works (using the latest commit on this branch), yet it does look like it shouldn't. It seems like sys._set_exception(None) should set tstate->exc_info->exc_value to Py_None and therefore should make PyErr_GetTopmostException skip over it and get the TypeError instead when assert sys.exception() is None runs.

Unless I'm misunderstanding something?

Try nesting it in yet another try-except?

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 5, 2023

Try nesting it in yet another try-except?

I have not been able to break it, no matter how much nesting above or below is used. This also runs without asserting:

import sys
try:
    raise RuntimeError()
except:
    try:
        raise TypeError()
    except:
        try:
            raise ValueError()
        except:
            assert type(sys.exception()) == ValueError
            assert type(sys.exception().__context__) == TypeError
            assert type(sys.exception().__context__.__context__) == RuntimeError
            sys._set_exception(None)  # !!!
            assert sys.exception() is None  # !!!
            try:
                raise IndexError()
            except:
                assert type(sys.exception()) == IndexError
                assert sys.exception().__context__ is None
                try:
                    raise SyntaxError()
                except:
                    assert type(sys.exception()) == SyntaxError
                    assert type(sys.exception().__context__) == IndexError
                    assert sys.exception().__context__.__context__ is None
                assert type(sys.exception()) == IndexError
                assert sys.exception().__context__ is None
            assert sys.exception() is None  # !!!
        assert type(sys.exception()) == TypeError
        assert type(sys.exception().__context__) == RuntimeError
    assert isinstance(sys.exception(), RuntimeError)
assert sys.exception() is None

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 6, 2023

The piece I was missing was that exc_info->previous_item is always NULL except if there are generators/coroutines involved.

The following shows the issue @iritkatriel raised (I've added the state of the exc_info stack as comments):

import sys

def gen():
    # exc_info: None
    yield
    # exc_info: NULL --> ZeroDivisionError('division by zero')
    assert type(sys.exception()) == ZeroDivisionError
    sys._set_exception(None)
    # exc_info: None --> ZeroDivisionError('division by zero')
    assert sys.exception() is None  # 💣💣💣
    yield

g = gen()
next(g)
try:
    1/0
except:
    next(g)

This breaks because generators push a new exc_info onto the tstate->exc_info stack. Because PyErr_GetTopmostException skips over NULL/None exc_values, sys.exception() still returns the ZeroDivisionError. In this state, sys._set_exception(None) is essentially a no-op since the top-level exc_info->exc_value already was NULL/None. The assert then fails on the next line.

This maybe brings up another "is it a bug or just a documentation issue" question (expand me)

In Doc/c-api/exceptions.rst PyErr_SetHandledException says the following:

To clear the exception state, pass NULL.

and

it can be used when code needs to save and restore the exception state temporarily. Use PyErr_GetHandledException to get the exception state.

This doesn't convey the nuance here. Passing NULL to clear the exception state only works if the current exception isn't "hidden" under a NULL/None already. Also, an uninformed implementation of saving and restoring exception state using PyErr_GetHandledException/PyErr_SetHandledException will also have issues if the top-level exception is NULL/None since it will effectively duplicate the exception. For example:

// exc_info state: None -> Exception
PyObject *obj = PyErr_GetHandledException();
// obj: Exception
/* <snip some code> */
PyErr_SetHandledException(obj)
// exc_info state: Exception -> Exception

Maybe the documentation should just be expanded to cover this edge case, but it seems to me that it's actually a bug that PyErr_SetHandledException doesn't set the same exception that PyErr_GetHandledException returns.

I've just pushed a commit that fixes this issue by making sure that PyErr_SetHandledException always sets the exception that would've been returned by PyErr_GetHandledException, which ensures that the currently-handled exception is the one that it sets. I understand that this may not be something that can be changed due to backwards compatibility concerns but thought I'd propose it anyway and see what you all think.

Open to feedback/suggestions. Also, since I think I've addressed everything at this point: I have made the requested changes; please review again.

@iritkatriel
Copy link
Member

Thanks for the analysis! Yes, I think you found the issue.

but it seems to me that it's actually a bug that PyErr_SetHandledException doesn't set the same exception that PyErr_GetHandledException returns.

I agree, and wouldn't mind seeing this fixed. Would it break anything? If you change the exception stack from [exc, None, None] to [exc, None, exc] - it look like it's semantically the same.

@pR0Ps pR0Ps requested a review from markshannon as a code owner November 6, 2023 21:32
@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 6, 2023

Would it break anything? If you change the exception stack from [exc, None, None] to [exc, None, exc] - it look like it's semantically the same.

I've messed around with this some more and I think you're right in that it's exactly the same from the perspective of everything just using _PyErr_GetTopmostException so it doesn't matter.

It doesn't seem to break anything (nothing in the test suite at least). The danger here is that it now makes it possible for PyErr_SetHandledException being used within a coroutine to change the exception in the calling function instead of being limited to only being able to affect change within the context of the coroutine.

As far as I can tell, the only place the interpreter uses PyErr_SetHandledException is to set the handled exception when processing an except* block so the context within the block only has the matched exceptions. This only happens within the context of an exception being handled so there's no danger of the top of the stack being None and it overwriting the caller's exception. That being said, if some external code relies on being in a coroutine and having the exc_info stack in a state where the top exception is None and uses this to add an exception on top instead setting an exception, the change I made will cause issues. It will cause the function to replace the exception of the calling function instead of adding the exception to the top of the stack. Ex:

// <in a coroutine with the caller handling an Exception>
// exc_info state: None --> Exception
PyObject *value_err = make_some_ValueError();
PyErr_SetHandledException(value_err);
// PREVIOUS: ValueError --> Exception
// NEW: None -> ValueError

Considering all the above, I'm reconsidering the change to PyErr_SetHandledException. Even though it seems unexpected that using it to set the handled exception to NULL/None doesn't do anything in a coroutine, I think it's much more unexpected that it can reach up out of its current context and affect the caller's exception.

While looking into this, I also realized that the change to PyErr_SetHandledException wasn't even a complete fix for the To clear the exception state, pass NULL. documentation and the @contextmanager issue. Having it set the handled exception to NULL/None doesn't clear the exception, it removes the current topmost one. If there was another exception under there, it would be returned by sys.exception() anyway. I've added a test case in the latest commit that tests this situation.

Sorry for all the back and forth on this - I'm discovering how all this works as I go.


Assuming that the exception stack having duplicates in it is fine, the remaining issue (documentation and otherwise) is that passing in NULL/None does not actually clear the exception state - it only removes the current top. If called within a coroutine, this is either a no-op or potentially exposes the caller exception state instead.

Since it's desired behavior to have sys.exception() return the exception of the caller within a coroutine that hasn't raised an exception, _PyErr_GetTopmostException's traversing up the stack of exceptions can't be removed. In order to be able to fully clear the exception state, this means that there has to be some way of signaling to _PyErr_GetTopmostException that it shouldn't traverse up the exception stack in the specific case where the exception has been explicitly cleared.

One way to do this that looks like it might work is to use the difference between NULL and None in the exception stack. If the exception is NULL (ie. unset) then it's skipped, but if it's set to None then it's a valid non-exception and is returned as-is as the top-level exception. This allows PyErr_SetHandledException(Py_None) to set None on the top of the stack and mask all the exceptions under it. Then, when the coroutine exits, the None is popped off the exception stack as normal, exposing the other exceptions under it.

I've implemented the above in my latest commit and it seems to work (all test cases pass). However, I'm definitely a bit out of my depth here so I'm not sure if this is something that can/should be changed or if I've overlooked an edge case with it.

@iritkatriel
Copy link
Member

I think it's much more unexpected that it can reach up out of its current context and affect the caller's exception.

Good point.

This is all useful analysis. Can you start by making some doc enhancements with this point you feel are missing? (If we end up making code changes we can change the docs as well, but it's useful to have the current state documented in any case).

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 14, 2023

Can you start by making some doc enhancements with this point you feel are missing?

Sure, should I do them in this PR or submit a new one?

@iritkatriel
Copy link
Member

Maybe a new one.

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Nov 15, 2023

In parallel to that, I think I've resolved all the issues that were brought up here so @bedevere-bot : I have made the requested changes; please review again.

EDIT: ok, no idea how to get the label to change back to awaiting review so I've just manually requested a re-revew. Sorry for the spam

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Dec 13, 2023

Anything else I need to do for this?

@iritkatriel
Copy link
Member

As far as I recall, the fix described in #111676 (comment) needed to be reverted so we still have a problem with the None/NULL exc_info. No?

@pR0Ps
Copy link
Contributor Author

pR0Ps commented Dec 16, 2023

I think the changes I made in the following commit of 36b6177 solved that problem. Previously, a None/NULL on the exception stack meant that _PyErr_GetTopmostException would skip it. With the new changes, _PyErr_GetTopmostException no longer skips Nones and returns them instead. This allows sys._set_exception(None) to add a None to the exception stack, effectively clearing it.

To support this change, I modified the sections of code that add exceptions to the exception stack so that they always insert NULL in cases where there was no exception instead of None as they did previously.

@iritkatriel
Copy link
Member

I think the changes I made in the following commit of 36b6177 solved that problem. Previously, a None/NULL on the exception stack meant that _PyErr_GetTopmostException would skip it. With the new changes, _PyErr_GetTopmostException no longer skips Nones and returns them instead. This allows sys._set_exception(None) to add a None to the exception stack, effectively clearing it.

To support this change, I modified the sections of code that add exceptions to the exception stack so that they always insert NULL in cases where there was no exception instead of None as they did previously.

This makes sense, but since it's a deep subtle change I think we should split it into two PRs. That would make it easier in the future to see what happened. Both PRs can be attached to this issue.

The first PR would get rid of the NULL/None ambiguity. Add an assertion in _PyErr_GetTopmostException that the exception is never None. The second PR can contain the rest.

pR0Ps added a commit to pR0Ps/cpython that referenced this pull request Dec 20, 2023
Previously, both `NULL` and `Py_None` would be used interchangeably to
indicate that an exception is no longer being handled. By ensuring that
only `NULL` is used, this opens up the possibility to use `Py_None` to
indicate a cleared exception. The difference here would be that clearing
would indicate that no exception is currently being handled vs. handling
would indicate that the next exception in the stack is currently being
handled.

This functionality will be used to patch up some edge cases in how the
exception context interacts with exceptions thrown into coroutines.

This is implemented in this commit by changing code that could add
`Py_None` to the exception stack to indicate that an exception is no
longer being handled to add `NULL` instead. An assert was also added to
ensure that `Py_None` is no longer added to the exception stack.

See pythongh-111676 for context.
pR0Ps added a commit to pR0Ps/cpython that referenced this pull request Dec 20, 2023
Previously, both `NULL` and `Py_None` would be used interchangeably to
indicate that an exception is no longer being handled. By ensuring that
only `NULL` is used, this opens up the possibility to use `Py_None` to
indicate a cleared exception. The difference here would be that clearing
would indicate that no exception is currently being handled vs. handling
would indicate that the next exception in the stack is currently being
handled.

This functionality will be used to patch up some edge cases in how the
exception context interacts with exceptions thrown into coroutines.

This is implemented in this commit by changing code that could add
`Py_None` to the exception stack to indicate that an exception is no
longer being handled to add `NULL` instead. An assert was also added to
ensure that `Py_None` is no longer added to the exception stack.

See pythongh-111676 for context.
pR0Ps added a commit to pR0Ps/cpython that referenced this pull request Dec 20, 2023
Previously, both `NULL` and `Py_None` would be used interchangeably to
indicate that an exception is no longer being handled. By ensuring that
only `NULL` is used, this opens up the possibility to use `Py_None` to
indicate a cleared exception. The difference here would be that clearing
would indicate that no exception is currently being handled vs. handling
would indicate that the next exception in the stack is currently being
handled.

This functionality will be used to patch up some edge cases in how the
exception context interacts with exceptions thrown into coroutines.

This is implemented in this commit by changing code that could add
`Py_None` to the exception stack to indicate that an exception is no
longer being handled to add `NULL` instead. An assert was also added to
ensure that `Py_None` is no longer added to the exception stack.

See pythongh-111676 for context.
@pR0Ps pR0Ps force-pushed the bugfix/with-statement-exception-handling branch from edd0f5b to a6f948b Compare December 20, 2023 04:14
@pR0Ps pR0Ps requested a review from gvanrossum as a code owner December 20, 2023 04:14
@pR0Ps
Copy link
Contributor Author

pR0Ps commented Dec 20, 2023

The first PR would get rid of the NULL/None ambiguity. Add an assertion in _PyErr_GetTopmostException that the exception is never None. The second PR can contain the rest.

I've created #113302 that fixes the NULL/None ambiguity and adds the assert and rebased this PR on top of that branch. I also took the opportunity while I was rewriting history to fix the merge conflicts with main and rewrite the commits here into something that's hopefully more understandable to reviewers. It now only includes 2 commits: one to add sys._set_exception() and the Py_None exception masking, and one to use that functionality to fix the @contextmanager behavior.

Apologies for this, I know rebasing PRs during development is generally frowned upon, but figured the alternative of having the same functionality implemented in different ways across multiple branches, having merge conflicts with main, and not being able to have it cleanly apply on #113302 was worse in this case.

@@ -0,0 +1,2 @@
Add sys._set_exception() function that can set/clear the current exception
context. Patch by Carey Metcalfe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 3 news files here. This and the next one should be deleted.

Copy link
Contributor Author

@pR0Ps pR0Ps Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Should the sys._set_exception() be added to the other news file or just not noted at all since it's an "internal" function? (provided sys._set_exception() is how this ends up being implemented, otherwise this is a moot point)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not needed if it's private. But you can wait with the news file till we know what's going in.

# context here. In order to make the context manager behave
# like a normal function we set the current exception context
# to what it was during the context manager's __enter__
sys._set_exception(exc_context)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not that comfortable adding this function to the sys module, even with an _. It is an API that shouldn't really exist, we're adding it just to enable a hack in contextmanager, which needs to change the semantics of code.

Is there a way that we can roll this functionality into the gen.throw() function instead? Such as -- add a parameter that indicates that the exception stack needs to be fudged in this way?

Copy link
Contributor Author

@pR0Ps pR0Ps Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. I don't really have strong opinions on how it's implemented, I only went with sys._set_exception because it seemed like it would work and there is precedent for a function that directly manipulates the current exception state.

With that being said, I don't know if doing it entirely within throw() is possible, at least in this case. Because the exception that needs to be set is the one being handled in the __enter__, I don't think there's enough context when throw() is called to correctly manipulate the current exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking you pass the exc_context as a new kwwarg to throw which defaults to None.

(I wasn't aware of exc_clear actually. It was added in 2.3, so quite old. Doesn't seem to be used anywhere in the stdlib at the moment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Now that I've got sys._set_exception() working and am therefore (reasonably) sure that the None/NULL exception stuff works properly, I'll take a look at adding a kwarg to throw() that essentially just makes it call PyErr_SetHandledException behind the scenes.

pR0Ps and others added 5 commits December 24, 2023 14:14
…nction

Exposing this function allows Python code to clear/set the current
exception context as returned by `sys.exception()`. It is prefixed by an
underscore since this functionality is not intended to be accessed by
user-written code.

In order to allow `sys._set_exception(None)` to clear the current
exception `_PyErr_GetTopmostException` was changed to consider `Py_None`
a valid exception instead of an indication to keep traversing the stack
like `NULL` is. Additionally, `PyErr_SetHandledException` was updated to
add `Py_None` to the exception stack instead of `NULL`.

Put together, these changes allow `PyErr_SetHandledException(NULL)` to
mask the entire exception chain and "clear" the current exception state
as documented in `Doc/c-api/exceptions.rst`. Furthermore, since
`sys._set_exception()` uses `PyErr_SetHandledException`, this allows
`sys._set_exception(None)` to clear the current exception context
instead of just exposing the next exception in the stack.
Previously, when using a `try...yield...except` construct within a
`@contextmanager` function, an exception raised by the `yield` wouldn't
be properly cleared when handled the `except` block. Instead, calling
`sys.exception()` would continue to return the exception, leading to
confusing tracebacks and logs.

This was happening due to the way that the `@contextmanager` decorator
drives its decorated generator function. When an exception occurs, it
uses `.throw(exc)` to throw the exception into the generator. However,
even if the generator handles this exception, because the exception was
thrown into it in the context of the `@contextmanager` decorator
handling it (and not being finished yet), `sys.exception()` was not
being reset.

In order to fix this, the exception context as the `@contextmanager`
decorator is `__enter__`'d is stored and set as the current exception
context just before throwing the new exception into the generator.
Doing this means that after the generator handles the thrown exception,
`sys.exception()` reverts back to what it was when the `@contextmanager`
function was started.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
@pR0Ps pR0Ps force-pushed the bugfix/with-statement-exception-handling branch from 1bdba4d to a879a9f Compare December 24, 2023 19:35
@pR0Ps pR0Ps force-pushed the bugfix/with-statement-exception-handling branch from 5e7f10b to 887eb54 Compare December 24, 2023 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants