Skip to content

bpo-27015: Save kwargs given to exceptions constructor #11580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

remilapeyre
Copy link
Contributor

@remilapeyre remilapeyre commented Jan 16, 2019

Fix unpickling bug when exception class expect keyword arguments

https://bugs.python.org/issue27015

@remilapeyre remilapeyre changed the title Save kwargs given to exceptions constructor bpo-27015: Save kwargs given to exceptions constructor Jan 16, 2019
Fix unpickling bug when exception class expect keyword arguments
@remilapeyre remilapeyre force-pushed the BaseException-does-not-unpickle branch from ce25dc8 to 6a00f21 Compare January 16, 2019 14:53
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@remilapeyre
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

}
key = PyTuple_GET_ITEM(item, 0);
value = PyTuple_GET_ITEM(item, 1);
PyTuple_SET_ITEM(seq, i, PyUnicode_FromFormat("%S=%R", key, value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am missing something PyUnicode_FromFormat can fail if the _PyUnicodeWriter fails to write the string. Please, add error checking around all calls to PyUnicode_FromFormat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no mention that PyUnicode_FromFormat can fail at https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_FromFormat

Should I add a note about this?

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@remilapeyre
Copy link
Contributor Author

Thansks @pablogsal, I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@gpshead, @pablogsal: please review the changes made to this pull request.

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a solid start, but the significant increase in the complexity of the __reduce__ implementation bothers me. It would be really nice if we could migrate exceptions to using the newer __get_newargs_ex__ API instead, as that natively handles exactly this problem.

@@ -87,6 +87,10 @@ The following exceptions are used mostly as base classes for other exceptions.
assign a special meaning to the elements of this tuple, while others are
usually called only with a single string giving an error message.

.. attribute:: kwargs

The dictionnary of keyword arguments given to the exception constructor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: dictionnary -> dictionary

Py_DECREF(self);
return NULL;
}
self->kwargs = kwds;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment should only be done if kwds is a non-empty dict, otherwise we're risking increasing the size of exception instances by sys.getsizeof({}) bytes for no compelling reason (that's an extra 288 bytes on my machine, vs the current 88 bytes for a BaseException instance).

While calls like SomeException() would get NULL for kwds, consider code like super().__new__(*some_args, **maybe_some_kwds) in exception subclasses.

If kwds is NULL, or an empty dict, then self->kwargs should be set to the None singleton.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be good to do that for args too?

if (val == NULL) {
PyErr_SetString(PyExc_TypeError, "kwargs may not be deleted");
return -1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val needs to be checked to make sure it is a mapping (with PyMapping_Check) or None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about doing that but it seems to me that PyMapping_Check is only a loose check, for example, wouldn't we be able to do:

>>> b = BaseException()
>>> b.kwargs = ()

since tuples support slicing?

If so, is the check still worth it?

PyObject **newargs;

_Py_IDENTIFIER(partial);
functools = PyImport_ImportModule("functools");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduce implementation concerns me, as it looks like it will make everything much slower, even for exception instances where self->kwargs isn't set.

Instead, I'd recommend migrating BaseException away from implementing __reduce__ directly, and instead have it implement __getnewargs_ex__: https://docs.python.org/3/library/pickle.html#object.__getnewargs_ex__

That way the pickle machinery will take care of calling __new__ with the correct arguments, and you wouldn't need to introduce a weird dependency from a core builtin into a standard library module.

(That would have potential backwards compatibility implications for subclasses implementing reduce based on the parent class implementation, but the same would hold true for introduce a partial object in place of a direct reference to the class - either way, there'll need to be a note in the Porting section of the What's New guide, and switching to __get_newargs_ex__ will at least have the virtue of simplifying the code rather than making it more complicated)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do that, I removed __reduce__ and added __getnewargs_ex__ to the methods as:

static PyObject *
BaseException_getnewargs_ex(PyBaseExceptionObject *self, PyObject *Py_UNUSED(ignored))
{
    PyObject *args = PyObject_GetAttrString((PyObject *) self, "args");
    PyObject *kwargs = PyObject_GetAttrString((PyObject *) self, "kwargs");

    if (args == NULL || kwargs == NULL) {
        return NULL;
    }

    return Py_BuildValue("(OO)", args, kwargs);
}

but it brocke pickling. Did I miss something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, found my mistake, using __getnewargs_ex__ broke pickling for protocols 0 and 1. Is this expected?

I don't think this happened when using a partial reference on the the constructor of the class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's ok to broke pickling support for protocols 0 and 1 since it was broken for keyword args anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining __reduce_ex__ would let you restore the old behaviour for those protocols, but I'm not sure __getnewargs_ex__ will still be called if you do that (we're reaching the limits of my own pickle knowledge).

@pitrou Do you have any recommendations here? (Context: trying to get BaseException to pickle keyword args properly, wanting to use __getnewargs_ex__ for more recent pickle protocols, but wondering how to handle the older protocols that don't use that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should I call object.__reduce_ex__?

It seems to me that calling the builtin super is not done anywhere in the source code but I don't find the right way to do it.

Do I need to call object___reduce_ex__ directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan Well, I'm not sure why you wouldn't implement the entire logic in __reduce_ex__, instead of also defining __getnewargs_ex__?

Or, rather, you could just define __getnewargs_ex__ and stop caring about protocols 0 and 1 (which are extremely obsolete by now, so we want to maintain compatibility, but fixing bugs in them is not important).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitrou I only suggested delegating to __getnewargs_ex__ because I wasn't sure how to mimic that behaviour from inside a custom __reduce_ex__ implementation.

But if __reduce__ still gets called for protocols 0 and 1 even when __getnewargs_ex__ is defined, then that's even better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pitrou @ncoghlan, thanks for you input. I pushed a new commit that implement __getnewargs_ex__ but it seems that __reduce_ex__ does not check it and call __reduce__ no matter what the protocol is:

>>> BaseException().__reduce_ex__(0)
(<class 'BaseException'>, ())
>>> BaseException().__reduce_ex__(1)
(<class 'BaseException'>, ())
>>> BaseException().__reduce_ex__(2)
(<class 'BaseException'>, ())
>>> BaseException().__reduce_ex__(3)
(<class 'BaseException'>, ())
>>> BaseException().__reduce_ex__(4)
(<class 'BaseException'>, ())
>>> BaseException().__getnewargs_ex__()
((), {})

If I remove the __reduce__, then it breaks pickling for protocols 0 and 1:

>>> BaseException().__reduce_ex__(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/remi/src/cpython/Lib/copyreg.py", line 66, in _reduce_ex
    raise TypeError(f"cannot pickle {cls.__name__!r} object")
TypeError: cannot pickle 'BaseException' object
>>> BaseException().__reduce_ex__(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/remi/src/cpython/Lib/copyreg.py", line 66, in _reduce_ex
    raise TypeError(f"cannot pickle {cls.__name__!r} object")
TypeError: cannot pickle 'BaseException' object
>>> BaseException().__reduce_ex__(2)
(<function __newobj__ at 0x105c63040>, (<class 'BaseException'>,), None, None, None)
>>> BaseException().__reduce_ex__(3)
(<function __newobj__ at 0x105c63040>, (<class 'BaseException'>,), None, None, None)
>>> BaseException().__reduce_ex__(4)
(<function __newobj__ at 0x105c63040>, (<class 'BaseException'>,), None, None, None)

Do I need to define a custom __reduce_ex__ as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug further and it seems my issue comes from https://github.com/python/cpython/blob/master/Lib/copyreg.py#L66, I will look into the details tomorrow.

@remilapeyre
Copy link
Contributor Author

remilapeyre commented Feb 2, 2019 via email

@ncoghlan
Copy link
Contributor

ncoghlan commented Feb 2, 2019

I still need to update reduce to use the partial function though otherwise support for pickling would still be broken for protocol 0 and 1 wouldn't it ?

It would, but there's no good reason for anyone to emit protocols lower than 2, since that was introduced all the way back in Python 2.3 (https://docs.python.org/3/library/pickle.html#data-stream-format).

The key part is to not break pickling & unpickling with those protocols - you don't need to enhance it.

@remilapeyre remilapeyre force-pushed the BaseException-does-not-unpickle branch from 5de8cf2 to 4657d7e Compare February 21, 2019 15:03
@remilapeyre
Copy link
Contributor Author

Hi @ncoghlan @pitroum, I made some changes, could you review again?

Seeing the increase in complexity to support slots and data-descriptors, I may have misunderstood what you were saying. If so, I will be happy to scratch that and do it again.

I tried my best to remove all refleaks but doing ./python.exe -m test -R : test_exceptions gives:

Run tests sequentially
0:00:00 load avg: 1.99 [1/1] test_exceptions
beginning 9 repetitions
123456789
.........
test_exceptions leaked [3, 3, 3, 3] references, sum=12
test_exceptions failed

== Tests result: FAILURE ==

1 test failed:
    test_exceptions

Total duration: 16 sec 535 ms
Tests result: FAILURE

This means I missed one, doesn't it?

It seems to me that argument clinic has not been used for Objects/exceptions.c, would a PR that convert it useful?

Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Apr 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting change review stale Stale PR or inactive for long period of time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants