Skip to content

gh-108512: Add and use new replacements for PySys_GetObject() #111035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Oct 18, 2023

Add functions PySys_GetAttr(), PySys_GetAttrString(), PySys_GetOptionalAttr() and PySys_GetOptionalAttrString().


📚 Documentation preview 📚: https://cpython-previews--111035.org.readthedocs.build/

Add functions PySys_GetAttr(), PySys_GetAttrString(),
PySys_GetOptionalAttr() and PySys_GetOptionalAttrString().
@serhiy-storchaka serhiy-storchaka requested review from a team and encukou as code owners October 18, 2023 14:10
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to leave changes to use these changes aside, in a separated PR, so we can focus only on the API, doc and tests on the new functions?

The function is called "GetAttr" but in the doc, you wrote "if the object exists". IMO "if the attribute exists" is more appropriate.

Is it required to add these new functions to the stable ABI in Python 3.13? Can we wait one Python release to see how it does, before add them to the stable ABI?

Would it be possible to add tests?

Comment on lines +251 to +256
If the object exists, set *\*result* to a new :term:`strong reference`
to the object and return ``1``.
If the object does not exist, set *\*result* to ``NULL`` and return ``0``,
without setting an exception.
If other error occurred, set an exception, set *\*result* to ``NULL`` and
return ``-1``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to use a list and start with the return value, so it's easier to see the 3 cases:

Suggested change
If the object exists, set *\*result* to a new :term:`strong reference`
to the object and return ``1``.
If the object does not exist, set *\*result* to ``NULL`` and return ``0``,
without setting an exception.
If other error occurred, set an exception, set *\*result* to ``NULL`` and
return ``-1``.
* Return ``1`` and set *\*result* to a new :term:`strong reference`
to the object if the attribute exists.
* Return ``0`` without setting an exception and set *\*result* to ``NULL``
if the attribute does not exist.
* Return ``-1``, set an exception and set *\*result* to ``NULL``
if an error occurred.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Return and then set exception and variable" looks like a wrong sequence to me. It cannot do anything after returning.

serhiy-storchaka and others added 2 commits October 19, 2023 10:57
Co-authored-by: Victor Stinner <vstinner@python.org>
with support.swap_attr(sys, '\U0001f40d', 42):
self.assertEqual(sys_getattr('\U0001f40d'), 42)

with self.assertRaisesRegex(RuntimeError, r'lost sys\.nonexisting'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is surprising. sys has no attribute "nonexisting". It's not "lost", it simply doesn't exist.

I would prefer to always raise AttributeError with a message like "module 'sys' has no attribute 'x'", similar than in Python:

>>> import sys
>>> sys.x
AttributeError: module 'sys' has no attribute 'x'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It leaves no use cases for PySys_GetAttr(). They all can be replaced with PySys_GetOptionalAttr() followed by PyErr_SetString(PyExc_RuntimeError,).

If you leave PySys_GetAttr(), you will always use it with PyErr_ExceptionMatches(PyExc_AttributeError) followed by PyErr_SetString(PyExc_RuntimeError,).

# CRASHES sys_getattr(NULL)

def test_sys_getoptionalattr(self):
sys_getattr = _testcapi.sys_getoptionalattr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that in all tests, the function is called "sys_getattr", whereas here you test PySys_GetOptionalAttr(), not PySys_GetAttr().

I suggest to rename the variable t o"sys_getoptionalattr", or even PySys_GetOptionalAttr() since this is a C API test. It would be more explicit to use the name of the C API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it was exactly what I wrote initially, but in last minute I replaced all names with the same name to make reading easy. But if you think that it does not help, I'll restore previous names.

self.assertEqual(sys_getattr('\U0001f40d'.encode()), 42)

self.assertIs(sys_getattr(b'nonexisting'), AttributeError)
self.assertRaises(UnicodeDecodeError, sys_getattr, b'\xff')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a UnicodeDecodeError to test_sys_getattr() as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, PySys_GetAttr() does not raise UnicodeDecodeError. The wrapper does, but we do not test PyArg_Parse() here.


switch (PySys_GetOptionalAttr(name, &value)) {
case -1:
assert(value == NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding: assert(PyErr_Occurred());. Same remark in sys_getoptionalattrstring().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a check that each function that returns NULL should also set an exception? I relied on this in all other tests.

return value;
default:
Py_FatalError("PySys_GetOptionalAttr() returned invalid code");
Py_UNREACHABLE();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be needed, Py_FatalError() is annotated with _Py_NO_RETURN. Same remark in sys_getoptionalattrstring().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, removing.

}
if (result == NULL) {
result = PyExc_AttributeError;
Py_INCREF(PyExc_AttributeError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this code path. The function must return NULL and raise an exception if the attribute does not exist. This code path must never be reached according to the API doc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it exactly to test that it never happens. But perhaps it can be removed.

@serhiy-storchaka
Copy link
Member Author

Would it be possible to leave changes to use these changes aside, in a separated PR, so we can focus only on the API, doc and tests on the new functions?

It is easy to see the effect of these changes if they are in a single PR. Also, it replaces old private functions with new public API that is impossible if they are still used. If you prefer, I will split this PR into two parts after the development of the new API is complete, immediately before merging.

The function is called "GetAttr" but in the doc, you wrote "if the object exists". IMO "if the attribute exists" is more appropriate.

These are the words used in the description of the existing function PySys_GetObject(). Module attributes are rarely referred to as "attributes". They are more often referred as module variables, module constants, module globals. Sphynx has a special role :data: for this instead of more general :attr:.

Is it required to add these new functions to the stable ABI in Python 3.13? Can we wait one Python release to see how it does, before add them to the stable ABI?

What exactly do you propose? Isn't it a point that we can suggest them as replacements for PySys_GetObject() and deprecate the latter in distant future? Should not all new public API be in the stable ABI or not be public at all?

Would it be possible to add tests?

Good point, I forgot about this. Done.

@vstinner
Copy link
Member

Module attributes are rarely referred to as "attributes".

Well, trying to get sys.x raises an AttributeError, not a NameError :-)

@vstinner
Copy link
Member

Should not all new public API be in the stable ABI or not be public at all?

I'm asking if we should only add the API to Include/cpython/ in Python 3.13, and wait for Python 3.14 to add it to the limited API. Just in case if something goes wrong, if the API changes when we notice issues. If it lands directly in the stable ABI, we can no longer change it, it's too late.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the exception, I see two options:

  • We consider that we get an object from a namespace, similar to LOAD_GLOBAL, and so NameError should be raised. IMO the function should be called PySys_GetVar() in this case. Example: https://docs.python.org/dev/c-api/frame.html#c.PyFrame_GetVar raises NameError.

  • Or we consider that we are getting an attribute, PySys_GetAttr() name is good, and AttributeError should be raised in this case.

For me, RuntimeError is just meaningless and it should be avoided. RuntimeError means everything and nothing: "something failed". Thanks Python...

In Python, usually I consider that the sys module is an object and I get sys attributes with getattr(sys, "stdout") which raises AttributeError.

In Python, I'm not even sure how to treat sys as a namespace. sys.__dict__['stdout'] raises KeyError, not NameError.

@serhiy-storchaka serhiy-storchaka marked this pull request as draft January 10, 2024 14:25
@serhiy-storchaka
Copy link
Member Author

Many of existing uses of PySys_GetObject() and _PySys_GetAttr() raise RuntimeError (if they raise an exception at all). None raises AttributeError. If you want an AttributeError, use PyObject_GetAttr(). I do not think there are many cases for this.

.. c:function:: PyObject *PySys_GetAttr(PyObject *name)

Get the attribute *name* of the :mod:`sys` module. Return a :term:`strong reference`.
Raise :exc:`RuntimeError` and return ``NULL`` if it does not exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to raise AttributeError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we would need a new pair of functions that raise RuntimeError to use in the CPython code. Or you can add a pair of functions that raise AttributeError later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we would need a new pair of functions that raise RuntimeError to use in the CPython code.

Would you mind to elaborate why RuntimeError is needed by CPython?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Status quo. And it looks to me it is justified by the purpose of RuntimeError: "Raised when an error is detected that doesn’t fall in any of the other categories."

If you want to change RuntimeError's in the existing code to something other, you should justify that change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my implementation #129369, PySys_SetAttr() raises RuntimeError if the sys module cannot be retrieved. But it raises AttributeError if the attribute doesn't exist.

PySys_GetObject() ignores all errors and returns NULL on error or if the attribute doesn't exist. So I'm confused, which function using PySys_GetObject() currently rely on the RuntimeError? Do you have an example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About a half of functions that use PySys_GetObject() or _PySys_GetAttr() raise RuntimeError if they return NULL. See my statistics in capi-workgroup/decisions#54. Look at usages of PySys_GetAttr() and PySys_GetAttrString() in this PR.

@serhiy-storchaka serhiy-storchaka marked this pull request as draft May 4, 2025 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants