Skip to content

gh-137609: Update signatures of builtins in the documentation #137610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Aug 10, 2025

Show signatures that match the actual signatures or future multisignatures for all functions, classes and methods in the "builtins" module.


📚 Documentation preview 📚: https://cpython-previews--137610.org.readthedocs.build/

Show signatures that match the actual signatures or future multisignatures
for all functions, classes and methods in the "builtins" module.
@serhiy-storchaka serhiy-storchaka added docs Documentation in the Doc dir skip news needs backport to 3.13 bugs and security fixes labels Aug 10, 2025
@serhiy-storchaka serhiy-storchaka added the needs backport to 3.14 bugs and security fixes label Aug 10, 2025
@github-project-automation github-project-automation bot moved this to Todo in Docs PRs Aug 10, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull request Aug 10, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull request Aug 10, 2025
@serhiy-storchaka
Copy link
Member Author

See also #137611.

Copy link
Member

@terryjreedy terryjreedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve of the issue, the addition of /s, the renamings, and most of the details. The main question for me is relative positioning of / and *args.

Fewer versus more lines is partly style preference and partly technical accuracy, and the signature needed to have an inspect.signature and to write a python version of the same or similar function, versus ease of understanding how to call the function. Are "future multisignatures" a real possibility?

Comment on lines 185 to 188
.. class:: bytearray()
bytearray(source)
bytearray(source, encoding)
bytearray(source, encoding, errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe only two lines are needed.

Suggested change
.. class:: bytearray()
bytearray(source)
bytearray(source, encoding)
bytearray(source, encoding, errors)
.. class:: bytearray(source=b'')
bytearray(source, encoding, errors='strict')

See also #137100, which is also about the text that follows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some descriptions have separate signature for no argument, others merge it with a signature with one argument. See for example dict which could be written as dict(mapping_or_iterable=(), **kwargs), but is written as three semantically different signatures. I tried to be more consistent and chose the former variant. But I have no such strong preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer fewer lines, and approve of the changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree here, source=b'' makes it less clear to me that source also accepts e.g. iterables of integers, buffer-protocol, etc. I would suggest:

.. class:: bytearray()
           bytearray(source, /)
           bytearray(source, /, encoding, errors='strict')

Note I have suggested to annotate 'source' as positional-only -- I think this makes more sense to users than writing e.g. bytearray(source=my_numpy_array). My IDE & type-checkers also indicates that source=... is an error, as it is annotated as positional-only in typeshed.

@bedevere-app
Copy link

bedevere-app bot commented Aug 10, 2025

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Copy link
Member Author

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the requested changes; please review again.

Comment on lines 185 to 188
.. class:: bytearray()
bytearray(source)
bytearray(source, encoding)
bytearray(source, encoding, errors)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some descriptions have separate signature for no argument, others merge it with a signature with one argument. See for example dict which could be written as dict(mapping_or_iterable=(), **kwargs), but is written as three semantically different signatures. I tried to be more consistent and chose the former variant. But I have no such strong preference.

AA-Turner and others added 2 commits August 12, 2025 00:30
Comment on lines 185 to 188
.. class:: bytearray()
bytearray(source)
bytearray(source, encoding)
bytearray(source, encoding, errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree here, source=b'' makes it less clear to me that source also accepts e.g. iterables of integers, buffer-protocol, etc. I would suggest:

.. class:: bytearray()
           bytearray(source, /)
           bytearray(source, /, encoding, errors='strict')

Note I have suggested to annotate 'source' as positional-only -- I think this makes more sense to users than writing e.g. bytearray(source=my_numpy_array). My IDE & type-checkers also indicates that source=... is an error, as it is annotated as positional-only in typeshed.

@@ -846,7 +844,7 @@ are always available. They are listed here in alphabetical order.


.. _func-frozenset:
.. class:: frozenset(iterable=set())
.. class:: frozenset(iterable=(), /)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty frozenset is (notionally) a singleton, similar to the empty tuple, so I think it's clearer to distinguish them here:

Suggested change
.. class:: frozenset(iterable=(), /)
.. class:: frozenset()
frozenset(iterable, /)

@@ -1144,8 +1142,7 @@ are always available. They are listed here in alphabetical order.


.. _func-list:
.. class:: list()
list(iterable)
.. class:: list(iterable=(), /)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with frozenset I think clearer to keep two lines here:

Suggested change
.. class:: list(iterable=(), /)
.. class:: list()
list(iterable, /)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestions are opposite to @terryjreedy's. I just changed the code in opposite direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terryjreedy what do you think? I would be -1 on combining here, I think it goes too far (even though it is technically accurate).

We currently have two lines for these container initialisers in the documentation, so it's not adding more bloat.

Comment on lines 3504 to 3505
be removed - the name refers to the fact this method is usually used with
ASCII characters. If omitted or ``None``, the *chars* argument defaults to
removing ASCII whitespace. The *chars* argument is not a suffix; rather,
ASCII characters. If omitted or ``None``, the *bytes* argument defaults to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence no longer makes sense after changing the argument name

Comment on lines 3448 to 3449
be removed - the name refers to the fact this method is usually used with
ASCII characters. If omitted or ``None``, the *chars* argument defaults
to removing ASCII whitespace. The *chars* argument is not a prefix;
ASCII characters. If omitted or ``None``, the *bytes* argument defaults
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence no longer makes sense after changing the argument name

Comment on lines 3581 to 3582
byte values to be removed - the name refers to the fact this method is
usually used with ASCII characters. If omitted or ``None``, the *chars*
argument defaults to removing ASCII whitespace. The *chars* argument is
usually used with ASCII characters. If omitted or ``None``, the *bytes*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence no longer makes sense after changing the argument name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, happy to explain. It currently says:

The chars argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters.

This PR changes *chars* to *bytes*, which means the second half of the sentence doesn't make sense / no longer applies. It should be removed or changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Then I'll restore chars. This is not the best name, but this should be a separate issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On other hand, bytes is already used in the signature for long time. So this is in the scope of this PR.

Copy link
Member Author

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review @AA-Turner, especially for your catch of stop/end misnaming.

You suggestions about constructors with no argument are opposite to @terryjreedy's. I have no strong preference for now. I will agree to any consensus.

Could you please explain what is wrong with bytes.strip() etc?

@@ -1562,7 +1559,7 @@ are always available. They are listed here in alphabetical order.
.. versionchanged:: 3.11
The ``'U'`` mode has been removed.

.. function:: ord(c)
.. function:: ord(c, /)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the following text and the docstring are incorrect, they should be changed in any case.

@@ -1144,8 +1142,7 @@ are always available. They are listed here in alphabetical order.


.. _func-list:
.. class:: list()
list(iterable)
.. class:: list(iterable=(), /)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestions are opposite to @terryjreedy's. I just changed the code in opposite direction.

Comment on lines 3581 to 3582
byte values to be removed - the name refers to the fact this method is
usually used with ASCII characters. If omitted or ``None``, the *chars*
argument defaults to removing ASCII whitespace. The *chars* argument is
usually used with ASCII characters. If omitted or ``None``, the *bytes*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain?


Given a string representing one Unicode character, return an integer
The argument must be a one-character string or a :class:`bytes` or
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a separate issue for this: #137668.

@serhiy-storchaka serhiy-storchaka force-pushed the docs-builtins-signatures branch from 00556ee to de15f70 Compare August 13, 2025 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge docs Documentation in the Doc dir needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes skip news
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

5 participants