builtins: Audit bytes arguments #7631

JelleZijlstra · 2022-04-16T05:20:52Z

As a followup from #7589 (comment),
I audited all occurrences of bytes in builtins.pyi by reading the corresponding C code
on CPython main.

Most use the C buffer protocol, so _typeshed.ReadableBuffer is the right type. A few
check specifically for bytes and bytearray.

As a followup from python#7589 (comment), I audited all occurrences of bytes in builtins.pyi by reading the corresponding C code on CPython main. Most use the C buffer protocol, so _typeshed.ReadableBuffer is the right type. A few check specifically for bytes and bytearray.

JelleZijlstra · 2022-04-16T05:21:43Z

stdlib/builtins.pyi

@@ -200,7 +200,7 @@ _NegativeInteger = Literal[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -1

 class int:
    @overload
-    def __new__(cls: type[Self], __x: str | bytes | SupportsInt | SupportsIndex | SupportsTrunc = ...) -> Self: ...
+    def __new__(cls: type[Self], __x: str | ReadableBuffer | SupportsInt | SupportsIndex | SupportsTrunc = ...) -> Self: ...
    @overload
    def __new__(cls: type[Self], __x: str | bytes | bytearray, base: SupportsIndex) -> Self: ...


>>> int(memoryview(b"0xdeadbeef"), 16) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: int() can't convert non-string with explicit base >>> int(memoryview(b"123")) 123

Showing that the first overload accepts buffers but the second doesn't.

https://github.com/bradfitz/deadbeef

JelleZijlstra · 2022-04-16T05:22:33Z

stdlib/builtins.pyi

@@ -223,7 +223,7 @@ class int:
    @classmethod
    def from_bytes(
        cls: type[Self],
-        bytes: Iterable[SupportsIndex] | SupportsBytes,  # TODO buffer object argument
+        bytes: Iterable[SupportsIndex] | SupportsBytes | ReadableBuffer,


>>> int.from_bytes([1, 2, 3]) 66051 >>> int.from_bytes(memoryview(b"123")) 3224115

JelleZijlstra · 2022-04-16T05:23:43Z

stdlib/builtins.pyi

    ) -> bool: ...
    if sys.version_info >= (3, 8):
        def expandtabs(self, tabsize: SupportsIndex = ...) -> bytes: ...
    else:
        def expandtabs(self, tabsize: int = ...) -> bytes: ...

    def find(
-        self, __sub: bytes | SupportsIndex, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...
+        self, __sub: ReadableBuffer | SupportsIndex, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...
    ) -> int: ...
    if sys.version_info >= (3, 8):
        def hex(self, sep: str | bytes = ..., bytes_per_sep: SupportsIndex = ...) -> str: ...


>>> b"xy".hex(memoryview(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sep must be str or bytes. >>> b"xy".hex(bytearray(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sep must be str or bytes.

stdlib/builtins.pyi

JelleZijlstra · 2022-04-16T05:25:26Z

stdlib/builtins.pyi

    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[int]: ...
    def __hash__(self) -> int: ...
    @overload
    def __getitem__(self, __i: SupportsIndex) -> int: ...
    @overload
    def __getitem__(self, __s: slice) -> bytes: ...
-    def __add__(self, __s: bytes) -> bytes: ...
+    def __add__(self, __s: ReadableBuffer) -> bytes: ...


>>> b"x" + memoryview(b"y") b'xy'

JelleZijlstra · 2022-04-16T05:26:23Z

stdlib/builtins.pyi

@@ -667,14 +679,14 @@ class bytearray(MutableSequence[int], ByteString):
    @overload
    def __setitem__(self, __s: slice, __x: Iterable[SupportsIndex] | bytes) -> None: ...
    def __delitem__(self, __i: SupportsIndex | slice) -> None: ...
-    def __add__(self, __s: bytes) -> bytearray: ...
-    def __iadd__(self: Self, __s: Iterable[int]) -> Self: ...


This was wrong; ba += [1, 2, 3] fails

JelleZijlstra · 2022-04-16T05:26:59Z

stdlib/builtins.pyi

@@ -1352,7 +1364,7 @@ def open(
    closefd: bool = ...,
    opener: _Opener | None = ...,
 ) -> IO[Any]: ...
-def ord(__c: str | bytes) -> int: ...
+def ord(__c: str | bytes | bytearray) -> int: ...


>>> ord(memoryview(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ord() expected string of length 1, but memoryview found

github-actions · 2022-04-16T05:36:47Z

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

srittau

Thanks, I didn't double check, but the changes look reasonable.

srittau · 2022-04-16T12:13:21Z

stdlib/builtins.pyi

-    def join(self, __iterable_of_bytes: Iterable[ByteString | memoryview]) -> bytes: ...
-    def ljust(self, __width: SupportsIndex, __fillchar: bytes = ...) -> bytes: ...
+    def join(self, __iterable_of_bytes: Iterable[ReadableBuffer]) -> bytes: ...
+    def ljust(self, __width: SupportsIndex, __fillchar: bytes | bytearray = ...) -> bytes: ...


Unfortunately, this will also accept memoryview at the moment, but having it more explicit can't hurt.

That's a mypy bug :)

It's working as documented. In the past when reviewing I've always asked people to remove bytearray from argument types due to that.

JelleZijlstra commented Apr 16, 2022

View reviewed changes

Update stdlib/builtins.pyi

446691f

This comment has been minimized.

Sign in to view

type ignore

a7b6ab8

This comment has been minimized.

Sign in to view

srittau approved these changes Apr 16, 2022

View reviewed changes

srittau merged commit ee09d9e into python:master Apr 16, 2022

JelleZijlstra deleted the bytes branch April 16, 2022 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

builtins: Audit bytes arguments #7631

builtins: Audit bytes arguments #7631

Uh oh!

JelleZijlstra commented Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

hauntsaninja Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 16, 2022

Uh oh!

srittau left a comment

Uh oh!

srittau Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

Uh oh!

srittau Apr 16, 2022

Uh oh!

Uh oh!

Uh oh!

builtins: Audit bytes arguments #7631

builtins: Audit bytes arguments #7631

Uh oh!

Conversation

JelleZijlstra commented Apr 16, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 16, 2022

Uh oh!

srittau left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!