Skip to content

Enable UTF-8 mode by default (implement PEP 686) #133711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
AA-Turner opened this issue May 8, 2025 · 1 comment
Open

Enable UTF-8 mode by default (implement PEP 686) #133711

AA-Turner opened this issue May 8, 2025 · 1 comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@AA-Turner
Copy link
Member

AA-Turner commented May 8, 2025

@AA-Turner AA-Turner added the type-feature A feature request or enhancement label May 8, 2025
@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label May 8, 2025
@serhiy-storchaka
Copy link
Member

Please wait until we fix all errors in the tests and the stdlib when running in a non-UTF-8 locale (see #133677). Then ensure that all tests are still passed in a non-UTF-8 locale after this change.

vstinner added a commit to vstinner/cpython that referenced this issue May 28, 2025
Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
vstinner added a commit to vstinner/cpython that referenced this issue May 28, 2025
vstinner added a commit to vstinner/cpython that referenced this issue May 28, 2025
vstinner added a commit to vstinner/cpython that referenced this issue May 28, 2025
Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
vstinner added a commit that referenced this issue May 28, 2025
Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 28, 2025
Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
(cherry picked from commit 9161827)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 28, 2025
Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
(cherry picked from commit 9161827)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue May 28, 2025
vstinner added a commit that referenced this issue May 28, 2025
Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 28, 2025
…thonGH-134841)

Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
(cherry picked from commit 4635115)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 28, 2025
…thonGH-134841)

Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
(cherry picked from commit 4635115)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue May 28, 2025
…4843)

gh-133711: Fix test_regrtest for PYTHONUTF8=1 (GH-134839)

Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
(cherry picked from commit 9161827)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue May 28, 2025
…4842)

gh-133711: Fix test_regrtest for PYTHONUTF8=1 (GH-134839)

Use "backslashreplace" error handler to decode stdout and stderr.
Example:

    vstinner@WIN C:\victor\python\main\build\test_python_worker_8360\x91>
    "C:\victor\python\main\PCbuild\amd64\python_d.exe"  -m test
    --fast-ci --slow-ci --testdir
    C:\Users\vstinner\AppData\Local\Temp\tmp0t59e8da
    test_regrtest_noop1 test_regrtest_noop2 test_regrtest_noop3
    test_regrtest_noop4

Notice the "\x91" byte at the end of the first line: it's the
non-ASCII U+00E6 character encoded to the OEM cp437 code page.
(cherry picked from commit 9161827)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue May 28, 2025
…H-134841) (#134852)

gh-133711: Fix test_readline.test_nonascii() for UTF-8 Mode (GH-134841)

Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
(cherry picked from commit 4635115)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue May 28, 2025
…H-134841) (#134851)

gh-133711: Fix test_readline.test_nonascii() for UTF-8 Mode (GH-134841)

Skip the test if the Python UTF-8 Mode is enabled and the LC_CTYPE
encoding is not UTF-8.
(cherry picked from commit 4635115)

Co-authored-by: Victor Stinner <vstinner@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants