Skip to content

GH-133711: Enable UTF-8 mode by default (PEP 686) #133712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented May 8, 2025

@StanFromIreland

This comment was marked as outdated.

@AA-Turner

This comment was marked as resolved.

@methane

This comment was marked as resolved.

@methane

This comment was marked as resolved.

* Python UTF-8 mode is now enabled by default.
It may be disabled with by setting :envvar:`PYTHONUTF8=0 <PYTHONUTF8>` as
an environment variable or by using the :option:`-X utf8=0 <-X>` flag.
See :pep:`686` for further details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we can probably put some more explanation in here, such as that it affects TextIOWrapper and hence open(). The current description doesn't sound as scary as it needs to, in my opinion.

Along the lines of: "Python UTF-8 mode is now enabled by default. This means that (files/console/etc.) will now use UTF-8 regardless of system settings, unless specifically overridden in code (typically with an encoding= argument)."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, it's nothing new. But we shouldn't assume that everyone already knows what UTF-8 mode implies. There are many more people out there who haven't ever thought about it than those who are waiting for it to be the default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another effect of the UTF-8 Mode is that Python ignores the locale encoding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FFY00 FFY00 removed their request for review May 10, 2025 02:28
@@ -75,7 +75,30 @@ New features
Other language changes
======================


* Python now uses UTF-8_ as the default encoding, independent of the system's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might mention the UTF-8 Mode earlier since it has other side effects documented in the UTF-8 Mode section, such as changing sys.stdout error handler and ignoring the locale encoding.

@AA-Turner

This comment was marked as resolved.

@bedevere-bot

This comment was marked as resolved.

@StanFromIreland

This comment was marked as resolved.

@vstinner

This comment was marked as resolved.

@vstinner

This comment was marked as resolved.

@AA-Turner

This comment was marked as outdated.

@methane

This comment was marked as resolved.

@methane

This comment was marked as resolved.

AA-Turner added 2 commits May 26, 2025 10:32
# Conflicts:
#	Lib/test/test_cmd_line.py
@AA-Turner AA-Turner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 26, 2025
@bedevere-bot

This comment was marked as outdated.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 26, 2025
@vstinner

This comment was marked as resolved.

@vstinner
Copy link
Member

test_regrtest failed:

I wrote #134839 to fix test_regrtest when PYTHONUTF8=1 env var is set.

@vstinner
Copy link
Member

I can also reproduce the [test_readline] issue in the main branch using the fr_FR locale and the command:
PYTHONUTF8=1 LANG=fr_FR ./python -m test -v test_readline -u all

I wrote #134841 to fix test_readline for the Python UTF-8 Mode.

@AA-Turner AA-Turner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 28, 2025
@bedevere-bot

This comment was marked as duplicate.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 28, 2025
@vstinner
Copy link
Member

buildbot/wasm32-wasi Non-Debug PR — Build done.

Configure host Python fails with:

checking for /dev/ptmx... not set
configure: error: set ac_cv_file__dev_ptmx to yes/no in your CONFIG_SITE file when cross compiling

It seems to be unrelated to this PR.

@AA-Turner
Copy link
Member Author

buildbot/wasm32-wasi Non-Debug PR — Build done.

Configure host Python fails with:

checking for /dev/ptmx... not set
configure: error: set ac_cv_file__dev_ptmx to yes/no in your CONFIG_SITE file when cross compiling

It seems to be unrelated to this PR.

cc @brettcannon, this builder seems to be consistently failing since ~15/05.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


Python UTF-8 mode is now enabled by default (:pep:`686`).
It may be disabled with by setting :envvar:`PYTHONUTF8=0 <PYTHONUTF8>` as
an environment variable or by using the :option:`-X utf8=0 <-X>` flag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
an environment variable or by using the :option:`-X utf8=0 <-X>` flag.
an environment variable or by using the :option:`-X utf8=0 <-X>` command line option.

Comment on lines +108 to +109
Enabled by default (equal to 1; PEP 686), or if Py_UTF8Mode=1,
or if "-X utf8=1" or PYTHONUTF8=1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Enabled by default (equal to 1; PEP 686), or if Py_UTF8Mode=1,
or if "-X utf8=1" or PYTHONUTF8=1.
Enabled by default (equal to 1; PEP 686), or if Py_UTF8Mode=1,
or if "-X utf8=1" or PYTHONUTF8=1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants