@@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:
34
34
35
35
If *locale * is given and not ``None ``, :func: `setlocale ` modifies the locale
36
36
setting for the *category *. The available categories are listed in the data
37
- description below. *locale * may be a string, or an iterable of two strings
38
- (language code and encoding). If it's an iterable, it's converted to a locale
39
- name using the locale aliasing engine. An empty string specifies the user's
37
+ description below. *locale * may be a :ref: `string <locale_name >`, or a pair,
38
+ language code and encoding. An empty string specifies the user's
40
39
default settings. If the modification of the locale fails, the exception
41
40
:exc: `Error ` is raised. If successful, the new locale setting is returned.
42
41
42
+ If *locale * is a pair, it is converted to a locale name using
43
+ the locale aliasing engine.
44
+ The language code has the same format as a :ref: `locale name <locale_name >`,
45
+ but without encoding and ``@ ``-modifier.
46
+ The language code and encoding can be ``None ``.
47
+
43
48
If *locale * is omitted or ``None ``, the current setting for *category * is
44
49
returned.
45
50
@@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions:
345
350
``'LANG' ``. The GNU gettext search path contains ``'LC_ALL' ``,
346
351
``'LC_CTYPE' ``, ``'LANG' `` and ``'LANGUAGE' ``, in that order.
347
352
348
- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
349
- *language code * and *encoding * may be ``None `` if their values cannot be
353
+ The language code has the same format as a :ref: `locale name <locale_name >`,
354
+ but without encoding and ``@ ``-modifier.
355
+ The language code and encoding may be ``None `` if their values cannot be
350
356
determined.
357
+ The "C" locale is represented as ``(None, None) ``.
351
358
352
359
.. deprecated-removed :: 3.11 3.15
353
360
354
361
355
362
.. function :: getlocale(category=LC_CTYPE)
356
363
357
- Returns the current setting for the given locale category as sequence containing
358
- * language code *, * encoding * . *category * may be one of the :const: `!LC_\* ` values
359
- except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
364
+ Returns the current setting for the given locale category as a tuple containing
365
+ the language code and encoding. *category * may be one of the :const: `!LC_\* `
366
+ values except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
360
367
361
- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
362
- *language code * and *encoding * may be ``None `` if their values cannot be
368
+ The language code has the same format as a :ref: `locale name <locale_name >`,
369
+ but without encoding and ``@ ``-modifier.
370
+ The language code and encoding may be ``None `` if their values cannot be
363
371
determined.
372
+ The "C" locale is represented as ``(None, None) ``.
364
373
365
374
366
375
.. function :: getpreferredencoding(do_setlocale=True)
@@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
615
624
part of a character class such as letter or whitespace.
616
625
617
626
627
+ .. _locale_name :
628
+
629
+ Locale names
630
+ ------------
631
+
632
+ The format of the locale name is platform dependent, and the set of supported
633
+ locales can depend on the system configuration.
634
+
635
+ On Posix platforms, it usually has the format [1 ]_:
636
+
637
+ .. productionlist :: locale_name
638
+ : language ["_" territory] ["." charset] ["@" modifier]
639
+
640
+ where *language * is a two- or three-letter language code from `ISO 639 `_,
641
+ *territory * is a two-letter country or region code from `ISO 3166 `_,
642
+ *charset * is a locale encoding, and *modifier * is a script name,
643
+ a language subtag, a sort order identifier, or other locale modifier
644
+ (for example, "latin", "valencia", "stroke" and "euro").
645
+
646
+ On Windows, several formats are supported. [2 ]_ [3 ]_
647
+ A subset of `IETF BCP 47 `_ tags:
648
+
649
+ .. productionlist :: locale_name
650
+ : language ["-" script] ["-" territory] ["." charset]
651
+ : language ["-" script] "-" territory "-" modifier
652
+
653
+ where *language * and *territory * have the same meaning as in Posix,
654
+ *script * is a four-letter script code from `ISO 15924 `_,
655
+ and *modifier * is a language subtag, a sort order identifier
656
+ or custom modifier (for example, "valencia", "stroke" or "x-python").
657
+ Both hyphen (``'-' ``) and underscore (``'_' ``) separators are supported.
658
+ Only UTF-8 encoding is allowed for BCP 47 tags.
659
+
660
+ Windows also supports locale names in the format:
661
+
662
+ .. productionlist :: locale_name
663
+ : language ["_" territory] ["." charset]
664
+
665
+ where *language * and *territory * are full names, such as "English" and
666
+ "United States", and *charset * is either a code page number (for example, "1252")
667
+ or UTF-8.
668
+ Only the underscore separator is supported in this format.
669
+
670
+ The "C" locale is supported on all platforms.
671
+
672
+ .. _ISO 639 : https://www.iso.org/iso-639-language-code
673
+ .. _ISO 3166 : https://www.iso.org/iso-3166-country-codes.html
674
+ .. _IETF BCP 47 : https://www.rfc-editor.org/info/bcp47
675
+ .. _ISO 15924 : https://www.unicode.org/iso15924/
676
+
677
+ .. [1 ] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02 >`_
678
+ .. [2 ] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings >`_
679
+ .. [3 ] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names >`_
680
+
681
+
618
682
.. _embedding-locale :
619
683
620
684
For extension writers and programs that embed Python
0 commit comments