locale.strxfrm() may improperly use PyUnicode_FromWideChar()

# Bug report

`wcsxfrm()` produces a sequence of `wchar_t` that can be compared using `wcscmp()`. There is no any promise that the resulting string can be interpreted as text in any way, all that you can do with it is to compare with other result of `wcsxfrm()` `wchar_t` by `wchar_t`.

For example, if `wchar_t` is 32-bit, the result can contain values larger than 0x10FFFF. Python strings can only contain Unicode code points in the range 0 to 0x10FFFF. If `wchar_t` is 16-bit, surrogate pair should not be interpreted as a single code point with value larger than 0xFFFF -- this breaks order when compare them `wchar_t` by `wchar_t`. `PyUnicode_FromWideChar()` will fail in the former case and produce wrong result in the latter case.

#138242 tries to solve this issue. We need to test on exotic platforms (AIX, Solaris) to check if it helps.


### Linked PRs
* gh-138242

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

locale.strxfrm() may improperly use PyUnicode_FromWideChar() #138247

Bug report

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

locale.strxfrm() may improperly use PyUnicode_FromWideChar() #138247

Description

Bug report

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions