Skip to content

Add method to detect if a string contains surrogates #69456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bitdancer opened this issue Sep 29, 2015 · 1 comment
Open

Add method to detect if a string contains surrogates #69456

bitdancer opened this issue Sep 29, 2015 · 1 comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-feature A feature request or enhancement

Comments

@bitdancer
Copy link
Member

bitdancer commented Sep 29, 2015

BPO 25269
Nosy @vstinner, @ezio-melotti, @bitdancer

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-09-29.12:48:03.328>
labels = ['interpreter-core', 'type-feature', 'expert-unicode']
title = 'Add method to detect if a string contains surrogates'
updated_at = <Date 2015-09-29.13:05:02.979>
user = 'https://github.com/bitdancer'

bugs.python.org fields:

activity = <Date 2015-09-29.13:05:02.979>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core', 'Unicode']
creation = <Date 2015-09-29.12:48:03.328>
creator = 'r.david.murray'
dependencies = []
files = []
hgrepos = []
issue_num = 25269
keywords = []
message_count = 1.0
messages = ['251853']
nosy_count = 3.0
nosy_names = ['vstinner', 'ezio.melotti', 'r.david.murray']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue25269'
versions = ['Python 3.6']

Linked PRs

@bitdancer
Copy link
Member Author

Because surrogates are in several contexts used to "smuggle" bytes through string APIs using surrogateescape, it is very useful to be able to determine if a given string contains surrogates. The email package, for example, uses different logic to handle strings that contain smuggled bytes and strings that don't when serializing a Message object. Currently it uses x.encode() and checks for an exception (we determined that for CPython this was the most efficient method to check). It would be better, I think, to have a dedicated method on str for this, among other reasons so that different python implementations could optimize it appropriately.

(Note that another aspect of dealing with surrogateescaped strings is discussed in bpo-18814.)

@bitdancer bitdancer added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Sep 29, 2015
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants