-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
add \z as a synonym for \Z in Python REs for standardization #133306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1, I'm very supportive! Currently it's easy to make a mistake using regexes when checking inputs for security; see Correctly Using Regular Expressions. Many platforms interpret Thanks! |
Hi @serhiy-storchaka, before you spend too much time implementing this I think you should be aware that there is some discussion on the glibc mailing list [1]. Because it is a proposed standardization, it is still subject to change. Paul Eggert, a glibc maintainer, wants to standardize |
As the author of a blog post about this footgun, I am in complete support of this proposal! Thank you for opening this issue. |
Even if |
The glibc discussion was also started by our committee, as a part of the rationalization effort. |
Yes, I experienced that in typing my original message. 😄 Anyways, I don't have super strong opinions. I just wanted to make sure the disagreement was known and considered. Thanks! |
…133314) \Z was an error inherited from PCRE 0.95. It was fixed in PCRE 2.0. In other engines, \Z means not “anchor at string end”, but “anchor before optional newline at string end”. \z means “anchor at string end” in most RE engines.
Feature or enhancement
Proposal:
Hello - I’m with the Austin Common Standards Revision Group - the joint technical working group established to develop and maintain the core open systems interfaces that are the POSIX™ 1003.1 (and former 1003.2) standards, ISO/IEC 9945, and the core of the Single UNIX Specification.
We have had a request to unify/rationalize the regex behaviors for “anchor at string beginning” (^ is the closest in POSIX) and “anchor at string end” ($ is the closest in POSIX). A description of this problem in depth can be found here and a table that scopes the varied solutions across varying languages can be found here.
Our working group has come to the conclusion that \A and \z are widely implemented across many ecosystems and are the most “standard” solution to the issue. We are asking if the Python community would consider adding “\z” as a synonym for “\Z” in their regex lexicon.
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
https://discuss.python.org/t/proposal-add-z-as-a-synonym-for-z-in-python-res-for-standardization/90378/1
Linked PRs
The text was updated successfully, but these errors were encountered: