-
Notifications
You must be signed in to change notification settings - Fork 2.1k
problem with decode regexp #466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… is case insensitive
And where is test to catch regression in future? |
Just to clarify problem exists in .net implementation of url encode - https://stackoverflow.com/questions/918019/net-urlencode-lowercase-problem And RFC https://tools.ietf.org/html/rfc3986 says: The uppercase hexadecimal digits 'A' through 'F' are equivalent to |
@mac2000 Thank you for the research! The RFC says:
But it also says:
I can see a coupe of options:
Be aware that option 1 might have other edge-case regressions. Which one should we go for? |
@FagnerMartinsBrack thanks for your participation but I am kind of not agree with your point about relaying on Both And if someone relaying on this - they are definitely doing something wrong The right question is what should be done to regressions, but this is a place where only you guys can help |
While writing this I did realize to my self that there is a chance to solve a problem in a following way - hexies are going by two pairs (at least for cyrillic) so may be we can figure out how to make regexp something like |
Lowercase escapes are legal, `decodeURIComponent()` supports them, and we should too. The most difficult part is to know when we are dealing with hexadecimal pairs and when not. Supporting only uppercase characters made interpreting strings slightly more constrained, nevertheless even in this case we might run into a scenario where we encounter strings that look like encoded characters but aren't, for instance "%A1". The solution to this is to no longer permit strings with mixed unencoded and encoded characters, which seems like a rather rare edge case to support. Either a cookie value is a fully url encoded string (which is ensured when we're writing the cookie via our own api) or we treat it as non-readable otherwise. In such fully encoded strings "%" would appear as "%25", e.g. "%A1" would look like "%25A1". In case someone needs to work with such mixed encoding cookie values, they would need to resort to implementing their own specialized converter. Closing #466
Fixed: 18ed0fd |
Hi, I found a problem in case, when to me from the back-end comes a cookie query-string like, where a part of the value is encoded in Cyrillic and is presented in lower case.
I could not understand why the
get()
method returnsundefined
.A little debugged made the situation clear - RegExp used in decodeURIComponent matches only lowerCase parts, therefore inside
decode()
function is throw an errorURI malformed
(because of an incorrect string).So I send you my PR with a simple solution to this problem.