Skip to content

Conversation

vstinner
Copy link
Member

The current regex based splitting produces a wrong result. For example::

http://abc#@def

Web browsers parse that URL as http://abc/#@def, that is, the host
is abc, the path is /, and the fragment is #@def.
(cherry picked from commit 90e01e5)

@@ -865,7 +865,7 @@ def splithost(url):
"""splithost('//host[:port]/path') --> 'host[:port]', '/path'."""
global _hostprog
if _hostprog is None:
_hostprog = re.compile('^//([^/?]*)(.*)$')
_hostprog = re.compile('//([^/#?]*)(.*)', re.DOTALL)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the usage of the DOTALL flag, see:
http://bugs.python.org/issue30500#msg296422

@vstinner
Copy link
Member Author

@larryhastings: Larry, would you mind to review and merge this fix please?

It fixes an important security vulnerability:
http://python-security.readthedocs.io/vuln/bpo-30500_urllib_connects_to_a_wrong_host.html

cc @ned-deily.

@vstinner
Copy link
Member Author

I tested manually that "./python -m test test_urlparse" pass. Sadly, 3.4 has no pre-commit CI yet.

@larryhastings
Copy link
Contributor

I accepted a PR from Serhiy and now there's a conflict from Misc/NEWS. Do you mind changing it to NEWS.d?

@larryhastings
Copy link
Contributor

I'll accept this PR once you fix the conflicts.

The current regex based splitting produces a wrong result. For example::

  http://abc#@def

Web browsers parse that URL as ``http://abc/#@def``, that is, the host
is ``abc``, the path is ``/``, and the fragment is ``#@def``.
(cherry picked from commit 90e01e5)
@vstinner
Copy link
Member Author

Victor: "I tested manually that "./python -m test test_urlparse" pass. Sadly, 3.4 has no pre-commit CI yet."

I proposed PR #2475 to add CIs.

Larry: "I accepted a PR from Serhiy and now there's a conflict from Misc/NEWS. Do you mind changing it to NEWS.d?"

Sure, I created a NEWS.d entry and rebased my change.

@larryhastings
Copy link
Contributor

I'm willing to consider PR 2475 for 3.4, but we can discuss it over there.

@larryhastings larryhastings merged commit cc54c1c into python:3.4 Jul 12, 2017
@vstinner vstinner deleted the splithost34 branch July 12, 2017 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants