Skip to content

GH-123599: url2pathname(): don't call gethostbyname() by default #132610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Apr 16, 2025

Follow-up to 66cdb2b.

Add resolve_netloc keyword-only argument to url2pathname(), defaulting to false. When set to true, we call socket.gethostbyname() to resolve the URL authority (netloc).

…fault

Follow-up to 0879ebc.

Add *resolve_netloc* keyword-only argument to `url2pathname()`, defaulting
to false. When set to true, we call `socket.gethostbyname()` to resolve
the URL authority (netloc).
@barneygale barneygale marked this pull request as ready for review April 16, 2025 19:58
@@ -1494,6 +1494,8 @@ def _is_local_authority(authority):
if authority == hostname:
return True
# Compare IP addresses
if not resolve:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, should we check this before calling gethostname() which can fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to come up with firm info on when gethostname() fails (or whether it can perform network access). Do you have any pointers? Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know.

But I was thinking about this case: processing the "file:" URL on different computer. We cannot use gethostname(), because it returns a different result. If we want to get a path on other computer, we should ignore the host name. There will still be problem with processing Windows paths on Posix and Posix paths on Windows.

I do not know how common this case, and how common the opposite case, when we only use gethostname(), but not gethostbyname(). We should guess what would be more useful for users without having any data.

barneygale and others added 2 commits April 28, 2025 17:48
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
@barneygale
Copy link
Contributor Author

Thanks for the reviews - sorry for my slow response


.. versionchanged:: next
The *resolve_netloc* argument was added.
The *require_scheme* and *resolve_host* arguments were added.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameters

discard authority if it resolves to a local IP address; otherwise
4. On Windows, return a UNC path; otherwise
5. (New) Raise :exc:`urllib.error.URLError`.
Add *resolve_host* keyword-only argument to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameter

@@ -1494,6 +1494,8 @@ def _is_local_authority(authority):
if authority == hostname:
return True
# Compare IP addresses
if not resolve:
return False
try:
address = socket.gethostbyname(authority)
except (socket.gaierror, AttributeError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except (socket.gaierror, AttributeError):
except (socket.gaierror, AttributeError, UnicodeEncodeError):

gethostbyname tries to encode with IDNA, which can fail (e.g. on the unencodable characters we put into paths for our test suite)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants