Skip to content

gh-104306: Fix incorrect comment handling in the netrc module, minor refactor #104511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

sleiderr
Copy link

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@ghost
Copy link

ghost commented May 15, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@ssbarnea
Copy link
Contributor

ssbarnea commented Jan 4, 2025

Any chance we can revive this? I already spend few hours trying to identify why I was getting 404 errors from github.com, just to discover that it was a commented line in my ~/.netrc, one that was ignored correctly by curl.

@@ -0,0 +1 @@
Fix incorrect comment parsing in the :mod:`netrc` module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this and include context that would give an arbitrary changelog reader an idea of how the change might impact them?

@@ -309,5 +329,9 @@ def test_security(self):
('anonymous', '', 'pass'))


def test_main():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

@webknjaz
Copy link
Contributor

webknjaz commented Jan 6, 2025

@sleiderr the CI is failing with this PR. Here I've extracted the relevant part of the log from one of the jobs:

1 test failed:
    test_netrc

435 tests OK.

0:12:34 load avg: 5.15 Re-running 1 failed tests in verbose mode in subprocesses
0:12:34 load avg: 5.15 Run 1 test in parallel using 1 worker process (timeout: 10 min, worker timeout: 15 min)
0:12:34 load avg: 5.15 [1/1/1] test_netrc failed (uncaught exception)
Re-running test_netrc in verbose mode
test test_netrc crashed -- Traceback (most recent call last):
  File "D:\a\cpython\cpython\Lib\test\libregrtest\single.py", line 184, in _runtest_env_changed_exc
    _load_run_test(result, runtests)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "D:\a\cpython\cpython\Lib\test\libregrtest\single.py", line 129, in _load_run_test
    test_mod = importlib.import_module(module_name)
  File "D:\a\cpython\cpython\Lib\importlib\__init__.py", line 88, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1386, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1359, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1330, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 756, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "D:\a\cpython\cpython\Lib\test\test_netrc.py", line 6, in <module>
    from test.support import os_helper, run_unittest
ImportError: cannot import name 'run_unittest' from 'test.support' (D:\a\cpython\cpython\Lib\test\support\__init__.py)
1 test failed again:
    test_netrc

== Tests result: FAILURE then FAILURE ==

(https://github.com/python/cpython/actions/runs/12611674426/job/35147581981?pr=104511#step:6:631)

import sys
import textwrap
import unittest
from test.support import os_helper, run_unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes a ImportError: cannot import name 'run_unittest' from 'test.support' (#104511 (comment)).

Copy link
Contributor

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that dropping the unrelated changes may fix #104511 (comment).

sleiderr and others added 2 commits January 6, 2025 21:37
Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
@webknjaz
Copy link
Contributor

webknjaz commented Jan 6, 2025

0:00:07 load avg: 9.38 [ 59/482] test_netrc passed

Lib/netrc.py Outdated
@@ -187,6 +187,3 @@ def __repr__(self):
rep += line
rep += "\n"
return rep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert deleting the CLI entry point logic:

Suggested change
return rep
return rep
if __name__ == '__main__':
print(netrc())

Lib/netrc.py Outdated
Comment on lines 121 to 122
raise NetrcParseError(
"missing %r name" % tt, file, lexer.lineno)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an irrelevant formatting change

Suggested change
raise NetrcParseError(
"missing %r name" % tt, file, lexer.lineno)
raise NetrcParseError("missing %r name" % tt, file, lexer.lineno)

Lib/netrc.py Outdated
if lexer.lineno == saved_lineno and len(tt) == 1:
# For top level tokens, we skip line if the # is followed
# by a space / newline. Otherwise, we only skip the token.
if tt == '#' and not lexer.dontskip:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand why the entirety of t is being compared to a '#'. Is this semantically “the entire token consists of just #”?
This could be

Suggested change
if tt == '#' and not lexer.dontskip:
if len(tt) == 1 and not lexer.dontskip:

but then, why does it matter whether it's # thing vs #thing? Typically, comment parsers just disregard whatever's after the hash and don't interpret that in any way…

Lib/netrc.py Outdated
if ch in self.whitespace and not enquoted:
if token == "":
continue
if ch == '\n':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about \r? \r\n?

Lib/netrc.py Outdated
for ch in fiter:
if ch in self.whitespace:
enquoted = False
while ch := self._read_char():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this refactoring necessary to fix the bug? It's rather difficult to read the diff with so many lines reshuffled. If I were to guess, this might be the reason people are postponing doing reviews on this PR.

Lib/netrc.py Outdated
Comment on lines 5 to 6
import os
import stat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's undo formatting to make sure only relevant changes are visible in the diff.

Suggested change
import os
import stat
import os, stat

Lib/netrc.py Outdated
return self.hosts['default']
else:
return None
return self.hosts.get(host, self.hosts.get('default'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that this might not be relevant to the fix and would be better in a separate refactoring PR.

@sleiderr
Copy link
Author

sleiderr commented Jan 7, 2025

Hi @webknjaz - thanks for reviewing this pull request.

I agree that most of the changes were not relevant to the actual bug fix and were more confusing than anything else (even for myself, one year later).

I've reverted to the upstream version of the module, and fixed the bug in (hopefully) a clearer manner.
Trailing new lines were messing up a check that I assume was supposed to determine whether a token was the the last one on a line, to then ensure that we do not skip line twice when parsing comment. But the way this check is implemented (checking if the current line number increased after calling the lever) is not quite correct, especially with extra blank lines that increase the current line number, but do not necessarily mean that the token was the last on its line.

Removing extra new lines before storing the line count fixes that issue, I've added a few test cases based on the original bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants