Skip to content

bpo-33927: Handle pyton -m json.tool --json-lines infile infile #29478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Lib/json/tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,10 @@ def main():
if options.outfile is None:
out = sys.stdout
else:
if (options.outfile.exists() and
options.outfile.expanduser().samefile(infile.name)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee that samefile() works as expected for all cases.
For example,

$ ./python.exe x.py | ./python.exe -m json.tool --json-lines - out.jsonl
Traceback (most recent call last):
  File "/Users/inada-n/work/python/cpython/Lib/pathlib.py", line 1006, in samefile
    other_st = other_path.stat()
               ^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'stat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/inada-n/work/python/cpython/Lib/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/inada-n/work/python/cpython/Lib/runpy.py", line 86, in _run_code
    exec(code, run_globals)
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/inada-n/work/python/cpython/Lib/json/tool.py", line 87, in <module>
    main()
    ^^^^^^
  File "/Users/inada-n/work/python/cpython/Lib/json/tool.py", line 73, in main
    options.outfile.expanduser().samefile(infile.name)):
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/inada-n/work/python/cpython/Lib/pathlib.py", line 1008, in samefile
    other_st = self._accessor.stat(other_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '<stdin>'

There may be other cases that samefile() don't work as your expect.

Additionally, if we don't load the whole file at once, we shouldn't do that even when infile and outfile is same file.

So I think creating a temporary file and rename on success is the right way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's rude, github... I had written a response and pressed "Resolve" and it didn't post my comment. Anyway the gist was:

[...]

There may be other cases that samefile() don't work as your expect.

Thanks for the counterexample the samefile() docs clearly link to the os.stat cases, I should have followed them.

Additionally, if we don't load the whole file at once, we shouldn't do that even when infile and outfile is same file.

So I think creating a temporary file and rename on success is the right way.

And while I think the tempfile idiom/technique has valid applications, the aim I had with #29273 was to implement what sponge and sort -o do, with little added complexity. Having seen the mess tools like pip and pipenv make of my /tmp, I feel uncomfortable introducing it into a small tool like json.tool.

So like you said

json.tool is a toy/demo program. So I don't think we need to support every case.

I've closed this PR. I'm unsure what should happen with the bpo. Feels a bit bad to just scratch my own itch and then drop the ball, but this rabbit hole goes a bit deeper than I initially judged. Thanks again for your time and patience.

# exhaust the generator before opening for writing
objs = list(objs)
out = options.outfile.open('w', encoding='utf-8')
with out as outfile:
for obj in objs:
Expand Down
20 changes: 17 additions & 3 deletions Lib/test/support/script_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# e.g. test_cmd_line, test_cmd_line_script and test_runpy

import collections
import contextlib
import importlib
import sys
import os
Expand Down Expand Up @@ -196,16 +197,29 @@ def spawn_python(*args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, **kw):

def kill_python(p):
"""Run the given Popen process until completion and return stdout."""
p.stdin.close()
data = p.stdout.read()
p.stdout.close()
data = None
if p.stdin:
p.stdin.close()
if p.stdout:
data = p.stdout.read()
p.stdout.close()
# try to cleanup the child so we don't appear to leak when running
# with regrtest -R.
p.wait()
subprocess._cleanup()
return data


@contextlib.contextmanager
def killing(process):
"""Context manager that kills and cleans up the given Popen on exit."""
try:
yield process
finally:
process.kill()
kill_python(process)


def make_script(script_dir, script_basename, source, omit_suffix=False):
script_filename = script_basename
if not omit_suffix:
Expand Down
36 changes: 35 additions & 1 deletion Lib/test/test_json/test_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

from test import support
from test.support import os_helper
from test.support.script_helper import assert_python_ok
from test.support.script_helper import assert_python_ok, killing


class TestTool(unittest.TestCase):
Expand Down Expand Up @@ -146,6 +146,40 @@ def test_jsonlines(self):
self.assertEqual(process.stdout, self.jsonlines_expect)
self.assertEqual(process.stderr, '')

def test_writing_jsonlines_in_place(self):
infile = self._create_infile(data=self.jsonlines_raw)
rc, out, err = assert_python_ok('-m', 'json.tool', '--json-lines', infile, infile)
with open(infile, "r", encoding="utf-8") as fp:
self.assertEqual(fp.read(), self.jsonlines_expect)
self.assertEqual(rc, 0)
self.assertEqual(out, b'')
self.assertEqual(err, b'')

def test_jsonlines_from_a_stream(self):
stream_code = textwrap.dedent("""
# print a stream of JSON integers
i = 0
while True:
print(i)
i += 1
""")
stream = subprocess.Popen((sys.executable, '-u', '-c', stream_code),
stdout=subprocess.PIPE,
bufsize=0,
)
args = sys.executable, '-u', '-m', 'json.tool', '--json-lines'
with killing(stream), \
self.assertRaises(subprocess.TimeoutExpired) as timeout:
subprocess.run(args, stdin=stream.stdout,
capture_output=True, timeout=1,
bufsize=0)

self.assertIsNotNone(timeout.exception.stdout)
self.assertEqual(
timeout.exception.stdout.splitlines()[:3],
[b'0', b'1', b'2']
)

def test_help_flag(self):
rc, out, err = assert_python_ok('-m', 'json.tool', '-h')
self.assertEqual(rc, 0)
Expand Down