Skip to content

GH-120754: Add a strace helper and test set of syscalls for open().read() #121143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Aug 24, 2024

Conversation

cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Jun 29, 2024

General goal here is to be able to add tests that behavior isn't unintentionally changed around some common python code patterns, in particular Path("README.rst").read_text().

This started with the code that validated subprocess used vfork, building more general utilities from that core code. The work was motivated in #120755 for how to test that particular change.

TODO

cmaloney added 3 commits June 28, 2024 19:50
The test explicitly wants to check for `vfork` call _or_ flag passed to
clone so keep doing the regex search rather than looking at particular
events
Goal is that as changes to I/O code are made changes to the set of
system calls made need to be intentional so that common fast-paths
aren't accidentally changed.
@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Running on my dev machine, ArchLinux, get fstat while Ubuntu CI machine
finds newfstatat
@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 29, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@cmaloney cmaloney marked this pull request as ready for review June 29, 2024 09:10
@cmaloney cmaloney requested a review from gpshead as a code owner June 29, 2024 09:10
Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of comments, but please don't be discouraged. I like what this is trying to do overall.

# system using a variant, allow that
fstat = next((sc for sc in calls if 'fstat' in sc), 'fstat')

readall_calls = ['openat', fstat, 'ioctl', 'lseek', 'read',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly testing system calls against this list is too specific, this is just a recorded existing behavior of an existing implementation the insides of which are somewhat beyond our control (these come from an underlying libc kernel interface). we don't want to guarantee that these stay the same.

assertEqual against a strict list isn't what is desired, it will be a maintenance burden. Make this a less fragile, more specific, regression test for specific behaviors. Such as not having multiple fstat calls, or having at most two read calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a a potentially fragile point, and lots could lead to a test failure even though the cpython implementation is as desired (ex. cpython _io, cpython _pyio, glibc, and musl differ around Buffered I/O). There are a lot of system calls which could happen as part of the Python standard file opening process which I want to validate don't ideally. Just saying "one stat", "two reads" "no ioctl" doesn't quite feel like it could help catch these changes.

Motivating examples I would like to be able to write behavior tests around:

  1. CPython's open() builtin currently calls isatty to check if the file it just opened is "interactive" to determine buffered mode. If buffered, a BufferedIO is constructed which additionally wants to have an accurate position inside the file, and calls lseek, sometimes at __init__ (inside open), but sometimes when other I/O operations happen (read, write). The buffering behavior is visible, and documented, around stdout (https://docs.python.org/3/library/sys.html#sys.stdout) with a special environment variable to change it PYTHONUNBUFFERED. stderr has a different behavior than stdout. For this PR, I needed to do a flush=True in print() because the test python has its output redirected to a file which changes the behavior of the standard library making there be one write() system call at interpreter shutdown instead of calls when the print() statements are run. The isatty python check results in an ioctl system call (full stack at [0]).
  2. gh-121313: Limit the reading size from pipes to their default buffer size on Unix systems #121315 in the initial implementation added a fstat to every read call to test "is this a pipe" (108e65b). That would fairly substantially reduce read performance. If I just test for at most one fstat over lifetime, could catch. But what if it was an attribute which ioctl is used to get? The performance slowdown was happening because of lots large allocation + de-allocation (Limit the reading size from pipes to their default buffer size on Unix systems #121313), which isn't a FD system call.

In some ways this feels like it interacts a bit with "what is a supported platform" (https://peps.python.org/pep-0011/). I can defer for now and count specific system calls (1 fstat over life of file, no ioctls, etc.) but worry it leaves some substantial gaps. I'm really not sure what the right set of tradeoffs is for long term maintenance + platform porting and needing to update tests like this one.

[0]

  1. ioctl(TCGETS) (linux)
  2. tcgettattr (glibc)
  3. isatty (glibc)
  4. io.isatty (cpython)
  5. FileIO.isatty (cpython)
  6. open() (cpython)


It's expected as bits of the I/O implementation change, this will need
to change. The goal is to catch changes that unintentionally add
additional systemcalls (ex. additional fstat calls has been an issue).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include a link to the specific gh cpython issue where additional calls were a problem.

# Do NOT check that fork() or other clones did not happen.
# If the OS denys the vfork it'll fallback to plain fork().

# Test that each individual thing that would disable the use of vfork
# actually disables it.
for sub_name, preamble, sp_kwarg, expect_permission_error in (
("!use_vfork", "subprocess._USE_VFORK = False", "", False),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems wrong to remove this subTest as part of this PR which is focused on making the strace test helper concept generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, order of operations vs. pr builds finishing / passing. I suspect this PR will take longer than the _USE_VFORK removal, but will separate out if we get close to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gh-121383 / _USE_VFORK removal has been merged

return unittest.skip("Linux only, requires strace.")
# Moderately expensive (spawns a subprocess), so share results when possible.
if _strace_working is None:
_can_strace()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the assignment to the global _strace_working into this function so that the is None test and code assigning it to something other than that all lives together in one scope. (have _can_strace() return a boolean to be assigned perhaps)

class StraceResult:
strace_returncode: int
python_returncode: int
_raw_events: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gets used, so perhaps don't make it private. Add a comment describing what exactly it is for. for the most part it looks like it'll be the same as stderr.

strace_returncode: int
python_returncode: int
_raw_events: str
stdout: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use str for child process output. The process emits binary output and no codec is declared or guaranteed.

It is okay to decode names once parsed out for high level use in StraceEvent.syscall, but the process itself shouldn't be expected to produce output matching any encoding.

res.fail(cmd_line)

# Get out program returncode
decoded = res.err.decode().strip()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Always specify a codec on decode. BUT... (2) Instead of decoding here, it'd be better to leave output as bytes. There will be processes that do not emit valid encoded data.

from dataclasses import dataclass
from test import support
from test.support.script_helper import run_python_until_end
from typing import Dict, List
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - modern typing does not need these two. use dict[] and list[] directly. (but read below, we aren't ready to accept non-trivial annotations in test.support)

match["returncode"]) for match in matches if match
]

def sections(self) -> Dict[str:List[StraceEvent]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a valid annotation. Use -> dict[str, list[StraceEvent]]... But it'd better to just not annotate this file at all. Mention what gets returned in doc strings instead.

BUT This is a good example of why we generally do not have type annotations anywhere within CPython. We are not setup to run multiple type checkers over the stdlib to validate them. You can see what little is configured in .github/workflows/mypy.yml - but do not try to use this PR to enable it for test.support. it isn't ready.

PEP 649 being implemented is the only reason this was not a Python syntax error.

)
# Match both vfork() and clone(..., flags=...|CLONE_VFORK|...)
self.assertRegex(vfork_result.err, br"(?i)vfork")
self.assertRegex(vfork_result._raw_events, r"(?i)vfork")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ._raw_events rather than .stderr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My theory was trying to encourage people to use the parsed results / .events for their test cases. It's useful to have the underlying un-parsed strace output (ex. vfork anywhere in args or function names should pass) but cases like #121315 mainly just care "these calls don't happen in this case", and for mine the overall set of calls are roughly what is expected.

@bedevere-app
Copy link

bedevere-app bot commented Jul 6, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@bedevere-app
Copy link

bedevere-app bot commented Jul 10, 2024

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from gpshead July 10, 2024 02:39
Copy link
Contributor

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the tests! I think gpshead's feedback is addressed?

(re #121143 (comment) , I think it's nice to have fairly specific tests. Maybe we could only run those if platform.libc_ver()[0] == glibc or something, personally I'm happy to risk a false positive until the test is proven fragile)

return [ev.syscall for ev in events]


def _can_strace():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i would just stick a functools.cache on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to functools.cache, definitely simplified a bit. Initially had been worried about adding a lot more dependencies to the I/O tests, but also realized when things like read are broken, interpreter build breaks (can't load modules, etc.)

@cmaloney
Copy link
Contributor Author

Yea, I think I've addressed all of @gpshead's review comments, I haven't clicked 'resolved conversation' though to keep the review visible for context

@hauntsaninja hauntsaninja merged commit e38d0af into python:main Aug 24, 2024
34 checks passed
@hauntsaninja
Copy link
Contributor

Thank you again! (gpshead, if you come back to this and want some change, just let me know)

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot ARM Raspbian 3.x has failed when building commit e38d0af.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/424/builds/7966) and take a look at the build logs.
  4. Check if the failure is related to this commit (e38d0af) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/424/builds/7966

Failed tests:

  • test_fileio

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/var/lib/buildbot/workers/3.x.gps-raspbian.nondebug/build/Lib/test/test_fileio.py", line 396, in check_readall
    self.assertEqual(count_similarname('fstat'), 1)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 != 1

hauntsaninja added a commit to hauntsaninja/cpython that referenced this pull request Aug 24, 2024
@hauntsaninja
Copy link
Contributor

Hmm, I'd need a few more logs to figure this out. Let's revert for now #123303 and fix in another PR? My bad, I should have thought to run the buildbots on this :-)

@cmaloney
Copy link
Contributor Author

cmaloney commented Aug 24, 2024

Sounds good to me. Think i have all the bits to try that raspbian locally, need to figure out what its syscall name for fstat looks like

hauntsaninja added a commit that referenced this pull request Aug 24, 2024
#123303)

Revert "GH-120754: Add a strace helper and test set of syscalls for open().read() (#121143)"

This reverts commit e38d0af.
@gpshead
Copy link
Member

gpshead commented Aug 24, 2024

Let me go strace on my Raspbian bot to see what it says. This type of test is also okay to just restrict to a known set of platforms if it comes down to that.

@gpshead
Copy link
Member

gpshead commented Aug 24, 2024

Okay, I changed the new assertEqual(count_similarname("fstat"), 1) lines assertEqual(count_similar("fstat"), 1, msg=f"set(syscalls)=") to get useful info in the error message. Here's a couple examples:

======================================================================
FAIL: test_syscalls_read (test.test_fileio.PyAutoFileTests.test_syscalls_read) (name='pathlib read_bytes')
Check that the set of system calls produced by the I/O stack is what
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/pi/repro-pr-120754/cpython-e38d0afe3548b856ccf0b05c01ed3eefc69cb3e7/Lib/test/test_fileio.py", line 396, in check_readall
    self.assertEqual(count_similarname('fstat'), 1, msg=f"{set(syscalls)=}")
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 != 1 : set(syscalls)={'openat', 'read', 'close', 'statx'}

======================================================================
FAIL: test_syscalls_read (test.test_fileio.PyAutoFileTests.test_syscalls_read) (name='pathlib read_text')
Check that the set of system calls produced by the I/O stack is what
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/pi/repro-pr-120754/cpython-e38d0afe3548b856ccf0b05c01ed3eefc69cb3e7/Lib/test/test_fileio.py", line 396, in check_readall
    self.assertEqual(count_similarname('fstat'), 1, msg=f"{set(syscalls)=}")
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 != 1 : set(syscalls)={'read', 'ioctl', 'openat', 'close', '_llseek', 'statx'}

----------------------------------------------------------------------

@gpshead
Copy link
Member

gpshead commented Aug 24, 2024

statx is indeed a valid Linux system call that could be used for the purpose.

@cmaloney cmaloney deleted the cmaloney/systrace_helper_wip branch August 25, 2024 04:18
@cmaloney cmaloney restored the cmaloney/systrace_helper_wip branch August 25, 2024 04:18
cmaloney added a commit to cmaloney/cpython that referenced this pull request Aug 28, 2024
NOTE: This needs a full buildbot test pass before merge, see: python#121143 (comment).

1. Added `statx` to set of allowed syscall forms (Should make Raspian bot pass).
2. Check that the `fd` returned from `open` is passed to all future calls. This helps ensure things like the `stat` call uses the file descriptor rather than the `filename` to avoid TOCTOU isuses.
3. Update the `Path().read_bytes()` test case to additionally validate the reduction in`isatty`/`ioctl` + `seek` calls from python#122111
4. Better diagnostic assertion messagess from @gpshead, so when the test fails have first information immediately available. Makes remote CI debugging much simpler.
@cmaloney cmaloney deleted the cmaloney/systrace_helper_wip branch November 2, 2024 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants