-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
pinky: fix off-by-one in GECOS parsing #3155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you please add a test to make sure we don't regress in the future? |
Sorry, I should have made my post clearer. This line is tested, in the |
yeah, please try to run the tests too :) |
74bf71a
to
d2aee2f
Compare
Okay, so I think I figured out the problem with the tests. They are actually being run in the CI, but weren't really doing much because 1. most of the tests that specify an account were specifying the root acount, which has no GECOS fields, and more importantly, 2. the environment is running the commands from an account without logging into it, so there is no utmp data. Pinky can't run most of its code if there are no users in the utmp file, and needs something in the GECOS fields to print something meaningful in that part of the output. I added the default GH Action user to one of the tests, and modified the linux portions of the CI to hack around the two above issues:
This caught a similar bug in another part of the code, I've folded the fix into the first commit. The coverage should be pretty good now, but a network blip prevented the codecov action from running. Someone with access should restart it before merging, to make sure it actually worked (or I'll rebase if I notice new commits on main first). |
@@ -595,17 +595,6 @@ jobs: | |||
*-pc-windows-msvc) STRIP="" ;; | |||
esac; | |||
outputs STRIP | |||
- name: Install/setup prerequisites |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's duplicated a few lines below, where I modify it with the additional lines. I believe 39a6e6c was supposed to move it but instead copied it. If I don't remove it, I'd need to choose which to modify, or modify both, in any case becoming a likely spot for bit rot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mistakenly left in... deletion should be fine.
Modify the second copy.
.github/workflows/CICD.yml
Outdated
@@ -624,6 +613,13 @@ jobs: | |||
case '${{ matrix.job.os }}' in | |||
macos-latest) brew install coreutils ;; # needed for testing | |||
esac | |||
case '${{ matrix.job.os }}' in | |||
*linux*) | |||
sudo sed -i 's/:,/:runner name,/' /etc/passwd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment explaining why you are doing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
d2aee2f
to
8ef5838
Compare
Oh I should also mention: I have a workflow run that clearly shows what these hacks are doing, and why they're necessary. (N.b. the commands there are using the built-in, GNU pinky.) |
.github/workflows/CICD.yml
Outdated
case '${{ matrix.job.os }}' in | ||
*linux*) | ||
sudo sed -i 's/:,/:runner name,/' /etc/passwd # add a full name each account with a gecos field but no full name | ||
FAKE_UTMP='[7] [999999] [tty2] [runner] [tty2] [] [0.0.0.0] [2022-02-22T22:22:22,222222+00:00]' | ||
(utmpdump /var/run/utmp ; echo $FAKE_UTMP) | sudo utmpdump -r -o /var/run/utmp # create a fake login entry for the GH runner | ||
;; # helps pinky tests | ||
esac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting some understanding from comments and variable naming, but this is a little opaque to me.
Could you document a little more "why" for this block (locally), expanding on "helps pinky tests"? Maybe just a couple of explanatory lines?
I think it'll help future me (and maybe future you) understand why it helps and if it can be and how it should be modified in the future.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment, how does it look now?
Hmm, 🤔; we're testing "--features feat_os_unix" which is a direct feature alias in Cargo.toml and therefore should be synonymous with "--features unix"...
Have you seen differences between the use of the two features? |
8ef5838
to
42b7c68
Compare
@rivy Yeah I was mistaken about the tests not being run in the CI, they are, just in an environment weird enough that pinky was basically untested. When you run in an actual, full Unix-like system, at least one user is going to be logged in, and typically have at least one gecos field (indeed, some of the "users" in a The workflow run I mentioned is the clearest place to see what's going on, but the tl;dr is
to
where |
So, just to be clear, GitHub is one of the "limited" *nix environments that doesn't have enough info being used/logged to actually test If that's correct, just add a quick note to that effect in a leading comment.
|
Sorry, I didn't mean to say that the VMs are in some essential way lesser *nix environments, it's more that they're configured to run in unusual ways. I don't know how the commands are run, but it's probably something like launching from the root init with sudo; there's just nobody logged in, and the runner account (the account that runs the action commands) doesn't have any of the info we need to fully test -- info that would almost always get added when creating a new login user. How does this work for a comment: case '${{ matrix.job.os }}' in
*linux*)
# pinky is a tool to show logged-in users from utmp, and gecos fields from /etc/passwd.
# In GitHub Action *nix VMs, no accounts log in, even the "runner" account that runs the commands. The account also has empty gecos fields.
# To work around this for pinky tests, we create a fake login entry for the GH runner account...
FAKE_UTMP='[7] [999999] [tty2] [runner] [tty2] [] [0.0.0.0] [2022-02-22T22:22:22,222222+00:00]'
# ... by dumping the login records, adding our fake line, then reverse dumping ...
(utmpdump /var/run/utmp ; echo $FAKE_UTMP) | sudo utmpdump -r -o /var/run/utmp
# ... and add a full name to each account with a gecos field but no full name.
sudo sed -i 's/:,/:runner name,/' /etc/passwd
;;
esac |
42b7c68
to
0339a43
Compare
d814494
to
8485e70
Compare
b62efb6
to
98d7b86
Compare
Some more good/bad news: I found out why the extra runner steps weren't working (bad regex for matching the os), and now that it's fixed, it seems to be catching more bugs in |
98d7b86
to
c6e95c9
Compare
I fixed the additional pinky bug (the code to read files was filling a buffer with null values that were being invisibly printed), and I believe identified the cause of the bugs in users and who (#3219). For now, I've disabled those tests, since fixing them seems out of scope for this PR. |
c6e95c9
to
fea0efe
Compare
fea0efe
to
687dcae
Compare
I am not a bit fan of ignoring that many tests :/ |
Hmm, what would you like to see then? The tests are good tests, my tweaks to the environment uncovered a bug they were finding, but that bug is unrelated to the original point of this PR (fixing the off-by-one error in GECOS parsing). Ignoring the tests annotates them as "we know there's a bug here", leaving them as-is gives the impression that they're testing behavior and passing when in reality, they're just being run in an environment that doesn't exercise relevant code paths. We could:
For some additional context, the reason I'm filing this PR is because it's blocking another, larger PR I plan to file that fixes the broken Android builds. Among other problems, there is a memory safety bug that my testing on Android uncovered: a syscall android builds -> So I'm fine with any of the above three options (or others that I'm perhaps missing), but I'd like to fix the memory safety bug sooner than later, and if we go for the latter two options, the fix won't be fully tested -- perhaps a tradeoff worth making, I just wanted to let you know where this is going. |
@tertsdiepraam wdyt? |
I agree with @jtracey that we should ignore the tests for now. Hopefully, we'll be able to enable them again soon. |
This lets the test pass when running
cargo test --features unix
on my system, a Xubuntu 21.10 install. Without this, the output includes the trailing comma. I am a bit cautious though, since it seems unusual that nobody noticed this in the 6 years since this code was added -- is it possible there's something system-specific going on? Also, all the CI tests pass before and after this change, does the CI not run the Unix-specific tests?