Skip to content

rm: r4_gnu_test #6621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 3, 2024
Merged

rm: r4_gnu_test #6621

merged 8 commits into from
Dec 3, 2024

Conversation

AnirbanHalder654322
Copy link
Contributor

Added regex detection to various combination of ".." and "." followed by slashes .

@AnirbanHalder654322 AnirbanHalder654322 changed the title rm: rm_4_gnu_test rm: r4_gnu_test Aug 7, 2024
Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

1 similar comment
Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@just-an-engineer
Copy link
Contributor

Why not make it work on Windows as well?

@just-an-engineer
Copy link
Contributor

#6620

Copy link
Contributor

@sylvestre sylvestre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs some updates

@AnirbanHalder654322
Copy link
Contributor Author

needs some updates

Sorry for the lack of communication, had some assignment deadlines to fulfill. I will push out updates today.

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@AnirbanHalder654322 AnirbanHalder654322 force-pushed the rm_4 branch 2 times, most recently from 3e4b765 to 8164ea8 Compare August 14, 2024 09:20
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/rm/r-4 is no longer failing!

@AnirbanHalder654322
Copy link
Contributor Author

@sylvestre This should be ready for a review now

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@just-an-engineer
Copy link
Contributor

just-an-engineer commented Aug 14, 2024

Is the utf-8 crate from Cargo.toml used? I could just be missing it. If not, can you remove it?

Also, for your rm.rs, lines 584 & 585, I imagine we only want to check for /. or /.. exactly. I might be wrong here, but I can do touch hi., making a file that ends in either . or .., in which case your code would currently classify it as the parent or current dir.
Also also, on lines 586 and 587, wouldn't you want it to be format!("{}.", ...) and similar to detect that?

I do like how you remove trailing slashes, but would you be able to propagate the new cleaned_path var on line 334 in rm.rs to the rest of that function (handle_dir)? I assume that would be nice to use for the rest since you've already processed it a bit. Also, would something like str.replace('//', '/') (and the related slash for Windows) work instead? (I'm just throwing out stuff, so if I'm wrong, then feel free to say so). I assume that will collapse double slashes both at the start, end, and middle. I assume the performance would be similar, but it might be more concise. But it looks like you just take a slice of the original string, so I assume yours doesn't work on multiple slashes in the middle of a path.

Lastly, for your tests, I think you could save yourself 30ish lines and just use the conditional variable assignments for *nix/Windows (/ vs \\) and just use a bunch of format all over the place. Although maybe it may look too cluttered, so perhaps what you have here is better than that method. Idk, @sylvestre, any thoughts on that? One worry would be drift if someone updates a test but forgets to do the other, but they're smallish and close to each other.

EDIT: I realize your function is for trailing slashes, not necessarily double. You could always turn it into a more general cleaning function with the replace that I mentioned, and remove trailing, but I just misread it, my apologies. But instead of a loop at line 633, wouldn't you be able to use a built-in method for the path to get the first index and use that directly in the path slice?

@AnirbanHalder654322
Copy link
Contributor Author

Is the utf-8 crate from Cargo.toml used? I could just be missing it. If not, can you remove it?

I believe my PR does remove it.

Also, for your rm.rs, lines 584 & 585, I imagine we only want to check for /. or /.. exactly. I might be wrong here, but I can do touch hi., making a file that ends in either . or .., in which case your code would currently classify it as the parent or current dir. Also also, on lines 586 and 587, wouldn't you want it to be format!("{}.", ...) and similar to detect that?

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

There is no point in doing format!("{}.") as ends_with('.') would cover the point of the condition anyway. And we would not replace the former with the latter since format!("{}.") wouldn't be true when you do rm -rf . or rm -rf ....

Lastly, for your tests, I think you could save yourself 30ish lines and just use the conditional variable assignments for *nix/Windows (/ vs \\) and just use a bunch of format all over the place. Although maybe it may look too cluttered, so perhaps what you have here is better than that method. Idk, @sylvestre, any thoughts on that? One worry would be drift if someone updates a test but forgets to do the other, but they're smallish and close to each other.

We usually just add new tests instead of changing old tests unless it asserts wrong behavior and therefore fail, the test would be caught in the CI if people forget to change anyway.

EDIT: I realize your function is for trailing slashes, not necessarily double. You could always turn it into a more general cleaning function with the replace that I mentioned, and remove trailing, but I just misread it, my apologies. But instead of a loop at line 633, wouldn't you be able to use a built-in method for the path to get the first index and use that directly in the path slice?

We only care about cleaning the end of the string from slashes to print to stderr. There is no point cleaning the middle, doing a stat() call on something like ..//../ will refer to the directory at ../../ anyway. We are only cleaning to be compatible with gnu.

Actually I was under the idea that converting windows utf-16 OsStr to rust str type might fail that's why the I was very hesitant to use strings.

@just-an-engineer
Copy link
Contributor

just-an-engineer commented Aug 14, 2024

I believe my PR does remove it.

My apologies, I just saw it an my mind went to add. But yeah, I see you're actually removing it

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

Yeeaah, I remember that now. However, on Linux, I can also make a directory that's named test. or test... Would you be able to add some tests to make sure a valid directory that ends with . or .. don't get improperly ignored? Because if it's the case that a real directory that ends with . or .. makes it more necessary to change how you check it. Because I agree if we didn't care about that, the ends_with function from your reply:

There is no point in doing format!("{}.") as ends_with('.') would cover the point of the condition anyway. And we would not replace the former with the latter since format!("{}.") wouldn't be true when you do rm -rf . or rm -rf ....

Would work just as well. But I think we need to go through the extra hassle of ensuring that's the 'entire' name of the program. There might be other ways to do that, like see if it ends with . and is of length 1 (outside of the rest of the path), but it may be simpler to do the format and keep most of the structure of what you have

We usually just add new tests instead of changing old tests unless it asserts wrong behavior and therefore fail, the test would be caught in the CI if people forget to change anyway.

Yeah, that's a good point. It may not cover the case of someone making a test stricter and forgetting the other one, but I think it's valid enough. And like I said, the test looks clean enough, doing a bunch of formats may make it look not so pretty.

Actually I was under the idea that converting windows utf-16 OsStr to rust str type might fail that's why the I was very hesitant to use strings.

I have no clue about that, so I'll just say that sounds good if so and leave it to you and Sylvestre.

@AnirbanHalder654322
Copy link
Contributor Author

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

Yeeaah, I remember that now. However, on Linux, I can also make a directory that's named test. or test... Would you be able to add some tests to make sure a valid directory that ends with . or .. don't get improperly ignored? Because if it's the case that a real directory that ends with . or .. makes it more necessary to change how you check it. Because I agree if we didn't care about that, the ends_with function from your reply:

Good catch, i will push a fix for that.

@AnirbanHalder654322
Copy link
Contributor Author

It seems like there is some niche behavior with GNU which does seem like a bug.

rm -rf ...
rm: refusing to remove '.' or '..' directory: skipping '../..'
rm -rf ....
rm: refusing to remove '.' or '..' directory: skipping '../../..' 

It seems like gnu converts the path into ../.. , seems they are processing it as overlapping instances of .. instead of disjoint instances of .. . Basically turning the path into the parent of parent directory in the case of rm -rf ... , parsing it as .. and .. which the 2nd . is overlapping. And same for .... , it essentially referring to k = (n!) / (2!*( n-2)!) , kth parent directory above the cwd in directory tree.

Linux allows both files and directories to be named ... or .... or combinations of ...... barring .. and . which refers to current dir and parent dir respectively. It feels a bit wrong to arbitrarily err on valid file names, which can be trivially created.

// Start up calls 

getrandom("\xd5\x1d\xd5\x6f\xc8\xd3\xcd\xe7", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x5d157286c000
brk(0x5d157288d000)                     = 0x5d157288d000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=15751120, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 15751120, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7abe77800000
close(3)                                = 0
ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "../..", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0 
// Seems like the path is converted in userspace and then a syscall is made to ``../..`` for ``rm -rf ...`` 

// User locale related syscalls 

 
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=613, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 613, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7abe78a82000
close(3)                                = 0
write(2, "rm: ", 4rm: )                     = 4
write(2, "refusing to remove '.' or '..' d"..., 58refusing to remove '.' or '..' directory: skipping '../..') = 58
write(2, "\n", 1
)                       = 1
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

@sylvestre Need some advice on how to handle this . Do we just do the same and read the path as overlapping instances of .. ?

@just-an-engineer
Copy link
Contributor

What setup are you on? I just made files and directories with a bunch of '.'s, and couldn't replicate it. For GNU, I mean. I'm on Mint 20.1, x86

@AnirbanHalder654322
Copy link
Contributor Author

What setup are you on? I just made files and directories with a bunch of '.'s, and couldn't replicate it. For GNU, I mean. I'm on Mint 20.1, x86

Have you checked the hidden files to see if they exist ? Doing std::fs::File::create should work. Try doing ls -a after creating the file.

Copy link

github-actions bot commented Sep 8, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

1 similar comment
Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link
Collaborator

@BenWiederhake BenWiederhake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now touches --no-preserve-root, and I'm not yet convinced that the code is correct. Are you sure you want to handle this in the same PR?

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link
Collaborator

@BenWiederhake BenWiederhake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only one minor thing:

(The other two nitpicks are optional.)

I'm clicking on "Approve" so that you don't have to wait so long for a response again.

Cargo.toml Outdated
@@ -337,7 +337,6 @@ thiserror = "1.0.59"
time = { version = "0.3.36" }
unicode-segmentation = "1.11.0"
unicode-width = "0.1.12"
utf-8 = "0.7.6"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet see why this PR needs to touch windows-sys and utf-8. There's probably a good reason that I just don't see. Could you fill me in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably removed it after running udeps. Looking back at it now, we need both of them for windows specific function and i definitely ignored it since the CI passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenWiederhake I was wondering, would basically updating my master and then merging to my feature branch be okay to do ? I kinda need some help on how to rebase this, can't seem to figure it out.

@@ -611,3 +679,17 @@ fn is_symlink_dir(metadata: &Metadata) -> bool {
metadata.file_type().is_symlink()
&& ((metadata.file_attributes() & FILE_ATTRIBUTE_DIRECTORY) != 0)
}

mod tests {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use super::*;

This way, you don't have to use crate::foo_bar; everything in each test again and again.

@AnirbanHalder654322 AnirbanHalder654322 force-pushed the rm_4 branch 3 times, most recently from d254369 to 03e5316 Compare October 4, 2024 13:39
@AnirbanHalder654322 AnirbanHalder654322 force-pushed the rm_4 branch 2 times, most recently from 404f5ce to 80693c1 Compare October 4, 2024 14:25
Copy link
Collaborator

@BenWiederhake BenWiederhake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's merge this! (After rebasing and fixing conflicts.)

Sorry for waiting an entire month, I got side-tracked.

I still suggest doing use super::*; in the tests in rm.rs, to simplify imports in the submodule. However, that doesn't affect correctness.

Copy link

github-actions bot commented Dec 2, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

github-actions bot commented Dec 3, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@sylvestre sylvestre merged commit a16630f into uutils:main Dec 3, 2024
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants