-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
more: constant memory initialization overhead #7765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
442ddb7
to
7649214
Compare
waiting for #7680 to be merged |
GNU testsuite comparison:
|
GNU testsuite comparison:
|
4f212a6
to
a981d4c
Compare
GNU testsuite comparison:
|
@@ -27,7 +27,10 @@ use uucore::{format_usage, help_about, help_usage}; | |||
|
|||
const ABOUT: &str = help_about!("more.md"); | |||
const USAGE: &str = help_usage!("more.md"); | |||
const BELL: &str = "\x07"; | |||
const BELL: char = '\x07'; | |||
const MULTI_FILE_TOP_PROMPT: &str = "\r::::::::::::::\n\r{}\n\r::::::::::::::\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a bit cryptic, could you please add a comment for this? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do @sylvestre. Please note that this PR is still WIP and not ready for full review. I'm awaiting my previous PR to be merged. Once that is done, I can better compare changes to main.
I will also do a writeup on the changes made, benchmarking, and add comments in the code.
GNU testsuite comparison:
|
@tertsdiepraam @sylvestre here is how I plan to refactor
Would be great to hear your feedback! |
61ddf17
to
148d6a2
Compare
GNU testsuite comparison:
|
148d6a2
to
310d807
Compare
Benchmark < /dev/urandom base64 | fold -w 120 | head -c 100M > bigfile
< /dev/urandom base64 | fold -w 120 | head -c 1M > smallfile
cargo build --no-default-features --features more --release
# file
/usr/bin/time ./target/release/coreutils more bigfile
/usr/bin/time ./target/release/coreutils more smallfile
# pipe
/usr/bin/time bash -c 'cat bigfile | ./target/release/coreutils more'
/usr/bin/time bash -c 'cat smallfile | ./target/release/coreutils more'
# in `main`
# bigfile: 0.37user 0.11system 0:01.81elapsed 26%CPU (0avgtext+0avgdata 126796maxresident)k 0inputs+0outputs (0major+31065minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.21elapsed 0%CPU (0avgtext+0avgdata 4540maxresident)k 0inputs+0outputs (0major+452minor)pagefaults 0swaps
# bigfile(pipe): 0.37user 0.15system 0:01.97elapsed 26%CPU (0avgtext+0avgdata 126832maxresident)k 0inputs+0outputs (0major+31430minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.01system 0:01.63elapsed 1%CPU (0avgtext+0avgdata 4344maxresident)k 0inputs+0outputs (0major+812minor)pagefaults 0swaps
# in `more-mem-constant`
# bigfile: 0.00user 0.00system 0:01.04elapsed 0%CPU (0avgtext+0avgdata 3084maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.05elapsed 0%CPU (0avgtext+0avgdata 3224maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 3528maxresident)k 0inputs+0outputs (0major+488minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.13elapsed 0%CPU (0avgtext+0avgdata 3592maxresident)k 0inputs+0outputs (0major+491minor)pagefaults 0swaps
# system
# bigfile: 0.00user 0.00system 0:00.95elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.11elapsed 0%CPU (0avgtext+0avgdata 3584maxresident)k 0inputs+0outputs (0major+494minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.10elapsed 0%CPU (0avgtext+0avgdata 3524maxresident)k 0inputs+0outputs (0major+492minor)pagefaults 0swaps Greatly reduced time to first paint and memory usage (41x) for large files. Verified that the overhead is constant with respect to input size. |
GNU testsuite comparison:
|
310d807
to
686f4b3
Compare
GNU testsuite comparison:
|
GNU testsuite comparison:
|
347c086
to
022ed0f
Compare
GNU testsuite comparison:
|
022ed0f
to
f138ac5
Compare
// Disable raw mode before exiting if a panic occurs | ||
set_hook(Box::new(|panic_info| { | ||
terminal::disable_raw_mode().unwrap(); | ||
print!("\r"); | ||
println!("{panic_info}"); | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic for modifying the terminal has been moved from uumain
to more
.
GNU testsuite comparison:
|
@tertsdiepraam could you please review this PR? |
GNU testsuite comparison:
|
Also, can you please squash the multiple commits in just one or two implementation steps ? |
Could the PR be squashed into one commit during merge? |
It can be done, but if you still want to split the whole work into 2 distinct pieces of work, that should be done on your side |
dd0da28
to
81ca89a
Compare
GNU testsuite comparison:
|
src/uu/more/Cargo.toml
Outdated
@@ -23,6 +23,7 @@ uucore = { workspace = true } | |||
crossterm = { workspace = true } | |||
unicode-width = { workspace = true } | |||
unicode-segmentation = { workspace = true } | |||
tempfile = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it should be in dev-dependencies given it is only used for test, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed and rebased
81ca89a
to
3c91ece
Compare
GNU testsuite comparison:
|
3c91ece
to
37a94e1
Compare
GNU testsuite comparison:
|
I am facing some issues with running tests locally even though they pass in the test runner. Please give me some more time to fix it. |
37a94e1
to
bfcfb04
Compare
GNU testsuite comparison:
|
7bd00a5
to
a9d4a8f
Compare
a9d4a8f
to
d4bc6b3
Compare
I have "fixed" the tests by migrating them to |
enum OutputType { | ||
Tty(Stdout), | ||
Pipe(Box<dyn Write>), | ||
#[cfg(test)] | ||
Test(Vec<u8>), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enum is used to distinguish between stdout types and provide a way for tests to verify output.
GNU testsuite comparison:
|
close #6397
This PR improves
uu_more
by achieving constant memory overhead during initialization. Previously (#7680), we read the entire file once and store byte positions. We now do so incrementally using the following techniques:InputType
enum shown above.Stdin
already implementsBufRead
and we need the underlyingFile
type to get the file size for displaying the status.Pager
object, we scan the only up to the start line (options.from_line
). We store the lines read inlines
, and the byte positions incumulative_line_sizes
to display the file navigation status.from_line
takes precedence over pattern which is in line withmore
. This means that if the specified pattern only exists beforefrom_line
, it will not be found.self.lines
. This means we scan the entire file at most once.self.upper_mark + self.content_rows
have been read into memory. Then, we iterate over the relevant indexes inself.lines
to print.test_more.rs
tomore.rs
since we cannot easily extract stdout from crossterm'sAlternateScreen
.less
.