Skip to content

more: constant memory initialization overhead #7765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 19, 2025

Conversation

aaron-ang
Copy link
Contributor

@aaron-ang aaron-ang commented Apr 16, 2025

close #6397

enum InputType {
	File(BufReader<File>),
	Stdin(Stdin),
}

This PR improves uu_more by achieving constant memory overhead during initialization. Previously (#7680), we read the entire file once and store byte positions. We now do so incrementally using the following techniques:

  1. We handle files and pipes together using the InputType enum shown above. Stdin already implements BufRead and we need the underlying File type to get the file size for displaying the status.
  2. When initializing the Pager object, we scan the only up to the start line (options.from_line). We store the lines read in lines, and the byte positions in cumulative_line_sizes to display the file navigation status.
  3. If pattern matching is required, we scan from the initialized start line to find first line containing the pattern. Note that from_line takes precedence over pattern which is in line with more. This means that if the specified pattern only exists before from_line, it will not be found.
  4. The file will be read on-demand without seeking and the newly read lines are appended toself.lines. This means we scan the entire file at most once.
  5. When drawing lines, we check that self.upper_mark + self.content_rows have been read into memory. Then, we iterate over the relevant indexes in self.lines to print.
  6. Moved some tests from test_more.rs to more.rs since we cannot easily extract stdout from crossterm's AlternateScreen.
  7. Additional UI changes to match less.

@aaron-ang
Copy link
Contributor Author

waiting for #7680 to be merged

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@@ -27,7 +27,10 @@ use uucore::{format_usage, help_about, help_usage};

const ABOUT: &str = help_about!("more.md");
const USAGE: &str = help_usage!("more.md");
const BELL: &str = "\x07";
const BELL: char = '\x07';
const MULTI_FILE_TOP_PROMPT: &str = "\r::::::::::::::\n\r{}\n\r::::::::::::::\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a bit cryptic, could you please add a comment for this? thanks

Copy link
Contributor Author

@aaron-ang aaron-ang Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do @sylvestre. Please note that this PR is still WIP and not ready for full review. I'm awaiting my previous PR to be merged. Once that is done, I can better compare changes to main.

I will also do a writeup on the changes made, benchmarking, and add comments in the code.

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@aaron-ang
Copy link
Contributor Author

aaron-ang commented Apr 20, 2025

@tertsdiepraam @sylvestre here is how I plan to refactor more in broad strokes

  1. Create a new terminal screen (similar to less). On exit, restore previous screen. This can be easily done with crossterm::terminal::{EnterAlternateScreen, LeaveAlternateScreen}.
  2. Fixing options: I realized some are labelled wrongly.
  3. Instead of storing line byte offset, we incrementally read and store file data into lines. This means reverting back to the original implementation. We replace the line_count attribute with file byte size and an eof flag in the paging logic.

Would be great to hear your feedback!

Copy link

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?

@aaron-ang
Copy link
Contributor Author

aaron-ang commented Apr 21, 2025

Benchmark

< /dev/urandom base64 | fold -w 120 | head -c 100M > bigfile
< /dev/urandom base64 | fold -w 120 | head -c 1M > smallfile
cargo build --no-default-features --features more --release
# file
/usr/bin/time ./target/release/coreutils more bigfile 
/usr/bin/time ./target/release/coreutils more smallfile
# pipe
/usr/bin/time bash -c 'cat bigfile | ./target/release/coreutils more'
/usr/bin/time bash -c 'cat smallfile | ./target/release/coreutils more'

# in `main`
# bigfile: 0.37user 0.11system 0:01.81elapsed 26%CPU (0avgtext+0avgdata 126796maxresident)k 0inputs+0outputs (0major+31065minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.21elapsed 0%CPU (0avgtext+0avgdata 4540maxresident)k 0inputs+0outputs (0major+452minor)pagefaults 0swaps
# bigfile(pipe): 0.37user 0.15system 0:01.97elapsed 26%CPU (0avgtext+0avgdata 126832maxresident)k 0inputs+0outputs (0major+31430minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.01system 0:01.63elapsed 1%CPU (0avgtext+0avgdata 4344maxresident)k 0inputs+0outputs (0major+812minor)pagefaults 0swaps

# in `more-mem-constant`
# bigfile: 0.00user 0.00system 0:01.04elapsed 0%CPU (0avgtext+0avgdata 3084maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.05elapsed 0%CPU (0avgtext+0avgdata 3224maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 3528maxresident)k 0inputs+0outputs (0major+488minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.13elapsed 0%CPU (0avgtext+0avgdata 3592maxresident)k 0inputs+0outputs (0major+491minor)pagefaults 0swaps

# system
# bigfile: 0.00user 0.00system 0:00.95elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.11elapsed 0%CPU (0avgtext+0avgdata 3584maxresident)k 0inputs+0outputs (0major+494minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.10elapsed 0%CPU (0avgtext+0avgdata 3524maxresident)k 0inputs+0outputs (0major+492minor)pagefaults 0swaps

Greatly reduced time to first paint and memory usage (41x) for large files. Verified that the overhead is constant with respect to input size.

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@aaron-ang aaron-ang marked this pull request as ready for review April 22, 2025 00:40
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/usage_vs_getopt (fails in this run but passes in the 'main' branch)

Comment on lines -108 to 103
// Disable raw mode before exiting if a panic occurs
set_hook(Box::new(|panic_info| {
terminal::disable_raw_mode().unwrap();
print!("\r");
println!("{panic_info}");
}));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic for modifying the terminal has been moved from uumain to more.

Copy link

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@aaron-ang aaron-ang requested a review from sylvestre April 22, 2025 03:44
@aaron-ang
Copy link
Contributor Author

@tertsdiepraam could you please review this PR?

Copy link

github-actions bot commented May 3, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@RenjiSann
Copy link
Collaborator

Also, can you please squash the multiple commits in just one or two implementation steps ?

@aaron-ang
Copy link
Contributor Author

Also, can you please squash the multiple commits in just one or two implementation steps ?

Could the PR be squashed into one commit during merge?

@aaron-ang aaron-ang force-pushed the more-mem-constant branch from dd0da28 to 81ca89a Compare May 3, 2025 20:19
Copy link

github-actions bot commented May 3, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@aaron-ang aaron-ang force-pushed the more-mem-constant branch from 81ca89a to 3c91ece Compare May 4, 2025 09:12
Copy link

github-actions bot commented May 4, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@aaron-ang aaron-ang force-pushed the more-mem-constant branch from 3c91ece to 37a94e1 Compare May 4, 2025 19:27
Copy link

github-actions bot commented May 4, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@aaron-ang
Copy link
Contributor Author

I am facing some issues with running tests locally even though they pass in the test runner. Please give me some more time to fix it.

@aaron-ang aaron-ang force-pushed the more-mem-constant branch from 37a94e1 to bfcfb04 Compare May 5, 2025 00:11
Copy link

github-actions bot commented May 5, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@aaron-ang aaron-ang force-pushed the more-mem-constant branch 3 times, most recently from a9d4a8f to d4bc6b3 Compare May 6, 2025 00:40
@aaron-ang
Copy link
Contributor Author

aaron-ang commented May 6, 2025

I have "fixed" the tests by migrating them to more.rs since we cannot easily extract stdout from crossterm's AlternateScreen. I also reran the benchmark with additional flags in the build: --no-default-features --features more, and we observe an astonishing 41x improvement in memory usage for the 100MB file 😂.

Comment on lines +286 to +289
enum OutputType {
Tty(Stdout),
Pipe(Box<dyn Write>),
#[cfg(test)]
Test(Vec<u8>),
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum is used to distinguish between stdout types and provide a way for tests to verify output.

Copy link

github-actions bot commented May 6, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/usage_vs_getopt (passes in this run but fails in the 'main' branch)

@hz2
Copy link
Contributor

hz2 commented May 6, 2025

I also reran the benchmark with additional flags in the build: --no-default-features --features more, and we observe an astonishing 41x improvement in memory usage for the 100MB file 😂.

Do you have the results of re-running the benchmark with your latest changes?

@aaron-ang
Copy link
Contributor Author

I also reran the benchmark with additional flags in the build: --no-default-features --features more, and we observe an astonishing 41x improvement in memory usage for the 100MB file 😂.

Do you have the results of re-running the benchmark with your latest changes?

yes, I've updated it in the previous comment here: #7765 (comment)

@aaron-ang aaron-ang requested a review from RenjiSann May 8, 2025 19:40
@RenjiSann
Copy link
Collaborator

I've just finished reviewing. I did not see any code smell, it's very readable imo and

However, this is a big refactor, and I have a pretty low confidence level in my own competence to review this.
@sylvestre can you take a look at it (again) ?

Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this looks really good. Thanks! You've crammed a lot into this PR (and basically 1 commit), which makes it hard to review. I've got some comments, but we can probably merge this after those have been fixed. Next time, please split it up over multiple PRs.

Comment on lines +77 to +76
(Some(n), _) | (None, Some(n)) if n > 0 => Some(n + 1),
_ => None, // Use terminal height
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next time, please separate cleanup from actual changes. That'll make it much easier to review.

print!("\r");
stdout.flush().unwrap();
#[cfg(test)]
impl Deref for OutputType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to involved for deref, I'd rather have a unwrap_test method or something more explicit like that. Only implementing this for cfg(test) also makes me a bit worried that the behaviour might change inside and outside of tests.

Copy link
Contributor Author

@aaron-ang aaron-ang May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this into mod tests so it should not change the original implementation. Besides, the unreachable macro should catch errors if not used appropriately.

}

#[cfg(target_os = "fuchsia")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you remove this intentionally? Shouldn't we keep it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I've added it back.

Comment on lines 416 to 421
lines: Vec<String>, // Lines read from the input
cumulative_line_sizes: Vec<u64>, // Cumulative byte sizes of lines
upper_mark: usize, // Current line at the top of the screen
content_rows: usize, // Number of rows that fit on the screen
eof_reached: bool,
lines_squeezed: usize, // Number of lines squeezed out in the current view
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments should probably be docstrings.

Comment on lines +615 to +605
reset_term()?;
std::process::exit(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for another PR, but it might be nice to never exit explicitly, so that we always reset via the drop guard.

/// Process user input events until exit
fn process_events(&mut self, options: &Options) -> UResult<()> {
loop {
if !event::poll(Duration::from_millis(100))? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the number of millis?

Copy link
Contributor Author

@aaron-ang aaron-ang May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original 10ms polling rate seems excessive for manual navigation. Wdyt?

@@ -342,7 +342,6 @@ thiserror = "2.0.3"
time = { version = "0.3.36" }
unicode-segmentation = "1.11.0"
unicode-width = "0.2.0"
utf-8 = "0.7.6"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this already unused? I can't find where you removed the usage of it.

Copy link
Contributor Author

@aaron-ang aaron-ang May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added the tempfile crate into uumore's Cargo file using cargo add, it automatically removed the utf-8 crate (along with selinux-sys) from the main Cargo file. I assumed it was cleaning up unused crates, but I can add it back if needed.

Comment on lines 735 to 736
// Ensure we have enough lines loaded for display
self.read_until_line(self.upper_mark + self.content_rows)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this take the squeezed lines into account?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it does not. That is a good point. I will move this into the loop

// Display lines until we've filled the screen
let mut lines_printed = 0;
let mut index = self.upper_mark;
while lines_printed < self.content_rows && index < self.lines.len() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be a for loop over the index.

Copy link
Contributor Author

@aaron-ang aaron-ang May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main loop condition is lines_printed < self.content_rows. We may not always read until self.lines.len(). The index variable is used after the loop to check for EOF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the loop; hope it is clearer now.

@aaron-ang aaron-ang force-pushed the more-mem-constant branch from d4bc6b3 to 7f5e2b9 Compare May 13, 2025 00:09
@sylvestre sylvestre force-pushed the more-mem-constant branch from 7f5e2b9 to c8c4f52 Compare May 18, 2025 19:09
@sylvestre sylvestre requested a review from tertsdiepraam May 18, 2025 19:35
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@sylvestre sylvestre merged commit 9fbb6ac into uutils:main May 19, 2025
65 of 70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

excessive memory usage in more
5 participants