seq: Add a print_seq fast path function for integer and positive increments #7564

drinkcat · 2025-03-24T21:42:38Z

A lot of custom logic, we basically do arithmetic on character arrays, but this comes at with huge performance gains.

Unlike coreutils seq, we do this for all positive increments (because why not), and we don't fall back to slow path if the last parameter is in scientific notation.

Fixes #7482

I'm not too sure what I think about this ,-) This is a lot of added code, but this comes with huge performance gains, around 17x in some cases (and also, it was a fun puzzle, and I learnt quite a few new Rust things ,-P), and we are now competitive with GNU seq:

$ cargo build -r -p uu_seq && taskset -c 0 hyperfine --warmup 3 -L seq target/release/seq,./seq-main,seq "{seq} 1000000"
Benchmark 1: target/release/seq 1000000
  Time (mean ± σ):      10.6 ms ±   1.1 ms    [User: 9.9 ms, System: 0.7 ms]
  Range (min … max):    10.0 ms …  22.4 ms    249 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: ./seq-main 1000000
  Time (mean ± σ):     165.7 ms ±   5.4 ms    [User: 163.5 ms, System: 0.8 ms]
  Range (min … max):   160.8 ms … 182.3 ms    18 runs
 
Benchmark 3: seq 1000000
  Time (mean ± σ):       9.3 ms ±   0.1 ms    [User: 9.0 ms, System: 0.5 ms]
  Range (min … max):     8.8 ms …   9.5 ms    275 runs
 
Summary
  seq 1000000 ran
    1.14 ± 0.12 times faster than target/release/seq 1000000
   17.83 ± 0.60 times faster than ./seq-main 1000000

We're less conservative than GNU, so this path also activates for ranges likes 1 123 100000000 (the GNU manual says they do something special below 200 increment, but I suspect that value is lower) and 1e11, as long as the desired precision is still zero (a.k.a. integers).

Still a draft, probably want to add something to BENCHMARKING.md, a few more tests, and see if I can extract a bit more performance...

github-actions · 2025-03-24T22:17:13Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-04-03T16:45:55Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-04-03T17:31:55Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat · 2025-04-03T17:32:41Z

Scrapped a few more percents:

cargo build -r -p uu_seq && taskset -c 0 hyperfine --warmup 3 -L seq target/release/seq,./seq-main,seq "{seq} 1000000"
Benchmark 1: target/release/seq 1000000
  Time (mean ± σ):       5.8 ms ±   0.0 ms    [User: 5.4 ms, System: 0.5 ms]
  Range (min … max):     5.7 ms …   6.1 ms    442 runs
  
Benchmark 2: ./seq-main 1000000
  Time (mean ± σ):      96.7 ms ±   2.5 ms    [User: 95.6 ms, System: 0.5 ms]
  Range (min … max):    95.6 ms … 110.0 ms    30 runs
  
Benchmark 3: seq 1000000
  Time (mean ± σ):       5.4 ms ±   0.0 ms    [User: 5.1 ms, System: 0.4 ms]
  Range (min … max):     5.3 ms …   5.6 ms    473 runs
  
Summary
  seq 1000000 ran
    1.08 ± 0.01 times faster than target/release/seq 1000000
   18.04 ± 0.48 times faster than ./seq-main 1000000

github-actions · 2025-04-03T18:08:32Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat · 2025-04-03T19:01:22Z

(Also added support for constant width printing, it's easy to add)

github-actions · 2025-04-03T19:32:02Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

tertsdiepraam

Excellent! Just some small comments

tertsdiepraam · 2025-04-04T14:35:58Z

src/uu/seq/src/seq.rs

+    // Clippy wants to use `if let Some(...) = ...` to avoid is_some/unwrap combination, but that's
+    // not possible within an "if" test with multiple sub-expressions.
+    #[allow(clippy::unnecessary_unwrap)]
+    if fast_allowed && first_bui.is_some() && increment_u64.is_some() && last_bui.is_some() {


Maybe you could do

if fast_allowed { // Test if we can use fast code path. // First try to convert the range to BigUint (u64 for the increment). let (first_bui, increment_u64, last_bui) = ( first.to_biguint(), increment.to_biguint().and_then(|x| x.to_u64()), last.to_biguint(), ); if let (Some(first_bui), Some(increment_u64), Some(last_bui)) = (first_bui, increment_u64, last_bui) { ... } }

Oh! That's better indeed. Thanks!

tertsdiepraam · 2025-04-04T14:44:07Z

src/uu/seq/src/seq.rs

+// Fast code path increment function.
+// Add inc to the string val[start..end]. This operates on ASCII digits, assuming
+// val and inc are well formed.
+// Returns the new value for start.


You can make this a docstring, which makes it also show up in language servers and things like that.

Suggested change

// Fast code path increment function.

// Add inc to the string val[start..end]. This operates on ASCII digits, assuming

// val and inc are well formed.

// Returns the new value for start.

/// Fast code path increment function.

///

/// Add inc to the string val[start..end]. This operates on ASCII digits, assuming

/// val and inc are well formed.

/// Returns the new value for start.

Done, thanks!

github-actions · 2025-04-04T16:47:19Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

…ements A lot of custom logic, we basically do arithmetic on character arrays, but this comes at with huge performance gains. Unlike coreutils `seq`, we do this for all positive increments (because why not), and we do not fall back to slow path if the last parameter is in scientific notation. Also, add some tests for empty separator, as that may catch some corner cases.

It is actually quite easy to implement, we just start with a padded number and increment as usual.

drinkcat · 2025-04-18T19:40:01Z

I feel a bit better about this, seeing that some of the extra code can be reused in #7782. We could directly merge that PR and drop this one, either way is fine.

github-actions · 2025-04-18T20:03:41Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat · 2025-04-22T16:11:31Z

Merged as part of #7782!

drinkcat force-pushed the seq-perf branch from 0f07c21 to 0e3a2b0 Compare March 25, 2025 08:08

drinkcat force-pushed the seq-perf branch from 0e3a2b0 to 344630d Compare April 3, 2025 16:01

drinkcat force-pushed the seq-perf branch 2 times, most recently from 29bafc5 to 01816c7 Compare April 3, 2025 17:31

drinkcat force-pushed the seq-perf branch from 01816c7 to a4740ae Compare April 3, 2025 17:34

drinkcat force-pushed the seq-perf branch from df0fbc8 to 5095101 Compare April 3, 2025 18:49

drinkcat marked this pull request as ready for review April 3, 2025 18:50

drinkcat force-pushed the seq-perf branch from 5095101 to 9e05203 Compare April 3, 2025 18:57

drinkcat mentioned this pull request Apr 3, 2025

seq performance is very poor, compared with GNU seq, when passed positive integer values #7482

Open

drinkcat mentioned this pull request Apr 4, 2025

RFC: doc: extensions: Explain how printf/seq handle precision #7641

Open

tertsdiepraam reviewed Apr 4, 2025

View reviewed changes

drinkcat force-pushed the seq-perf branch from 9e05203 to 8700e73 Compare April 4, 2025 16:11

drinkcat added 3 commits April 18, 2025 20:20

seq: Add constant width support in fast path

0a17c7a

It is actually quite easy to implement, we just start with a padded number and increment as usual.

seq: Update doc for fast_inc

77ca582

drinkcat force-pushed the seq-perf branch from 8700e73 to 77ca582 Compare April 18, 2025 19:26

drinkcat mentioned this pull request Apr 18, 2025

Move seq's fast_inc to uucore, use it in cat #7782

Merged

drinkcat closed this Apr 22, 2025

drinkcat deleted the seq-perf branch April 22, 2025 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seq: Add a print_seq fast path function for integer and positive increments #7564

seq: Add a print_seq fast path function for integer and positive increments #7564

drinkcat commented Mar 24, 2025

github-actions bot commented Mar 24, 2025

github-actions bot commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

drinkcat commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

drinkcat commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

tertsdiepraam left a comment

tertsdiepraam Apr 4, 2025

drinkcat Apr 4, 2025

tertsdiepraam Apr 4, 2025

drinkcat Apr 4, 2025

github-actions bot commented Apr 4, 2025

drinkcat commented Apr 18, 2025

github-actions bot commented Apr 18, 2025

drinkcat commented Apr 22, 2025

seq: Add a print_seq fast path function for integer and positive increments #7564

seq: Add a print_seq fast path function for integer and positive increments #7564

Conversation

drinkcat commented Mar 24, 2025

github-actions bot commented Mar 24, 2025

github-actions bot commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

drinkcat commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

drinkcat commented Apr 3, 2025

github-actions bot commented Apr 3, 2025

tertsdiepraam left a comment

Choose a reason for hiding this comment

tertsdiepraam Apr 4, 2025

Choose a reason for hiding this comment

drinkcat Apr 4, 2025

Choose a reason for hiding this comment

tertsdiepraam Apr 4, 2025

Choose a reason for hiding this comment

drinkcat Apr 4, 2025

Choose a reason for hiding this comment

github-actions bot commented Apr 4, 2025

drinkcat commented Apr 18, 2025

github-actions bot commented Apr 18, 2025

drinkcat commented Apr 22, 2025