doc: extensions: Explain how printf/seq handle precision #7641

drinkcat · 2025-04-03T14:28:14Z

There are some difference in behaviour vs GNU coreutils, explain what those are.

RFC, I'd like to get opinions on this. This is basically how uutils will behave after #7631. I think it's better to just say we support arbitrary precision, rather than trying to downgrade accuracy to emulate 64/80/128-bit float behavior.

In particular, for seq, the GNU coreutils intentionally does not give guarantees in terms of precision (https://www.gnu.org/software/coreutils/manual/coreutils.html#seq-invocation), so I feel it's ok to do better.

Be careful when using seq with outlandish values: otherwise you may see surprising results, as seq uses floating point internally. For example, on the x86 platform, where the internal representation uses a 64-bit fraction, the command:
seq 1 0.0000000000000000001 1.0000000000000000009
outputs 1.0000000000000000007 twice and skips 1.0000000000000000008.

There is nothing mentioned in printf about precision, so I think providing arbitrary precision is also fair. Hexadecimal floating point is a bit complicated, so worth mentioning what architecture-specific behavior we picked.

(there are quite a few issues related to this, #7186, #5759, #6244 at least)

github-actions · 2025-04-03T15:05:45Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

drinkcat · 2025-04-03T19:08:35Z

@tertsdiepraam @RenjiSann @jfinkels @sylvestre FYI (trying to figure out who touched floating point code and might be interested ,-))

tertsdiepraam

I think it's a great idea to document these differences!

I think my position would be that these differences are acceptable, since they are correct and also any script that parses these numbers would need to support the formats that we print.

Additionally, I think that our behaviour is more portable, which is a nice bonus.

However, the arbitrary precision might be a performance problem? Have you checked that? If that is negligible, I feel this is acceptable.

tertsdiepraam · 2025-04-04T10:02:31Z

docs/src/extensions.md

@@ -97,3 +176,5 @@ Similar to the proc-ps implementation and unlike GNU/Coreutils, `uptime` provide
 ## `base32/base64/basenc`

 Just like on macOS, `base32/base64/basenc` provides `-D` to decode data.
+
+[^1] https://en.cppreference.com/w/c/io/fprintf


I don't think this is the best source for that. Generally, we should refer to GNU docs or the POSIX specification.

You might find the info on one of these pages:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html

https://pubs.opengroup.org/onlinepubs/9699919799/functions/strtod.html

Thanks! I think this has what we need: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05

(GNU docs reword the same thing, and I think C99 is also very similar in wording)

tertsdiepraam · 2025-04-04T10:02:54Z

docs/src/extensions.md

+format specified, etc.), so its output will be more correct than GNU coreutils for
+some inputs (e.g. small fractional increments where GNU coreutils uses `long double`).
+
+The only limitation is that the position of the decimal point is stored in a i64,


Suggested change

The only limitation is that the position of the decimal point is stored in a i64,

The only limitation is that the position of the decimal point is stored in a `i64`,

Fixed thanks.

drinkcat · 2025-04-04T13:10:48Z

I think it's a great idea to document these differences!

I think my position would be that these differences are acceptable, since they are correct and also any script that parses these numbers would need to support the formats that we print.

Thanks! Yeah, I also suppose that anybody who uses seq with such small increments would not really rely on the exact values. Not that I'm too clear why you'd actually call GNU seq with such values in the first place, given how imprecise it is...

It's perhaps a bit more arguable with printf, e.g. if somebody relied on printf to understand precision losses with some not-really-well-specified floating point format... But that seems a bit... unlikely as well.

Additionally, I think that our behaviour is more portable, which is a nice bonus.

Right. And in a way, it's almost acceptable that a coreutils->uutils update would cause subtle changes somewhat similar to a x86->arm update?

However, the arbitrary precision might be a performance problem? Have you checked that? If that is negligible, I feel this is acceptable.

It's fine. For printf it obviously doesn't matter as we only deal with a few numbers at most.

For seq, we're in the same ballpark performance for floating points (maybe 20% faster). The GNU implementation has some fast path when dealing with positive integers, I could get within ~10% with this PR: #7564 (otherwise we're 15-20 times slower, but that's not only because of the added precision).

And now that I'm looking into #7475, it's interesting to see similar precision questions (GNU timeout rounds very small duration values to 0, which has a totally different meaning...)

github-actions · 2025-04-04T13:45:11Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

jfinkels · 2025-04-08T01:23:00Z

I agree, it seems sensible to me to maintain the arbitrary precision and document the differences. You've documented them clearly here!

There are some difference in behaviour vs GNU coreutils, explain what those are.

drinkcat · 2025-05-25T19:41:07Z

Rebased. I know this is a wall of text, but... feedback still welcome ,-)

github-actions · 2025-05-25T20:23:34Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

sylvestre · 2025-05-25T20:53:16Z

thanks!

drinkcat mentioned this pull request Apr 3, 2025

seq: numerical precision issue causing difference from GNU output #7186

Open

tertsdiepraam reviewed Apr 4, 2025

View reviewed changes

drinkcat mentioned this pull request Apr 4, 2025

timeout: Duration parsing does not support all float formats (e.g. hex) #7475

Closed

drinkcat force-pushed the doc-seq-printf branch from 9acdfc3 to be37d32 Compare April 4, 2025 13:08

drinkcat mentioned this pull request Apr 7, 2025

seq: format_float_shortest panic when printing absurdly large exponents #7632

Open

drinkcat marked this pull request as ready for review April 22, 2025 16:14

drinkcat mentioned this pull request May 3, 2025

printf: %a not supported (but partially implemented) #2776

Open

doc: extensions: Explain how printf/seq handle precision

3f12ed9

There are some difference in behaviour vs GNU coreutils, explain what those are.

drinkcat force-pushed the doc-seq-printf branch from be37d32 to 3f12ed9 Compare May 25, 2025 19:39

drinkcat changed the title ~~RFC: doc: extensions: Explain how printf/seq handle precision~~ doc: extensions: Explain how printf/seq handle precision May 25, 2025

sylvestre merged commit 142e4cc into uutils:main May 25, 2025
74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

doc: extensions: Explain how printf/seq handle precision #7641

doc: extensions: Explain how printf/seq handle precision #7641

Uh oh!

drinkcat commented Apr 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 3, 2025

Uh oh!

drinkcat commented Apr 3, 2025

Uh oh!

tertsdiepraam left a comment

Uh oh!

tertsdiepraam Apr 4, 2025

Uh oh!

drinkcat Apr 4, 2025

Uh oh!

tertsdiepraam Apr 4, 2025

Uh oh!

drinkcat Apr 4, 2025

Uh oh!

drinkcat commented Apr 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

jfinkels commented Apr 8, 2025

Uh oh!

drinkcat commented May 25, 2025

Uh oh!

github-actions bot commented May 25, 2025

Uh oh!

Uh oh!

sylvestre commented May 25, 2025

Uh oh!

Uh oh!

	The only limitation is that the position of the decimal point is stored in a i64,
	The only limitation is that the position of the decimal point is stored in a `i64`,

Uh oh!

doc: extensions: Explain how printf/seq handle precision #7641

doc: extensions: Explain how printf/seq handle precision #7641

Uh oh!

Conversation

drinkcat commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 3, 2025

Uh oh!

drinkcat commented Apr 3, 2025

Uh oh!

tertsdiepraam left a comment

Choose a reason for hiding this comment

Uh oh!

tertsdiepraam Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

drinkcat Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

tertsdiepraam Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

drinkcat Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

drinkcat commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

jfinkels commented Apr 8, 2025

Uh oh!

drinkcat commented May 25, 2025

Uh oh!

github-actions bot commented May 25, 2025

Uh oh!

Uh oh!

sylvestre commented May 25, 2025

Uh oh!

Uh oh!

drinkcat commented Apr 3, 2025 •

edited

Loading

drinkcat commented Apr 4, 2025 •

edited

Loading