Skip to content

dd: should terminate with error if skip argument is too large #7275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ic3man5
Copy link
Contributor

@ic3man5 ic3man5 commented Feb 6, 2025

Fixes #7216

 cargo run dd bs=1 skip=9223372036854775808 count=0
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
     Running `target/debug/coreutils dd bs=1 skip=9223372036854775808 count=0`
dd: invalid number: ‘'9223372036854775808': Value too large for defined data type’

@ic3man5
Copy link
Contributor Author

ic3man5 commented Feb 6, 2025

Notes:

  • slight output difference:
    dd: invalid number: ‘'9223372036854775808': Value too large for defined data type’
    instead of:
    dd: invalid number: '9223372036854775808': Value too large for defined data type
  • seek also had this issue. For some reason GNU coreutils uses a signed integer and checks to make sure its in range of 0 and max i64.

@ic3man5
Copy link
Contributor Author

ic3man5 commented Feb 6, 2025

reference to the check in GNU coreutils:
https://github.com/coreutils/coreutils/blob/master/src/dd.c#L1581C23-L1581C39

Copy link

github-actions bot commented Feb 6, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)

@cakebaker
Copy link
Contributor

Can you please add a test to ensure we don't regress in the future?

@ic3man5
Copy link
Contributor Author

ic3man5 commented Feb 6, 2025

Can you please add a test to ensure we don't regress in the future?

yes, do we care that the GnuTests CI step failed? I'm not sure how to tell what test exactly failed from the logs.

I ran this gnu test but I'm unsure of how to compare it to see what test actually failed.

$ bash util/run-gnu-test.sh

============================================================================
Testsuite summary for GNU coreutils 9.6.8-fbfd88-dirty
============================================================================
# TOTAL: 617
# PASS:  469
# SKIP:  85
# XFAIL: 0
# FAIL:  63
# XPASS: 0
# ERROR: 0
============================================================================
See ./tests/test-suite.log for debugging.
Some test(s) failed.  Please report this to bug-coreutils@gnu.org,
together with the test-suite.log file (gzipped) and your system
information.  Thanks.
============================================================================

@cakebaker
Copy link
Contributor

do we care that the GnuTests CI step failed?

It fails because of an intermittent test and you can ignore it.

Copy link

github-actions bot commented Feb 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)

@ic3man5
Copy link
Contributor Author

ic3man5 commented Feb 7, 2025

@cakebaker this should be ready to go now.

Comment on lines 227 to 229
return Err(ParseError::InvalidNumber(format!(
"'{skip}': Value too large for defined data type"
)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a similar error in uucore::parse_size here:

Self::SizeTooBig(format!(
"{}: Value too large for defined data type",
s.quote()
))
Could that be reused in some way? I guess GNU probably determines that the skip and seek are too large at the time of parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfinkels yes, they determine it when its parsed. Honestly I feel like its a "bug" upstream. I can reuse uucore::parse_size if you'd like instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ic3man5 Please do not read or get inspiration from GNU's coreutils source code, as it would violate the terms of the copyleft GPL license attached to their software. We distribute uutils' coreutils under the MIT license, which is not compatible with copyleft licenses.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless, yes, please reuse parse_size.rs to reduce code duplication

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfinkels @RenjiSann I refactored to use ParseSizeError instead. I'll rebase once this is accepted.

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/rm/rm2 is no longer failing!

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@RenjiSann
Copy link
Collaborator

The CI is unhappy :(

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@RenjiSann
Copy link
Collaborator

Can you handle this ?

  ──── TRY 1 STDERR:       coreutils::tests test_dd::test_invalid_number_arg_gnu_compatibility
  
  thread 'test_dd::test_invalid_number_arg_gnu_compatibility' panicked at tests/by-util/test_dd.rs:1274:14:
  assertion failed: `(left == right)`
  
  Diff < left / right > :
  <dd: invalid number: ‘''’
  >dd: invalid number: ‘’

@ic3man5
Copy link
Contributor Author

ic3man5 commented Mar 26, 2025

sorry I didn't see notification on this one, I'll look into it.

@ic3man5
Copy link
Contributor Author

ic3man5 commented Mar 26, 2025

@RenjiSann Okay I actually need some direction here. I noticed a lot of tests are quoting with (0x2018 Left Single Quotation Mark) and (0x2019 Right Single Quotation Mark). Running coreutils on my system (arch linux) dd returns single quote character ' (0x27 Single Quotation).

I see a lot of compatibility issues here in other places besides this. What am I missing here?

coreutils installed on my system:

$ dd --version
dd (coreutils) 9.6

edit: The original issue report shows single quotations also as expected.

edit2: locale information

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@RenjiSann
Copy link
Collaborator

@RenjiSann Okay I actually need some direction here. I noticed a lot of tests are quoting with (0x2018 Left Single Quotation Mark) and (0x2019 Right Single Quotation Mark). Running coreutils on my system (arch linux) dd returns single quote character ' (0x27 Single Quotation).

I see a lot of compatibility issues here in other places besides this. What am I missing here?

coreutils installed on my system:

$ dd --version
dd (coreutils) 9.6

edit: The original issue report shows single quotations also as expected.

edit2: locale information

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

The reason is that for now, we don't handle locales at all, and by default we replicate the behavior of LC_ALL=C.

@RenjiSann
Copy link
Collaborator

Consequently, the easiest thing to do right now is to expect our code to format output as expected with the LC_ALL=C locale.

Someday we'll look into localization, but there is no easy way to handle it for now.

@RenjiSann
Copy link
Collaborator

RenjiSann commented Apr 24, 2025

ping @ic3man5
there are some conflicts

@ic3man5
Copy link
Contributor Author

ic3man5 commented Apr 28, 2025

sorry, going to keep working on this, just been busy. If someone else wants to take it over I won't mind. I don't want to block progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dd: should terminate with error if skip argument is too large
4 participants