-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
printf: accept non-UTF-8 input in FORMAT and ARGUMENT arguments #8329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
GNU testsuite comparison:
|
7a7f02b
to
f31f360
Compare
GNU testsuite comparison:
|
f31f360
to
045c269
Compare
@jtracey if you have time to review this, both the rebase and my added changes, it'd be absolutely awesome... I ended up modifying |
It looks great to me! |
GNU testsuite comparison:
|
GNU testsuite comparison:
|
Lgtm! Thanks for picking this up. FWIW, the first commit should probably have multiple authors, but if this gets squashed GitHub might handle it on its own. If not, it's not the end of the world if the commit history is a little messy. |
4a472b8
to
d2c7ec4
Compare
Oh, didn't know the logic (I think native git only allows a single author?), I did keep a "Author:" line in the first squashed commit, but github didn't pick it up. Looks like adding a Also I'll... rebase to fix the conflict. |
d2c7ec4
to
8437dc7
Compare
Rebased... Hopefully all tests still pass. |
GNU testsuite comparison:
|
Interesting, github does not allow me to merge it |
Weird... Not sure if I can do anything to help... Looks like it's waiting for a review by you? But... "It is not required to merge." it says... |
Other implementations of `printf` permit arbitrary data to be passed to `printf`. The only restriction is that a null byte terminates FORMAT and ARGUMENT argument strings (since they are C strings). The current implementation only accepts FORMAT and ARGUMENT arguments that are valid UTF-8 (this is being enforced by clap). This commit removes the UTF-8 validation by switching to OsStr and OsString. This allows users to use `printf` to transmit or reformat null-safe but not UTF-8-safe data, such as text encoded in an 8-bit text encoding. See the `non_utf_8_input` test for an example (ISO-8859-1 text). [drinkcat: also squashed in this commit to ease rebase] Author: Justin Tracey <jdt.dev@unsuspicious.click> uucore, printf: improve non-UTF-8 format arguments This fixes handling of format arguments, in part by eliminating duplicate implementations. Utilities with format arguments other than printf will no longer accept things like "'a" as numbers, etc. Co-authored-by: Justin Tracey <jdt.dev@unsuspicious.click>
And fix all the callers, e.g. all the next_ functions.
In one case, we'll need an actual owned String in PartialMatch, so it's easier to just use that. Removes a bunch of lifetime things in the code too.
After this, we can use a common extract_value function.
Suggested by Gemini and I think that's not a bad idea.
8437dc7
to
ccad817
Compare
GNU testsuite comparison:
|
well done folks |
Rebased #7209 yet again (I "had" to squash 2 commits together to make the rebase slightly easier..), I'll also attempt to apply the changes I suggested.
--
Rebases #6812 on #7208.
EDIT: Now also includes a commit from me to pass the printf-mb GNU test, as well as a few other odds and ends. Fixes #6804.