Skip to content

Some minor fixes for the printf implementation #28177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

esyr
Copy link
Contributor

@esyr esyr commented Aug 5, 2025

Per @Sashan's suggestion in [1], moving fixes for the printf routine into a separate PR.

[1] #28059

Checklist
  • tests are added or updated

esyr added 3 commits August 5, 2025 01:31
Per [1]:

    hh
        Specifies that a following d, i, o, u, x, or X conversion specifier
        applies to a signed char or unsigned char argument

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Per [1]:

   For x or X conversion specifiers, a non-zero result shall have 0x (or 0X)
   prefixed to it.

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
…cision

Both width and precision are "decimal digit strings" of unspecified size,
but we can realistically cap it at INT_MAX.

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
@esyr esyr force-pushed the esyr/printf-fixes branch from 29ebfad to 2a937b3 Compare August 5, 2025 13:54
@esyr esyr requested a review from Sashan August 5, 2025 14:42
@vavroch2010 vavroch2010 moved this to Waiting Review in Development Board Aug 5, 2025
@vavroch2010 vavroch2010 requested review from nhorman, andrewkdinh and a team August 5, 2025 15:07
@esyr esyr mentioned this pull request Aug 5, 2025
2 tasks
nhorman
nhorman previously approved these changes Aug 5, 2025
Comment on lines +199 to +203
if (*format == 'h') {
cflags = DP_C_CHAR;
format++;
} else {
cflags = DP_C_SHORT;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support the n Conversion Specifier. With %hn the specifier is a pointer to a short rather than an int, but we don't appear to track this. From the BSD/MacOS printf(3) manpage:

     •   An optional length modifier, that specifies the size of the argument.  The following length modifiers are valid for the d, i, n, o, u, x, or X conversion:

         Modifier                 d, i               o, u, x, X                n
         hh                       signed char        unsigned char             signed char *
         h                        short              unsigned short            short *
         l (ell)                  long               unsigned long             long *
         ll (ell ell)             long long          unsigned long long        long long *
         j                        intmax_t           uintmax_t                 intmax_t *
         t                        ptrdiff_t          (see note)                ptrdiff_t *
         z                        (see note)         size_t                    (see note)
         q (deprecated)           quad_t             u_quad_t                  quad_t *

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[1] says that argument for the "%n" should be an "integer", but doesn't specify its size, however, from the description of the length modifiers it's possible to conjecture that they are not applied to "%n", as those do not mention it. [2] explicitly says that "The behavior is undefined if the conversion specification includes any flags, a field width, or a precision", but I'm not sure what is the source of that, any standard (or standards), interpretation of it (them), or specifically glibc's way of handling it.

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html
[2] https://www.man7.org/linux/man-pages/man3/printf.3.html

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [1] the description covers the h and hh modifiers for n:

  • hh
    Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing); or that a following n conversion specifier applies to a pointer to a signed char argument.
  • h
    Specifies that a following d, i, o, u, x, or X conversion specifier applies to a short or unsigned short argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to short or unsigned short before printing); or that a following n conversion specifier applies to a pointer to a short argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be an exercise in being vague. The POSIX specification here Says this about the n conversion:

n
The argument shall be a pointer to an integer into which is written the number of bytes written to the output so far by this call to one of the fprintf() functions. No argument is converted.

But the man page says:

n      The number of characters written so far is stored into the
              integer pointed to by the corresponding argument.  That
              argument shall be an int *, or variant whose size matches
              the (optionally) supplied integer length modifier.  No
              argument is converted.  (This specifier is not supported by
              the bionic C library.)  The behavior is undefined if the
              conversion specification includes any flags, a field width,
              or a precision.

The man page language seems a bit more implementation specific, but fairly pragmatic, as you don't want to write more data than its being provided in your conversion pointer.

Theres another question here. Nothing in openssl currently uses the hh modifier at all (save for the test that this PR adds). That is not to say of course that some application using openssl won't try to use it with BIO_printf or friends, but given that I don't think we've seen any bug reports about hh not working properly (though someone correct me if I'm wrong here), is this really worth supporting? We could just ignore the hh modifer and assume an n conversion pairs with an int *. If someone tries to use a short or char, an asan test will flag that pretty quickly, and we can address it when/if it ever occurs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theres another question here. Nothing in openssl currently uses the hh modifier at all (save for the test that this PR adds). That is not to say of course that some application using openssl won't try to use it with BIO_printf or friends, but given that I don't think we've seen any bug reports about hh not working properly (though someone correct me if I'm wrong here), is this really worth supporting? We could just ignore the hh modifer and assume an n conversion pairs with an int *. If someone tries to use a short or char, an asan test will flag that pretty quickly, and we can address it when/if it ever occurs.

We should EITHER:

  1. Reject the length modifiers with the n conversion.
  2. Accept the modifiers and handle them as specified by open group, ...

Allowing them in the format template, but not acting on them is a bug,, though not one that an applications likely to run into. So while we're here, we can note the bug, and fix it later, if not trivial to do at this time.

Copy link
Contributor Author

@esyr esyr Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [1] the description covers the h and hh modifiers for n

You're right; apparently, I've failed at reading comprehension. Will update the patch set to include a fix and check for it.

Copy link
Contributor

@Sashan Sashan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good in general. need just clarification on is_eob() and fmtint(). thanks.

{ { .i = 0x1337 }, ISZ_INT, "|%2147483639.x|",
"| " },
/*
* glibc just bails out on the following three, treating everythin greated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/everythin greated/everything greater

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

esyr added 8 commits August 6, 2025 12:58
Per [1]:

    A negative field width is taken as a '-' flag followed by a positive field
    width.

So, printf("%-*d", -12, 34) should lead to a 123-wide left-aligned output,
"34          ".

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Per [1] (emphasis is added):

    - For o conversion, it shall increase the precision,
      **if and only if necessary**, to force the first digit of the result
      to be a zero (**if the value and precision are both 0,
      a single 0 is printed**).
    - For x or X conversion specifiers, a **non-zero** result shall have
      0x (or 0X) prefixed to it.

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Zero prefix in the alternative octal form count tovards precision,
per [1]:

    For o conversion, it **shall increase the precision**...

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Sign, prefix, and zero padding should count towards precision.

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Per [1]:

    If the '0' and '-' flags both appear, the '0' flag is ignored.

[1] https://pubs.opengroup.org/onlinepubs/9799919799//functions/printf.html

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
…onversion

On some architectures, char defaults to unsigned char, so the signed integer
conversions have to explicitly cast to the signed char type.

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
Sashan
Sashan previously approved these changes Aug 6, 2025
Copy link
Contributor

@Sashan Sashan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. thanks.

Signed-off-by: Eugene Syromiatnikov <esyr@openssl.org>
@esyr esyr force-pushed the esyr/printf-fixes branch from a6558c2 to 38c9481 Compare August 7, 2025 12:17
@esyr esyr requested a review from Sashan August 7, 2025 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting Review
Development

Successfully merging this pull request may close these issues.

5 participants