Skip to content

Commit f9310b2

Browse files
Jessica Yutorvalds
authored andcommitted
sscanf: implement basic character sets
Implement basic character sets for the '%[' conversion specifier. The '%[' conversion specifier matches a nonempty sequence of characters from the specified set of accepted (or with '^', rejected) characters between the brackets. The substring matched is to be made up of characters in (or not in) the set. This is useful for matching substrings that are delimited by something other than spaces. This implementation differs from its glibc counterpart in the following ways: (1) No support for character ranges (e.g., 'a-z' or '0-9') (2) The hyphen '-' is not a special character (3) The closing bracket ']' cannot be matched (4) No support (yet) for discarding matching input ('%*[') The bitmap code is largely based upon sample code which was provided by Rasmus. The motivation for adding character set support to sscanf originally stemmed from the kernel livepatching project. An ongoing patchset utilizes new livepatch Elf symbol and section names to store important metadata livepatch needs to properly apply its patches. Such metadata is stored in these section and symbol names as substrings delimited by periods '.' and commas ','. For example, a livepatch symbol name might look like this: .klp.sym.vmlinux.printk,0 However, sscanf currently can only extract "substrings" delimited by whitespace using the "%s" specifier. Thus for the above symbol name, one cannot not use sscanf() to extract substrings "vmlinux" or "printk", for example. A number of discussions on the livepatch mailing list dealing with string parsing code for extracting these '.' and ',' delimited substrings eventually led to the conclusion that such code would be completely unnecessary if the kernel sscanf() supported character sets. Thus only a single sscanf() call would be necessary to extract these substrings. In addition, such an addition to sscanf() could benefit other areas of the kernel that might have a similar need in the future. [akpm@linux-foundation.org: 80-col tweaks] Signed-off-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 2553b67 commit f9310b2

File tree

1 file changed

+58
-1
lines changed

1 file changed

+58
-1
lines changed

lib/vsprintf.c

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2640,8 +2640,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
26402640
if (*fmt == '*') {
26412641
if (!*str)
26422642
break;
2643-
while (!isspace(*fmt) && *fmt != '%' && *fmt)
2643+
while (!isspace(*fmt) && *fmt != '%' && *fmt) {
2644+
/* '%*[' not yet supported, invalid format */
2645+
if (*fmt == '[')
2646+
return num;
26442647
fmt++;
2648+
}
26452649
while (!isspace(*str) && *str)
26462650
str++;
26472651
continue;
@@ -2714,6 +2718,59 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
27142718
num++;
27152719
}
27162720
continue;
2721+
/*
2722+
* Warning: This implementation of the '[' conversion specifier
2723+
* deviates from its glibc counterpart in the following ways:
2724+
* (1) It does NOT support ranges i.e. '-' is NOT a special
2725+
* character
2726+
* (2) It cannot match the closing bracket ']' itself
2727+
* (3) A field width is required
2728+
* (4) '%*[' (discard matching input) is currently not supported
2729+
*
2730+
* Example usage:
2731+
* ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]",
2732+
* buf1, buf2, buf3);
2733+
* if (ret < 3)
2734+
* // etc..
2735+
*/
2736+
case '[':
2737+
{
2738+
char *s = (char *)va_arg(args, char *);
2739+
DECLARE_BITMAP(set, 256) = {0};
2740+
unsigned int len = 0;
2741+
bool negate = (*fmt == '^');
2742+
2743+
/* field width is required */
2744+
if (field_width == -1)
2745+
return num;
2746+
2747+
if (negate)
2748+
++fmt;
2749+
2750+
for ( ; *fmt && *fmt != ']'; ++fmt, ++len)
2751+
set_bit((u8)*fmt, set);
2752+
2753+
/* no ']' or no character set found */
2754+
if (!*fmt || !len)
2755+
return num;
2756+
++fmt;
2757+
2758+
if (negate) {
2759+
bitmap_complement(set, set, 256);
2760+
/* exclude null '\0' byte */
2761+
clear_bit(0, set);
2762+
}
2763+
2764+
/* match must be non-empty */
2765+
if (!test_bit((u8)*str, set))
2766+
return num;
2767+
2768+
while (test_bit((u8)*str, set) && field_width--)
2769+
*s++ = *str++;
2770+
*s = '\0';
2771+
++num;
2772+
}
2773+
continue;
27172774
case 'o':
27182775
base = 8;
27192776
break;

0 commit comments

Comments
 (0)