The only difference between noncompliant and C99-compliant scanf is
that the former accepts the archaic GNU extension '%as' (also %aS and
%a[...]) meaning to allocate space for the input string with malloc.
This extension conflicts with C99's use of %a as a format _type_
meaning to read a floating-point number; POSIX.1-2008 standardized
equivalent functionality using the modifier letter 'm' instead (%ms,
%mS, %m[...]).
The extension was already disabled in most conformance modes:
specifically, any mode that doesn't involve _GNU_SOURCE and _does_
involve either strict conformance to C99 or loose conformance to both
C99 and POSIX.1-2001 would get the C99-compliant scanf. With
compilers new enough to use -std=gnu11 instead of -std=gnu89, or
equivalent, that includes the default mode.
With this patch, we now provide C99-compliant scanf in all
configurations except when _GNU_SOURCE is defined *and*
__STDC_VERSION__ or __cplusplus (whichever is relevant) indicates
C89/C++98. This leaves the old scanf available under e.g. -std=c89
-D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it
was already not present under -std=gnu11 without -D_GNU_SOURCE) and
from -std=gnu89 without -D_GNU_SOURCE.
There needs to be an internal override so we can compile the
noncompliant scanf itself. This is the same problem we had when we
removed 'gets' from _GNU_SOURCE and it's dealt with the same way:
there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to
off under the appropriate conditions for external code, but can be
overridden by individual files within stdio.
We also run into problems with PLT bypass for internal uses of sscanf,
because libc_hidden_proto uses __REDIRECT and so does the logic in
stdio.h for choosing which implementation of scanf to use; __REDIRECT
isn't transitive, so include/stdio.h needs to bridge the gap with a
macro. As far as I can tell, sscanf is the only function in this
family that's internally called by unrelated code.
Finally, there are several tests in stdio-common that use the
extension. bug21.c is a regression test for a crash; it still
exercises the relevant code when changed to use %ms instead of %as.
scanf14.c through scanf17.c are more complicated since they are
actually testing the subtleties of the extension - under what
circumstances is 'a' treated as a modifier letter, etc. I changed all
of them to use %ms instead of %as as well, but duplicated scanf14.c
and scanf16.c as scanf14a.c and scanf16a.c. These still use %as and
are compiled with -std=gnu89 to access the old extension. A bunch of
diagnostic overrides and manual workarounds for the old stdio.h
behavior become unnecessary. Yay!
* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE
parameter. Only use deprecated scanf when __USE_GNU is defined
and __STDC_VERSION__ is less than 199901L or __cplusplus is less
than 201103L, whichever is relevant for the language being compiled.
* libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect
scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their
__isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF).
* wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for
wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf.
* libio/iovsscanf.c
* libio/fwscanf.c
* libio/iovswscanf.c
* libio/swscanf.c
* libio/vscanf.c
* libio/vwscanf.c
* libio/wscanf.c
* stdio-common/fscanf.c
* stdio-common/scanf.c
* stdio-common/vfscanf.c
* stdio-common/vfwscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-scanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c
* sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c:
Override __GLIBC_USE_DEPRECATED_SCANF to 1.
* stdio-common/sscanf.c: Likewise. Remove ldbl_hidden_def for __sscanf.
* stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf.
* include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf,
not sscanf.
[!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf
with a preprocessor macro.
* stdio-common/bug21.c, stdio-common/scanf14.c:
Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[];
remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
* stdio-common/scanf16.c: Likewise. Add __attribute__ ((format (scanf)))
to xscanf, xfscanf, xsscanf.
* stdio-common/scanf14a.c: New copy of scanf14.c which still uses
%as, %aS, %a[]. Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
* stdio-common/scanf16a.c: New copy of scanf16.c which still uses
%as, %aS, %a[]. Add __attribute__ ((format (scanf))) to xscanf,
xfscanf, xsscanf.
* stdio-common/scanf15.c, stdio-common/scanf17.c: No need to
override feature selection macros or provide definitions of u_char etc.
* stdio-common/Makefile (tests): Add scanf14a and scanf16a.
(CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove.
(CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New. Compile these files
with -std=gnu89.
There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode
used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used
by __isoc99_ scanf variants. In this patch, the new functions honor
these flag bits if they're set, but they still also look at the
corresponding bits of environmental state, and callers all pass zero.
The new functions do *not* have the "errp" argument possessed by
_IO_vfscanf and _IO_vfwscanf. All internal callers passed NULL for
that argument. External callers could theoretically exist, so I
preserved wrappers, but they are flagged as compat symbols and they
don't preserve the three-way distinction among types of errors that
was formerly exposed. These functions probably should have been in
the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just
aliases for vfscanf and vfwscanf.
(It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf.
Please check that part of the patch very carefully, I am still not
confident I understand all of the details of ldbl-opt.)
This patch also introduces helper inlines in libio/strfile.h that
encapsulate the process of initializing an _IO_strfile object for
reading. This allows us to call __vfscanf_internal directly from
sscanf, and __vfwscanf_internal directly from swscanf, without
duplicating the initialization code. (Previously, they called their
v-counterparts, but that won't work if we want to control *both* C99
mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.)
It's still a little awkward, especially for wide strfiles, but it's
much better than what we had.
Tested for powerpc and powerpc64le.
As reported in bug 23280, scanf functions produce incorrectly rounded
result for floating-point formats in FE_UPWARD and FE_DOWNWARD modes,
because they pass the input with sign removed to strtod functions, and
then negate the result if there was a '-' at the start of the input.
This patch fixes this by arranging for the sign to be passed to strtod
rather than scanf doing the negation itself. In turn, keeping the
sign around in the buffer being built up for strtod requires updating
places that examine char_buffer_size (&charbuf) to allow for the sign
being there as an extra character.
Tested for x86_64.
[BZ #23280]
* stdio-common/vfscanf.c (_IO_vfscanf_internal): Pass sign of
floating-point number to strtod functions rather than possibly
negating result of those functions.
* stdio-common/tst-scanf-round.c: New file.
* stdio-common/Makefile (tests): Add tst-scanf-round.
($(objpfx)tst-scanf-round): Depend on $(libm).
This patch eliminates a number of #if 0 and #ifdef TODO blocks, macros
that are never used, macros that provide portability to substrates that
lack basic things like EINVAL and off_t, and other such debris.
I preserved IO_DEBUG and CHECK_FILE, even though as far as I can tell
IO_DEBUG is never defined and therefore CHECK_FILE never does
anything, because it seems like we might actually want to turn it _on_.
Installed stripped libraries and executables are unchanged, except,
again, that the line number of an assertion changes (this time it's
somewhere in fileops.c).
* libio/libio.h (_IO_pos_BAD, _IO_pos_0, _IO_pos_adjust):
Define here, unconditionally.
* libio/iolibio.h (_IO_pos_BAD): Don't define here.
* libio/libioP.h: Remove #if 0 blocks.
(_IO_pos_BAD, _IO_pos_0, _IO_pos_adjust): Don't define here.
(_IO_va_start, COERCE_FILE, MAYBE_SET_EINVAL): Don't define.
(CHECK_FILE): Don't use MAYBE_SET_EINVAL or COERCE_FILE. Fix style.
* libio/clearerr.c, libio/fputc.c, libio/getchar.c:
Assume weak_alias is always defined.
* libio/fileops.c, libio/genops.c, libio/oldfileops.c
* libio/oldpclose.c, libio/pclose.c, libio/wfileops.c:
Remove #if 0 and #ifdef TODO blocks.
Assume text_set_element is always defined.
* libio/iofdopen.c, libio/iogetdelim.c, libio/oldiofdopen.c
Use __set_errno (EINVAL) instead of MAYBE_SET_EINVAL.
* libio/tst-mmap-eofsync.c: Make #if 1 block unconditional.
This patch mechanically removes all remaining uses, and the
definitions, of the following libio name aliases:
name replaced with
---- -------------
_IO_FILE FILE
_IO_fpos_t __fpos_t
_IO_fpos64_t __fpos64_t
_IO_size_t size_t
_IO_ssize_t ssize_t or __ssize_t
_IO_off_t off_t
_IO_off64_t off64_t
_IO_pid_t pid_t
_IO_uid_t uid_t
_IO_wint_t wint_t
_IO_va_list va_list or __gnuc_va_list
_IO_BUFSIZ BUFSIZ
_IO_cookie_io_functions_t cookie_io_functions_t
__io_read_fn cookie_read_function_t
__io_write_fn cookie_write_function_t
__io_seek_fn cookie_seek_function_t
__io_close_fn cookie_close_function_t
I used __fpos_t and __fpos64_t instead of fpos_t and fpos64_t because
the definitions of fpos_t and fpos64_t depend on the largefile mode.
I used __ssize_t and __gnuc_va_list in a handful of headers where
namespace cleanliness might be relevant even though they're
internal-use-only. In all other cases, I used the public-namespace
name.
There are a tiny handful of places where I left a use of 'struct _IO_FILE'
alone, because it was being used together with 'struct _IO_FILE_plus'
or 'struct _IO_FILE_complete' in the same arithmetic expression.
Because this patch was almost entirely done with search and replace, I
may have introduced indentation botches. I did proofread the diff,
but I may have missed something.
The ChangeLog below calls out all of the places where this was not a
pure search-and-replace change.
Installed stripped libraries and executables are unchanged by this patch,
except that some assertions in vfscanf.c change line numbers.
* libio/libio.h (_IO_FILE): Delete; all uses changed to FILE.
(_IO_fpos_t): Delete; all uses changed to __fpos_t.
(_IO_fpos64_t): Delete; all uses changed to __fpos64_t.
(_IO_size_t): Delete; all uses changed to size_t.
(_IO_ssize_t): Delete; all uses changed to ssize_t or __ssize_t.
(_IO_off_t): Delete; all uses changed to off_t.
(_IO_off64_t): Delete; all uses changed to off64_t.
(_IO_pid_t): Delete; all uses changed to pid_t.
(_IO_uid_t): Delete; all uses changed to uid_t.
(_IO_wint_t): Delete; all uses changed to wint_t.
(_IO_va_list): Delete; all uses changed to va_list or __gnuc_va_list.
(_IO_BUFSIZ): Delete; all uses changed to BUFSIZ.
(_IO_cookie_io_functions_t): Delete; all uses changed to
cookie_io_functions_t.
(__io_read_fn): Delete; all uses changed to cookie_read_function_t.
(__io_write_fn): Delete; all uses changed to cookie_write_function_t.
(__io_seek_fn): Delete; all uses changed to cookie_seek_function_t.
(__io_close_fn): Delete: all uses changed to cookie_close_function_t.
* libio/iofopncook.c: Remove unnecessary forward declarations.
* libio/iolibio.h: Correct outdated commentary.
* malloc/malloc.c (__malloc_stats): Remove unnecessary casts.
* stdio-common/fxprintf.c (__fxprintf_nocancel):
Remove unnecessary casts.
* stdio-common/getline.c: Use _IO_getdelim directly.
Don't redefine ssize_t.
* stdio-common/printf_fp.c, stdio_common/printf_fphex.c
* stdio-common/printf_size.c: Don't redefine size_t or FILE.
Remove outdated comments.
* stdio-common/vfscanf.c: Don't redefine va_list.
libio.h was originally the header for a set of supported GNU
extensions, but they have not been maintained as such in many years,
they are now standing in the way of improvements to stdio, and we
don't think there are any remaining external users. _G_config.h was
never intended for public use, but predates the bits convention.
Move both of these headers into the bits directory and provide stubs
at top level which issue deprecation warnings.
The contents of (bits/)libio.h and (bits/)_G_config.h are still
exposed to external software via stdio.h; changing that requires more
complex surgery than I have time to attempt right now.
* libio/libio.h, libio/_G_config.h: New stub headers which issue a
deprecation warning and then include <bits/libio.h>, <bits/_G_config.h>
respectively.
* libio/libio.h: Rename the original version of this file to
libio/bits/libio.h. Error out if not included by stdio.h or the
stub libio.h.
* include/libio.h: Move to include/bits. Forward to libio/bits/libio.h.
* sysdeps/generic/_G_config.h: Move to top-level bits/. Error out
if not included by bits/libio.h or the stub _G_config.h.
* sysdeps/unix/sysv/linux/_G_config.h: Move to
sysdeps/unix/sysv/linux/bits. Error out if not included by
bits/libio.h or the stub _G_config.h.
* libio/stdio.h: Include bits/libio.h, not libio.h.
* libio/Makefile: Install bits/libio.h and bits/_G_config.h as
well as libio.h and _G_config.h.
* csu/init.c, libio/fmemopen.c, libio/iolibio.h, libio/oldfmemopen.c
* libio/strfile.h, stdio-common/vfscanf.c
* sysdeps/pthread/flockfile.c, sysdeps/pthread/funlockfile.c
Include stdio.h, not _G_config.h nor libio.h.
* libio/iofgetpos.c: Also rename fgetpos64 out of the way.
* libio/iofsetpos.c: Also rename fsetpos64 out of the way.
* scripts/check-installed-headers.sh: Skip libio.h and _G_config.h.
<locale.h> is specified to define locale_t in POSIX.1-2008, and so are
all of the headers that define functions that take locale_t arguments.
Under _GNU_SOURCE, the additional headers that define such functions
have also always defined locale_t. Therefore, there is no need to use
__locale_t in public function prototypes, nor in any internal code.
* ctype/ctype-c99_l.c, ctype/ctype.h, ctype/ctype_l.c
* include/monetary.h, include/stdlib.h, include/time.h
* include/wchar.h, locale/duplocale.c, locale/freelocale.c
* locale/global-locale.c, locale/langinfo.h, locale/locale.h
* locale/localeinfo.h, locale/newlocale.c
* locale/nl_langinfo_l.c, locale/uselocale.c
* localedata/bug-usesetlocale.c, localedata/tst-xlocale2.c
* stdio-common/vfscanf.c, stdlib/monetary.h, stdlib/stdlib.h
* stdlib/strfmon_l.c, stdlib/strtod_l.c, stdlib/strtof_l.c
* stdlib/strtol.c, stdlib/strtol_l.c, stdlib/strtold_l.c
* stdlib/strtoll_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* string/strcasecmp.c, string/strcoll_l.c, string/string.h
* string/strings.h, string/strncase.c, string/strxfrm_l.c
* sysdeps/ieee754/float128/strtof128_l.c
* sysdeps/ieee754/float128/wcstof128.c
* sysdeps/ieee754/float128/wcstof128_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-strfmon_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-wcstold_l.c
* sysdeps/powerpc/powerpc32/power7/strcasecmp.S
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S
* sysdeps/x86_64/strcasecmp_l-nonascii.c
* sysdeps/x86_64/strncase_l-nonascii.c, time/strftime_l.c
* time/strptime_l.c, time/time.h, wcsmbs/mbsrtowcs_l.c
* wcsmbs/wchar.h, wcsmbs/wcscasecmp.c, wcsmbs/wcsncase.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstol_l.c, wcsmbs/wcstold.c
* wcsmbs/wcstold_l.c, wcsmbs/wcstoll_l.c, wcsmbs/wcstoul_l.c
* wcsmbs/wcstoull_l.c, wctype/iswctype_l.c
* wctype/towctrans_l.c, wctype/wcfuncs_l.c
* wctype/wctrans_l.c, wctype/wctype.h, wctype/wctype_l.c:
Change all uses of __locale_t to locale_t.
posix/wordexp-test.c used libc-internal.h for PTR_ALIGN_DOWN; similar
to what was done with libc-diag.h, I have split the definitions of
cast_to_integer, ALIGN_UP, ALIGN_DOWN, PTR_ALIGN_UP, and PTR_ALIGN_DOWN
to a new header, libc-pointer-arith.h.
It then occurred to me that the remaining declarations in libc-internal.h
are mostly to do with early initialization, and probably most of the
files including it, even in the core code, don't need it anymore. Indeed,
only 19 files actually need what remains of libc-internal.h. 23 others
need libc-diag.h instead, and 12 need libc-pointer-arith.h instead.
No file needs more than one of them, and 16 don't need any of them!
So, with this patch, libc-internal.h stops including libc-diag.h as
well as losing the pointer arithmetic macros, and all including files
are adjusted.
* include/libc-pointer-arith.h: New file. Define
cast_to_integer, ALIGN_UP, ALIGN_DOWN, PTR_ALIGN_UP, and
PTR_ALIGN_DOWN here.
* include/libc-internal.h: Definitions of above macros
moved from here. Don't include libc-diag.h anymore either.
* posix/wordexp-test.c: Include stdint.h and libc-pointer-arith.h.
Don't include libc-internal.h.
* debug/pcprofile.c, elf/dl-tunables.c, elf/soinit.c, io/openat.c
* io/openat64.c, misc/ptrace.c, nptl/pthread_clock_gettime.c
* nptl/pthread_clock_settime.c, nptl/pthread_cond_common.c
* string/strcoll_l.c, sysdeps/nacl/brk.c
* sysdeps/unix/clock_settime.c
* sysdeps/unix/sysv/linux/i386/get_clockfreq.c
* sysdeps/unix/sysv/linux/ia64/get_clockfreq.c
* sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c
* sysdeps/unix/sysv/linux/sparc/sparc64/get_clockfreq.c:
Don't include libc-internal.h.
* elf/get-dynamic-info.h, iconv/loop.c
* iconvdata/iso-2022-cn-ext.c, locale/weight.h, locale/weightwc.h
* misc/reboot.c, nis/nis_table.c, nptl_db/thread_dbP.h
* nscd/connections.c, resolv/res_send.c, soft-fp/fmadf4.c
* soft-fp/fmasf4.c, soft-fp/fmatf4.c, stdio-common/vfscanf.c
* sysdeps/ieee754/dbl-64/e_lgamma_r.c
* sysdeps/ieee754/dbl-64/k_rem_pio2.c
* sysdeps/ieee754/flt-32/e_lgammaf_r.c
* sysdeps/ieee754/flt-32/k_rem_pio2f.c
* sysdeps/ieee754/ldbl-128/k_tanl.c
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c
* sysdeps/ieee754/ldbl-96/k_tanl.c, sysdeps/nptl/futex-internal.h:
Include libc-diag.h instead of libc-internal.h.
* elf/dl-load.c, elf/dl-reloc.c, locale/programs/locarchive.c
* nptl/nptl-init.c, string/strcspn.c, string/strspn.c
* malloc/malloc.c, sysdeps/i386/nptl/tls.h
* sysdeps/nacl/dl-map-segments.h, sysdeps/x86_64/atomic-machine.h
* sysdeps/unix/sysv/linux/spawni.c
* sysdeps/x86_64/nptl/tls.h:
Include libc-pointer-arith.h instead of libc-internal.h.
* elf/get-dynamic-info.h, sysdeps/nacl/dl-map-segments.h
* sysdeps/x86_64/atomic-machine.h:
Add multiple include guard.
The function read_int, from printf-parse.h, parses an integer from a string
while avoiding overflows. It is used by other functions, such as vfprintf,
to avoid undefined behavior.
The function vfscanf (_IO_vfwscanf) parses an integer from the format
string, and can use read_int.
This avoids a race condition if the process-global locale is changed
while vfscanf is running. MB_LEN_MAX is always larger than MB_CUR_MAX,
so we might realloc earlier than necessary (but even MB_CUR_MAX could
be larger than the minimum required space).
The existing length was a bit questionable because str + MB_LEN_MAX
might point past the end of the buffer.
A custom character buffer is added in this commit, in the form of
struct char_buffer. The char_buffer_add function replaces the
ADDW macro (which has grown with each successive security fix).
The char_buffer_add slow path is moved out-of-line, reducing
code size.
* stdio-common/vfscanf.c (MEMCPY): Remove macro.
(struct char_buffer): New type.
(char_buffer_start, char_buffer_size, char_buffer_error)
(char_buffer_rewind, char_buffer_add): New functions.
(ADDW): Remove macro, replaced by the char_buffer_add function.
(_IO_vfscanf_internal): Rewrite using struct char_buffer instead
of extend_alloca. Make control flow more explicit.
BZ #16618
Under certain conditions wscanf can allocate too little memory for the
to-be-scanned arguments and overflow the allocated buffer. The
implementation now correctly computes the required buffer size when
using malloc.
A regression test was added to tst-sscanf.
in loop to look for conversion specifier to avoid testing of
wrong errno value.
* stdio-common/Makefile (tests): Add bug18, bug18a, bug19, bug19a.
* stdio-common/bug18a.c: New file.
* stdio-common/bug19.c: New file.
* stdio-common/bug19a.c: New file.