Commit Graph

416 Commits

Author SHA1 Message Date
Joseph Myers
9e2ff880f3 Declare wcstofN, wcstofNx for C2x
WG14 accepted the changes in N3105 to define wcstofN and wcstofNx
functions for C2x.  Thus enable those for C2x (given also __GLIBC_USE
(IEC_60559_TYPES_EXT) and support for the relevant _FloatN / _FloatNx
type) rather than only for __USE_GNU.

Tested for x86_64.
2023-03-14 18:11:27 +00:00
Joseph Myers
dee2bea048 C2x scanf binary constant handling
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants for the %i scanf format (in addition to the %b format,
which isn't yet implemented for scanf in glibc).  Implement that scanf
support for glibc.

As with the strtol support, this is incompatible with previous C
standard versions, in that such an input string starting with 0b or 0B
was previously required to be parsed as 0 (with the rest of the input
potentially matching subsequent parts of the scanf format string).
Thus this patch adds 12 new __isoc23_* functions per long double
format (12, 24 or 36 depending on how many long double formats the
glibc configuration supports), with appropriate header redirection
support (generally very closely following that for the __isoc99_*
scanf functions - note that __GLIBC_USE (DEPRECATED_SCANF) takes
precedence over __GLIBC_USE (C2X_STRTOL), so the case of GNU
extensions to C89 continues to get old-style GNU %a and does not get
this new feature).  The function names would remain as __isoc23_* even
if C2x ends up published in 2024 rather than 2023.

When scanf %b support is added, I think it will be appropriate for all
versions of scanf to follow C2x rules for inputs to the %b format
(given that there are no compatibility concerns for a new format).

Tested for x86_64 (full glibc testsuite).  The first version was also
tested for powerpc (32-bit) and powerpc64le (stdio-common/ and wcsmbs/
tests), and with build-many-glibcs.py.
2023-03-02 19:10:37 +00:00
Joseph Myers
64924422a9 C2x strtol binary constant handling
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2.  Implement that strtol support for glibc.

As discussed at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed).  Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support.  This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch.  The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.

Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests.  The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.

Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses.  Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored.  I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):

benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c

I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case.  In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.

Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.

As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points).  For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all.  An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed.  (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)

strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.

I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.
2023-02-16 23:02:40 +00:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Adhemerval Zanella
8d98c7c00f configure: Use -Wno-ignored-attributes if compiler warns about multiple aliases
clang emits an warning when a double alias redirection is used, to warn
the the original symbol will be used even when weak definition is
overridden.  However, this is a common pattern for weak_alias, where
multiple alias are set to same symbol.

Reviewed-by: Fangrui Song <maskray@google.com>
2022-11-01 09:51:06 -03:00
Florian Weimer
58548b9d68 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources
In the future, this will result in a compilation failure if the
macros are unexpectedly undefined (due to header inclusion ordering
or header inclusion missing altogether).

Assembler sources are more difficult to convert.  In many cases,
they are hand-optimized for the mangling and no-mangling variants,
which is why they are not converted.

sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c
are special: These are C sources, but most of the implementation is
in assembler, so the PTR_DEMANGLE macro has to be undefined in some
cases, to match the assembler style.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-18 17:04:10 +02:00
Florian Weimer
88f4b6929c Introduce <pointer_guard.h>, extracted from <sysdep.h>
This allows us to define a generic no-op version of PTR_MANGLE and
PTR_DEMANGLE.  In the future, we can use PTR_MANGLE and PTR_DEMANGLE
unconditionally in C sources, avoiding an unintended loss of hardening
due to missing include files or unlucky header inclusion ordering.

In i386 and x86_64, we can avoid a <tls.h> dependency in the C
code by using the computed constant from <tcb-offsets.h>.  <sysdep.h>
no longer includes these definitions, so there is no cyclic dependency
anymore when computing the <tcb-offsets.h> constants.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-18 17:03:55 +02:00
Adhemerval Zanella Netto
de477abcaa Use '%z' instead of '%Z' on printf functions
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-22 08:48:04 -03:00
Raphael Moreira Zinsly
c7509d49c4 Apply asm redirections in wchar.h before first use
Similar to d0fa09a770, but for wchar.h.  Fixes [BZ #27087] by applying
all long double related asm redirections before using functions in
bits/wchar2.h.
Moves the function declarations from wcsmbs/bits/wchar2.h to a new file
wcsmbs/bits/wchar2-decl.h that will be included first in wcsmbs/wchar.h.

Tested with build-many-glibcs.py.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-08-30 12:50:16 -03:00
H.J. Lu
e03f5ccd6c wcsmbs: Add missing test-c8rtomb/test-mbrtoc8 dependency
Make test-c8rtomb.out and test-mbrtoc8.out depend on $(gen-locales) for

  xsetlocale (LC_ALL, "de_DE.UTF-8");
  xsetlocale (LC_ALL, "zh_HK.BIG5-HKSCS");

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-08-01 09:48:27 -03:00
Tom Honermann
825f84f133 stdlib: Suppress gcc diagnostic that char8_t is a keyword in C++20 in uchar.h.
gcc 13 issues the following diagnostic for the uchar.h header when the
-Wc++20-compat option is enabled in C++ modes that do not enable char8_t
as a builtin type (C++17 and earlier by default; subject to _GNU_SOURCE
and the gcc -f[no-]char8_t option).
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]
This change modifies the uchar.h header to suppress the diagnostic through
the use of '#pragma GCC diagnostic' directives for gcc 10 and later (the
-Wc++20-compat option was added in gcc version 10).  Unfortunately, a bug
in gcc currently prevents those directives from having the intended effect
as reported at https://gcc.gnu.org/PR106423.  A patch for that issue has
been submitted and is available in the email thread archive linked below.
  https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598736.html
2022-08-01 09:39:07 -03:00
Tom Honermann
f4fe72a4f7 stdlib: Tests for mbrtoc8, c8rtomb, and the char8_t typedef.
This change adds tests for the mbrtoc8 and c8rtomb functions adopted for
C++20 via WG21 P0482R6 and for C2X via WG14 N2653, and for the char8_t
typedef adopted for C2X from WG14 N2653.

The tests for mbrtoc8 and c8rtomb specifically exercise conversion to
and from Big5-HKSCS because of special cases that arise with that encoding.
Big5-HKSCS defines some double byte sequences that convert to more than
one Unicode code point.  In order to test this, the locale dependencies
for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-07-06 09:29:45 -03:00
Tom Honermann
8bcca1db3d stdlib: Implement mbrtoc8, c8rtomb, and the char8_t typedef.
This change provides implementations for the mbrtoc8 and c8rtomb
functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14
N2653.  It also provides the char8_t typedef from WG14 N2653.

The mbrtoc8 and c8rtomb functions are declared in uchar.h in C2X
mode or when the _GNU_SOURCE macro or C++20 __cpp_char8_t feature
test macro is defined.

The char8_t typedef is declared in uchar.h in C2X mode or when the
_GNU_SOURCE macro is defined and the C++20 __cpp_char8_t feature
test macro is not defined (if __cpp_char8_t is defined, then char8_t
is a builtin type).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-07-06 09:29:42 -03:00
Florian Weimer
93ec1cf0fe locale: Add more cached data to LC_CTYPE
This data will be used in number formatting.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Florian Weimer
7ee41feba6 locale: Remove private union from struct __locale_data
This avoids an alias violation later.  This commit also fixes
an incorrect double-checked locking idiom in _nl_init_era_entries.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Florian Weimer
bbebe83a28 locale: Remove cleanup function pointer from struct __localedata
We can call the cleanup functions directly from _nl_unload_locale
if we pass the category to it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Siddhesh Poyarekar
9bcd12d223 wcrtomb: Make behavior POSIX compliant
The GNU implementation of wcrtomb assumes that there are at least
MB_CUR_MAX bytes available in the destination buffer passed to wcrtomb
as the first argument.  This is not compatible with the POSIX
definition, which only requires enough space for the input wide
character.

This does not break much in practice because when users supply buffers
smaller than MB_CUR_MAX (e.g. in ncurses), they compute and dynamically
allocate the buffer, which results in enough spare space (thanks to
usable_size in malloc and padding in alloca) that no actual buffer
overflow occurs.  However when the code is built with _FORTIFY_SOURCE,
it runs into the hard check against MB_CUR_MAX in __wcrtomb_chk and
hence fails.  It wasn't evident until now since dynamic allocations
would result in wcrtomb not being fortified but since _FORTIFY_SOURCE=3,
that limitation is gone, resulting in such code failing.

To fix this problem, introduce an internal buffer that is MB_LEN_MAX
long and use that to perform the conversion and then copy the resultant
bytes into the destination buffer.  Also move the fortification check
into the main implementation, which checks the result after conversion
and aborts if the resultant byte count is greater than the destination
buffer size.

One complication is that applications that assume the MB_CUR_MAX
limitation to be gone may not be able to run safely on older glibcs if
they use static destination buffers smaller than MB_CUR_MAX; dynamic
allocations will always have enough spare space that no actual overruns
will occur.  One alternative to fixing this is to bump symbol version to
prevent them from running on older glibcs but that seems too strict a
constraint.  Instead, since these users will only have made this
decision on reading the manual, I have put a note in the manual warning
them about the pitfalls of having static buffers smaller than
MB_CUR_MAX and running them on older glibc.

Benchmarking:

The wcrtomb microbenchmark shows significant increases in maximum
execution time for all locales, ranging from 10x for ar_SA.UTF-8 to
1.5x-2x for nearly everything else.  The mean execution time however saw
practically no impact, with some results even being quicker, indicating
that cache locality has a much bigger role in the overhead.

Given that the additional copy uses a temporary buffer inside wcrtomb,
it's likely that a hot path will end up putting that buffer (which is
responsible for the additional overhead) in a similar place on stack,
giving the necessary cache locality to negate the overhead.  However in
situations where wcrtomb ends up getting called at wildly different
spots on the call stack (or is on different call stacks, e.g. with
threads or different execution contexts) and is still a hotspot, the
performance lag will be visible.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-13 19:15:46 +05:30
Siddhesh Poyarekar
fcfc908681 debug: Synchronize feature guards in fortified functions [BZ #28746]
Some functions (e.g. stpcpy, pread64, etc.) had moved to POSIX in the
main headers as they got incorporated into the standard, but their
fortified variants remained under __USE_GNU.  As a result, these
functions did not get fortified when _GNU_SOURCE was not defined.

Add test wrappers that check all functions tested in tst-chk0 at all
levels with _GNU_SOURCE undefined and then use the failures to (1)
exclude checks for _GNU_SOURCE functions in these tests and (2) Fix
feature macro guards in the fortified function headers so that they're
the same as the ones in the main headers.

This fixes BZ #28746.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-01-12 23:34:48 +05:30
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Joseph Myers
309548bec3 Support C2X printf %b, %B
C2X adds a printf %b format (see
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted
for C2X), for outputting integers in binary.  It also has recommended
practice for a corresponding %B format (like %b, but %#B starts the
output with 0B instead of 0b).  Add support for these formats to
glibc.

One existing test uses %b as an example of an unknown format, to test
how glibc printf handles unknown formats; change that to %v.  Use of
%b and %B as user-registered format specifiers continues to work (and
we already have a test that covers that, tst-printfsz.c).

Note that C2X also has scanf %b support, plus support for binary
constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol
base 0 and scanf %i coming from a previous paper that added binary
integer literals).  I intend to implement those features in a separate
patch or patches; as discussed in the thread starting at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
they will be more complicated because they involve adding extra public
symbols to ensure compatibility with existing code that might not
expect 0b constants to be handled by strtol base 0 and 2 and scanf %i,
whereas simply adding a new format specifier poses no such
compatibility concerns.

Note that the actual conversion from integer to string uses existing
code in _itoa.c.  That code has special cases for bases 8, 10 and 16,
probably so that the compiler can optimize division by an integer
constant in the code for those bases.  If desired such special cases
could easily be added for base 2 as well, but that would be an
optimization, not actually needed for these printf formats to work.

Tested for x86_64 and x86.  Also tested with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline to make sure that the test does
indeed build with GCC 12 (where format checking warnings are enabled
for most of the test).
2021-11-10 15:52:21 +00:00
Siddhesh Poyarekar
a643f60c53 Make sure that the fortified function conditionals are constant
In _FORTIFY_SOURCE=3, the size expression may be non-constant,
resulting in branches in the inline functions remaining intact and
causing a tiny overhead.  Clang (and in future, gcc) make sure that
the -1 case is always safe, i.e. any comparison of the generated
expression with (size_t)-1 is always false so that bit is taken care
of.  The rest is avoidable since we want the _chk variant whenever we
have a size expression and it's not -1.

Rework the conditionals in a uniform way to clearly indicate two
conditions at compile time:

- Either the size is unknown (-1) or we know at compile time that the
  operation length is less than the object size.  We can call the
  original function in this case.  It could be that either the length,
  object size or both are non-constant, but the compiler, through
  range analysis, is able to fold the *comparison* to a constant.

- The size and length are known and the compiler can see at compile
  time that operation length > object size.  This is valid grounds for
  a warning at compile time, followed by emitting the _chk variant.

For everything else, emit the _chk variant.

This simplifies most of the fortified function implementations and at
the same time, ensures that only one call from _chk or the regular
function is emitted.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-20 18:12:41 +05:30
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Martin Sebor
c1760eaf3b Enable support for GCC 11 -Wmismatched-dealloc.
To help detect common kinds of memory (and other resource) management
bugs, GCC 11 adds support for the detection of mismatched calls to
allocation and deallocation functions.  At each call site to a known
deallocation function GCC checks the set of allocation functions
the former can be paired with and, if the two don't match, issues
a -Wmismatched-dealloc warning (something similar happens in C++
for mismatched calls to new and delete).  GCC also uses the same
mechanism to detect attempts to deallocate objects not allocated
by any allocation function (or pointers past the first byte into
allocated objects) by -Wfree-nonheap-object.

This support is enabled for built-in functions like malloc and free.
To extend it beyond those, GCC extends attribute malloc to designate
a deallocation function to which pointers returned from the allocation
function may be passed to deallocate the allocated objects.  Another,
optional argument designates the positional argument to which
the pointer must be passed.

This change is the first step in enabling this extended support for
Glibc.
2021-05-16 15:21:18 -06:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
f9de8bfe1a nonstring: Enable __FORTIFY_LEVEL=3
Use __builtin_dynamic_object_size in the remaining functions that
don't have compiler builtins as is the case for string functions.
2020-12-31 16:55:21 +05:30
Joseph Myers
224b419d1e Make strtoimax, strtoumax, wcstoimax, wcstoumax into aliases
The functions strtoimax, strtoumax, wcstoimax, wcstoumax currently
have three implementations each (wordsize-32, wordsize-64 and dummy
implementation in stdlib/ using #error), defining the functions as
thin wrappers round corresponding *_internal functions.  Simplify the
code by changing them into aliases of functions such as strtol and
wcstoull.  This is more consistent with how e.g. imaxdiv is handled.

Tested for x86_64 and x86.
2020-12-08 18:15:27 +00:00
Carlos O'Donell
61af4bbb2a mbstowcs: Document, test, and fix null pointer dst semantics (Bug 25219)
The function mbstowcs, by an XSI extension to POSIX, accepts a null
pointer for the destination wchar_t array.  This API behaviour allows
you to use the function to compute the length of the required wchar_t
array i.e. does the conversion without storing it and returns the
number of wide characters required.

We remove the __write_only__ markup for the first argument because it
is not true since the destination may be a null pointer, and so the
length argument may not apply.  We remove the markup otherwise the new
test case cannot be compiled with -Werror=nonnull.

We add a new test case for mbstowcs which exercises the destination is
a null pointer behaviour which we have now explicitly documented.

The mbsrtowcs and mbsnrtowcs behave similarly, and mbsrtowcs is
documented as doing this in C11, even if the standard doesn't come out
and call out this specific use case.  We add one note to each of
mbsrtowcs and mbsnrtowcs to call out that they support a null pointer
for the destination.

The wcsrtombs function behaves similarly but in the other way around
and allows you to use a null destination pointer to compute how many
bytes you would need to convert the wide character input.  We document
this particular case also, but leave wcsnrtombs as a references to
wcsrtombs, so the reader must still read the details of the semantics
for wcsrtombs.
2020-06-01 12:26:32 -04:00
Paul E. Murphy
e2239af353 Rename __LONG_DOUBLE_USES_FLOAT128 to __LDOUBLE_REDIRECTS_TO_FLOAT128_ABI
Improve the commentary to aid future developers who will stumble
upon this novel, yet not always perfect, mechanism to support
alternative formats for long double.

Likewise, rename __LONG_DOUBLE_USES_FLOAT128 to
__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI now that development work
has settled down.  The command used was

git grep -l __LONG_DOUBLE_USES_FLOAT128 ':!./ChangeLog*' | \
  xargs sed -i 's/__LONG_DOUBLE_USES_FLOAT128/__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI/g'

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Gabriel F. T. Gomes
e4a3999213 Prepare redirections for IEEE long double on powerpc64le
All functions that have a format string, which can consume a long double
argument, must have one version for each long double format supported on
a platform.  On powerpc64le, these functions currently have two versions
(i.e.: long double with the same format as double, and long double with
IBM Extended Precision format).  Support for a third long double format
option (i.e. long double with IEEE long double format) is being prepared
and all the aforementioned functions now have a third version (not yet
exported on the master branch, but the code is in).

For these functions to get selected (during build time), references to
them in user programs (or dependent libraries) must get redirected to
the aforementioned new versions of the functions.  This patch installs
the header magic required to perform such redirections.

Notice, however, that since the redirections only happen when
__LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including
powerpc64le) currently does it, no redirections actually happen.
Redirections and the exporting of the new functions will happen at the
same time (when powerpc64le adds ldbl-128ibm-compat to their Implies.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
2020-02-17 15:28:29 -06:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Gabriel F. T. Gomes
93486ba583 Use DEPRECATED_SCANF macro for remaining C99-compliant scanf functions
When the commit

commit 03992356e6
Author: Zack Weinberg <zackw@panix.com>
Date:   Sat Feb 10 11:58:35 2018 -0500

    Use C99-compliant scanf under _GNU_SOURCE with modern compilers.

added the DEPRECATED_SCANF macro to select when redirections of *scanf
functions to their ISO C99 compliant versions should happen, it
accidentally missed doing it for vfwscanf, vwscanf, and vswscanf.

Tested for powerpc64le and with build-many-glibcs (i686-linux-gnu and
nios2-linux-gnu are failing with current master, and with this patch,
but I didn't see a regression).

Change-Id: I706b344a3fb50be017cdab9251d9da18a3ba8c60
2019-11-22 15:29:21 -03:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Florian Weimer
0bfddfc944 iconv: Revert steps array reference counting changes
The changes introduce a memory leak for gconv steps arrays whose
first element is an internal conversion, which has a fixed
reference count which is not decremented.  As a result, after the
change in commit 50ce3eae5b, the steps
array is never freed, resulting in an unbounded memory leak.

This reverts commit 50ce3eae5b
("gconv: Check reference count in __gconv_release_cache
[BZ #24677]") and commit 7e740ab2e7
("libio: Fix gconv-related memory leak [BZ #24583]").  It
reintroduces bug 24583.  (Bug 24677 was just a regression caused by
the second commit.)
2019-07-31 11:43:59 +02:00
Florian Weimer
c9c15ac316 wcsmbs: Fix data race in __wcsmbs_clone_conv [BZ #24584]
This also adds an overflow check and documents the synchronization
requirement in <gconv.h>.
2019-05-21 12:04:55 +02:00
Florian Weimer
7e740ab2e7 libio: Fix gconv-related memory leak [BZ #24583]
struct gconv_fcts for the C locale is statically allocated,
and __gconv_close_transform deallocates the steps object.
Therefore this commit introduces __wcsmbs_close_conv to avoid
freeing the statically allocated steps objects.
2019-05-21 12:03:54 +02:00
Adhemerval Zanella
662c2cc4e9 wcsmbs: Use loop_unroll on wcsrchr
This allows an architecture to set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* wcsmbs/wcsrchr.c (WCSRCHR): Use loop_unroll.h to parametrize
	the loop unroll.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
7ba0100c6a wcsmbs: Use loop_unroll on wcschr
This allows an architecture to set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* wcsmbs/wcschr.c (WCSCHR): Use loop_unroll.h to parametrize
	the loop unroll.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
e3fd0b0e93 wcsmbs: Add wcscpy loop unroll option
This allows an architecture to use the old generic implementation
and also set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* include/loop_unroll.h: New file.
	* wcsmbs/wcscpy (__wcscpy): Add option to use loop unrolling
	besides generic implementation.
2019-04-04 16:01:10 +07:00
Adhemerval Zanella
457208b1e9 wcsmbs: optimize wcsnlen
This patch rewrites wcsnlen using wmemchr.  The generic wmemchr
already uses the strategy (loop unrolling and tail handling) and
by using it it allows architectures that have optimized wmemchr
(s390 and x86_64) to optimize wcsnlen as well.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsnlen.c (__wcsnlen): Rewrite using wmemchr.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
30a7e2081c wcsmbs: optimize wcsncpy
This patch rewrites wcsncpy using wcsnlen, wmemset, and wmemcpy.  This is
similar to the optimization done on strncpy by f6482cf29d and 6423d4754c.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsncpy.c (__wcsncpy): Rewrite using wcsnlen, wmemset, and
	wmemcpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
ddf21ec79f wcsmbs: optimize wcsncat
This patch rewrites wcsncat using wcslen, wcsnlen, and wmemcpy.  This is
similar to the optimization done on strncat by 3eb38795db and e80514b5a8.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsncat.c (wcsncat): Rewrite using wcslen, wcsnlen, and
	wmemcpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
4d8015639a wcsmbs: optimize wcscpy
This patch rewrites wcscpy using wcslen and wmemcpy.  This is similar
to the optimization done on strcpy by b863d2bc4d.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcscpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-27 10:00:42 -03:00
Adhemerval Zanella
81a1443941 wcsmbs: optimize wcscat
This patch rewrites wcscat using wcslen and wcscpy.  This is similar to
the optimization done on strcat by 6e46de42fe.

The strcpy changes are mainly to add the internal alias to avoid PLT
calls.

Checked on x86_64-linux-gnu and a build against the affected
architectures.

	* include/wchar.h (__wcscpy): New prototype.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
	(__wcscpy): Route internal symbol to generic implementation.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
	Add internal __wcscpy alias.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
	* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
	* wcsmbs/wcscpy.c (wcscpy): Add
	* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
	use generic implementation.
	* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
39ef074419 wcsmbs: optimize wcpncpy
This patch rewrites wcpncpy using wcslen, wmemcpy, and wmemset.  This is
similar to the optimization done on stpncpy by 48497aba8e.

Checked on x86_64-linux-gnu.

        * wcsmbs/wcpncpy.c (__wcpcpy): Rewrite using wcslen, wmemcpy, and
	wmemset.
2019-02-27 10:00:38 -03:00
Adhemerval Zanella
7b3fb62051 wcsmbs: optimize wcpcpy
This patch rewrites wcpcpy using wcslen and wmemcpy.  This is
similar to the optimizatio done on stpcpy by f559d8cf29.

Checked on x86_64-linux-gnu and string tests on a simulated
m68k-linux-gnu.

	* sysdeps/m68k/wcpcpy.c: Remove file.
	* wcsmbs/wcpcpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-27 10:00:34 -03:00
Andreas Schwab
65f7767a91 Fix handling of collating elements in fnmatch (bug 17396, bug 16976)
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for
regexp matching.  As a side effect it also removes the use of an unbound
VLA.
2019-02-04 15:45:02 +01:00
Zack Weinberg
03992356e6
Use C99-compliant scanf under _GNU_SOURCE with modern compilers.
The only difference between noncompliant and C99-compliant scanf is
that the former accepts the archaic GNU extension '%as' (also %aS and
%a[...]) meaning to allocate space for the input string with malloc.
This extension conflicts with C99's use of %a as a format _type_
meaning to read a floating-point number; POSIX.1-2008 standardized
equivalent functionality using the modifier letter 'm' instead (%ms,
%mS, %m[...]).

The extension was already disabled in most conformance modes:
specifically, any mode that doesn't involve _GNU_SOURCE and _does_
involve either strict conformance to C99 or loose conformance to both
C99 and POSIX.1-2001 would get the C99-compliant scanf.  With
compilers new enough to use -std=gnu11 instead of -std=gnu89, or
equivalent, that includes the default mode.

With this patch, we now provide C99-compliant scanf in all
configurations except when _GNU_SOURCE is defined *and*
__STDC_VERSION__ or __cplusplus (whichever is relevant) indicates
C89/C++98.  This leaves the old scanf available under e.g. -std=c89
-D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it
was already not present under -std=gnu11 without -D_GNU_SOURCE) and
from -std=gnu89 without -D_GNU_SOURCE.

There needs to be an internal override so we can compile the
noncompliant scanf itself.  This is the same problem we had when we
removed 'gets' from _GNU_SOURCE and it's dealt with the same way:
there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to
off under the appropriate conditions for external code, but can be
overridden by individual files within stdio.

We also run into problems with PLT bypass for internal uses of sscanf,
because libc_hidden_proto uses __REDIRECT and so does the logic in
stdio.h for choosing which implementation of scanf to use; __REDIRECT
isn't transitive, so include/stdio.h needs to bridge the gap with a
macro.  As far as I can tell, sscanf is the only function in this
family that's internally called by unrelated code.

Finally, there are several tests in stdio-common that use the
extension.  bug21.c is a regression test for a crash; it still
exercises the relevant code when changed to use %ms instead of %as.
scanf14.c through scanf17.c are more complicated since they are
actually testing the subtleties of the extension - under what
circumstances is 'a' treated as a modifier letter, etc.  I changed all
of them to use %ms instead of %as as well, but duplicated scanf14.c
and scanf16.c as scanf14a.c and scanf16a.c.  These still use %as and
are compiled with -std=gnu89 to access the old extension.  A bunch of
diagnostic overrides and manual workarounds for the old stdio.h
behavior become unnecessary.  Yay!

	* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE
	parameter.  Only use deprecated scanf when __USE_GNU is defined
	and __STDC_VERSION__ is less than 199901L or __cplusplus is less
	than 201103L, whichever is relevant for the language being compiled.

	* libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect
	scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their
	__isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF).
	* wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for
	wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf.

	* libio/iovsscanf.c
	* libio/fwscanf.c
	* libio/iovswscanf.c
	* libio/swscanf.c
	* libio/vscanf.c
	* libio/vwscanf.c
	* libio/wscanf.c
	* stdio-common/fscanf.c
	* stdio-common/scanf.c
	* stdio-common/vfscanf.c
	* stdio-common/vfwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
	* sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-scanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c:
	Override __GLIBC_USE_DEPRECATED_SCANF to 1.

	* stdio-common/sscanf.c: Likewise.  Remove ldbl_hidden_def for __sscanf.
	* stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf.
	* include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf,
	not sscanf.
	[!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf
	with a preprocessor macro.

	* stdio-common/bug21.c, stdio-common/scanf14.c:
	Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[];
	remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
	* stdio-common/scanf16.c: Likewise.  Add __attribute__ ((format (scanf)))
	to xscanf, xfscanf, xsscanf.

	* stdio-common/scanf14a.c: New copy of scanf14.c which still uses
	%as, %aS, %a[].  Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
	* stdio-common/scanf16a.c: New copy of scanf16.c which still uses
	%as, %aS, %a[].  Add __attribute__ ((format (scanf))) to xscanf,
	xfscanf, xsscanf.
	* stdio-common/scanf15.c, stdio-common/scanf17.c: No need to
	override feature selection macros or provide definitions of u_char etc.
	* stdio-common/Makefile (tests): Add scanf14a and scanf16a.
	(CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove.
	(CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New.  Compile these files
	with -std=gnu89.
2019-01-03 11:12:39 -05:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Zack Weinberg
b87eb3f8fe Use SCANF_ISOC99_A instead of _IO_FLAGS2_SCANF_STD.
Change the callers of __vfscanf_internal and __vfwscanf_internal that
want C99-compliant behavior to communicate this via the new flags
argument, rather than setting bits on the FILE object.  This also
means these functions do not need to do their own locking.

Tested for powerpc and powerpc64le.
2018-12-05 18:15:42 -02:00
Zack Weinberg
349718d4d7 Add __vfscanf_internal and __vfwscanf_internal with flags arguments.
There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode
used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used
by __isoc99_ scanf variants.  In this patch, the new functions honor
these flag bits if they're set, but they still also look at the
corresponding bits of environmental state, and callers all pass zero.

The new functions do *not* have the "errp" argument possessed by
_IO_vfscanf and _IO_vfwscanf.  All internal callers passed NULL for
that argument.  External callers could theoretically exist, so I
preserved wrappers, but they are flagged as compat symbols and they
don't preserve the three-way distinction among types of errors that
was formerly exposed.  These functions probably should have been in
the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just
aliases for vfscanf and vfwscanf.

(It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf.
Please check that part of the patch very carefully, I am still not
confident I understand all of the details of ldbl-opt.)

This patch also introduces helper inlines in libio/strfile.h that
encapsulate the process of initializing an _IO_strfile object for
reading.  This allows us to call __vfscanf_internal directly from
sscanf, and __vfwscanf_internal directly from swscanf, without
duplicating the initialization code.  (Previously, they called their
v-counterparts, but that won't work if we want to control *both* C99
mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.)
It's still a little awkward, especially for wide strfiles, but it's
much better than what we had.

Tested for powerpc and powerpc64le.
2018-12-05 18:15:42 -02:00