Commit Graph

414 Commits

Author SHA1 Message Date
Andreas Schwab
6e642a47fa Fix name space violation in fortify wrappers (bug 32052)
Rename the identifier sz to __sz everywhere.

Fixes: a643f60c53 ("Make sure that the fortified function conditionals are constant")
(cherry picked from commit 39ca997ab3)
(redone from scratch because of many conflicts)
2024-08-06 08:46:39 +02:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Adhemerval Zanella
8d98c7c00f configure: Use -Wno-ignored-attributes if compiler warns about multiple aliases
clang emits an warning when a double alias redirection is used, to warn
the the original symbol will be used even when weak definition is
overridden.  However, this is a common pattern for weak_alias, where
multiple alias are set to same symbol.

Reviewed-by: Fangrui Song <maskray@google.com>
2022-11-01 09:51:06 -03:00
Florian Weimer
58548b9d68 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources
In the future, this will result in a compilation failure if the
macros are unexpectedly undefined (due to header inclusion ordering
or header inclusion missing altogether).

Assembler sources are more difficult to convert.  In many cases,
they are hand-optimized for the mangling and no-mangling variants,
which is why they are not converted.

sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c
are special: These are C sources, but most of the implementation is
in assembler, so the PTR_DEMANGLE macro has to be undefined in some
cases, to match the assembler style.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-18 17:04:10 +02:00
Florian Weimer
88f4b6929c Introduce <pointer_guard.h>, extracted from <sysdep.h>
This allows us to define a generic no-op version of PTR_MANGLE and
PTR_DEMANGLE.  In the future, we can use PTR_MANGLE and PTR_DEMANGLE
unconditionally in C sources, avoiding an unintended loss of hardening
due to missing include files or unlucky header inclusion ordering.

In i386 and x86_64, we can avoid a <tls.h> dependency in the C
code by using the computed constant from <tcb-offsets.h>.  <sysdep.h>
no longer includes these definitions, so there is no cyclic dependency
anymore when computing the <tcb-offsets.h> constants.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-18 17:03:55 +02:00
Adhemerval Zanella Netto
de477abcaa Use '%z' instead of '%Z' on printf functions
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-22 08:48:04 -03:00
Raphael Moreira Zinsly
c7509d49c4 Apply asm redirections in wchar.h before first use
Similar to d0fa09a770, but for wchar.h.  Fixes [BZ #27087] by applying
all long double related asm redirections before using functions in
bits/wchar2.h.
Moves the function declarations from wcsmbs/bits/wchar2.h to a new file
wcsmbs/bits/wchar2-decl.h that will be included first in wcsmbs/wchar.h.

Tested with build-many-glibcs.py.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-08-30 12:50:16 -03:00
H.J. Lu
e03f5ccd6c wcsmbs: Add missing test-c8rtomb/test-mbrtoc8 dependency
Make test-c8rtomb.out and test-mbrtoc8.out depend on $(gen-locales) for

  xsetlocale (LC_ALL, "de_DE.UTF-8");
  xsetlocale (LC_ALL, "zh_HK.BIG5-HKSCS");

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-08-01 09:48:27 -03:00
Tom Honermann
825f84f133 stdlib: Suppress gcc diagnostic that char8_t is a keyword in C++20 in uchar.h.
gcc 13 issues the following diagnostic for the uchar.h header when the
-Wc++20-compat option is enabled in C++ modes that do not enable char8_t
as a builtin type (C++17 and earlier by default; subject to _GNU_SOURCE
and the gcc -f[no-]char8_t option).
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]
This change modifies the uchar.h header to suppress the diagnostic through
the use of '#pragma GCC diagnostic' directives for gcc 10 and later (the
-Wc++20-compat option was added in gcc version 10).  Unfortunately, a bug
in gcc currently prevents those directives from having the intended effect
as reported at https://gcc.gnu.org/PR106423.  A patch for that issue has
been submitted and is available in the email thread archive linked below.
  https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598736.html
2022-08-01 09:39:07 -03:00
Tom Honermann
f4fe72a4f7 stdlib: Tests for mbrtoc8, c8rtomb, and the char8_t typedef.
This change adds tests for the mbrtoc8 and c8rtomb functions adopted for
C++20 via WG21 P0482R6 and for C2X via WG14 N2653, and for the char8_t
typedef adopted for C2X from WG14 N2653.

The tests for mbrtoc8 and c8rtomb specifically exercise conversion to
and from Big5-HKSCS because of special cases that arise with that encoding.
Big5-HKSCS defines some double byte sequences that convert to more than
one Unicode code point.  In order to test this, the locale dependencies
for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-07-06 09:29:45 -03:00
Tom Honermann
8bcca1db3d stdlib: Implement mbrtoc8, c8rtomb, and the char8_t typedef.
This change provides implementations for the mbrtoc8 and c8rtomb
functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14
N2653.  It also provides the char8_t typedef from WG14 N2653.

The mbrtoc8 and c8rtomb functions are declared in uchar.h in C2X
mode or when the _GNU_SOURCE macro or C++20 __cpp_char8_t feature
test macro is defined.

The char8_t typedef is declared in uchar.h in C2X mode or when the
_GNU_SOURCE macro is defined and the C++20 __cpp_char8_t feature
test macro is not defined (if __cpp_char8_t is defined, then char8_t
is a builtin type).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-07-06 09:29:42 -03:00
Florian Weimer
93ec1cf0fe locale: Add more cached data to LC_CTYPE
This data will be used in number formatting.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Florian Weimer
7ee41feba6 locale: Remove private union from struct __locale_data
This avoids an alias violation later.  This commit also fixes
an incorrect double-checked locking idiom in _nl_init_era_entries.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Florian Weimer
bbebe83a28 locale: Remove cleanup function pointer from struct __localedata
We can call the cleanup functions directly from _nl_unload_locale
if we pass the category to it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-05-23 11:06:31 +02:00
Siddhesh Poyarekar
9bcd12d223 wcrtomb: Make behavior POSIX compliant
The GNU implementation of wcrtomb assumes that there are at least
MB_CUR_MAX bytes available in the destination buffer passed to wcrtomb
as the first argument.  This is not compatible with the POSIX
definition, which only requires enough space for the input wide
character.

This does not break much in practice because when users supply buffers
smaller than MB_CUR_MAX (e.g. in ncurses), they compute and dynamically
allocate the buffer, which results in enough spare space (thanks to
usable_size in malloc and padding in alloca) that no actual buffer
overflow occurs.  However when the code is built with _FORTIFY_SOURCE,
it runs into the hard check against MB_CUR_MAX in __wcrtomb_chk and
hence fails.  It wasn't evident until now since dynamic allocations
would result in wcrtomb not being fortified but since _FORTIFY_SOURCE=3,
that limitation is gone, resulting in such code failing.

To fix this problem, introduce an internal buffer that is MB_LEN_MAX
long and use that to perform the conversion and then copy the resultant
bytes into the destination buffer.  Also move the fortification check
into the main implementation, which checks the result after conversion
and aborts if the resultant byte count is greater than the destination
buffer size.

One complication is that applications that assume the MB_CUR_MAX
limitation to be gone may not be able to run safely on older glibcs if
they use static destination buffers smaller than MB_CUR_MAX; dynamic
allocations will always have enough spare space that no actual overruns
will occur.  One alternative to fixing this is to bump symbol version to
prevent them from running on older glibcs but that seems too strict a
constraint.  Instead, since these users will only have made this
decision on reading the manual, I have put a note in the manual warning
them about the pitfalls of having static buffers smaller than
MB_CUR_MAX and running them on older glibc.

Benchmarking:

The wcrtomb microbenchmark shows significant increases in maximum
execution time for all locales, ranging from 10x for ar_SA.UTF-8 to
1.5x-2x for nearly everything else.  The mean execution time however saw
practically no impact, with some results even being quicker, indicating
that cache locality has a much bigger role in the overhead.

Given that the additional copy uses a temporary buffer inside wcrtomb,
it's likely that a hot path will end up putting that buffer (which is
responsible for the additional overhead) in a similar place on stack,
giving the necessary cache locality to negate the overhead.  However in
situations where wcrtomb ends up getting called at wildly different
spots on the call stack (or is on different call stacks, e.g. with
threads or different execution contexts) and is still a hotspot, the
performance lag will be visible.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-13 19:15:46 +05:30
Siddhesh Poyarekar
fcfc908681 debug: Synchronize feature guards in fortified functions [BZ #28746]
Some functions (e.g. stpcpy, pread64, etc.) had moved to POSIX in the
main headers as they got incorporated into the standard, but their
fortified variants remained under __USE_GNU.  As a result, these
functions did not get fortified when _GNU_SOURCE was not defined.

Add test wrappers that check all functions tested in tst-chk0 at all
levels with _GNU_SOURCE undefined and then use the failures to (1)
exclude checks for _GNU_SOURCE functions in these tests and (2) Fix
feature macro guards in the fortified function headers so that they're
the same as the ones in the main headers.

This fixes BZ #28746.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-01-12 23:34:48 +05:30
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Joseph Myers
309548bec3 Support C2X printf %b, %B
C2X adds a printf %b format (see
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted
for C2X), for outputting integers in binary.  It also has recommended
practice for a corresponding %B format (like %b, but %#B starts the
output with 0B instead of 0b).  Add support for these formats to
glibc.

One existing test uses %b as an example of an unknown format, to test
how glibc printf handles unknown formats; change that to %v.  Use of
%b and %B as user-registered format specifiers continues to work (and
we already have a test that covers that, tst-printfsz.c).

Note that C2X also has scanf %b support, plus support for binary
constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol
base 0 and scanf %i coming from a previous paper that added binary
integer literals).  I intend to implement those features in a separate
patch or patches; as discussed in the thread starting at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
they will be more complicated because they involve adding extra public
symbols to ensure compatibility with existing code that might not
expect 0b constants to be handled by strtol base 0 and 2 and scanf %i,
whereas simply adding a new format specifier poses no such
compatibility concerns.

Note that the actual conversion from integer to string uses existing
code in _itoa.c.  That code has special cases for bases 8, 10 and 16,
probably so that the compiler can optimize division by an integer
constant in the code for those bases.  If desired such special cases
could easily be added for base 2 as well, but that would be an
optimization, not actually needed for these printf formats to work.

Tested for x86_64 and x86.  Also tested with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline to make sure that the test does
indeed build with GCC 12 (where format checking warnings are enabled
for most of the test).
2021-11-10 15:52:21 +00:00
Siddhesh Poyarekar
a643f60c53 Make sure that the fortified function conditionals are constant
In _FORTIFY_SOURCE=3, the size expression may be non-constant,
resulting in branches in the inline functions remaining intact and
causing a tiny overhead.  Clang (and in future, gcc) make sure that
the -1 case is always safe, i.e. any comparison of the generated
expression with (size_t)-1 is always false so that bit is taken care
of.  The rest is avoidable since we want the _chk variant whenever we
have a size expression and it's not -1.

Rework the conditionals in a uniform way to clearly indicate two
conditions at compile time:

- Either the size is unknown (-1) or we know at compile time that the
  operation length is less than the object size.  We can call the
  original function in this case.  It could be that either the length,
  object size or both are non-constant, but the compiler, through
  range analysis, is able to fold the *comparison* to a constant.

- The size and length are known and the compiler can see at compile
  time that operation length > object size.  This is valid grounds for
  a warning at compile time, followed by emitting the _chk variant.

For everything else, emit the _chk variant.

This simplifies most of the fortified function implementations and at
the same time, ensures that only one call from _chk or the regular
function is emitted.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-20 18:12:41 +05:30
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Martin Sebor
c1760eaf3b Enable support for GCC 11 -Wmismatched-dealloc.
To help detect common kinds of memory (and other resource) management
bugs, GCC 11 adds support for the detection of mismatched calls to
allocation and deallocation functions.  At each call site to a known
deallocation function GCC checks the set of allocation functions
the former can be paired with and, if the two don't match, issues
a -Wmismatched-dealloc warning (something similar happens in C++
for mismatched calls to new and delete).  GCC also uses the same
mechanism to detect attempts to deallocate objects not allocated
by any allocation function (or pointers past the first byte into
allocated objects) by -Wfree-nonheap-object.

This support is enabled for built-in functions like malloc and free.
To extend it beyond those, GCC extends attribute malloc to designate
a deallocation function to which pointers returned from the allocation
function may be passed to deallocate the allocated objects.  Another,
optional argument designates the positional argument to which
the pointer must be passed.

This change is the first step in enabling this extended support for
Glibc.
2021-05-16 15:21:18 -06:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
f9de8bfe1a nonstring: Enable __FORTIFY_LEVEL=3
Use __builtin_dynamic_object_size in the remaining functions that
don't have compiler builtins as is the case for string functions.
2020-12-31 16:55:21 +05:30
Joseph Myers
224b419d1e Make strtoimax, strtoumax, wcstoimax, wcstoumax into aliases
The functions strtoimax, strtoumax, wcstoimax, wcstoumax currently
have three implementations each (wordsize-32, wordsize-64 and dummy
implementation in stdlib/ using #error), defining the functions as
thin wrappers round corresponding *_internal functions.  Simplify the
code by changing them into aliases of functions such as strtol and
wcstoull.  This is more consistent with how e.g. imaxdiv is handled.

Tested for x86_64 and x86.
2020-12-08 18:15:27 +00:00
Carlos O'Donell
61af4bbb2a mbstowcs: Document, test, and fix null pointer dst semantics (Bug 25219)
The function mbstowcs, by an XSI extension to POSIX, accepts a null
pointer for the destination wchar_t array.  This API behaviour allows
you to use the function to compute the length of the required wchar_t
array i.e. does the conversion without storing it and returns the
number of wide characters required.

We remove the __write_only__ markup for the first argument because it
is not true since the destination may be a null pointer, and so the
length argument may not apply.  We remove the markup otherwise the new
test case cannot be compiled with -Werror=nonnull.

We add a new test case for mbstowcs which exercises the destination is
a null pointer behaviour which we have now explicitly documented.

The mbsrtowcs and mbsnrtowcs behave similarly, and mbsrtowcs is
documented as doing this in C11, even if the standard doesn't come out
and call out this specific use case.  We add one note to each of
mbsrtowcs and mbsnrtowcs to call out that they support a null pointer
for the destination.

The wcsrtombs function behaves similarly but in the other way around
and allows you to use a null destination pointer to compute how many
bytes you would need to convert the wide character input.  We document
this particular case also, but leave wcsnrtombs as a references to
wcsrtombs, so the reader must still read the details of the semantics
for wcsrtombs.
2020-06-01 12:26:32 -04:00
Paul E. Murphy
e2239af353 Rename __LONG_DOUBLE_USES_FLOAT128 to __LDOUBLE_REDIRECTS_TO_FLOAT128_ABI
Improve the commentary to aid future developers who will stumble
upon this novel, yet not always perfect, mechanism to support
alternative formats for long double.

Likewise, rename __LONG_DOUBLE_USES_FLOAT128 to
__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI now that development work
has settled down.  The command used was

git grep -l __LONG_DOUBLE_USES_FLOAT128 ':!./ChangeLog*' | \
  xargs sed -i 's/__LONG_DOUBLE_USES_FLOAT128/__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI/g'

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Gabriel F. T. Gomes
e4a3999213 Prepare redirections for IEEE long double on powerpc64le
All functions that have a format string, which can consume a long double
argument, must have one version for each long double format supported on
a platform.  On powerpc64le, these functions currently have two versions
(i.e.: long double with the same format as double, and long double with
IBM Extended Precision format).  Support for a third long double format
option (i.e. long double with IEEE long double format) is being prepared
and all the aforementioned functions now have a third version (not yet
exported on the master branch, but the code is in).

For these functions to get selected (during build time), references to
them in user programs (or dependent libraries) must get redirected to
the aforementioned new versions of the functions.  This patch installs
the header magic required to perform such redirections.

Notice, however, that since the redirections only happen when
__LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including
powerpc64le) currently does it, no redirections actually happen.
Redirections and the exporting of the new functions will happen at the
same time (when powerpc64le adds ldbl-128ibm-compat to their Implies.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
2020-02-17 15:28:29 -06:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Gabriel F. T. Gomes
93486ba583 Use DEPRECATED_SCANF macro for remaining C99-compliant scanf functions
When the commit

commit 03992356e6
Author: Zack Weinberg <zackw@panix.com>
Date:   Sat Feb 10 11:58:35 2018 -0500

    Use C99-compliant scanf under _GNU_SOURCE with modern compilers.

added the DEPRECATED_SCANF macro to select when redirections of *scanf
functions to their ISO C99 compliant versions should happen, it
accidentally missed doing it for vfwscanf, vwscanf, and vswscanf.

Tested for powerpc64le and with build-many-glibcs (i686-linux-gnu and
nios2-linux-gnu are failing with current master, and with this patch,
but I didn't see a regression).

Change-Id: I706b344a3fb50be017cdab9251d9da18a3ba8c60
2019-11-22 15:29:21 -03:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Florian Weimer
0bfddfc944 iconv: Revert steps array reference counting changes
The changes introduce a memory leak for gconv steps arrays whose
first element is an internal conversion, which has a fixed
reference count which is not decremented.  As a result, after the
change in commit 50ce3eae5b, the steps
array is never freed, resulting in an unbounded memory leak.

This reverts commit 50ce3eae5b
("gconv: Check reference count in __gconv_release_cache
[BZ #24677]") and commit 7e740ab2e7
("libio: Fix gconv-related memory leak [BZ #24583]").  It
reintroduces bug 24583.  (Bug 24677 was just a regression caused by
the second commit.)
2019-07-31 11:43:59 +02:00
Florian Weimer
c9c15ac316 wcsmbs: Fix data race in __wcsmbs_clone_conv [BZ #24584]
This also adds an overflow check and documents the synchronization
requirement in <gconv.h>.
2019-05-21 12:04:55 +02:00
Florian Weimer
7e740ab2e7 libio: Fix gconv-related memory leak [BZ #24583]
struct gconv_fcts for the C locale is statically allocated,
and __gconv_close_transform deallocates the steps object.
Therefore this commit introduces __wcsmbs_close_conv to avoid
freeing the statically allocated steps objects.
2019-05-21 12:03:54 +02:00
Adhemerval Zanella
662c2cc4e9 wcsmbs: Use loop_unroll on wcsrchr
This allows an architecture to set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* wcsmbs/wcsrchr.c (WCSRCHR): Use loop_unroll.h to parametrize
	the loop unroll.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
7ba0100c6a wcsmbs: Use loop_unroll on wcschr
This allows an architecture to set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* wcsmbs/wcschr.c (WCSCHR): Use loop_unroll.h to parametrize
	the loop unroll.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
e3fd0b0e93 wcsmbs: Add wcscpy loop unroll option
This allows an architecture to use the old generic implementation
and also set explicit loop unrolling.

Checked on aarch64-linux-gnu.

	* include/loop_unroll.h: New file.
	* wcsmbs/wcscpy (__wcscpy): Add option to use loop unrolling
	besides generic implementation.
2019-04-04 16:01:10 +07:00
Adhemerval Zanella
457208b1e9 wcsmbs: optimize wcsnlen
This patch rewrites wcsnlen using wmemchr.  The generic wmemchr
already uses the strategy (loop unrolling and tail handling) and
by using it it allows architectures that have optimized wmemchr
(s390 and x86_64) to optimize wcsnlen as well.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsnlen.c (__wcsnlen): Rewrite using wmemchr.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
30a7e2081c wcsmbs: optimize wcsncpy
This patch rewrites wcsncpy using wcsnlen, wmemset, and wmemcpy.  This is
similar to the optimization done on strncpy by f6482cf29d and 6423d4754c.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsncpy.c (__wcsncpy): Rewrite using wcsnlen, wmemset, and
	wmemcpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
ddf21ec79f wcsmbs: optimize wcsncat
This patch rewrites wcsncat using wcslen, wcsnlen, and wmemcpy.  This is
similar to the optimization done on strncat by 3eb38795db and e80514b5a8.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcsncat.c (wcsncat): Rewrite using wcslen, wcsnlen, and
	wmemcpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
4d8015639a wcsmbs: optimize wcscpy
This patch rewrites wcscpy using wcslen and wmemcpy.  This is similar
to the optimization done on strcpy by b863d2bc4d.

Checked on x86_64-linux-gnu.

	* wcsmbs/wcscpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-27 10:00:42 -03:00
Adhemerval Zanella
81a1443941 wcsmbs: optimize wcscat
This patch rewrites wcscat using wcslen and wcscpy.  This is similar to
the optimization done on strcat by 6e46de42fe.

The strcpy changes are mainly to add the internal alias to avoid PLT
calls.

Checked on x86_64-linux-gnu and a build against the affected
architectures.

	* include/wchar.h (__wcscpy): New prototype.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
	(__wcscpy): Route internal symbol to generic implementation.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
	Add internal __wcscpy alias.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
	* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
	* wcsmbs/wcscpy.c (wcscpy): Add
	* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
	use generic implementation.
	* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
2019-02-27 10:00:37 -03:00
Adhemerval Zanella
39ef074419 wcsmbs: optimize wcpncpy
This patch rewrites wcpncpy using wcslen, wmemcpy, and wmemset.  This is
similar to the optimization done on stpncpy by 48497aba8e.

Checked on x86_64-linux-gnu.

        * wcsmbs/wcpncpy.c (__wcpcpy): Rewrite using wcslen, wmemcpy, and
	wmemset.
2019-02-27 10:00:38 -03:00
Adhemerval Zanella
7b3fb62051 wcsmbs: optimize wcpcpy
This patch rewrites wcpcpy using wcslen and wmemcpy.  This is
similar to the optimizatio done on stpcpy by f559d8cf29.

Checked on x86_64-linux-gnu and string tests on a simulated
m68k-linux-gnu.

	* sysdeps/m68k/wcpcpy.c: Remove file.
	* wcsmbs/wcpcpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-27 10:00:34 -03:00
Andreas Schwab
65f7767a91 Fix handling of collating elements in fnmatch (bug 17396, bug 16976)
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for
regexp matching.  As a side effect it also removes the use of an unbound
VLA.
2019-02-04 15:45:02 +01:00
Zack Weinberg
03992356e6
Use C99-compliant scanf under _GNU_SOURCE with modern compilers.
The only difference between noncompliant and C99-compliant scanf is
that the former accepts the archaic GNU extension '%as' (also %aS and
%a[...]) meaning to allocate space for the input string with malloc.
This extension conflicts with C99's use of %a as a format _type_
meaning to read a floating-point number; POSIX.1-2008 standardized
equivalent functionality using the modifier letter 'm' instead (%ms,
%mS, %m[...]).

The extension was already disabled in most conformance modes:
specifically, any mode that doesn't involve _GNU_SOURCE and _does_
involve either strict conformance to C99 or loose conformance to both
C99 and POSIX.1-2001 would get the C99-compliant scanf.  With
compilers new enough to use -std=gnu11 instead of -std=gnu89, or
equivalent, that includes the default mode.

With this patch, we now provide C99-compliant scanf in all
configurations except when _GNU_SOURCE is defined *and*
__STDC_VERSION__ or __cplusplus (whichever is relevant) indicates
C89/C++98.  This leaves the old scanf available under e.g. -std=c89
-D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it
was already not present under -std=gnu11 without -D_GNU_SOURCE) and
from -std=gnu89 without -D_GNU_SOURCE.

There needs to be an internal override so we can compile the
noncompliant scanf itself.  This is the same problem we had when we
removed 'gets' from _GNU_SOURCE and it's dealt with the same way:
there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to
off under the appropriate conditions for external code, but can be
overridden by individual files within stdio.

We also run into problems with PLT bypass for internal uses of sscanf,
because libc_hidden_proto uses __REDIRECT and so does the logic in
stdio.h for choosing which implementation of scanf to use; __REDIRECT
isn't transitive, so include/stdio.h needs to bridge the gap with a
macro.  As far as I can tell, sscanf is the only function in this
family that's internally called by unrelated code.

Finally, there are several tests in stdio-common that use the
extension.  bug21.c is a regression test for a crash; it still
exercises the relevant code when changed to use %ms instead of %as.
scanf14.c through scanf17.c are more complicated since they are
actually testing the subtleties of the extension - under what
circumstances is 'a' treated as a modifier letter, etc.  I changed all
of them to use %ms instead of %as as well, but duplicated scanf14.c
and scanf16.c as scanf14a.c and scanf16a.c.  These still use %as and
are compiled with -std=gnu89 to access the old extension.  A bunch of
diagnostic overrides and manual workarounds for the old stdio.h
behavior become unnecessary.  Yay!

	* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE
	parameter.  Only use deprecated scanf when __USE_GNU is defined
	and __STDC_VERSION__ is less than 199901L or __cplusplus is less
	than 201103L, whichever is relevant for the language being compiled.

	* libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect
	scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their
	__isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF).
	* wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for
	wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf.

	* libio/iovsscanf.c
	* libio/fwscanf.c
	* libio/iovswscanf.c
	* libio/swscanf.c
	* libio/vscanf.c
	* libio/vwscanf.c
	* libio/wscanf.c
	* stdio-common/fscanf.c
	* stdio-common/scanf.c
	* stdio-common/vfscanf.c
	* stdio-common/vfwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
	* sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-scanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c
	* sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c:
	Override __GLIBC_USE_DEPRECATED_SCANF to 1.

	* stdio-common/sscanf.c: Likewise.  Remove ldbl_hidden_def for __sscanf.
	* stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf.
	* include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf,
	not sscanf.
	[!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf
	with a preprocessor macro.

	* stdio-common/bug21.c, stdio-common/scanf14.c:
	Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[];
	remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
	* stdio-common/scanf16.c: Likewise.  Add __attribute__ ((format (scanf)))
	to xscanf, xfscanf, xsscanf.

	* stdio-common/scanf14a.c: New copy of scanf14.c which still uses
	%as, %aS, %a[].  Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat.
	* stdio-common/scanf16a.c: New copy of scanf16.c which still uses
	%as, %aS, %a[].  Add __attribute__ ((format (scanf))) to xscanf,
	xfscanf, xsscanf.
	* stdio-common/scanf15.c, stdio-common/scanf17.c: No need to
	override feature selection macros or provide definitions of u_char etc.
	* stdio-common/Makefile (tests): Add scanf14a and scanf16a.
	(CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove.
	(CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New.  Compile these files
	with -std=gnu89.
2019-01-03 11:12:39 -05:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Zack Weinberg
b87eb3f8fe Use SCANF_ISOC99_A instead of _IO_FLAGS2_SCANF_STD.
Change the callers of __vfscanf_internal and __vfwscanf_internal that
want C99-compliant behavior to communicate this via the new flags
argument, rather than setting bits on the FILE object.  This also
means these functions do not need to do their own locking.

Tested for powerpc and powerpc64le.
2018-12-05 18:15:42 -02:00
Zack Weinberg
349718d4d7 Add __vfscanf_internal and __vfwscanf_internal with flags arguments.
There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode
used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used
by __isoc99_ scanf variants.  In this patch, the new functions honor
these flag bits if they're set, but they still also look at the
corresponding bits of environmental state, and callers all pass zero.

The new functions do *not* have the "errp" argument possessed by
_IO_vfscanf and _IO_vfwscanf.  All internal callers passed NULL for
that argument.  External callers could theoretically exist, so I
preserved wrappers, but they are flagged as compat symbols and they
don't preserve the three-way distinction among types of errors that
was formerly exposed.  These functions probably should have been in
the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just
aliases for vfscanf and vfwscanf.

(It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf.
Please check that part of the patch very carefully, I am still not
confident I understand all of the details of ldbl-opt.)

This patch also introduces helper inlines in libio/strfile.h that
encapsulate the process of initializing an _IO_strfile object for
reading.  This allows us to call __vfscanf_internal directly from
sscanf, and __vfwscanf_internal directly from swscanf, without
duplicating the initialization code.  (Previously, they called their
v-counterparts, but that won't work if we want to control *both* C99
mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.)
It's still a little awkward, especially for wide strfiles, but it's
much better than what we had.

Tested for powerpc and powerpc64le.
2018-12-05 18:15:42 -02:00
Joseph Myers
be8ff03f92 Stop c32rtomb and mbrtoc32 aliasing wcrtomb and mbrtowc (bug 23793).
glibc does:

/* There should be no difference between the UTF-32 handling required
   by c32rtomb and the wchar_t handling which has long since been
   implemented in wcrtomb.  */
weak_alias (__wcrtomb, c32rtomb)

/* There should be no difference between the UTF-32 handling required
   by mbrtoc32 and the wchar_t handling which has long since been
   implemented in mbrtowc.  */
weak_alias (__mbrtowc, mbrtoc32)

The reasoning in those comments to justify those aliases is incorrect:
ISO C requires that, for the case of a NULL mbstate_t* being passed,
each function has its *own* internal static mbstate_t.  Thus a program
must be able to use both wcrtomb and c32rtomb at the same time with
each keeping its own separate state, and likewise for mbrtowc and
mbrtoc32.

This patch duly sets up separarate char32_t function that wrap the
wchar_t ones.  Note that the included test only covers the mbrtoc32 /
mbrtowc pair.  While I think the change made is logically correct for
c32rtomb / wcrtomb as well, I'm not sure we have a locale with a
suitable state-dependent multibyte encoding for testing that part of
the change.

Tested for x86_64.

	[BZ #23793]
	* wcsmbs/c32rtomb.c: New file.
	* wcsmbs/mbrtoc32.c: Likewise.
	* wcsmbs/tst-c32-state.c: Likewise.
	* wcsmbs/mbrtowc.c (mbrtoc32): Do not define as alias.
	* wcsmbs/wcrtomb.c (c32rtomb): Likewise.
	* wcsmbs/Makefile (routines): Add mbrtoc32 and c32rtomb.
	(tests): Add tst-c32-state.
	[$(run-built-tests) = yes] ($(objpfx)tst-c32-state.out): Depend on
	$(gen-locales).
2018-10-22 14:52:14 +00:00
Joseph Myers
d0a7415979 Handle surrogate pairs in c16rtomb (bug 23794, DR#488, C2X).
The c16rtomb implementation has:

  // XXX The ISO C 11 spec I have does not say anything about handling
  // XXX surrogates in this interface.

The DR#488 resolution, as applied to C2X, requires surrogate pairs to
be handled here (so the first call returns 0 and stores the high
surrogate in the mbstate_t, while the second call combines the
surrogates, produces a multibyte character and returns the number of
bytes written).  This patch implements that.  (mbrtoc16 already
handled producing surrogates as output.)

Tested for x86_64.

	[BZ #23794]
	* wcsmbs/c16rtomb.c (c16rtomb): Save first character of surrogate
	pair and return 0 in that case, and use saved character to
	interpret following character.
	* wcsmbs/tst-c16-surrogate.c: New file.
	* wcsmbs/Makefile (tests): Add tst-c16-surrogate.c.
	[$(run-built-tests) = yes] ($(objpfx)tst-c16-surrogate.out):
	Depend on $(gen-locales)
2018-10-19 16:31:29 +00:00