Commit Graph

1290 Commits

Author SHA1 Message Date
Florian Weimer
1daccf403b nptl: Move stack list variables into _rtld_global
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-16 19:33:30 +01:00
Adhemerval Zanella
01bd62517c Remove tls.h inclusion from internal errno.h
The tls.h inclusion is not really required and limits possible
definition on more arch specific headers.

This is a cleanup to allow inline functions on sysdep.h, more
specifically on i386 and ia64 which requires to access some tls
definitions its own.

No semantic changes expected, checked with a build against all
affected ABIs.
2020-11-13 12:59:19 -03:00
Florian Weimer
0f34d426ac x86: Remove UP macro. Define LOCK_PREFIX unconditionally.
The UP macro is never defined.  Also define LOCK_PREFIX
unconditionally, to the same string.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-13 15:20:03 +01:00
Samuel Thibault
3d3316b1de hurd: keep only required PLTs in ld.so
We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so.
See Roland's comment in
https://sourceware.org/bugzilla/show_bug.cgi?id=15605
"in the Hurd it's crucial that calls like __mmap be the libc ones
instead of the rtld-local ones after the bootstrap phase, when the
dynamic linker is being used for dlopen and the like."

We used to just avoid all hidden use in the rtld ; this commit switches to
keeping only those that should use PLT calls, i.e. essentially those defined in
sysdeps/mach/hurd/dl-sysdep.c:

__assert_fail
__assert_perror_fail
__*stat64
_exit

This fixes a few startup issues, notably the call to __tunable_get_val that is
made before PLTs are set up.
2020-11-11 02:36:22 +01:00
H.J. Lu
0f09154c64 x86: Initialize CPU info via IFUNC relocation [BZ 26203]
X86 CPU features in ld.so are initialized by init_cpu_features, which is
invoked by DL_PLATFORM_INIT from _dl_sysdep_start.  But when ld.so is
loaded by static executable, DL_PLATFORM_INIT is never called.  Also
x86 cache info in libc.o and libc.a is initialized by a constructor
which may be called too late.  Since some fields in _rtld_global_ro
in ld.so are initialized by dynamic relocation, we can also initialize
x86 CPU features in _rtld_global_ro in ld.so and cache info in libc.so
by initializing dummy function pointers in ld.so and libc.so via IFUNC
relocation.

Key points:

1. IFUNC is always supported, independent of --enable-multi-arch or
--disable-multi-arch.  Linker generates IFUNC relocations from input
IFUNC objects and ld.so performs IFUNC relocations.
2. There are no IFUNC dependencies in ld.so before dynamic relocation
have been performed,
3. The x86 CPU features in ld.so is initialized by DL_PLATFORM_INIT
in dynamic executable and by IFUNC relocation in dlopen in static
executable.
4. The x86 cache info in libc.o is initialized by IFUNC relocation.
5. In libc.a, both x86 CPU features and cache info are initialized from
ARCH_INIT_CPU_FEATURES, not by IFUNC relocation, before __libc_early_init
is called.

Note: _dl_x86_init_cpu_features can be called more than once from
DL_PLATFORM_INIT and during relocation in ld.so.
2020-10-16 16:17:53 -07:00
Szabolcs Nagy
238032ead6 aarch64: enforce >=64K guard size [BZ #26691]
There are several compiler implementations that allow large stack
allocations to jump over the guard page at the end of the stack and
corrupt memory beyond that. See CVE-2017-1000364.

Compilers can emit code to probe the stack such that the guard page
cannot be skipped, but on aarch64 the probe interval is 64K by default
instead of the minimum supported page size (4K).

This patch enforces at least 64K guard on aarch64 unless the guard
is disabled by setting its size to 0.  For backward compatibility
reasons the increased guard is not reported, so it is only observable
by exhausting the address space or parsing /proc/self/maps on linux.

On other targets the patch has no effect. If the stack probe interval
is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can
be defined to get large enough stack guard on libc allocated stacks.

The patch does not affect threads with user allocated stacks.

Fixes bug 26691.
2020-10-02 09:57:44 +01:00
Florian Weimer
90ccfdf176 x86: Use one ldbl2mpn.c file for both i386 and x86_64 2020-09-22 17:58:39 +02:00
H.J. Lu
9620398097 x86: Install <sys/platform/x86.h> [BZ #26124]
Install <sys/platform/x86.h> so that programmers can do

 #if __has_include(<sys/platform/x86.h>)
 #include <sys/platform/x86.h>
 #endif
 ...

   if (CPU_FEATURE_USABLE (SSE2))
 ...
   if (CPU_FEATURE_USABLE (AVX2))
 ...

<sys/platform/x86.h> exports only:

enum
{
  COMMON_CPUID_INDEX_1 = 0,
  COMMON_CPUID_INDEX_7,
  COMMON_CPUID_INDEX_80000001,
  COMMON_CPUID_INDEX_D_ECX_1,
  COMMON_CPUID_INDEX_80000007,
  COMMON_CPUID_INDEX_80000008,
  COMMON_CPUID_INDEX_7_ECX_1,
  /* Keep the following line at the end.  */
  COMMON_CPUID_INDEX_MAX
};

struct cpuid_features
{
  struct cpuid_registers cpuid;
  struct cpuid_registers usable;
};

struct cpu_features
{
  struct cpu_features_basic basic;
  struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
};

/* Get a pointer to the CPU features structure.  */
extern const struct cpu_features *__x86_get_cpu_features
  (unsigned int max) __attribute__ ((const));

Since all feature checks are done through macros, programs compiled with
a newer <sys/platform/x86.h> are compatible with the older glibc binaries
as long as the layout of struct cpu_features is identical.  The features
array can be expanded with backward binary compatibility for both .o and
.so files.  When COMMON_CPUID_INDEX_MAX is increased to support new
processor features, __x86_get_cpu_features in the older glibc binaries
returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return false on the
new processor feature.  No new symbol version is neeeded.

Both CPU_FEATURE_USABLE and HAS_CPU_FEATURE are provided.  HAS_CPU_FEATURE
can be used to identify processor features.

Note: Although GCC has __builtin_cpu_supports, it only supports a subset
of <sys/platform/x86.h> and it is equivalent to CPU_FEATURE_USABLE.  It
doesn't support HAS_CPU_FEATURE.
2020-09-11 17:20:52 -07:00
Ondřej Hošek
23af890b3f x86-64: Fix FMA4 detection in ifunc [BZ #26534]
A typo in commit 107e6a3c22 causes the
FMA4 code path to be taken on systems that support FMA, even if they do
not support FMA4. Fix this to detect FMA4.
2020-09-02 05:07:37 -07:00
Adhemerval Zanella
5ff35e9544 math: Update x86_64 ulps
From new j0 test.
2020-08-08 16:43:11 -03:00
Andreas K. Hüttel
180d5a045f Update x86-64 libm-test-ulps
x86_64 Intel(R) Core(TM) i5-8265U
gcc (Gentoo 10.1.0-r2 p3) 10.1.0
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-25 17:10:53 -04:00
H.J. Lu
107e6a3c22 x86: Support usable check for all CPU features
Support usable check for all CPU features with the following changes:

1. Change struct cpu_features to

struct cpuid_features
{
  struct cpuid_registers cpuid;
  struct cpuid_registers usable;
};

struct cpu_features
{
  struct cpu_features_basic basic;
  struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
...
};

so that there is a usable bit for each cpuid bit.
2. After the cpuid bits have been initialized, copy the known bits to the
usable bits.  EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for
CPU feature detection.
3. Clear the usable bits which require OS support.
4. If the feature is supported by OS, copy its cpuid bit to its usable
bit.
5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE
and CPU_FEATURE_USABLE_P to check if a feature is usable.
6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13.
7. Unset MPX feature since it has been deprecated.

The results are

1. If the feature is known and doesn't requre OS support, its usable bit
is copied from the cpuid bit.
2. Otherwise, its usable bit is copied from the cpuid bit only if the
feature is known to supported by OS.
3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the
feature can be used.
4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports
the feature.
2020-07-13 06:05:16 -07:00
H.J. Lu
9016b6f389 x86: Remove the unused __x86_prefetchw
Since

commit c867597bff
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Jun 8 13:57:50 2016 -0700

    X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove

removed the only usage of __x86_prefetchw, we can remove the unused
__x86_prefetchw.
2020-07-11 09:34:03 -07:00
H.J. Lu
3f4b61a0b8 x86: Add thresholds for "rep movsb/stosb" to tunables
Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables
to update thresholds for "rep movsb" and "rep stosb" at run-time.

Note that the user specified threshold for "rep movsb" smaller than
the minimum threshold will be ignored.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-06 11:48:42 -07:00
Adhemerval Zanella
b24381e50f i386: Use builtin sqrtl
Checked on i686-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
d19d25dd06 x86_64: Use builtin sqrt{f,l}
Checked on x86_64-linux-gnu.
2020-06-22 11:09:49 -03:00
Sunil K Pandey
75870237ff Fix avx2 strncmp offset compare condition check [BZ #25933]
strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.
2020-06-17 07:07:38 -07:00
H.J. Lu
a35a59036e x86_64: Use %xmmN with vpxor to clear a vector register
Since "vpxor %xmmN, %xmmN, %xmmN" clears the whole vector register, use
%xmmN, instead of %ymmN, with vpxor to clear a vector register.
2020-06-17 05:44:02 -07:00
Vineet Gupta
8dbb7a08ec dl-runtime: reloc_{offset,index} now functions arch overide'able
The existing macros are fragile and expect local variables with a
certain name. Fix this by defining them as functions with default
implementation in a new header dl-runtime.h which arches can override
if need be.

This came up during ARC port review, hence the need for argument pltgot
in reloc_index() which is not needed by existing ports.

This patch potentially only affects hppa/x86 ports,
build tested for both those configs and a few more.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-06-05 13:45:46 -07:00
Florian Weimer
501bdb5dd6 Linux: Remove remnants of the getcpu cache
The getcpu cache was removed from the kernel in Linux 2.6.24.  glibc
support from the sched_getcpu implementation was removed in commit
dd26c44403 ("Consolidate sched_getcpu").
2020-05-16 15:47:51 +02:00
H.J. Lu
55c7bcc71b x86-64: Use RDX_LP on __x86_shared_non_temporal_threshold [BZ #25966]
Since __x86_shared_non_temporal_threshold is defined as

long int __x86_shared_non_temporal_threshold;

and long int is 4 bytes for x32, use RDX_LP to compare against
__x86_shared_non_temporal_threshold in assembly code.
2020-05-09 12:28:15 -07:00
Joseph Myers
dbb188dd87 Remove unused floating-point configuration from gmp-impl.h.
This patch removes the IEEE_DOUBLE_BIG_ENDIAN and
IEEE_DOUBLE_MIXED_ENDIAN macros from gmp-impl.h and gmp-mparam.h, and
the ieee_double_extract union from gmp-impl.h.  The macros were used
only in defining the union, which was used nowhere in glibc.  As GMP's
gmp-impl.h is over 5000 lines, the file in glibc is so far from the
GMP version that it doesn't seem to make sense to keep things there
that are not relevant in glibc.  (I expect there is plenty more in the
header after this patch that is also not relevant in glibc and can be
cleaned up later.)

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
2020-04-28 15:05:09 +00:00
Adhemerval Zanella
f721171632 Revert "x86_64: Add SSE sfp-exceptions"
The __sfp_handle_exceptions is not fully correct regarding raising
exceptions, since there is no direct way to raise only FP_EX_OVERFLOW
nor FP_EX_UNDERFLOW for SSE mode.  Both libgcc and feraiseexcept rely
on x87 mode to accomplish it.

This reverts commit 460ee50de0.

Checked on x86_64.
2020-04-20 14:56:05 -03:00
Adhemerval Zanella
460ee50de0 x86_64: Add SSE sfp-exceptions
The exported x86_64 fenv.h functions operate on both i387 and SSE (since
they should work on both float, double, and long double) while the
internal libc_fe* set either SSE (float, double, and float128) or
i387 (long double).

The libgcc __sfp_handle_exceptions (used on float128 implementation),
however, will set either SEE or i387 exception depending of the
exception to raise.  This broke the internal assumption of float128
where only SSE operations will be used.

This patch reimplements the libgcc __sfp_handle_exceptions to use only
SSE operations and sets libgcc to use it instead of its own
implementation.

And I think we should fix libgcc in a similar manner, since checking on
config/i386/64/sfp-machine.h it already only supports SSE rounding mode
and x86_64 ABI also expectes float128 to use SSE registers [1]
(although it is not clear on how future implementation might implement
it).

Checked on x86_64-linux-gnu.

[1] https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
2020-04-17 11:42:29 -03:00
Adhemerval Zanella
17fd707f88 nptl: Remove x86_64 cancellation assembly implementations [BZ #25765]
All cancellable syscalls are done by C implementations, so there is no
no need to use a specialized implementation to optimize register usage.

It fixes BZ #25765.

Checked on x86_64-linux-gnu.
2020-04-03 10:47:59 -03:00
Paul Zimmermann
a9d42c09a3 math: Add inputs that yield larger errors for float type (x86_64)
The corner cases included were generated using exhaustive search
for all float/binary32 values on x86_64 (comparing to MPFR for
correct rounding to nearest).

For the j0/j1/y0 functions, only cases with ulp error <= 9 were
included.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-31 21:48:54 -04:00
Adhemerval Zanella
1c15464ca0 math: Remove inline math tests
With mathinline removal there is no need to keep building and testing
inline math tests.

The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries.  The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-03-19 11:45:44 -03:00
Florian Weimer
fe49a73316 x86: Avoid single-argument _Static_assert in <tls.h>
Older GCC versions do not support this extension.  Fixes commit f1bdee6179
("x86 tls: Use _Static_assert for TLS access size assertion").
2020-02-17 11:12:03 +01:00
Samuel Thibault
f1bdee6179 x86 tls: Use _Static_assert for TLS access size assertion 2020-02-17 00:40:39 +01:00
Florian Weimer
3a0ecccb59 ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]
Exporting functions and relying on symbol interposition from libc.so
makes the choice of implementation dependent on DT_NEEDED order, which
is not what some compiler drivers expect.

This commit replaces one magic mechanism (symbol interposition) with
another one (preprocessor-/compiler-based redirection).  This makes
the hand-over from the minimal malloc to the full malloc more
explicit.

Removing the ABI symbols is backwards-compatible because libc.so is
always in scope, and the dynamic loader will find the malloc-related
symbols there since commit f0b2132b35
("ld.so: Support moving versioned symbols between sonames
[BZ #24741]").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-02-15 11:01:23 +01:00
Adhemerval Zanella
fcb78a5505 linux: Consolidate INLINE_SYSCALL
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, there is no need to replicate the INLINE_SYSCALL.

The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__,
which is ok now and it allows cleanup some archaic code that assume
otherwise.

Checked with a build against all affected ABIs.
2020-02-14 21:09:12 -03:00
Wilco Dijkstra
220622dde5 Add libm_alias_finite for _finite symbols
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol.  It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).

The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.

Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).

Passes buildmanyglibc.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:04 -03:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Stefan Liebler
1c94bf0f0a Always use wordsize-64 version of s_trunc.c.
This patch replaces s_trunc.c in sysdeps/dbl-64 with the one in
sysdeps/dbl-64/wordsize-64 and removes the latter one.
The code is not changed except changes in code style.

Also adjusted the include path in x86_64 and sparc64 files.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-12-11 15:12:14 +01:00
Stefan Liebler
9f234eafe8 Always use wordsize-64 version of s_ceil.c.
This patch replaces s_ceil.c in sysdeps/dbl-64 with the one in
sysdeps/dbl-64/wordsize-64 and removes the latter one.
The code is not changed except changes in code style.

Also adjusted the include path in x86_64 and sparc64 files.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-12-11 15:12:13 +01:00
Stefan Liebler
95b0c2c431 Always use wordsize-64 version of s_floor.c.
This patch replaces s_floor.c in sysdeps/dbl-64 with the one in
sysdeps/dbl-64/wordsize-64 and removes the latter one.
The code is not changed except changes in code style.

Also adjusted the include path in x86_64 and sparc64 files.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-12-11 15:12:12 +01:00
Stefan Liebler
ab48bdd098 Always use wordsize-64 version of s_rint.c.
This patch replaces s_rint.c in sysdeps/dbl-64 with the one in
sysdeps/dbl-64/wordsize-64 and removes the latter one.
The code is not changed except changes in code style.

Also adjusted the include path in x86_64 file.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-12-11 15:12:12 +01:00
Stefan Liebler
af123aa950 Always use wordsize-64 version of s_nearbyint.c.
This patch replaces s_nearbyint.c in sysdeps/dbl-64 with the one in
sysdeps/dbl-64/wordsize-64 and removes the latter one.
The code is not changed except changes in code style.

Also adjusted the include path in x86_64 file.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-12-11 15:12:11 +01:00
Florian Weimer
4db71d2f98 elf: Do not run IFUNC resolvers for LD_DEBUG=unused [BZ #24214]
This commit adds missing skip_ifunc checks to aarch64, arm, i386,
sparc, and x86_64.  A new test case ensures that IRELATIVE IFUNC
resolvers do not run in various diagnostic modes of the dynamic
loader.

Reviewed-By: Szabolcs Nagy <szabolcs.nagy@arm.com>
2019-12-02 14:55:22 +01:00
Adhemerval Zanella
48dbce60cf nptl: Add tests for internal pthread_rwlock_t offsets
This patch new build tests to check for internal fields offsets for
internal pthread_rwlock_t definition.  Althoug the '__data.__flags'
field layout should be preserved due static initializators, the patch
also adds tests for the futexes that may be used in a shared memory
(although using different libc version in such scenario is not really
supported).

Checked with a build against all affected ABIs.

Change-Id: Iccc103d557de13d17e4a3f59a0cad2f4a640c148
2019-11-26 13:53:36 +00:00
Adhemerval Zanella
71d260c107 nptl: Cleanup mutex internal offset tests
The offsets of pthread_mutex_t __data.__nusers, __data.__spins,
__data.elision, __data.list are not required to be constant over
the releases.  Only the __data.__kind is used for static
initializers.

This patch also adds an additional size check for __data.__kind.

Checked with a build against affected ABIs.

Change-Id: I7a4e48cc91b4c4ada57e9a5d1b151fb702bfaa9f
2019-11-26 13:53:36 +00:00
Wilco Dijkstra
d0007dc53c Remove x64 _finite tests and references
Remove _finite tests and references from x86_64.  Rather than calling
__exp_finite, use exp directly (since it's the same entry point).

x86_64 builds and passes testsuite.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-10-21 14:29:12 -03:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Wilco Dijkstra
3c05dd79d0 Use generic memset/memcpy/memmove in benchtests
Use the generic C memset/memcpy/memmove in benchtests since comparing
against a slow byte-oriented implementation makes no sense.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

2019-08-29  Wilco Dijkstra  <wdijkstr@arm.com>

	* benchtests/bench-memcpy.c (simple_memcpy): Remove.
	(generic_memcpy): Include generic C memcpy.
	* benchtests/bench-memmove.c (simple_memmove): Remove.
	(generic_memmove): Include generic C memmove.
	* benchtests/bench-memset.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* benchtests/bench-memset-large.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* benchtests/bench-memset-walk.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* string/memcpy.c (MEMCPY): Add defines to enable redirection.
	* string/memset.c (MEMSET): Likewise.
	* sysdeps/x86_64/memcopy.h: Remove empty file.
2019-08-30 17:21:35 +01:00
H.J. Lu
7e681561a3 x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603]
When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in
sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions,
which leads to store forward stall:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

There is no easy fix in compiler.  This patch limits vector width to
128 bits to work around this issue.  It improves performance of sin
and cos by more than 40% on Skylake compiled with -O3 -march=skylake.

Tested with GCC 7/8/9 on x86-64.

	[BZ #24603]
	* sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128
	works.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New.  Set
	to -mprefer-vector-width=128 if supported.
2019-07-24 14:48:43 -07:00
H.J. Lu
d039da1c00 x86: Add sysdeps/x86/dl-lookupcfg.h
Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are
identical, we can replace them with sysdeps/x86/dl-lookupcfg.h.

	* sysdeps/i386/dl-lookupcfg.h: Moved to ...
	* sysdeps/x86/dl-lookupcfg.h: Here.
	* sysdeps/x86_64/dl-lookupcfg.h: Removed.
2019-06-26 15:07:28 -07:00
Adhemerval Zanella
81a1443941 wcsmbs: optimize wcscat
This patch rewrites wcscat using wcslen and wcscpy.  This is similar to
the optimization done on strcat by 6e46de42fe.

The strcpy changes are mainly to add the internal alias to avoid PLT
calls.

Checked on x86_64-linux-gnu and a build against the affected
architectures.

	* include/wchar.h (__wcscpy): New prototype.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
	(__wcscpy): Route internal symbol to generic implementation.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
	Add internal __wcscpy alias.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
	* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
	* wcsmbs/wcscpy.c (wcscpy): Add
	* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
	use generic implementation.
	* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
2019-02-27 10:00:37 -03:00
Joseph Myers
32db86d558 Add fall-through comments.
This patch adds fall-through comments in some cases where -Wextra
produces implicit-fallthrough warnings.

The patch is non-exhaustive.  Apart from architecture-specific code
for non-x86_64 architectures, it does not change sunrpc/xdr.c (legacy
code, probably should have such changes, but left to be dealt with
separately), or places that already had comments about the
fall-through but not matching the form expected by
-Wimplicit-fallthrough=3 (the default level with -Wextra; my
inclination is to adjust those comments to match rather than
downgrading to -Wimplicit-fallthrough=1 to allow any comment), or one
place where I thought the implicit fallthrough was not correct and so
should be handled separately as a bug fix.  I think the key thing to
consider in review of this patch is whether the fall-through is indeed
intended and correct in each place where such a comment is added.

Tested for x86_64.

	* elf/dl-exception.c (_dl_exception_create_format): Add
	fall-through comments.
	* elf/ldconfig.c (parse_conf_include): Likewise.
	* elf/rtld.c (print_statistics): Likewise.
	* locale/programs/charmap.c (parse_charmap): Likewise.
	* misc/mntent_r.c (__getmntent_r): Likewise.
	* posix/wordexp.c (parse_arith): Likewise.
	(parse_backtick): Likewise.
	* resolv/ns_ttl.c (ns_parse_ttl): Likewise.
	* sysdeps/x86/cpu-features.c (init_cpu_features): Likewise.
	* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
2019-02-12 10:30:34 +00:00
Andreas Schwab
65f7767a91 Fix handling of collating elements in fnmatch (bug 17396, bug 16976)
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for
regexp matching.  As a side effect it also removes the use of an unbound
VLA.
2019-02-04 15:45:02 +01:00
H.J. Lu
3f635fb433 x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155]
Since the size argument is unsigned. we should use unsigned Jcc
instructions, instead of signed, to check size.

Tested on x86-64 and x32, with and without --disable-multi-arch.

	[BZ #24155]
	CVE-2019-7309
	* NEWS: Updated for CVE-2019-7309.
	* sysdeps/x86_64/memcmp.S: Use RDX_LP for size.  Clear the
	upper 32 bits of RDX register for x32.  Use unsigned Jcc
	instructions, instead of signed.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2.
	* sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.
2019-02-04 06:31:13 -08:00