Commit Graph

13293 Commits

Author SHA1 Message Date
Alex Butler
df06b0d90f aarch64: MTE compatible memrchr
Add support for MTE to memrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
7ff899969f aarch64: MTE compatible memchr
Add support for MTE to memchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Gabor Kertesz <gabor.kertesz@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
bb2c12aecb aarch64: MTE compatible strcpy
Add support for MTE to strcpy. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Joseph Myers
8ec13b4639 Add MREMAP_DONTUNMAP from Linux 5.7
Add the new constant MREMAP_DONTUNMAP from Linux 5.7 to
bits/mman-shared.h.

Tested with build-many-glibcs.py.
2020-06-23 14:42:45 +00:00
H.J. Lu
ecbbadbf10 x86: Update CPU feature detection [BZ #26149]
1. Divide architecture features into the usable features and the preferred
features.  The usable features are for correctness and can be exported in
a stable ABI.  The preferred features are for performance and only for
glibc internal use.
2. Change struct cpu_features to

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
  unsigned int usable[USABLE_FEATURE_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
  ...
};

and initialize usable_p to pointer to the usable arary so that

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
};

can be exported via a stable ABI.  The cpuid and usable arrays can be
expanded with backward binary compatibility for both .o and .so files.
3. Add COMMON_CPUID_INDEX_7_ECX_1 for AVX512_BF16.
4. Detect ENQCMD, PKS, AVX512_VP2INTERSECT, MD_CLEAR, SERIALIZE, HYBRID,
TSXLDTRK, L1D_FLUSH, CORE_CAPABILITIES and AVX512_BF16.
5. Rename CAPABILITIES to ARCH_CAPABILITIES.
6. Check if AVX512_VP2INTERSECT, AVX512_BF16 and PKU are usable.
7. Update CPU feature detection test.
2020-06-22 13:09:33 -07:00
Adhemerval Zanella
ea04f02131 aarch64: Remove fpu Makefile
The -fno-math-errno is already added by default and the minimum
required GCC to build glibc (6.2) make the -ffinite-math-only
superflous.

Checked on aarch64-linux-gnu.
2020-06-22 11:09:50 -03:00
Adhemerval Zanella
9f21672b89 m68k: Use sqrt{f} builtin for coldfire
Checked with a build for m68k-linux-gnu-coldfire.
2020-06-22 11:09:50 -03:00
Adhemerval Zanella
cbf3571f49 arm: Use sqrt{f} builtin
Checked on arm-linux-gnueabi and armv7-linux-gnueabihf
2020-06-22 11:09:50 -03:00
Adhemerval Zanella
9dbb3fdfb7 riscv: Use sqrt{f} builtin
Checked with a build for riscv64-linux-gnu-rv64imac-lp64 (no
builtin support), riscv64-linux-gnu-rv64imafdc-lp64, and
riscv64-linux-gnu-rv64imafdc-lp64d.
2020-06-22 11:09:50 -03:00
Adhemerval Zanella
3ca05a8e9e s390: Use sqrt{f} builtin
Checked on s390x-linux-gnu.
2020-06-22 11:09:50 -03:00
Adhemerval Zanella
c9a30f08e1 sparc: Use sqrt{f} builtin
It also enabled to use fsqrtd on sparc64.

Checked on sparcv9-linux-gnu and sparc64-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
32c65b28f3 mips: Use sqrt{f} builtin
Checked with a build against mips-linux-gnu and mips64-linux-gnu
and comparing the resulting binaries.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
8a7923b57e alpha: Use builtin sqrt{f}
The generic implementation is simplified by removing the
'optimization' for !_IEEE_FP_INEXACT (which does not handle
inexact neither some values).

Checked on alpha-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
b24381e50f i386: Use builtin sqrtl
Checked on i686-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
d19d25dd06 x86_64: Use builtin sqrt{f,l}
Checked on x86_64-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
169ea8f928 powerpc: Use sqrt{f} builtin
The powerpc sqrt implementation is also simplified:

  - the static constants are open coded within the implementation.
  - for !USE_SQRT_BUILTIN the function is implemented directly on
    __ieee754_sqrt (it avoid an superflous extra jump).

Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
a2e833667d s390x: Use fma{f} builtin
Checked on s390x-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
271afad8f4 aarch64: Use math-use-builtins for ceil{f}
The define is already set on the math-use-builtins-ceil.h, the patch
just removes the implementations (it was missed on c9feb1be93).

Checked on aarch64-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
e80501a5c9 math: Decompose math-use-builtins.h
Each symbol definitions are moved on a separated file and it
cover all symbol type definitions (float, double, long double,
and float128).

It allows to set support for architectures without the boiler
place of copying default values.

Checked with a build on the affected ABIs.
2020-06-22 11:09:45 -03:00
Samuel Thibault
c013d5d3aa hurd: Add mremap
* sysdeps/mach/hurd/mremap.c: New file.
* sysdeps/mach/hurd/Makefile [misc] (sysdep_routines): Add mremap.
* sysdeps/mach/hurd/Versions (libc.GLIBC_2.32): Add mremap.
* sysdeps/mach/hurd/i386/libc.abilist: Add mremap.
2020-06-20 13:49:57 +00:00
Adhemerval Zanella
3297d019e1 ia64: Use generic exp10f
The generic implementation is slight worse (Itanium(R) Processor 9020):

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 3.61582e+08,
    "iterations": 2.384e+07,
    "reciprocal-throughput": 14.8334,
    "latency": 15.5006,
    "max-throughput": 6.74153e+07,
    "min-throughput": 6.45136e+07
   }
  }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 3.85549e+08,
    "iterations": 2.384e+07,
    "reciprocal-throughput": 15.8391,
    "latency": 16.5056,
    "max-throughput": 6.31348e+07,
    "min-throughput": 6.05857e+07
   }
  }

However it fixes all the issues on both:

  math/test-float-exp10
  math/test-float32-exp10

(all the issues wrong results for non default rounding modes).

The existing ia64 libm interface uses matherrf and matherrl in addition
to matherr for SVID error handling. However, there is no such error
handling support for exp10f in ia64 libm.  So replacing it with the
generic implementation should be fine.

Checked on ia64-linux-gnu.
2020-06-19 12:08:52 -03:00
Adhemerval Zanella
be668a8d78 New exp10f version without SVID compat wrapper
This patch changes the exp10f error handling semantics to only set
errno according to POSIX rules.  New symbol version is introduced at
GLIBC_2.32.  The old wrappers are kept for compat symbols.

There are some outliers that need special handling:

  - ia64 provides an optimized implementation of exp10f that uses ia64
    specific routines to set SVID compatibility.  The new symbol version
    is aliased to the exp10f one.

  - m68k also provides an optimized implementation, and the new version
    uses it instead of the sysdeps/ieee754/flt32 one.

  - riscv and csky uses the generic template implementation that
    does not provide SVID support.  For both cases a new exp10f
    version is not added, but rather the symbols version of the
    generic sysdeps/ieee754/flt32 is adjusted instead.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
powerpc64le-linux-gnu.
2020-06-19 12:08:47 -03:00
Adhemerval Zanella
4b2d8e4442 i386: Use generic exp10f
The generic implementation is twice as fast.  Using the exp10f
benchmark:

 * master:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.02967e+09,
    "iterations": 4.768e+07,
    "reciprocal-throughput": 18.3579,
    "latency": 24.8331,
    "max-throughput": 5.44725e+07,
    "min-throughput": 4.02688e+07
   }
  }

 * patched:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.01821e+09,
    "iterations": 6.1984e+07,
    "reciprocal-throughput": 13.1975,
    "latency": 19.6563,
    "max-throughput": 7.57719e+07,
    "min-throughput": 5.08743e+07
   }
  }

Checked on i686-linux-gnu.
2020-06-19 10:48:15 -03:00
Paul Zimmermann
6e98983c09 math: Optimized generic exp10f with wrappers
It is inspired by expf and reuses its tables and internal functions.
The error checks are inlined and errno setting is in separate tail
called functions, but the wrappers are kept in this patch to handle
the _LIB_VERSION==_SVID_ case.

Double precision arithmetics is used which is expected to be faster on
most targets (including soft-float) than using single precision and it
is easier to get good precision result with it.

Result for x86_64 (i7-4790K CPU @ 4.00GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.0414e+09,
    "iterations": 1.00128e+08,
    "reciprocal-throughput": 26.6818,
    "latency": 54.043,
    "max-throughput": 3.74787e+07,
    "min-throughput": 1.85038e+07
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.11951e+09,
    "iterations": 1.23968e+08,
    "reciprocal-throughput": 21.0581,
    "latency": 45.4028,
    "max-throughput": 4.74876e+07,
    "min-throughput": 2.20251e+07
   }

Result for aarch64 (A72 @ 2GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.62362e+09,
    "iterations": 3.3376e+07,
    "reciprocal-throughput": 127.698,
    "latency": 149.365,
    "max-throughput": 7.831e+06,
    "min-throughput": 6.69501e+06
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.29108e+09,
    "iterations": 6.6752e+07,
    "reciprocal-throughput": 51.2111,
    "latency": 77.3568,
    "max-throughput": 1.9527e+07,
    "min-throughput": 1.29271e+07
   }

Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu,
and sparc64-linux-gnu.
2020-06-19 10:48:15 -03:00
H.J. Lu
27f8864bd4 x86: Update F16C detection [BZ #26133]
Since F16C requires AVX, set F16C usable only when AVX is usable.
2020-06-18 07:01:58 -07:00
Sunil K Pandey
75870237ff Fix avx2 strncmp offset compare condition check [BZ #25933]
strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.
2020-06-17 07:07:38 -07:00
H.J. Lu
a35a59036e x86_64: Use %xmmN with vpxor to clear a vector register
Since "vpxor %xmmN, %xmmN, %xmmN" clears the whole vector register, use
%xmmN, instead of %ymmN, with vpxor to clear a vector register.
2020-06-17 05:44:02 -07:00
H.J. Lu
b7c9bb183b x86: Correct bit_cpu_CLFLUSHOPT [BZ #26128]
bit_cpu_CLFLUSHOPT should be (1u << 23), not (1u << 22).
2020-06-17 05:32:37 -07:00
Paul E. Murphy
b637306d3e powerpc64le: refactor e_sqrtf128.c
Combine both implementations into a single file to allow
building twice with appropriate multiarch support when possible.
2020-06-16 13:50:44 -05:00
Joseph Myers
b67339d0bb Update syscall-names.list for Linux 5.7.
Linux 5.7 has no new syscalls.  Update the version number in
syscall-names.list to reflect that it is still current for 5.7.

Tested with build-many-glibcs.py.
2020-06-15 22:58:22 +00:00
Vineet Gupta
e93c264336 ieee754/dbl-64: Reduce the scope of temporary storage variables
This came to light when adding hard-flaot support to ARC glibc port
without hardware sqrt support causing glibc build to fail:

| ../sysdeps/ieee754/dbl-64/e_sqrt.c: In function '__ieee754_sqrt':
| ../sysdeps/ieee754/dbl-64/e_sqrt.c:58:54: error: unused variable 'ty' [-Werror=unused-variable]
|   double y, t, del, res, res1, hy, z, zz, p, hx, tx, ty, s;

The reason being EMULV() macro uses the hardware provided
__builtin_fma() variant, leaving temporary variables 'p, hx, tx, hy, ty'
unused hence compiler warning and ensuing error.

The intent of the patch was to fix that error, but EMULV is pervasive
and used fair bit indirectly via othe rmacros, hence this patch.
Functionally it should not result in code gen changes and if at all
those would be better since the scope of those temporaries is greatly
reduced now

Built tested with aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf hppa-linux-gnu x86_64-linux-gnu arm-linux-gnueabihf riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64 powerpc-linux-gnu microblaze-linux-gnu nios2-linux-gnu hppa-linux-gnu

Also as suggested by Joseph [1] used --strip and compared the libs with
and w/o patch and they are byte-for-byte unchanged (with gcc 9).

| for i in `find . -name libm-2.31.9000.so`;
| do
|    echo $i; diff $i /SCRATCH/vgupta/gnu2/install/glibcs/$i ; echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so

[1] https://sourceware.org/pipermail/libc-alpha/2019-November/108267.html
2020-06-15 13:09:21 -07:00
Samuel Thibault
c1dcc54113 hurd: Fix __writev_nocancel_nostatus
* sysdeps/mach/hurd/Makefile [subdir=misc] (sysdep_routines): Add
writev_nocancel writev_nocancel_nostatus.
* sysdeps/mach/hurd/not-cancel.h (__writev_nocancel_nostatus): Replace
macro with function declaration (with hidden prototype in libc).
(__writev_nocancel): New function declaration (with hidden prototype in libc).
* sysdeps/mach/hurd/writev_nocancel_nostatus.c: New file.
* sysdeps/posix/writev_nocancel.c: New file, includes writev.c to make a
nocancel variant that calls __write_nocancel.
* sysdeps/posix/writev.c (writev): Do not define alias if __writev is
renamed.
2020-06-14 17:45:04 +00:00
Samuel Thibault
0c46891442 hurd: Make send* cancellation points
* sysdeps/mach/hurd/send.c (__send): Make the __socket_send call
a cancellation point.
* sysdeps/mach/hurd/sendto.c (__sendto): Likewise.
* sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.
2020-06-14 17:11:22 +00:00
Samuel Thibault
45fce058fe htl: Enable more cancellation tests
* nptl/tst-cancel-self-cancelstate.c, tst-cancel-self.c, tst-cancel9.c,
tst-cancelx9.c: Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
2020-06-14 16:16:59 +00:00
Samuel Thibault
662de0889a hurd: Make write and pwrite64 cancellation points
and add _nocancel variants.

* sysdeps/mach/hurd/write.c (__libc_write): Call __write_nocancel
surrounded by enabling async cancel, to replace implementation moved
to...
* sysdeps/mach/hurd/write_nocancel.c (__write_nocancel): ... here.
* sysdeps/mach/hurd/pwrite64.c (__libc_pwrite64): Call
__pwrite64_nocancel surrounded by enabling async cancel, to replace
implementation moved to...
* sysdeps/mach/hurd/pwrite64_nocancel.c (__pwrite64_nocancel): ... here.
* sysdeps/mach/hurd/Makefile (sysdep_routines): Add write_nocancel and
pwrite64_nocancel.
* sysdeps/mach/hurd/not-cancel.h (__write_nocancel,
__pwrite64_nocancel): Replace macros with prototypes with a hidden proto on
libc.

* sysdeps/mach/hurd/dl-sysdep.c (__write_nocancel): New alias, check
that it is not hidden.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __write_nocancel.
(ld.GLIBC_PRIVATE): Add __write_nocancel.
* sysdeps/mach/hurd/i386/localplt.data (__write_nocancel): Add
reference.
2020-06-14 15:53:21 +00:00
Samuel Thibault
76fe4ef4be htl: Fix cleanup support for IO locking
* sysdeps/htl/stdio-lock.h: New file, registers locking cleanup to htl.
* sysdeps/htl/libc-lockP.h: Include <libc-lock.h>.
(__libc_cleanup_region_start, __libc_cleanup_end,
__libc_cleanup_region_end): Override macros from <libc-lock.h> with
versions which register cleanup to htl.
(__pthread_get_cleanup_stack): Make reference weak for skipping
registration on in the static non-libpthread case.
2020-06-14 15:53:04 +00:00
Samuel Thibault
ea5cad3e37 htl: Add noreturn attribute on __pthread_exit forward
* sysdeps/htl/pthread-functions.h (__pthread_exit): Add noreturn
attribute.
(struct pthread_functions): Add noreturn attribute on ptr___pthread_exit
field.
2020-06-14 12:53:38 +00:00
Samuel Thibault
89edef7b39 hurd: Make recv* cancellation points
* sysdeps/mach/hurd/recv.c (__recv): Make the __socket_recv call
cancellable.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Make the __socket_recv and
__socket_whatis_address calls cancellable.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Make the __socket_recv,
__socket_whatis_address, __io_reauthenticate, and __auth_user_authenticate calls
cancellable.
2020-06-14 00:19:35 +00:00
Paul E. Murphy
146fea0764 powerpc: Automatic CPU detection in preconfigure
Added a check to detect the CPU value in preconfigure, so that glibc is
built with the correct --with-cpu value.  And move existing checks into
preconfigure.ac.

Co-Authored-By: Carlos Eduardo Seo  <cseo@linux.vnet.ibm.com>
Co-Authored-By: Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
2020-06-11 17:15:49 -05:00
Samuel Thibault
62d97c3432 htl: Enable more cancel tests
* nptl/tst-cancel11.c, tst-cancel21-static.c, tst-cancel21.c, tst-cancel6.c, tst-cancelx11.c, tst-cancelx21.c, tst-cancelx6.c: Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
2020-06-10 21:34:19 +00:00
Andrea Corallo
a365ac45b7 aarch64: MTE compatible strlen
Introduce an Arm MTE compatible strlen implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance on modern cores. On cores with less
efficient Advanced SIMD implementation such as Cortex-A53 it can
be slower.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-09 09:21:11 +01:00
Andrea Corallo
49beaaec1b aarch64: MTE compatible strchr
Introduce an Arm MTE compatible strchr implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-09 09:20:27 +01:00
Andrea Corallo
f7de454f20 aarch64: MTE compatible strchrnul
Introduce an Arm MTE compatible strchrnul implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-09 09:20:27 +01:00
Krzysztof Koch
d1f75e9644 AArch64: Merge Falkor memcpy and memmove implementations
Falkor's memcpy and memmove share some implementation details,
therefore, the two routines are moved to a single source file
for code reuse.

The two routines now share code for small and medium copies
(up to and including 128 bytes). Large copies in memcpy do not
handle overlap correctly, consequently, the loops for
moving/copying more than 128 bytes stay separate for memcpy
and memmove.

To increase code reuse a number of small modifications were made:

1. The old implementation of memcpy copied the first 16-bytes as
   soon as the size of data was determined to be greater than 32 bytes.
   For memcpy code to also work when copying small/medium overlapping
   data, the first load and store was moved to the large copy case.
2. Medium memcpy case no longer assumes that 16 bytes were already
   copied and uses 8 registers to copy up to 128 bytes.
3. Small case for memmove was enlarged to that of memcpy, which is
   less than or equal to 32 bytes.
4. Medium case for memmove was enlarged to that of memcpy, which is
   less than or equal to 128 bytes.

Other changes include:

1. Improve alignment of existing loop bodies.
2. 'Delouse' memmove and memcpy input arguments. Make sure that
   upper 32-bits of input registers are zeroed if unused.
3. Do one more iteration in memmove loops and reduce the number of
   copies made from the start/end of the buffer, depending on
   the direction of the memmove loop.

Benchmarking:

Looking at the results from bench-memcpy-random.out, we can see that
now memmove_falkor is about 5% faster than memcpy_falkor_old, while
memmove_falkor_old was more than 15% slower. The memcpy implementation
remained largely unmodified, so there is no significant performance
change.

The reason for such a significant memmove performance gain is the
increase of the upper bound on the small copy case to 32 bytes and
the increase of the upper bound on the medium copy case to 128 bytes.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-08 14:13:05 +01:00
Samuel Thibault
f112dcc506 hurd: document that gcc&gdb look at the trampoline code
* sysdeps/mach/hurd/i386/trampoline.c (rpc_wait_trampoline): Document
which gcc and gdb files look at the code of the trampoline.
2020-06-08 14:41:57 +02:00
Samuel Thibault
dd7a8ad7ba pthread: Move back linking rules to nptl and htl
d6d74ec16 ('htl: Enable more tests') moved the linking rules from
nptl/Makefile and htl/Makefile to the shared sysdeps/pthread/Makefile.  But
e.g. on powerpc some tests are added in sysdeps/powerpc/Makefile, which is
included *after* sysdeps/pthread/Makefile, and thus the tests don't get
affected by the rules and fail to link.  For now let's just copy over the
set of rules in both nptl/Makefile and htl/Makefile.

* sysdeps/pthread/Makefile: Move libpthread linking rules to...
* htl/Makefile: ... here and...
* nptl/Makefile: ... there.
2020-06-08 14:34:22 +02:00
Samuel Thibault
314a431d37 htl: Enable more tests
* nptl/tst-_res1.c, tst-_res1mod1.c, tst-_res1mod2.c, tst-atfork2.c,
tst-atfork2mod.c, tst-fini1.c, tst-fini1mod.c, tst-tls4.c, tst-tls4moda.c,
tst-tls4modb.c: Move to...
* sysdeps/pthread: ... here.  Rename tst-tls4.c to tst-pt-tls4.c to avoid
conflicting with elf/tst-tls4.c.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
2020-06-07 23:45:25 +00:00
Samuel Thibault
15e995a8fb htl: Fix registration of atfork handlers in modules
We really need modules to use their own pthread_atfork so that
__dso_handle properly identifies them.

* sysdeps/htl/pt-atfork.c (__pthread_atfork): Hide function.
(pthread_atfork): Hide alias.
* sysdeps/htl/old_pt-atfork.c (pthread_atfork): Rename macro to
__pthread_atfork to fix building the compatibility alias.
2020-06-07 23:36:42 +00:00
Samuel Thibault
af27fabe40 htl: Fix tls initialization for already-created threads
* sysdeps/htl/pthreadP.h: Include <link.h>
(__pthread_init_static_tls): New prototype.
* htl/pt-alloc.c (__pthread_init_static_tls): New function.
* sysdeps/mach/hurd/htl/pt-sysdep.c (_init_routine): Initialize tcb
field of initial thread. Set GL(dl_init_static_tls) to
&__pthread_init_static_tls.
2020-06-07 23:36:40 +00:00
Samuel Thibault
3944c61bdf hurd: Make read and pread64 cancellable
and add _nocancel variants.

* sysdeps/mach/hurd/pread64.c (__libc_pread64):  Call __pread64_nocancel
surrounded by enabling async cancel, to replace implementation moved to...
* sysdeps/mach/hurd/pread64_nocancel.c (__pread64_nocancel): ... here.
* sysdeps/mach/hurd/read.c (__libc_read): Call __read_nocancel surrounded by
enabling async cancel, to replace implementation moved to...
* sysdeps/mach/hurd/read_nocancel.c (__read_nocancel): ... here.
* sysdeps/mach/hurd/Makefile (sysdep_routines): Add read_nocancel and
pread64_nocancel.
* sysdeps/mach/hurd/not-cancel.h (__read_nocancel, __pread64_nocancel):
Replace macros with prototypes with a hidden proto on libc.

* sysdeps/mach/hurd/dl-sysdep.c: Include <not-cancel.h>.
(__pread64_nocancel): New alias, check that it is not hidden.
(__read_nocancel): New alias, check that it is not hidden.

* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __read_nocancel and
__pread64_nocancel.
(ld.GLIBC_2.1): Add __pread64.
(ld.GLIBC_PRIVATE): Add __read_nocancel and __pread64_nocancel.
* sysdeps/mach/hurd/i386/ld.abilist (__pread64): Add symbol.
* sysdeps/mach/hurd/i386/localplt.data (__read_nocancel, __pread64,
__pread64_nocancel): Add references.
2020-06-07 23:36:10 +00:00
Samuel Thibault
337a7b74fa hurd: Fix unwinding over interruptible RPC
* sysdeps/mach/hurd/i386/intr-msg.h (INTR_MSG_TRAP): Set CFA register to
%ecx while %esp is altered.
2020-06-07 23:36:10 +00:00
Samuel Thibault
4bab9ad854 htl: Enable but XFAIL tst-flock2, tst-signal1, tst-signal2
They need setpshared support.

* nptl/tst-flock2.c, tst-signal1.c, tst-signal2.c: Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding tests references to...
* sysdeps/pthread/Makefile: ... here.
* sysdeps/mach/hurd/i386/Makefile (test-xfail-tst-flock2,
test-xfail-tst-signal1, test-xfail-tst-signal2): Add.
2020-06-07 16:14:23 +02:00
Samuel Thibault
7b6b18319e hurd: XFAIL more tests that require setpshared support
* sysdeps/mach/hurd/i386/Makefile (test-xfail-tst-pututxline-cache,
test-xfail-tst-pututxline-lockfail, test-xfail-tst-mallocfork2): Add.
2020-06-07 15:37:33 +02:00
Samuel Thibault
e797c57f93 hurd: Briefly document in xfails the topics of the bugzilla entries
* sysdeps/mach/hurd/i386/Makefile: Add comments.
2020-06-07 15:35:12 +02:00
Samuel Thibault
d6d74ec16c htl: Enable more tests
* htl/Makefile: Remove rules adding libpthread.so and libpthread.a to link
lines.
* nptl/Makefile: Move rules adding libpthread.so and libpthread.a to
link lines to...
* sysdeps/pthread/Makefile: ... here.

* nptl/eintr.c, tst-align.c tst-align3.c tst-atfork1.c tst-backtrace1.c
tst-bad-schedattr.c tst-cancel-self-canceltype.c tst-cancel-self-cleanup.c
tst-cancel-self-testcancel.c tst-cancel1.c tst-cancel10.c tst-cancel12.c
tst-cancel14.c tst-cancel15.c tst-cancel18.c tst-cancel19.c tst-cancel2.c
tst-cancel22.c tst-cancel23.c tst-cancel26.c tst-cancel27.c tst-cancel28.c
tst-cancel3.c tst-cancel8.c tst-cancelx1.c tst-cancelx10.c tst-cancelx12.c
tst-cancelx14.c tst-cancelx15.c tst-cancelx18.c tst-cancelx2.c tst-cancelx3.c
tst-cancelx8.c tst-cleanup0.c tst-cleanup0.expect tst-cleanup1.c tst-cleanup2.c
tst-cleanup3.c tst-cleanupx0.c tst-cleanupx0.expect tst-cleanupx1.c
tst-cleanupx2.c tst-cleanupx3.c tst-clock1.c tst-create-detached.c tst-detach1.c
tst-eintr2.c tst-eintr3.c tst-eintr4.c tst-eintr5.c tst-exec1.c tst-exec2.c
tst-exec3.c tst-exit1.c tst-exit2.c tst-exit3.c tst-flock1.c tst-fork1.c
tst-fork2.c tst-fork3.c tst-fork4.c tst-getpid3.c tst-kill1.c tst-kill2.c
tst-kill3.c tst-kill4.c tst-kill5.c tst-kill6.c tst-locale1.c tst-locale2.c
tst-memstream.c tst-popen1.c tst-raise1.c tst-sem5.c tst-setuid3.c tst-signal4.c
tst-signal5.c tst-signal6.c tst-signal8.c tst-stack1.c tst-stdio1.c tst-stdio2.c
tst-sysconf.c tst-tls1.c tst-tls2.c tst-tsd1.c tst-tsd2.c tst-tsd5.c tst-tsd6.c
tst-umask1.c tst-unload.c tst-unwind-thread.c tst-vfork1.c tst-vfork1x.c
tst-vfork2.c tst-vfork2x.c: Move tests to...
* sysdeps/pthread: ... here.
Rename
tst-popen1.c -> tst-pt-popen1.c
tst-align.c -> tst-pt-align.c
tst-align3.c -> tst-pt-align3.c
tst-sysconf.c -> tst-pt-sysconf.c
tst-tls1.c -> tst-pt-tls1.c
tst-tls2.c -> tst-pt-tls2.c
tst-vfork1.c -> tst-pt-vfork1.c
tst-vfork2.c -> tst-pt-vfork2.c
to avoid conflicting with libio/tst-popen1.c, elf/tst-align.c,
posix/tst-sysconf.c, elf/tst-tls1.c, elf/tst-tls2.c, posix/tst-vfork1.c,
posix/tst-vfork2.c.

* nptl/Makefile: Move corresponding tests references and special rules to...
* sysdeps/pthread/Makefile: ... here.

* sysdeps/pthread/tst-stack1.c (do_test): Do not clamp stack size to
PTHREAD_STACK_MIN if not defined.

Tested on linux-x86_64 and hurd-i386
2020-06-07 13:35:54 +02:00
Samuel Thibault
be22a151f3 htl: Add sem_clockwait support
* sysdeps/htl/sem-timedwait.c (__sem_timedwait_internal): Add clock_id
parameter instead of hardcoding CLOCK_REALTIME.
(__sem_clockwait): New function.
(sem_clockwait): New weak alias.
* sysdeps/htl/sem-wait.c (__sem_timedwait_internal): Update declaration.
(__sem_wait): Update call to __sem_timedwait_internal.
* htl/Versions (GLIBC_2.32): Add sem_clockwait.
* sysdeps/mach/hurd/i386/libpthread.abilist (sem_clockwait): Add symbol.
* nptl/Makefile (tests): Move tst-sem5 to...
* sysdeps/pthread/Makefile (tests): ... here.
2020-06-07 03:14:49 +02:00
Samuel Thibault
02937d825a hurd: fix clearing SS_ONSTACK when longjmp-ing from sighandler
* sysdeps/i386/htl/Makefile: New file.
* sysdeps/i386/htl/tcb-offsets.sym: New file.
* sysdeps/mach/hurd/i386/Makefile [setjmp] (gen-as-const-headers): Add
signal-defines.sym.
* sysdeps/mach/hurd/i386/____longjmp_chk.S: Include tcb-offsets.h.
(____longjmp_chk): Harmonize with i386's __longjmp. Clear SS_ONSTACK
when jumping off the alternate stack.
* sysdeps/mach/hurd/i386/__longjmp.S: New file.
2020-06-06 20:24:30 +02:00
Samuel Thibault
8fcc772da8 hurd: Add pointer guard support
* sysdeps/mach/hurd/i386/tls.h (THREAD_SET_POINTER_GUARD,
THREAD_COPY_POINTER_GUARD): New macros.
2020-06-06 03:29:44 +02:00
Samuel Thibault
ecfa912f42 hurd: Add stack guard support
* sysdeps/mach/hurd/i386/tls.h (THREAD_SET_STACK_GUARD,
THREAD_COPY_STACK_GUARD): New macros
* sysdeps/mach/hurd/i386/ld.abilist (__stack_chk_guard): Remove symbol.
2020-06-06 02:04:32 +02:00
Vineet Gupta
8dbb7a08ec dl-runtime: reloc_{offset,index} now functions arch overide'able
The existing macros are fragile and expect local variables with a
certain name. Fix this by defining them as functions with default
implementation in a new header dl-runtime.h which arches can override
if need be.

This came up during ARC port review, hence the need for argument pltgot
in reloc_index() which is not needed by existing ports.

This patch potentially only affects hppa/x86 ports,
build tested for both those configs and a few more.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-06-05 13:45:46 -07:00
Paul E. Murphy
a23bd00f9d powerpc64le: add optimized strlen for P9
This started as a trivial change to Anton's rawmemchr.  I got
carried away.  This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).

The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.

Finally, explicly use the P7 strlen with the runtime loader when building
P9.  We need to be cautious about vector/vsx extensions here on P9 only
builds.
2020-06-05 15:30:00 -05:00
Paul E. Murphy
6ef4227509 powerpc64le: use common fmaf128 implementation
This defines the macro such that it should behave best on all
supported powerpc targets.  Likewise, this allows us to remove the
ppc64le specific s_fmaf128.c.

I have verified powerpc64le multiarch and powerpc64le power9
no-multiarch builds continue to generate optimize fmaf128.
2020-06-05 15:29:44 -05:00
H.J. Lu
f607047668 Update HP_TIMING_NOW for _ISOMAC in sysdeps/generic/hp-timing.h
commit e9698175b0
Author: Lukasz Majewski <lukma@denx.de>
Date:   Mon Mar 16 08:31:41 2020 +0100

    y2038: Replace __clock_gettime with __clock_gettime64

breaks benchtests with sysdeps/generic/hp-timing.h:

In file included from ./bench-timing.h:23,
                 from ./bench-skeleton.c:25,
                 from
/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/benchtests/bench-rint.c:45:
./bench-skeleton.c: In function ‘main’:
../sysdeps/generic/hp-timing.h:37:23: error: storage size of ‘tv’ isn’t known
   37 |   struct __timespec64 tv;      \
      |                       ^~

Define HP_TIMING_NOW with clock_gettime in sysdeps/generic/hp-timing.h
if _ISOMAC is defined.  Don't define __clock_gettime in bench-timing.h
since it is no longer needed.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-05 09:44:06 -07:00
Adhemerval Zanella
6f10ff02cb powerpc: Fix powerpc64le due a7a3435c9a
The build uses an undefined macro evaluation for fmaf128 build.
For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0.

Checked with a build for:

  powerpc64le-linux-gnu-power9-disable-multi-arch
  powerpc64le-linux-gnu-power9
  powerpc64le-linux-gnu
  powerpc64-linux-gnu-power8
  powerpc64-linux-gnu
  powerpc-linux-gnu-power4
  powerpc-linux-gnu
2020-06-04 09:05:41 -03:00
Vineet Gupta
a7a3435c9a powerpc/fpu: use generic fma functions
Tested with build-many-glibcs for powerpc-linux-gnu

This is a non functional change and powerpc libm before/after was
byte invariant as compared below:

| cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline
| for i in `find . -name libm-2.31.9000.so`; do
|   echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ;
|   echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so
| 0

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:33 -07:00
Vineet Gupta
c9feb1be93 aarch/fpu: use generic builtins based math functions
introduce sysdep header math-use-builtins.h to replace aarch64
implementations with corresponding generic ones.

 - newly inroduced generic sqrt{,f}, fma{,f}
 - existing floor{,f}, nearbyint{,f}, rint{,f}, round{,f}, trunc{,f}
 - Note that generic copysign was already enabled (via generic
   math-use-builtins.h) now thru sysdep header

Tested with build-many-glibcs for aarch64-linux-gnu

This is a non functional change and aarch64 libm before/after was
byte invariant as compared below:

| cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline
| for i in `find . -name libm-2.31.9000.so`; do
|   echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ;
|   echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so
| 0

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-06-03 10:23:33 -07:00
Vineet Gupta
628d90c5f9 ieee754: provide gcc builtins based generic fma functions
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:28 -07:00
Vineet Gupta
3374868668 ieee754: provide gcc builtins based generic sqrt functions
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:22 -07:00
Florian Weimer
ba9f6ee9bb Linux: Use __pthread_attr_setsigmask_internal for timer helper thread
timer_create needs to create threads with all signals blocked,
including SIGTIMER (which happens to equal SIGCANCEL).

Fixes commit b3cae39dcb ("nptl: Start
new threads with all signals blocked [BZ #25098]").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-02 11:59:26 +02:00
Florian Weimer
ec41af45a6 nptl: Add pthread_attr_setsigmask_np, pthread_attr_getsigmask_np
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2020-06-02 11:59:18 +02:00
Florian Weimer
7538d46113 nptl: Make pthread_attr_t dynamically extensible
This introduces the function __pthread_attr_extension to allocate the
extension space, which is freed by pthread_attr_destroy.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-02 11:54:58 +02:00
Samuel Thibault
8c64cc78bc htl: Fix gsync_wait symbol exposition
* sysdeps/htl/pt-cond-destroy.c (__pthread_cond_destroy): Call
__gsync_wait instead of gsync_wait.
2020-06-01 22:22:03 +02:00
Samuel Thibault
8081702460 htl: Make pthread_cond_destroy wait for threads to be woken
This allows to reuse the storage after calling pthread_cond_destroy.

* sysdeps/htl/bits/types/struct___pthread_cond.h (__pthread_cond):
Replace unused struct __pthread_condimpl *__impl field with unsigned int
__wrefs.
(__PTHREAD_COND_INITIALIZER): Update accordingly.
* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Register as waiter in __wrefs field. On unregistering, wake any pending
pthread_cond_destroy.
* sysdeps/htl/pt-cond-destroy.c (__pthread_cond_destroy): Register wake
request in __wrefs.
* nptl/Makefile (tests): Move tst-cond20 tst-cond21 to...
* sysdeps/pthread/Makefile (tests): ... here.
* nptl/tst-cond20.c nptl/tst-cond21.c: Move to...
* sysdeps/pthread/tst-cond20.c sysdeps/pthread/tst-cond21.c: ... here.
2020-06-01 17:38:31 +00:00
Samuel Thibault
a3e589d1f6 htl: Enable more cond tests
* nptl/Makefile (tests): Move tst-cond11 and tst-cond27 to...
* sysdeps/pthread/Makefile (tests): ... here.
2020-06-01 17:38:31 +00:00
Samuel Thibault
3478859281 tst-cond11: Fix build with _SC_MONOTONIC_CLOCK > 0
* sysdeps/pthread/tst-cond11.c (do_test): Fix misplaced brace.
2020-06-01 17:38:31 +00:00
Samuel Thibault
6544999083 hurd: Fix fexecve
* sysdeps/mach/hurd/fexecve.c (fexecve): Re-lookup fd with O_EXEC before
calling _hurd_exec_paths.
2020-05-28 23:30:57 +00:00
Florian Weimer
cc0118983a i386: Remove unused file sysdeps/unix/i386/sysdep.S
Linux overrides this file via sysdeps/unix/sysv/linux/i386/sysdep.c.
Hurd does not have sysdeps/unix/i386 on its search path, so it uses
csu/sysdep.c instead.
2020-05-28 13:44:29 +02:00
Samuel Thibault
c318f663bd hurd: fix ptsname error when called on a non-tty
* sysdeps/mach/hurd/ptsname.c (__ptsname_internal): Replace
not-supported errors from __term_get_peername with ENOTTY.
2020-05-28 10:22:36 +00:00
Samuel Thibault
94884ff506 hurd: Fix fdopendir checking for directory type
* sysdeps/mach/hurd/fdopendir.c (__fdopendir): Lookup "./" instead of
"/" that would designate the root of the filesystem.
2020-05-28 10:15:33 +00:00
Florian Weimer
fff30716a7 i386: Remove NO_TLS_DIRECT_SEG_REFS handling
This was needed for 32-bit PV Xen, which has been superseded by this
point according to Xen developers.
2020-05-28 11:53:08 +02:00
Florian Weimer
6321f9e5e8 Hurd: Move <hurd/sigpreempt.h> internals into wrapper header
_hurdsig_preemptors and _hurdsig_preempted_set are not ABI symbols,
so do not declare them.  HURD_PREEMPT_SIGNAL_P is an implementation
detail, so move it as well.

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2020-05-28 11:40:13 +02:00
Florian Weimer
a9175662f8 Hurd: Use __sigmask in favor of deprecated sigmask
This fixes various build errors due to deprecation warnings.

Fixes commit 02802fafcf
("signal: Deprecate additional legacy signal handling functions").

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2020-05-28 11:40:13 +02:00
Adhemerval Zanella
ef3330fde4 linux: Use internal DIR locks when accessing filepos on telldir
Since it might change during a readdir call.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-05-27 11:55:00 -03:00
Samuel Thibault
415d0b0b3f Update i386 libm-test-ulps 2020-05-26 13:21:57 +02:00
Samuel Thibault
28cada0418 htl: Add clock variants
* htl/pt-join.c (__pthread_join): Move implementation to...
(__pthread_join_common): ... new function. Add try, timed and clock support.
(__pthread_join): Reimplement on top of __pthread_join_common.
(__pthread_tryjoin_np, __pthread_timedjoin_np, __pthread_clockjoin_np):
Implement on top of __pthread_join_common.
(pthread_tryjoin_np, pthread_timedjoin_np, pthread_clockjoin_np): New
aliases.

* hurd/hurdlock.c (__lll_abstimed_wait, __lll_abstimed_xwait,
__lll_abstimed_lock): Check for supported clock.

* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Add clockid parameter and support it.
(__pthread_cond_timedwait): Pass -1 as clockid.
(__pthread_cond_clockwait): New function.
(pthread_cond_clockwait): New alias.
* sysdeps/htl/pt-cond-wait.c (__pthread_cond_timedwait_internal): Update
prototype.
(__pthread_cond_wait): Pass -1 as clockid.

* sysdeps/htl/pt-rwlock-timedrdlock.c
(__pthread_rwlock_timedrdlock_internal): Add clockid parameter, and
support id.
(__pthread_rwlock_clockrdlock): New function.
(pthread_rwlock_clockrdlock): New alias.
* sysdeps/htl/pt-rwlock-rdlock.c (__pthread_rwlock_timedrdlock_internal): Update
prototype.
(__pthread_rwlock_rdlock): Pass -1 as clockid.

* sysdeps/htl/pt-rwlock-timedwrlock.c
(__pthread_rwlock_timedwrlock_internal): Add clockid parameter, and
support id.
(__pthread_rwlock_clockwrlock): New function.
(pthread_rwlock_clockwrlock): New alias.
* sysdeps/htl/pt-rwlock-wrlock.c (__pthread_rwlock_timedwrlock_internal): Update
prototype.
(__pthread_rwlock_wrlock): Pass -1 as clockid.

* sysdeps/mach/hurd/htl/pt-mutex-timedlock.c (__pthread_mutex_timedlock): Move implementation to
(__pthread_mutex_clocklock): New function with additional clockid
parameter and support it.
(pthread_mutex_clocklock): New alias.
(__pthread_mutex_timedlock): Reimplement on top of __pthread_mutex_clocklock.

* sysdeps/htl/pthread.h (pthread_tryjoin_np, pthread_timedjoin_np,
pthread_clockjoin_np, pthread_mutex_clocklock, pthread_cond_clockwait,
pthread_rwlock_clockrdlock, pthread_rwlock_clockwrlock): New prototypes.
* sysdeps/htl/pthreadP.h (__pthread_cond_clockwait): New prototype.

* htl/Versions (GLIBC_2.32): Add pthread_cond_clockwait,
pthread_mutex_clocklock, pthread_rwlock_clockrdlock, pthread_rwlock_clockwrlock,
pthread_tryjoin_np, pthread_timedjoin_np, pthread_clockjoin_np.
* sysdeps/mach/hurd/i386/libpthread.abilist (pthread_clockjoin_np,
pthread_cond_clockwait, pthread_mutex_clocklock, pthread_rwlock_clockrdlock,
pthread_rwlock_clockwrlock, pthread_timedjoin_np, pthread_tryjoin_np):
New functions.

* nptl/tst-abstime.c, nptl/tst-join10.c, nptl/tst-join11.c, nptl/tst-join12.c,
nptl/tst-join13.c, nptl/tst-join14.c, nptl/tst-join2.c, nptl/tst-join3.c,
nptl/tst-join8.c, nptl/tst-join9.c, nptl/tst-mutex-errorcheck.c,
nptl/tst-pthread-mutexattr.c, nptl/tst-mutex11.c, nptl/tst-mutex5.c,
nptl/tst-mutex7.c, nptl/tst-mutex7robus.c, nptl/tst-mutex9.c,
nptl/tst-rwlock12.c, nptl/tst-rwlock14.c: Move to sysdeps/pthread.
* sysdeps/pthread/tst-mutex8.c: Move back to nptl.
* nptl/Makefile (tests): Move tst-mutex5, tst-mutex7, tst-mutex7robust,
tst-mutex9, tst-mutex11, tst-rwlock12, tst-rwlock14, tst-join2, tst-join3,
tst-join8, tst-join9 tst-join10, tst-join11, tst-join12, tst-join13, tst-join14,
tst-abstime, tst-mutex-errorcheck, tst-pthread-mutexattr to ...
* sysdeps/pthread/Makefile (tests): ... here.
2020-05-26 00:46:07 +00:00
Florian Weimer
de42613540 elf: Turn _dl_printf, _dl_error_printf, _dl_fatal_printf into functions
This change makes it easier to set a breakpoint on these calls.

This also addresses the issue that including <ldsodefs.h> without
<unistd.h> does not result usable _dl_*printf macros because of the
use of the STD*_FILENO macros there.

(The private symbol for _dl_fatal_printf will go away again
once the exception handling implementation is unified between
libc and ld.so.)

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-25 18:17:27 +02:00
H.J. Lu
76d5b2f002 x86: Update Intel Atom processor family optimization
Enable Intel Silvermont optimization for Intel Goldmont Plus.  Detect more
Intel Airmont processors.  Optimize Intel Tremont like Intel Silvermont
with rep string instructions.
2020-05-21 13:36:54 -07:00
Florian Weimer
331c6e8a18 nptl: Add __pthread_attr_copy for copying pthread_attr_t objects
Also add the private type union pthread_attr_transparent, to reduce
the amount of casting that is required.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2020-05-20 20:28:44 +02:00
Florian Weimer
07a73d5219 nptl: Move pthread_gettattr_np into libc
This is part of the libpthread removal project:

    <https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>

Use __getline instead of __getdelim to avoid a localplt failure.
Likewise for __getrlimit/getrlimit.

The abilist updates were performed by:

git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
  | while read x ; do
    echo "GLIBC_2.32 pthread_getattr_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py --only-linux pthread_getattr_np

The private export of __pthread_getaffinity_np is no longer needed, but
the hidden alias still necessary so that the symbol can be exported with
versioned_symbol.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2020-05-20 20:27:49 +02:00
Florian Weimer
52302bc298 nptl: Move pthread_getaffinity_np into libc
This is part of the libpthread removal project:

    <https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>

The abilist updates were performed by:

git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
  | while read x ; do
    echo "GLIBC_2.32 pthread_getaffinity_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py pthread_getaffinity_np

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2020-05-20 20:23:20 +02:00
Florian Weimer
1979819d68 nptl: Move pthread_attr_setaffinity_np into libc
This is part of the libpthread removal project:

    <https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>

The symbol did not previously exist in libc, so a new GLIBC_2.32
symbol is needed, to get correct dependency for binaries which
use the symbol but no longer link against libpthread.

The abilist updates were performed by:

git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
  | while read x ; do
    echo "GLIBC_2.32 pthread_attr_setaffinity_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py pthread_attr_setaffinity_np

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2020-05-20 20:22:59 +02:00
Florian Weimer
714da1d4ea nptl: Replace some stubs with the Linux implementation
The stubs for pthread_getaffinity_np, pthread_getname_np,
pthread_setaffinity_np, pthread_setname_np are replaced, and corresponding
tests are moved.

After the removal of the NaCl port, nptl is Linux-specific, and the stubs
are no longer needed.  This effectively reverts commit
c76d1ff514 ("NPTL: Add stubs for Linux-only
extension functions.").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-05-20 20:22:31 +02:00
Florian Weimer
b6ad64b907 Linux: Add missing handling of tai field to __ntp_gettime64
This fixes a build error:

../sysdeps/unix/sysv/linux/ntp_gettime.c: In function ‘__ntp_gettime’:
../sysdeps/unix/sysv/linux/ntp_gettime.c:56:10: error: ‘ntv64.tai’ is used uninitialized in this function [-Werror=uninitialized]
   56 |   *ntv = valid_ntptimeval64_to_ntptimeval (ntv64);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-05-20 17:58:56 +02:00
Lukasz Majewski
e9698175b0 y2038: Replace __clock_gettime with __clock_gettime64
The __clock_gettime internal function is not supporting 64 bit time on
architectures with __WORDSIZE == 32 and __TIMESIZE != 64 (like e.g. ARM 32
bit).

The __clock_gettime64 function shall be used instead in the glibc itself as
it supports 64 bit time on those systems.
This patch does not bring any changes to systems with __WORDSIZE == 64 as
for them the __clock_gettime64 is aliased to __clock_gettime (in
./include/time.h).
2020-05-20 16:45:16 +02:00
Lukasz Majewski
4c4fc04826 y2038: linux: Provide __ntp_gettimex64 implementation
This patch provides new __ntp_gettimex64 explicit 64 bit function for getting
time parameters via NTP interface.

The call to __adjtimex in __ntp_gettime64 function has been replaced with
direct call to __clock_adjtime64 syscall, to simplify the code.

Moreover, a 32 bit version - __ntp_gettimex has been refactored to internally
use __ntp_gettimex64.

The __ntp_gettimex is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
ntptimeval and 64 bit struct __ntptimeval64.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __ntp_gettimex64 and __ntp_gettimex.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-20 01:09:16 +02:00
Lukasz Majewski
5613afe9e3 y2038: linux: Provide __ntp_gettime64 implementation
This patch provides new __ntp_gettime64 explicit 64 bit function for getting
time parameters via NTP interface.

Internally, the __clock_adjtime64 syscall is used instead of __adjtimex. This
patch is necessary for having architectures with __WORDSIZE == 32 Y2038 safe.

Moreover, a 32 bit version - __ntp_gettime has been refactored to internally
use __ntp_gettime64.

The __ntp_gettime is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
ntptimeval and 64 bit struct __ntptimeval64.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __ntp_gettime64 and __ntp_gettime.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-20 01:09:09 +02:00
Lukasz Majewski
10ae49d2ce y2038: Provide conversion helpers for struct __ntptimeval64
Those functions allow easy conversion between Y2038 safe, glibc internal
struct __ntptimeval64 and struct ntptimeval.

The reserved fields (i.e. __glibc_reserved{1234}) during conversion are
zeroed as well, to provide behavior similar to one in ntp_gettimex function
(where those are cleared before the struct ntptimeval is returned).

Those functions are put in Linux specific sys/timex.h file, as putting
them into glibc's local include/time.h would cause build break on HURD as
it doesn't support struct timex related syscalls.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-20 01:03:27 +02:00
Lukasz Majewski
df4289508a y2038: Introduce struct __ntptimeval64 - new internal glibc type
This type is a glibc's "internal" type to get time parameters data from
Linux kernel (NTP daemon interface). It stores time in struct __timeval64
rather than struct timeval, which makes it Y2038-proof.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-05-20 01:03:27 +02:00
Lukasz Majewski
0308077e3a y2038: linux: Provide __adjtime64 implementation
This patch provides new __adjtime64 explicit 64 bit function for adjusting
Linux kernel clock.

Internally, the __clock_adjtime64 syscall is used instead of __adjtimex. This
patch is necessary for having architectures with __WORDSIZE == 32 Y2038 safe.

Moreover, a 32 bit version - __adjtime has been refactored to internally use
__adjtime64.

The __adjtime is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
timeval and 64 bit struct __timeval64.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __adjtime64 and __adjtime.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-20 01:03:26 +02:00
Lukasz Majewski
8f8a6cae48 y2038: linux: Provide ___adjtimex64 implementation
This patch provides new ___adjtimex64 explicit 64 bit function for adjusting
Linux kernel clock.

Internally, the __clock_adjtime64 syscall is used. This patch is necessary
for having architectures with __WORDSIZE == 32 Y2038 safe.

Moreover, a 32 bit version - ___adjtimex has been refactored to internally
use ___adjtimex64.

The ___adjtimex is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
timex and 64 bit struct __timex64.

Last but not least, in ___adjtimex64 function the __clock_adjtime syscall has
been replaced with __clock_adjtime64 to support 64 bit time on architectures
with __WORDSIZE == 32 and __TIMESIZE != 64.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both ___adjtimex64 and ___adjtimex.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-20 01:03:26 +02:00