glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-27 05:00:15 +00:00

Author	SHA1	Message	Date
Zong Li	30b963c143	RISC-V: Add rv32 path to RTLDLIST in ldd Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:43 -07:00
Alistair Francis	7a55dd3fb6	riscv32: Specify the arch_minimum_kernel as 5.4 Specify the minimum kernel version for RISC-V 32-bit as the 5.4 kernel. We require this commit: "waitid: Add support for waiting for the current process group" for the kernel as it adds support for the P_PGID id for the waitid syscall. Without this patch we can't replace the wait4 syscall on 64-bit time_t only systems. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:43 -07:00
Zong Li	2ed993ada6	RISC-V: Fix llrint and llround missing exceptions on RV32 Conversions from a float to a long long on 32-bit RISC-V (RV32) may not raise the correct exceptions on overflow, it also may raise spurious "inexact" exceptions on non overflow cases. This patch fixes the problem, similarly to the fix for MIPS, ARM and S390. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:43 -07:00
Alistair Francis	b2d175cdb7	RISC-V: Add the RV32 libm-test-ulps Add a libm-test-ulps for RV32, this is the same as the RV64 one. This dosn't match what is generated by running `make regen-ulps` on RV32 QEMU, but the current in tree RV64 doesn't match that either. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:43 -07:00
Alistair Francis	5820c3731e	RISC-V: Add 32-bit ABI lists Use the update-abi Make target to generate the abilist for RV32. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:43 -07:00
Zong Li	941a55cf59	RISC-V: Add hard float support for 32-bit CPUs This patch adds support for hardware floating-point support for the RV32IF and RV32IFD platforms. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	5b6113d62e	RISC-V: Support the 32-bit ABI implementation This patch adds the ABI implementation for 32-bit RISC-V. It contains the Linux-specific and RISC-V architecture code. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	07598d7600	RISC-V: Add arch-syscall.h for RV32 Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	708b92e878	RISC-V: Add path of library directories for the 32-bit With RV32 support the list of possible RISC-V system directories increases to: - /lib64/lp64d - /lib64/lp64 - /lib32/ilp32d - /lib32/ilp32 - /lib (only ld.so) This patch changes the add_system_dir () macro to support the new ilp32d and ilp32 directories for RV32. While refactoring this code let's split out the confusing if statements into a loop to make it easier to understand and extend. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Zong Li	8041759aef	RISC-V: Support dynamic loader for the 32-bit Add the LD_SO_ABI definition for RISC-V 32-bit. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	68efae739a	RISC-V: Add support for 32-bit vDSO calls Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	7ed05adc82	RISC-V: Use 64-bit-time syscall numbers with the 32-bit port sysdep.h redefines only the syscall where the generic implementation still does not have actual 64-bit time_t support: /* Workarounds for generic code needing to handle 64-bit time_t. / / Fix sysdeps/unix/sysv/linux/clock_getcpuclockid.c. / #define __NR_clock_getres __NR_clock_getres_time64 / Fix sysdeps/nptl/lowlevellock-futex.h. */ #define __NR_futex __NR_futex_time64 [...] This patch also adds a comment that it is a workaround to handle 64-bit time_t and on each #define comment for which implementation it intends to. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:42 -07:00
Alistair Francis	4875afe552	RISC-V: Cleanup some of the sysdep.h code Remove a duplicate inclusion of <sysdeps/unix/sysdep.h> which is already pulled via <sysdeps/unix/sysv/linux/generic/sysdep.h>, and the inclusion of <errno.h> whose definition of `__set_errno' is not needed here. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:41 -07:00
Alistair Francis	2b09ebeee7	RISC-V: Use 64-bit time_t and off_t for RV32 and RV64 Using the original glibc headers under bits/ let's make small modifications to use 64-bit time_t and off_t for both RV32 and RV64. For the typesizes.h, here are justifications for the changes from the generic version (based on Arnd's very helpful feedback): - All the !__USE_FILE_OFFSET64 types (__off_t, __ino_t, __rlim_t, ...) are changed to match the 64-bit replacements. - __time_t is defined to 64 bit, but no __time64_t is added. This makes sense as we don't have the time64 support for other 32-bit architectures yet, and it will be easy to change when that happens. - __suseconds_t is 64-bit. This matches what we use the kernel ABI for the few drivers that are relying on 'struct timeval' input arguments in ioctl, as well as the adjtimex system call. It means that timeval has to be defined without the padding, unlike timespec, which needs padding. Reviewed-by: Maciej W. Rozycki <macro@wdc.com>	2020-08-27 08:17:41 -07:00
Samuel Thibault	cd41ffeb0b	hurd: define BSD 4.3 ioctls only under __USE_MISC	2020-08-27 13:36:32 +02:00
Adhemerval Zanella	f032f3af2c	linux: Simplify utimensat With arch-syscall.h it can now assumes the existance of either __NR_utimensat or __NR_utimensat_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	278498a1c0	linux: Simplify timerfd_settime With arch-syscall.h it can now assumes the existance of either __NR_timer_settime or __NR_time_settime_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	70746a06c2	linux: Simplify timer_gettime With arch-syscall.h it can now assumes the existance of either __NR_timer_gettime or __NR_time_gettime_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	fd31691c67	linux: Simplify sched_rr_get_interval With arch-syscall.h it can now assumes the existance of either __NR_sched_rr_get_interval or __NR_sched_rr_get_interval_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	3feb53bab0	linux: Simplify ppoll With arch-syscall.h it can now assumes the existance of either __NR_ppoll or __NR_ppoll_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	85077eaa54	linux: Simplify mq_timedsend With arch-syscall.h it can now assumes the existance of either __NR_mq_timedsend or __NR_mq_timedsend_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	1e03b6d828	linux: Simplify mq_timedreceive With arch-syscall.h it can now assumes the existance of either __NR_mq_timedreceive or __NR_mq_timedreceive_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	ff6228d5c6	linux: Simplify clock_settime With arch-syscall.h it can now assumes the existance of either __NR_clock_settime or __NR_clock_settime_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:31 -03:00
Adhemerval Zanella	55399535c1	linux: Simplify clock_nanosleep With arch-syscall.h it can now assumes the existance of either __NR_clock_nanosleep or __NR_clock_nanosleep_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 15:04:29 -03:00
Adhemerval Zanella	d9310f33fc	linux: Simplify clock_gettime With arch-syscall.h it can now assumes the existance of either __NR_clock_gettime or __NR_clock_gettime_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. It also uses the time64-support functions to simplify it further. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel).	2020-08-24 14:28:21 -03:00
Adhemerval Zanella	4f7092348d	linux: Simplify clock_adjtime With arch-syscall.h it can now assumes the existance of either __NR_clock_adjtime or __NR_clock_adjtime_time64. The 32-bit time_t support is now only build for !__ASSUME_TIME64_SYSCALLS. Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15 kernel). Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-08-24 14:27:19 -03:00
Adhemerval Zanella	02c91eb611	linux: Add helper function to optimize 64-bit time_t fallback support These helper functions are used to optimize the 64-bit time_t support on configurations that requires support for 32-bit time_t fallback (!__ASSUME_TIME64_SYSCALLS). The idea is once the kernel advertises that it does not have 64-bit time_t support, glibc will stop to try issue the 64-bit time_t syscall altogether. For instance: #ifndef __NR_symbol_time64 # define __NR_symbol_time64 __NR_symbol #endif int r; if (supports_time64 ()) { r = INLINE_SYSCALL_CALL (symbol, ...); if (r == 0 \|\| errno != ENOSYS) return r; mark_time64_unsupported (); } #ifndef __ASSUME_TIME64_SYSCALLS <32-bit fallback syscall> #endif return r; On configuration with default 64-bit time_t this optimization should be optimized away by the compiler resulting in no overhead.	2020-08-24 14:27:15 -03:00
Stefan Liebler	756c306502	S390: Sync HWCAP names with kernel by adding aliases [BZ #25971 ] Unfortunately some HWCAP names like HWCAP_S390_VX differs between kernel (see <kernel>/arch/s390/include/asm/elf.h) and glibc. Therefore, those HWCAP names from kernel are now introduced as alias	2020-08-21 11:23:17 +02:00
Joseph Myers	b3aa7976d0	Update kernel version to 5.8 in tst-mman-consts.py. This patch updates the kernel version in the test tst-mman-consts.py to 5.8. (There are no new MAP_* constants covered by this test in 5.8 that need any other header changes.) Tested with build-many-glibcs.py.	2020-08-13 18:50:24 +00:00
Lukasz Majewski	4a14cb87ca	y2038: nptl: Convert pthread_{clock\|timed}join_np to support 64 bit time The pthread_clockjoin_np and pthread_timedjoin_np have been converted to support 64 bit time. This change introduces new futex_timed_wait_cancel64 function in ./sysdeps/nptl/futex-internal.h, which uses futex_time64 where possible and tries to replace low-level preprocessor macros from lowlevellock-futex.h The pthread_{timed\|clock}join_np only accept absolute time. Moreover, there is no need to check for NULL passed as *abstime pointer as clockwait_tid() always passes struct __timespec64. For systems with __TIMESIZE != 64 && __WORDSIZE == 32: - Conversions between 64 bit time to 32 bit are necessary - Redirection to __pthread_{clock\|timed}join_np64 will provide support for 64 bit time Build tests: ./src/scripts/build-many-glibcs.py glibcs Run-time tests: - Run specific tests on ARM/x86 32bit systems (qemu): https://github.com/lmajewski/meta-y2038 and run tests: https://github.com/lmajewski/y2038-tests/commits/master Above tests were performed with Y2038 redirection applied as well as without to test the proper usage of both __pthread_{timed\|clock}join_np64 and __pthread_{timed\|clock}join_np. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>	2020-08-13 14:16:34 +02:00
Szabolcs Nagy	12b2fd0ef9	aarch64: update ulps. For new j0 test.	2020-08-13 13:02:35 +01:00
Stefan Liebler	0be0845b7a	S390: Regenerate ULPs. Updates needed after new j0 test: commit `9bfc225078` math: Regenerate auto-libm-test-out-j0	2020-08-12 16:23:12 +02:00
Adhemerval Zanella	5ff35e9544	math: Update x86_64 ulps From new j0 test.	2020-08-08 16:43:11 -03:00
Florian Weimer	3d3ab573a5	Linux: Use faccessat2 to implement faccessat (bug 18683) This provides correct AT_EACCESS handling and also takes Linux security modules into account. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-08-07 22:06:59 +02:00
Paul Zimmermann	b7dd366dbe	math: Fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tiny Checked on x86_64-linux-gnu and i686-linux-gnu.	2020-08-07 16:33:13 -03:00
Joseph Myers	1cfb471528	Update syscall lists for Linux 5.8. Linux 5.8 has one new syscall, faccessat2. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.	2020-08-07 14:38:43 +00:00
Samuel Thibault	ac54c335e9	htl: Enable tst-cancelx?[45] * nptl/{tst-cancel4-common.c, tst-cancel4-common.h, tst-cancel4.c, tst-cancel5.c, tst-cancelx4.c, tst-cancelx5.c}: Move to sysdeps/pthread/ * nptl/Makefile: Move corresponding rules to... * sysdeps/pthread/Makefile: ... here.	2020-08-06 23:38:35 +00:00
Samuel Thibault	4ebd73d43f	hurd: Add missing hidden def * sysdeps/mach/hurd/sched_gets.c (__sched_getscheduler): Add hidden def.	2020-08-06 20:14:01 +02:00
Samuel Thibault	8c6beab4e1	hurd: Rework sbrk Making the brk start exactly at the end of the main application binary was requiring to get it through the _end symbol, which does not work any more with recent toolchains, and actually produces in libc.so a confusing external _end symbol that produces odd results, see https://sourceware.org/bugzilla/show_bug.cgi?id=23499 Trying to do so is quite outdated anyway with the tendency for address randomization. Using _end was also allowing to include the main binary data within the RLIMIT_DATA, but this also seems outdated with dynamic library loading, and nowadays' memory consumption via malloc and mmap rather than statically-allocated data. This adds a BRK_START macro in <vm_param.h> that just tells where we want to start the brk, and thus removes the _end symbol. * sysdeps/mach/hurd/i386/vm_param.h: New file. * sysdeps/mach/hurd/brk.c: Use BRK_START as brk start instead of _end. Also ignore __data_start. * hurd/Versions: Remove _end symbol. * sysdeps/mach/hurd/i386/libc.abilist: Remove _end symbol.	2020-08-05 23:52:04 +02:00
Samuel Thibault	ce62504488	hurd: Implement basic sched_get/setscheduler * sysdeps/mach/hurd/sched_gets.c: New file. * sysdeps/mach/hurd/sched_sets.c: New file.	2020-08-05 23:46:14 +02:00
H.J. Lu	ac3bda9a25	x86: Rename Intel CPU feature names Intel64 and IA-32 Architectures Software Developer’s Manual has changed the following CPU feature names: 1. The CPU feature of Enhanced Intel SpeedStep Technology is renamed from EST to EIST. 2. The CPU feature which supports Platform Quality of Service Monitoring (PQM) capability is changed to Intel Resource Director Technology (Intel RDT) Monitoring capability, i.e. PQM is renamed to RDT_M. 3. The CPU feature which supports Platform Quality of Service Enforcement (PQE) capability is changed to Intel Resource Director Technology (Intel RDT) Allocation capability, i.e. PQE is renamed to RDT_A.	2020-08-05 11:48:46 -07:00
Carlos O'Donell	6d403f2e1b	Regenerate configure scripts.	2020-08-04 21:36:19 -04:00
Maciej W. Rozycki	45069ac2a9	RISC-V: Update lp64d libm-test-ulps according to HiFive Unleashed Produced with HiFive Unleashed hardware using Linux 5.8-rc5 exactly and GCC 10.0.1 20200426. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-08-04 13:00:17 +01:00
Florian Weimer	7650321ce0	powerpc: Fix incorrect cache line size load in memset (bug 26332) __GLRO loaded the word after the requested variable on big-endian PowerPC, where LOWORD is 4. This can cause the memset implement go wrong because the masking with the cache line size produces wrong results, particularly if the loaded value happens to be 1. The __GLRO macro is not used in any place where loading the lower 32-bit word of a 64-bit value is desired, so the +4 offset is always wrong. Fixes commit `18363b4f01` ("powerpc: Move cache line size to rtld_global_ro") and bug 26332. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-08-03 18:07:19 +02:00
Chung-Lin Tang	783fdd969f	Update Nios II libm-test-ulps file.	2020-08-03 01:42:48 -07:00
Szabolcs Nagy	2dc33b928b	aarch64: Use future HWCAP2_MTE in ifunc resolver Make glibc MTE-safe on systems where MTE is available. This allows using heap tagging with an LD_PRELOADed malloc implementation that enables MTE. We don't document this as guaranteed contract yet, so glibc may not be MTE safe when HWCAP2_MTE is set (older glibcs certainly aren't). This is mainly for testing and debugging. The HWCAP flag is not exposed in public headers until Linux adds it to its uapi. The HWCAP value reservation will be in Linux 5.9.	2020-07-27 12:54:22 +01:00
Andreas K. Hüttel	180d5a045f	Update x86-64 libm-test-ulps x86_64 Intel(R) Core(TM) i5-8265U gcc (Gentoo 10.1.0-r2 p3) 10.1.0 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-25 17:10:53 -04:00
Szabolcs Nagy	7ebd114211	aarch64: Respect p_flags when protecting code with PROT_BTI Use PROT_READ and PROT_WRITE according to the load segment p_flags when adding PROT_BTI. This is before processing relocations which may drop PROT_BTI in case of textrels. Executable stacks are not protected via PROT_BTI either. PROT_BTI is hardening in case memory corruption happened, it's value is reduced if there is writable and executable memory available so missing it on such memory is fine, but we should respect the p_flags and should not drop PROT_WRITE.	2020-07-24 08:52:22 +01:00
Tulio Magno Quites Machado Filho	f6add169c8	powerpc: Fix POWER10 selection Add a line that was missing from a previous commit. Without increasing str, the null-byte is not validated, and _dl_string_platform returns -1. Fixes: `d2ba3677da` ("powerpc: Add support for POWER10") Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-21 18:01:39 -03:00
Paul E. Murphy	c79607a474	powerpc64le: guarantee a .gnu.attributes section [BZ #26220 ] Upstream GCC 11 development is now building the ibm128 runtime support (in libgcc) without a .gnu.attributes section on ppc64le. Ensure we have one to replace by building one ibm128 file in libc and libm with attributes. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2020-07-21 09:03:01 -05:00
Joseph Myers	469c03907b	Update powerpc-nofpu libm-test-ulps.	2020-07-20 20:16:25 +00:00
Samuel Thibault	5baad9a6f9	hurd: Fix longjmp check for sigstate * sysdeps/mach/hurd/i386/____longjmp_chk.S,__longjmp.S: Properly check for sigstate being NULL.	2020-07-18 15:12:56 +02:00
Samuel Thibault	115bcf921a	hurd: Fix longjmp early in initialization When e.g. an LD_PRELOAD fails, _dl_signal_exception/error longjmps, but TLS is not initialized yet, let along signal state. We thus mustn't look at them within __longjmp. * sysdeps/mach/hurd/i386/____longjmp_chk.S,__longjmp.S: Check for initialized value of %gs, and that sigstate is non-NULL.	2020-07-18 15:08:03 +02:00
Wilco Dijkstra	f46ef33ad1	AArch64: Improve strlen_asimd performance (bug 25824) Optimize strlen using a mix of scalar and SIMD code. On modern micro architectures large strings are 2.6 times faster than existing strlen_asimd and 35% faster than the new MTE version of strlen. On a random strlen benchmark using small sizes the speedup is 7% vs strlen_asimd and 40% vs the MTE strlen. This fixes the main strlen regressions on Cortex-A53 and other cores with a simple Neon unit. Rename __strlen_generic to __strlen_mte, and select strlen_asimd when MTE is not enabled (this is waiting on support for a HWCAP_MTE bit). This fixes big-endian bug 25824. Passes GLIBC regression tests. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2020-07-17 15:07:23 +01:00
Florian Weimer	efedd1ed3d	Linux: Remove rseq support The kernel ABI is not finalized, and there are now various proposals to change the size of struct rseq, which would make the glibc ABI dependent on the version of the kernels used for building glibc. This is of course not acceptable. This reverts commit `48699da1c4` ("elf: Support at least 32-byte alignment in static dlopen"), commit `8f4632deb3` ("Linux: rseq registration tests"), commit `6e29cb3f61` ("Linux: Use rseq in sched_getcpu if available"), and commit `0c76fc3c2b` ("Linux: Perform rseq registration at C startup and thread creation"), resolving the conflicts introduced by the ARC port and the TLS static surplus changes. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-16 17:55:35 +02:00
Aurelien Jarno	7b5f02dc2a	arm: remove string/tst-memmove-overflow XFAIL The arm string/tst-memmove-overflow XFAIL has been added in commit `eca1b23332` ("arm: XFAIL string/tst-memmove-overflow due to bug 25620") as a way to reproduce the reported bug. Now that this bug has been fixed in commits `79a4fa341b` ("arm: CVE-2020-6096: fix memcpy and memmove for negative length [BZ #25620]") and `beea361050` ("arm: CVE-2020-6096: Fix multiarch memcpy for negative length [BZ #25620]"), let's remove the XFAIL. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-16 06:56:52 +02:00
Wilco Dijkstra	0f6278a879	AArch64: Rename IS_ARES to IS_NEOVERSE_N1 Rename IS_ARES to IS_NEOVERSE_N1 since that is a bit clearer. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-15 16:58:07 +01:00
Wilco Dijkstra	4a733bf375	AArch64: Add optimized Q-register memcpy Add a new memcpy using 128-bit Q registers - this is faster on modern cores and reduces codesize. Similar to the generic memcpy, small cases include copies up to 32 bytes. 64-128 byte copies are split into two cases to improve performance of 64-96 byte copies. Large copies align the source rather than the destination. bench-memcpy-random is ~9% faster than memcpy_falkor on Neoverse N1, so make this memcpy the default on N1 (on Centriq it is 15% faster than memcpy_falkor). Passes GLIBC regression tests. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2020-07-15 16:55:07 +01:00
Wilco Dijkstra	34f0d01d5e	AArch64: Align ENTRY to a cacheline Given almost all uses of ENTRY are for string/memory functions, align ENTRY to a cacheline to simplify things. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-15 16:50:02 +01:00
Petr Vorel	5500cdba40	Remove --enable-obsolete-rpc configure flag Sun RPC was removed from glibc. This includes rpcgen program, librpcsvc, and Sun RPC headers. Also test for bug #20790 was removed (test for rpcgen). Backward compatibility for old programs is kept only for architectures and ABIs that have been added in or before version 2.28. libtirpc is mature enough, librpcsvc and rpcgen are provided in rpcsvc-proto project. NOTE: libnsl code depends on Sun RPC (installed libnsl headers use installed Sun RPC headers), thus --enable-obsolete-rpc was a dependency for --enable-obsolete-nsl (removed in a previous commit). The arc ABI list file has to be updated because the port was added with the sunrpc symbols Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-13 19:36:35 +02:00
Adhemerval Zanella	3486924dc7	hurd: Fix build-many-glibcs.py It fixes the issue report by Joseph [1]. Checked with a build-many-glibcs.py build for i686-gnu. [1] https://sourceware.org/pipermail/libc-alpha/2020-July/116134.html	2020-07-13 14:25:03 -03:00
H.J. Lu	107e6a3c22	x86: Support usable check for all CPU features Support usable check for all CPU features with the following changes: 1. Change struct cpu_features to struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; so that there is a usable bit for each cpuid bit. 2. After the cpuid bits have been initialized, copy the known bits to the usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for CPU feature detection. 3. Clear the usable bits which require OS support. 4. If the feature is supported by OS, copy its cpuid bit to its usable bit. 5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE and CPU_FEATURE_USABLE_P to check if a feature is usable. 6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13. 7. Unset MPX feature since it has been deprecated. The results are 1. If the feature is known and doesn't requre OS support, its usable bit is copied from the cpuid bit. 2. Otherwise, its usable bit is copied from the cpuid bit only if the feature is known to supported by OS. 3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the feature can be used. 4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports the feature.	2020-07-13 06:05:16 -07:00
H.J. Lu	43530ba1dc	x86: Remove __ASSEMBLER__ check in init-arch.h Since commit `430388d5dc` Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Aug 3 08:04:49 2018 -0700 x86: Don't include <init-arch.h> in assembly codes removed all usages of <init-arch.h> from assembly codes, we can remove __ASSEMBLER__ check in init-arch.h.	2020-07-11 10:03:05 -07:00
H.J. Lu	9016b6f389	x86: Remove the unused __x86_prefetchw Since commit `c867597bff` Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 8 13:57:50 2016 -0700 X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove removed the only usage of __x86_prefetchw, we can remove the unused __x86_prefetchw.	2020-07-11 09:34:03 -07:00
Vineet Gupta	0be8ae3679	ARC: Build Infrastructure Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:45 -07:00
Vineet Gupta	33ff7b3988	ARC: ABI lists Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	c86a9483f4	ARC: Linux Startup and Dynamic Loading A big shoutout to Cupertino Miranda <cmiranda@synopsys.com> for his valuable contribution in initial bringup and debugging on Linux and later in solving pesky unwinding/cancelation failures in testsuite. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	e5ccf113cd	ARC: Linux ABI Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	add5071a5c	ARC: Linux Syscall Interface Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	3ab8611a22	ARC: hardware floating point support Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	fd9dec20c8	ARC: math soft float support Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	9679dd5ecd	ARC: Atomics and Locking primitives Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	0261315289	ARC: Thread Local Storage support This includes all 4 TLS addressing models Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	dd2e6ef179	ARC: startup and dynamic linking code Code for C runtime startup and dynamic loading including PLT layout. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Vineet Gupta	0e7d930c4c	ARC: ABI Implementation This code deals with the ARC ABI. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-10 16:08:44 -07:00
Tulio Magno Quites Machado Filho	7c7bcf3634	powerpc64: Fix calls when r2 is not used [BZ #26173 ] Teach the linker that __mcount_internal, __sigjmp_save_symbol, __syscall_error and __GI_exit do not use r2, so that it does not need to recover r2 after the call. Test at configure time if the assembler supports @notoc and define USE_PPC64_NOTOC.	2020-07-10 19:41:06 -03:00
Patsy Franklin	b21c2c24ed	Update i686 libm-test-ulps Without my ULP patch these 18 tests fail on i686: https://koji.fedoraproject.org/koji/taskinfo?taskID=46467301 + cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel Xeon Processor (Cascadelake) FAIL: math/test-double-j0 FAIL: math/test-double-y0 FAIL: math/test-float-erfc FAIL: math/test-float-j0 FAIL: math/test-float-j1 FAIL: math/test-float-lgamma FAIL: math/test-float-tgamma FAIL: math/test-float-y0 FAIL: math/test-float32-erfc FAIL: math/test-float32-j0 FAIL: math/test-float32-j1 FAIL: math/test-float32-lgamma FAIL: math/test-float32-tgamma FAIL: math/test-float32-y0 FAIL: math/test-float32x-j0 FAIL: math/test-float32x-y0 FAIL: math/test-float64-j0 FAIL: math/test-float64-y0 With my ULP patch applied these tests now pass: https://koji.fedoraproject.org/koji/taskinfo?taskID=46436310	2020-07-09 23:43:25 -04:00
Maciej W. Rozycki	c363f834cf	linux: Fix syscall list generation instructions Make the instructions for syscall list generation match Makefile and refer to `update-syscall-lists'; there has been no `update-arch-syscall' target. Also use single quotes around the command to stick to the ASCII character set. Fixes `4cf0d22305` ("Linux: Add tables with system call numbers"). Reviewed-by: Alistair Francis <alistair.francis@wdc.com>	2020-07-09 17:43:57 +01:00
Adhemerval Zanella	ffd178c651	sysv: linux: Add 64-bit time_t variant for shmctl To provide a y2038 safe interface a new symbol __shmctl64 is added and __shmctl is change to call it instead (it adds some extra buffer copying for the 32 bit time_t implementation). Two new structures are added: 1. kernel_shmid64_ds: used internally only on 32-bit architectures to issue the syscall. A handful of architectures (hppa, i386, mips, powerpc32, and sparc32) require specific implementations due to their kernel ABI. 2. shmid_ds64: this is only for __TIMESIZE != 64 to use along with the 64-bit shmctl. It is different than the kernel struct because the exported 64-bit time_t might require different alignment depending on the architecture ABI. So the resulting implementation does: 1. For 64-bit architectures it assumes shmid_ds already contains 64-bit time_t fields and will result in just the __shmctl symbol using the __shmctl64 code. The shmid_ds argument is passed as-is to the syscall. 2. For 32-bit architectures with default 64-bit time_t (newer ABIs such riscv32 or arc), it will also result in only one exported symbol but with the required high/low time handling. 3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t support we follow the already set way to provide one symbol with 64-bit time_t support and implement the 32-bit time_t support using of the 64-bit one. The default 32-bit symbol will allocate and copy the shmid_ds over multiple buffers, but this should be deprecated in favor of the __shmctl64 anyway. Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and sparc64. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-09 12:05:47 -03:00
Adhemerval Zanella	7929d77985	sysvipc: Remove the linux shm-pad.h file Each architecture overrides the struct msqid_ds which its required kernel ABI one. Checked on x86_64-linux-gnu and some bases sysvipc tests on hppa, mips, mipsle, mips64, mips64le, sparc64, sparcv9, powerpc64le, powerpc64, and powerpc. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-09 12:05:46 -03:00
Adhemerval Zanella	380b7ced6a	sysvipc: Split out linux struct shmid_ds This will allow us to have architectures specify their own version. Not semantic changes expected. Checked with a build against the all affected ABIs. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-09 12:05:46 -03:00
Adhemerval Zanella	3283f71113	sysv: linux: Add 64-bit time_t variant for msgctl To provide a y2038 safe interface a new symbol __msgctl64 is added and __msgctl is change to call it instead (it adds some extra buffer coping for the 32 bit time_t implementation). Two new structures are added: 1. kernel_msqid64_ds: used internally only on 32-bit architectures to issue the syscall. A handful of architectures (hppa, i386, mips, powerpc32, and sparc32) require specific implementations due to their kernel ABI. 2. msqid_ds64: this is only for __TIMESIZE != 64 to use along with the 64-bit msgctl. It is different than the kernel struct because the exported 64-bit time_t might require different alignment depending on the architecture ABI. So the resulting implementation does: 1. For 64-bit architectures it assumes msqid_ds already contains 64-bit time_t fields and will result in just the __msgctl symbol using the __msgctl64 code. The msgid_ds argument is passed as-is to the syscall. 2. For 32-bit architectures with default 64-bit time_t (newer ABIs such riscv32 or arc), it will also result in only one exported symbol but with the required high/low time handling. 3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t support we follow the already set way to provide one symbol with 64-bit time_t support and implement the 32-bit time_t support using the 64-bit time_t. The default 32-bit symbol will allocate and copy the msqid_ds over multiple buffers, but this should be deprecated in favor of the __msgctl64 anyway. Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and sparc64. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com>	2020-07-09 12:05:40 -03:00
Adhemerval Zanella	915b9fe312	sysvipc: Remove the linux msq-pad.h file Each architecture overrides the struct msqid_ds which its required kernel ABI one. Checked on x86_64-linux-gnu and some bases sysvipc tests on hppa, mips, mipsle, mips64, mips64le, sparc64, sparcv9, powerpc64le, powerpc64, and powerpc. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com>	2020-07-09 12:05:40 -03:00
Adhemerval Zanella	078a892085	sysvipc: Split out linux struct semid_ds This will allow us to have architectures specify their own version. Not semantic changes expected. Checked with a build against the all affected ABIs. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2020-07-09 12:05:40 -03:00
Adhemerval Zanella	dba950e317	sysv: linux: Add 64-bit time_t variant for semctl Different than others 64-bit time_t syscalls, the SysIPC interface does not provide a new set of syscall for y2038 safeness. Instead it uses unused fields in semid_ds structure to return the high bits for the timestamps. To provide a y2038 safe interface a new symbol __semctl64 is added and __semctl is change to call it instead (it adds some extra buffer copying for the 32 bit time_t implementation). Two new structures are added: 1. kernel_semid64_ds: used internally only on 32-bit architectures to issue the syscall. A handful of architectures (hppa, i386, mips, powerpc32, sparc32) require specific implementations due their kernel ABI. 2. semid_ds64: this is only for __TIMESIZE != 64 to use along with the 64-bit semctl. It is different than the kernel struct because the exported 64-bit time_t might require different alignment depending on the architecture ABI. So the resulting implementation does: 1. For 64-bit architectures it assumes semid_ds already contains 64-bit time_t fields and will result in just the __semctl symbol using the __semctl64 code. The semid_ds argument is passed as-is to the syscall. 2. For 32-bit architectures with default 64-bit time_t (newer ABIs such riscv32 or arc), it will also result in only one exported symbol but with the required high/low handling. It might be possible to optimize it further to avoid the kernel_semid64_ds to semun transformation if the exported ABI for the architectures matches the expected kernel ABI, but the implementation is already complex enough and don't think this should be a hotspot in any case. 3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t support we follow the already set way to provide one symbol with 64-bit time_t support and implement the 32-bit time_t support using the 64-bit one. The default 32-bit symbol will allocate and copy the semid_ds over multiple buffers, but this should be deprecated in favor of the __semctl64 anyway. Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and sparc64. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Vineet Gupta <vgupta@synopsys.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2020-07-09 12:05:35 -03:00
Szabolcs Nagy	ffb17e7ba3	rtld: Avoid using up static TLS surplus for optimizations [BZ #25051 ] On some targets static TLS surplus area can be used opportunistically for dynamically loaded modules such that the TLS access then becomes faster (TLSDESC and powerpc TLS optimization). However we don't want all surplus TLS to be used for this optimization because dynamically loaded modules with initial-exec model TLS can only use surplus TLS. The new contract for surplus static TLS use is: - libc.so can have up to 192 bytes of IE TLS, - other system libraries together can have up to 144 bytes of IE TLS. - Some "optional" static TLS is available for opportunistic use. The optional TLS is now tunable: rtld.optional_static_tls, so users can directly affect the allocated static TLS size. (Note that module unloading with dlclose does not reclaim static TLS. After the optional TLS runs out, TLS access is no longer optimized to use static TLS.) The default setting of rtld.optional_static_tls is 512 so the surplus TLS is 3192 + 4144 + 512 = 1664 by default, the same as before. Fixes BZ #25051. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-08 17:32:56 +01:00
Szabolcs Nagy	17796419b5	rtld: Account static TLS surplus for audit modules The new static TLS surplus size computation is surplus_tls = 192 * (nns-1) + 144 * nns + 512 where nns is controlled via the rtld.nns tunable. This commit accounts audit modules too so nns = rtld.nns + audit modules. rtld.nns should only include the namespaces required by the application, namespaces for audit modules are accounted on top of that so audit modules don't use up the static TLS that is reserved for the application. This allows loading many audit modules without tuning rtld.nns or using up static TLS, and it fixes FAIL: elf/tst-auditmany Note that DL_NNS is currently a hard upper limit for nns, and if rtld.nns + audit modules go over the limit that's a fatal error. By default rtld.nns is 4 which allows 12 audit modules. Counting the audit modules is based on existing audit string parsing code, we cannot use GLRO(dl_naudit) before the modules are actually loaded.	2020-07-08 17:32:56 +01:00
Szabolcs Nagy	0c7b002fac	rtld: Add rtld.nns tunable for the number of supported namespaces TLS_STATIC_SURPLUS is 1664 bytes currently which is not enough to support DL_NNS (== 16) number of dynamic link namespaces, if we assume 192 bytes of TLS are reserved for libc use and 144 bytes are reserved for other system libraries that use IE TLS. A new tunable is introduced to control the number of supported namespaces and to adjust the surplus static TLS size as follows: surplus_tls = 192 * (rtld.nns-1) + 144 * rtld.nns + 512 The default is rtld.nns == 4 and then the surplus TLS size is the same as before, so the behaviour is unchanged by default. If an application creates more namespaces than the rtld.nns setting allows, then it is not guaranteed to work, but the limit is not checked. So existing usage will continue to work, but in the future if an application creates more than 4 dynamic link namespaces then the tunable will need to be set. In this patch DL_NNS is a fixed value and provides a maximum to the rtld.nns setting. Static linking used fixed 2048 bytes surplus TLS, this is changed so the same contract is used as for dynamic linking. With static linking DL_NNS == 1 so rtld.nns tunable is forced to 1, so by default the surplus TLS is reduced to 144 + 512 = 656 bytes. This change is not expected to cause problems. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-08 17:32:56 +01:00
Petr Vorel	ae7a94e5e3	Remove --enable-obsolete-nsl configure flag this means that always libnsl is only built as shared library for backward compatibility and the NSS modules libnss_nis and libnss_nisplus are not built at all, libnsl's headers aren't installed. This compatibility is kept only for architectures and ABIs that have been added in or before version 2.28. Replacement implementations based on TIRPC, which additionally support IPv6, are available from <https://github.com/thkukuk/>. This change does not affect libnss_compat which does not depended on libnsl since 2.27 and thus can be used without NIS. libnsl code depends on Sun RPC, e.g. on --enable-obsolete-rpc (installed libnsl headers use installed Sun RPC headers), which will be removed in the following commit.	2020-07-08 17:25:57 +02:00
Szabolcs Nagy	d174ec248d	aarch64: redefine RETURN_ADDRESS to strip PAC RETURN_ADDRESS is used at several places in glibc to mean a valid code address of the call site, but with pac-ret it may contain a pointer authentication code (PAC), so its definition is adjusted. This is gcc PR target/94891: __builtin_return_address should not expose signed pointers to user code where it can cause ABI issues. In glibc RETURN_ADDRESS is only changed if it is built with pac-ret. There is no detection for the specific gcc issue because it is hard to test and the additional xpac does not cause problems. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:38 +01:00
Szabolcs Nagy	c94767712b	aarch64: fix pac-ret support in _mcount Currently gcc -pg -mbranch-protection=pac-ret passes signed return address to _mcount, so _mcount now has to always strip pac from the frompc since that's from user code that may be built with pac-ret. This is gcc PR target/94791: signed pointers should not escape and get passed across extern call boundaries, since that's an ABI break, but because existing gcc has this issue we work it around in glibc until that is resolved. This is compatible with a fixed gcc and it is a nop on systems without PAuth support. The bug was introduced in gcc-7 with -msign-return-address=non-leaf\|all support which in gcc-9 got renamed to -mbranch-protection=pac-ret\|pac-ret+leaf\|standard. strip_pac uses inline asm instead of __builtin_aarch64_xpaclri since that is not a documented api and not available in all supported gccs. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:38 +01:00
Szabolcs Nagy	1be3d6eb82	aarch64: Add pac-ret support to assembly files Use return address signing in assembly files for functions that save LR when pac-ret is enabled in the compiler. The GNU property note for PAC-RET is not meaningful to the dynamic linker so it is not strictly required, but it may be used to track the security property of binaries. (The PAC-RET property is only set if BTI is set too because BTI implies working GNU property support.) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:38 +01:00
Szabolcs Nagy	9e1751e6d6	aarch64: configure check for pac-ret code generation Return address signing requires unwinder support, which is present in libgcc since >=gcc-7, however due to bugs the support may be broken in <gcc-10 (and similarly there may be issues in custom unwinders), so pac-ret is not always safe to use. So in assembly code glibc should only use pac-ret if the compiler uses it too. Unfortunately there is no predefined feature macro for it set by the compiler so pac-ret is inferred from the code generation. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:38 +01:00
Szabolcs Nagy	de9301c02e	aarch64: ensure objects are BTI compatible When glibc is built with branch protection (i.e. with a gcc configured with --enable-standard-branch-protection), all glibc binaries should be BTI compatible and marked as such. It is easy to link BTI incompatible objects by accident and this is silent currently which is usually not the expectation, so this is changed into a link error. (There is no linker flag for failing on BTI incompatible inputs so all warnings are turned into fatal errors outside the test system when building glibc with branch protection.) Unfortunately, outlined atomic functions are not BTI compatible in libgcc (PR libgcc/96001), so to build glibc with current gcc use 'CC=gcc -mno-outline-atomics', this should be fixed in libgcc soon and then glibc can be built and tested without such workarounds. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:38 +01:00
Sudakshina Das	605338745b	aarch64: enable BTI at runtime Binaries can opt-in to using BTI via an ELF object file marking. The dynamic linker has to then mprotect the executable segments with PROT_BTI. In case of static linked executables or in case of the dynamic linker itself, PROT_BTI protection is done by the operating system. On AArch64 glibc uses PT_GNU_PROPERTY instead of PT_NOTE to check the properties of a binary because PT_NOTE can be unreliable with old linkers (old linkers just append the notes of input objects together and add them to the output without checking them for consistency which means multiple incompatible GNU property notes can be present in PT_NOTE). BTI property is handled in the loader even if glibc is not built with BTI support, so in theory user code can be BTI protected independently of glibc. In practice though user binaries are not marked with the BTI property if glibc has no support because the static linked libc objects (crt files, libc_nonshared.a) are unmarked. This patch relies on Linux userspace API that is not yet in a linux release but in v5.8-rc1 so scheduled to be in Linux 5.8. Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	5f846c8b0d	aarch64: fix RTLD_START for BTI Tailcalls must use x16 or x17 for the indirect branch instruction to be compatible with code that uses BTI c at function entries. (Other forms of indirect branches can only land on BTI j.) Also added a BTI c at the ELF entry point of rtld, this is not strictly necessary since the kernel does not use indirect branch to get there, but it seems safest once building glibc itself with BTI is supported. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	fddbd7c0ef	aarch64: fix swapcontext for BTI setcontext returns to the specified context via an indirect jump, so there should be a BTI j. In case of getcontext (and all other returns_twice functions) the compiler adds BTI j at the call site, but swapcontext is a normal c call that is currently not handled specially by the compiler. So we change swapcontext such that the saved context returns to a local address that has BTI j and then swapcontext returns to the caller via a normal RET. For this we save the original return address in the slot for x1 of the context because x1 need not be preserved by swapcontext but it is restored when the context saved by swapcontext is resumed. The alternative fix (which is done on x86) would make swapcontext special in the compiler so BTI j is emitted at call sites, on x86 there is an indirect_return attribute for this, on AArch64 we would have to use returns_twice. It was decided against because such fix may need user code updates: the attribute has to be added when swapcontext is called via a function pointer and it breaks always_inline functions with swapcontext. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Sudakshina Das	91181954f9	aarch64: Add BTI support to assembly files To enable building glibc with branch protection, assembly code needs BTI landing pads and ELF object file markings in the form of a GNU property note. The landing pads are unconditionally added to all functions that may be indirectly called. When the code segment is not mapped with PROT_BTI these instructions are nops. They are kept in the code when BTI is not supported so that the layout of performance critical code is unchanged across configurations. The GNU property notes are only added when there is support for BTI in the toolchain, because old binutils does not handle the notes right. (Does not know how to merge them nor to put them in PT_GNU_PROPERTY segment instead of PT_NOTE, and some versions of binutils emit warnings about the unknown GNU property. In such cases the produced libc binaries would not have valid ELF marking so BTI would not be enabled.) Note: functions using ENTRY or ENTRY_ALIGN now start with an additional BTI c, so alignment of the following code changes, but ENTRY_ALIGN_AND_PAD was fixed so there is no change to the existing code layout. Some string functions may need to be tuned for optimal performance after this commit. Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	2a4c2dde49	aarch64: Rename place holder .S files to .c The compiler can add required elf markings based on CFLAGS but the assembler cannot, so using C code for empty files creates less of a maintenance problem. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	1b0a4f58f5	aarch64: configure test for BTI support Check BTI support in the compiler and linker. The check also requires READELF that understands the BTI GNU property note. It is expected to succeed with gcc >=gcc-9 configured with --enable-standard-branch-protection and binutils >=binutils-2.33. Note: passing -mbranch-protection=bti in CFLAGS when building glibc may not be enough to get a glibc that supports BTI because crtbegin* and crtend* provided by the compiler needs to be BTI compatible too. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	dbfefbdc3a	Rewrite abi-note.S in C. Using C code allows the compiler to add target specific object file markings based on CFLAGS. The arm specific abi-note.S is removed and similar object file fix up will be avoided on AArch64 with standard branch protection.	2020-07-08 15:02:37 +01:00
Szabolcs Nagy	c7aa8596de	rtld: Clean up PT_NOTE and add PT_GNU_PROPERTY handling Add generic code to handle PT_GNU_PROPERTY notes. Invalid content is ignored, _dl_process_pt_gnu_property is always called after PT_LOAD segments are mapped and it has no failure modes. Currently only one NT_GNU_PROPERTY_TYPE_0 note is handled, which contains target specific properties: the _dl_process_gnu_property hook is called for each property. The old _dl_process_pt_note and _rtld_process_pt_note differ in how the program header is read. The old _dl_process_pt_note is called before PT_LOAD segments are mapped and _rtld_process_pt_note is called after PT_LOAD segments are mapped. The old _rtld_process_pt_note is removed and _dl_process_pt_note is always called after PT_LOAD segments are mapped and now it has no failure modes. The program headers are scanned backwards so that PT_NOTE can be skipped if PT_GNU_PROPERTY exists. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-07-08 14:28:53 +01:00
Alexander Anisimov	beea361050	arm: CVE-2020-6096: Fix multiarch memcpy for negative length [BZ #25620 ] Unsigned branch instructions could be used for r2 to fix the wrong behavior when a negative length is passed to memcpy. This commit fixes the armv7 version.	2020-07-08 14:18:31 +02:00
Evgeny Eremin	79a4fa341b	arm: CVE-2020-6096: fix memcpy and memmove for negative length [BZ #25620 ] Unsigned branch instructions could be used for r2 to fix the wrong behavior when a negative length is passed to memcpy and memmove. This commit fixes the generic arm implementation of memcpy amd memmove.	2020-07-08 14:18:19 +02:00
Samuel Thibault	01ac385ca8	hurd: Fix strerror not setting errno * sysdeps/mach/strerror_l.c: Include <errno.h>. (__strerror_l): Save errno on entry and restore it on exit.	2020-07-07 21:46:53 +00:00
Samuel Thibault	d63387d81d	hurd: Evaluate fd before entering the critical section * sysdeps/hurd/include/hurd/fd.h (HURD_FD_PORT_USE_CANCEL): Evaluate fd before calling _hurd_critical_section_lock.	2020-07-07 22:10:24 +02:00
Adhemerval Zanella	325081b9eb	string: Add strerrorname_np and strerrordesc_np The strerrorname_np returns error number name (e.g. "EINVAL" for EINVAL) while strerrordesc_np returns string describing error number (e.g "Invalid argument" for EINVAL). Different than strerror, strerrordesc_np does not attempt to translate the return description, both functions return NULL for an invalid error number. They should be used instead of sys_errlist and sys_nerr, both are thread and async-signal safe. These functions are GNU extensions. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 15:02:57 -03:00
Adhemerval Zanella	bfe05aa289	string: Add sigabbrev_np and sigdescr_np The sigabbrev_np returns the abbreviated signal name (e.g. "HUP" for SIGHUP) while sigdescr_np returns the string describing the error number (e.g "Hangup" for SIGHUP). Different than strsignal, sigdescr_np does not attempt to translate the return description and both functions return NULL for an invalid signal number. They should be used instead of sys_siglist or sys_sigabbrev and they are both thread and async-signal safe. They are added as GNU extensions on string.h header (same as strsignal). Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:57:14 -03:00
Adhemerval Zanella	08d2024b41	string: Simplify strerror_r Use snprintf instead of mempcpy plus itoa_word and remove unused definitions. There is no potential for infinite recursion because snprintf only use strerror_r for the %m specifier. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	725eeb4af1	string: Use tls-internal on strerror_l The buffer allocation uses the same strategy of strsignal. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	28aff04781	string: Implement strerror in terms of strerror_l If the thread is terminated then __libc_thread_freeres will free the storage via __glibc_tls_internal_free. It is only within the calling thread that this matters. It makes strerror MT-safe. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	9deec7c8ba	string: Remove old TLS usage on strsignal The per-thread state is refactored two use two strategies: 1. The default one uses a TLS structure, which will be placed in the static TLS space (using __thread keyword). 2. Linux allocates via struct pthread and access it through THREAD_* macros. The default strategy has the disadvantage of increasing libc.so static TLS consumption and thus decreasing the possible surplus used in some scenarios (which might be mitigated by BZ#25051 fix). It is used only on Hurd, where accessing the thread storage in the in single thread case is not straightforward (afaiu, Hurd developers could correct me here). The fallback static allocation used for allocation failure is also removed: defining its size is problematic without synchronizing with translated messages (to avoid partial translation) and the resulting usage is not thread-safe. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	f26d456b98	linux: Fix __NSIG_WORDS and add __NSIG_BYTES The __NSIG_WORDS value is based on minimum number of words to hold the maximum number of signals supported by the architecture. This patch also adds __NSIG_BYTES, which is the number of bytes required to represent the supported number of signals. It is used in syscalls which takes a sigset_t. Checked on x86_64-linux-gnu and i686-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	f13d260190	signal: Move sys_errlist to a compat symbol The symbol is deprecated by strerror since its usage imposes some issues such as copy relocations. Its internal name is also changed to _sys_errlist_internal to avoid static linking usage. The compat code is also refactored by removing the over enginered errlist-compat.c generation from manual entried and extra comment token in linker script file. It disantangle the code generation from manual and simplify both Linux and Hurd compat code. The definitions from errlist.c are moved to errlist.h and a new test is added to avoid a new errno entry without an associated one in manual. Checked on x86_64-linux-gnu and i686-linux-gnu. I also run a check-abi on all affected platforms. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	b1ccfc061f	signal: Move sys_siglist to a compat symbol The symbol was deprecated by strsignal and its usage imposes issues such as copy relocations. Its internal name is changed to __sys_siglist and __sys_sigabbrev to avoid static linking usage. The compat code is also refactored, since both Linux and Hurd usage the same strategy: export the same array with different object sizes. The libSegfault change avoids calling strsignal on the SIGFAULT signal handler (the current usage is already sketchy, adding a call that potentially issue locale internal function is even sketchier). Checked on x86_64-linux-gnu and i686-linux-gnu. I also run a check-abi on all affected platforms. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
Adhemerval Zanella	e4e11b1dba	signal: Add signum-{generic,arch}.h It refactor how signals are defined by each architecture. Instead of include a generic header (bits/signum-generic.h) and undef non-default values in an arch specific header (bits/signum.h) the new scheme uses a common definition (bits/signum-generic.h) and each architectures add its specific definitions on a new header (bits/signum-arch.h). For Linux it requires copy some system default definitions to alpha, hppa, and sparc. They are historical values and newer ports uses the generic Linux signum-arch.h. For Hurd the BSD signum is removed and moved to a new header (it is used currently only on Hurd). Checked on a build against all affected ABIs. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2020-07-07 14:10:58 -03:00
H.J. Lu	3f4b61a0b8	x86: Add thresholds for "rep movsb/stosb" to tunables Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables to update thresholds for "rep movsb" and "rep stosb" at run-time. Note that the user specified threshold for "rep movsb" smaller than the minimum threshold will be ignored. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-06 11:48:42 -07:00
Joseph Myers	6c010c5dde	Use C2x return value from getpayload of non-NaN (bug 26073). In TS 18661-1, getpayload had an unspecified return value for a non-NaN argument, while C2x requires the return value -1 in that case. This patch implements the return value of -1. I don't think this is worth having a new symbol version that's an alias of the old one, although occasionally we do that in such cases where the new function semantics are a refinement of the old ones (to avoid programs relying on the new semantics running on older glibc versions but not behaving as intended). Tested for x86_64 and x86; also ran math/ tests for aarch64 and powerpc.	2020-07-06 16:18:02 +00:00
H.J. Lu	28c13ae5bb	x86: Detect Extended Feature Disable (XFD) An extension called extended feature disable (XFD) is an extension added for Intel AMX to the XSAVE feature set that allows an operating system to enable a feature while preventing specific user threads from using the feature.	2020-07-06 06:57:08 -07:00
H.J. Lu	f8b4630ef6	x86: Correct bit_cpu_CLFSH [BZ #26208 ] bit_cpu_CLFSH should be (1u << 19), not (1u << 20).	2020-07-06 06:38:05 -07:00
Florian Weimer	706ad1e7af	Add the __libc_single_threaded variable The variable is placed in libc.so, and it can be true only in an outer libc, not libcs loaded via dlmopen or static dlopen. Since thread creation from inner namespaces does not work, pthread_create can update __libc_single_threaded directly. Using __libc_early_init and its initial flag, implementation of this variable is very straightforward. A future version may reset the flag during fork (but not in an inner namespace), or after joining all threads except one. Reviewed-by: DJ Delorie <dj@redhat.com>	2020-07-06 11:15:58 +02:00
Mathieu Desnoyers	8f4632deb3	Linux: rseq registration tests These tests validate that rseq is registered from various execution contexts (main thread, destructor, other threads, other threads created from destructor, forked process (without exec), pthread_atfork handlers, pthread setspecific destructors, signal handlers, atexit handlers). tst-rseq.c only links against libc.so, testing registration of rseq in a non-multithreaded environment. tst-rseq-nptl.c also links against libpthread.so, testing registration of rseq in a multithreaded environment. See the Linux kernel selftests for extensive rseq stress-tests.	2020-07-06 10:21:35 +02:00
Mathieu Desnoyers	6e29cb3f61	Linux: Use rseq in sched_getcpu if available When available, use the cpu_id field from __rseq_abi on Linux to implement sched_getcpu(). Fall-back on the vgetcpu vDSO if unavailable. Benchmarks: x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading glibc sched_getcpu(): 13.7 ns (baseline) glibc sched_getcpu() using rseq: 2.5 ns (speedup: 5.5x) inline load cpuid from __rseq_abi TLS: 0.8 ns (speedup: 17.1x)	2020-07-06 10:21:32 +02:00
Mathieu Desnoyers	0c76fc3c2b	Linux: Perform rseq registration at C startup and thread creation Register rseq TLS for each thread (including main), and unregister for each thread (excluding main). "rseq" stands for Restartable Sequences. See the rseq(2) man page proposed here: https://lkml.org/lkml/2018/9/19/647 Those are based on glibc master branch commit `3ee1e0ec5c`. The rseq system call was merged into Linux 4.18. The TLS_STATIC_SURPLUS define is increased to leave additional room for dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working. The increase (76 bytes) is larger than 32 bytes because it has not been increased in quite a while. The cost in terms of additional TLS storage is quite significant, but it will also obscure some initial-exec-related dlopen failures.	2020-07-06 10:21:16 +02:00
Florian Weimer	5f40e4b1ba	Linux: Fix UTC offset setting in settimeofday for __TIMESIZE != 64 The time argument is NULL in this case, and attempt to convert it leads to a null pointer dereference. This fixes commit `d2e3b697da` ("y2038: linux: Provide __settimeofday64 implementation"). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-06-30 21:20:20 +02:00
Joseph Myers	3ee1e0ec5c	Update kernel version to 5.7 in tst-mman-consts.py. This patch updates the kernel version in the test tst-mman-consts.py to 5.7. (There are no new constants covered by this test in 5.7 that need any other header changes; there's a new MREMAP_DONTUNMAP, but this test doesn't yet cover MREMAP_*.) Tested with build-many-glibcs.py.	2020-06-29 14:06:32 +00:00
Tulio Magno Quites Machado Filho	d2ba3677da	powerpc: Add support for POWER10 1. Add the directories to hold POWER10 files. 2. Add support to select POWER10 libraries based on AT_PLATFORM. 3. Let submachine=power10 be set automatically.	2020-06-29 10:08:38 -03:00
Samuel Thibault	81b1c8cbb5	hurd: Simplify usleep timeout computation as suggested by Andreas Schwab * sysdeps/mach/usleep.c (usleep): Divide timeout in an overflow-safe way.	2020-06-29 10:10:32 +02:00
Samuel Thibault	269e4c17cd	htl: Enable cancel16 an cancel20 tests * nptl/tst-cancel16.c, tst-cancel20.c, tst-cancelx16.c, tst-cancelx20.c: Move to... * sysdeps/pthread: ... here. * nptl/Makefile: Move corresponding references and rules to... * sysdeps/pthread/Makefile: ... here. * sysdeps/mach/hurd/i386/Makefile: Xfail tst-cancel*16 for now: missing barrier pshared support, but test should be working otherwise.	2020-06-29 00:16:33 +00:00
Samuel Thibault	f512321130	hurd: Add remaining cancelation points * hurd/hurdselect.c: Include <sysdep-cancel.h>. (_hurd_select): Surround call to __mach_msg with enabling async cancel. * sysdeps/mach/hurd/accept4.c: Include <sysdep-cancel.h>. (__libc_accept4): Surround call to __socket_accept with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/connect.c: Include <sysdep-cancel.h>. (__connect): Surround call to __file_name_lookup and __socket_connect with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/fdatasync.c: Include <sysdep-cancel.h>. (fdatasync): Surround call to __file_sync with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/fsync.c: Include <sysdep-cancel.h>. (fsync): Surround call to __file_sync with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/ioctl.c: Include <sysdep-cancel.h>. (__ioctl): When request is TIOCDRAIN, surround call to send_rpc with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/msync.c: Include <sysdep-cancel.h>. (msync): Surround call to __vm_object_sync with enabling async cancel. * sysdeps/mach/hurd/sigsuspend.c: Include <sysdep-cancel.h>. (__sigsuspend): Surround call to __mach_msg with enabling async cancel. * sysdeps/mach/hurd/sigwait.c: Include <sysdep-cancel.h>. (__sigwait): Surround wait code with enabling async cancel. * sysdeps/mach/msync.c: Include <sysdep-cancel.h>. (msync): Surround call to __vm_msync with enabling async cancel. * sysdeps/mach/sleep.c: Include <sysdep-cancel.h>. (__sleep): Surround call to __mach_msg with enabling async cancel. * sysdeps/mach/usleep.c: Include <sysdep-cancel.h>. (usleep): Surround call to __vm_msync with enabling async cancel.	2020-06-28 22:46:21 +00:00
Samuel Thibault	1f3413338e	hurd: fix usleep(ULONG_MAX) * sysdeps/mach/usleep.c (usleep): Clamp timeout when rouding up.	2020-06-28 22:39:03 +00:00
Samuel Thibault	3c9f67e7a5	hurd: Make fcntl(F_SETLKW) cancellation points and add _nocancel variant. sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add fcntl_nocancel. * sysdeps/mach/hurd/fcntl.c [NOCANCEL]: Include <not-cancel.h>. [!NOCANCEL]: Include <sysdep-cancel.h>. (__libc_fcntl) [!NOCANCEL]: Surround __file_record_lock call with enabling async cancel, and use HURD_FD_PORT_USE_CANCEL instead of HURD_FD_PORT_USE. * sysdeps/mach/hurd/fcntl_nocancel.c: New file, defines __fcntl_nocancel by including fcntl.c. * sysdeps/mach/hurd/not-cancel.h (__fcntl64_nocancel): Replace macro with __fcntl_nocancel declaration with hidden proto, and make __fcntl64_nocancel call __fcntl_nocancel.	2020-06-28 18:24:37 +00:00
Samuel Thibault	09effdc9b0	hurd: make wait4 a cancellation point and add _nocancel variant. * sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add wait4_nocancel. * sysdeps/mach/hurd/wait4.c: Include <sysdep-cancel.h> (__wait4): Surround __proc_wait with enabling async cancel, and use __USEPORT_CANCEL instead of __USEPORT. * sysdeps/mach/hurd/wait4_nocancel.c: New file, contains previous implementation of __wait4. * sysdeps/mach/hurd/not-cancel.h (__waitpid_nocancel): Replace macro with __wait4_nocancel declaration with hidden proto, and make __waitpid_nocancel call __wait4_nocancel.	2020-06-28 18:04:27 +00:00
Samuel Thibault	d60fdd480d	hurd: Fix port definition in HURD_PORT_USE_CANCEL * sysdeps/hurd/include/hurd/port.h: Include <libc-lock.h>. (HURD_PORT_USE_CANCEL): Add local port variable.	2020-06-28 18:04:26 +00:00
Samuel Thibault	fd3df63fb6	hurd: make close a cancellation point and add _nocancel variant. * sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add close_nocancel. * sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add __close_nocancel. * sysdeps/mach/hurd/i386/localplt.data (__close_nocancel): Allow PLT. * sysdeps/mach/hurd/close.c: Include <sysdep-cancel.h> (__libc_close): Surround _hurd_fd_close with enabling async cancel. * sysdeps/mach/hurd/close_nocancel.c: New file. * sysdeps/mach/hurd/not-cancel.h (__close_nocancel): Replace macro with declaration with hidden proto.	2020-06-28 16:34:14 +00:00
Samuel Thibault	4cafcd839f	hurd: make open and openat cancellation points and add _nocancel variants. * sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add open_nocancel openat_nocancel. * sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add __open_nocancel. * sysdeps/mach/hurd/dl-sysdep.c (__open_nocancel): Add alias, check it is not hidden. * sysdeps/mach/hurd/i386/localplt.data (__open_nocancel): Allow PLT. * sysdeps/mach/hurd/not-cancel.h (__open_nocancel, __openat_nocancel: Replace macros with declarations with hidden proto. (__open64_nocancel, __openat64_nocancel): Call __open_nocancel and __openat_nocancel instead of __open64 and __openat64. * sysdeps/mach/hurd/open.c: Include <sysdep-cancel.h> (__libc_open): Surround __file_name_lookup with enabling async cancel. * sysdeps/mach/hurd/openat.c: Likewise. * sysdeps/mach/hurd/open_nocancel.c, sysdeps/mach/hurd/openat_nocancel.c: New files.	2020-06-28 15:11:23 +00:00
Samuel Thibault	67a78072e2	hurd: clean fd and port on thread cancel HURD_PORT_USE link fd and port with a stack-stored structure, so on thread cancel we need to cleanup this. hurd/fd-cleanup.c: New file. * hurd/port-cleanup.c (_hurd_port_use_cleanup): New function. * hurd/Makefile (routines): Add fd-cleanup. * sysdeps/hurd/include/hurd.h (__USEPORT_CANCEL): New macro. * sysdeps/hurd/include/hurd/fd.h (_hurd_fd_port_use_data): New structure. (_hurd_fd_port_use_cleanup): New prototype. (HURD_DPORT_USE_CANCEL, HURD_FD_PORT_USE_CANCEL): New macros. * sysdeps/hurd/include/hurd/port.h (_hurd_port_use_data): New structure. (_hurd_port_use_cleanup): New prototype. (HURD_PORT_USE_CANCEL): New macro. * hurd/hurd/fd.h (HURD_FD_PORT_USE): Also refer to HURD_FD_PORT_USE_CANCEL. * hurd/hurd.h (__USEPORT): Also refer to __USEPORT_CANCEL. * hurd/hurd/port.h (HURD_PORT_USE): Also refer to HURD_PORT_USE_CANCEL. * hurd/fd-read.c (_hurd_fd_read): Call HURD_FD_PORT_USE_CANCEL instead of HURD_FD_PORT_USE. * hurd/fd-write.c (_hurd_fd_write): Likewise. * sysdeps/mach/hurd/send.c (__send): Call HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE. * sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise. * sysdeps/mach/hurd/sendto.c (__sendto): Likewise. * sysdeps/mach/hurd/recv.c (__recv): Likewise. * sysdeps/mach/hurd/recvfrom.c (__recvfrom): Likewise. * sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Call __USEPORT_CANCEL instead of __USEPORT, and HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.	2020-06-28 00:38:46 +00:00
Samuel Thibault	6414eef6e0	htl: Move cleanup handling to non-private libc-lock This adds sysdeps/htl/libc-lock.h which augments sysdeps/mach/libc-lock.h with the htl-aware cleanup handling. Otherwise inclusion of libc-lock.h without libc-lockP.h would keep only the mach-aware handling. This also fixes cleanup getting called when the binary is statically-linked without libpthread. * sysdeps/htl/libc-lockP.h (__libc_cleanup_region_start, __libc_cleanup_end, __libc_cleanup_region_end, __pthread_get_cleanup_stack): Move to... * sysdeps/htl/libc-lock.h: ... new file. (__libc_cleanup_region_start): Always set handler and arg. (__libc_cleanup_end): Always call the cleanup handler. (__libc_cleanup_push, __libc_cleanup_pop): New macros.	2020-06-28 00:13:57 +00:00
Samuel Thibault	cf2c8cc2c6	htl: Fix includes for lockfile These only need exactly to use __libc_ptf_call. * sysdeps/htl/flockfile.c: Include <libc-lockP.h> instead of <libc-lock.h> * sysdeps/htl/ftrylockfile.c: Include <libc-lockP.h> instead of <errno.h>, <pthread.h>, <stdio-lock.h> * sysdeps/htl/funlockfile.c: Include <libc-lockP.h> instead of <pthread.h> and <stdio-lock.h>	2020-06-28 00:13:57 +00:00
Samuel Thibault	726117e01b	htl: avoid cancelling threads inside critical sections Like hurd_thread_cancel does. * sysdeps/mach/hurd/htl/pt-docancel.c: Include <hurd/signal.h> (__pthread_do_cancel): Lock target thread's critical_section_lock and ss lock around thread mangling.	2020-06-27 02:34:18 +02:00
H.J. Lu	4fdd4d41a1	x86: Detect Intel Advanced Matrix Extensions Intel Advanced Matrix Extensions (Intel AMX) is a new programming paradigm consisting of two components: a set of 2-dimensional registers (tiles) representing sub-arrays from a larger 2-dimensional memory image, and accelerators able to operate on tiles. Intel AMX is an extensible architecture. New accelerators can be added and the existing accelerator may be enhanced to provide higher performance. The initial features are AMX-BF16, AMX-TILE and AMX-INT8, which are usable only if the operating system supports both XTILECFG state and XTILEDATA state. Add AMX-BF16, AMX-TILE and AMX-INT8 support to HAS_CPU_FEATURE and CPU_FEATURE_USABLE.	2020-06-26 06:53:05 -07:00
Stefan Liebler	1d21fb1061	S390: Optimize __memset_z196. It turned out that an 256b-mvc instruction which depends on the result of a previous 256b-mvc instruction is counterproductive. Therefore this patch adjusts the 256b-loop by storing the first byte with stc and setting the remaining 255b with mvc. Now the 255b-mvc instruction depends on the stc instruction.	2020-06-26 09:45:11 +02:00
Stefan Liebler	0792c8ae1a	S390: Optimize __memcpy_z196. This patch introduces an extra loop without pfd instructions as it turned out that the pfd instructions are usefull for copies >=64KB but are counterproductive for smaller copies.	2020-06-26 09:45:11 +02:00
Florian Weimer	2034c70e64	elf: Include <stddef.h> (for size_t), <sys/stat.h> in <ldconfig.h> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-06-25 16:51:03 +02:00
Stefan Liebler	f6b955e8ba	S390: Regenerate ULPs. Updates needed after recent exp10f commits.	2020-06-24 14:51:06 +02:00
Florian Weimer	1fb7dc751e	htl: Add wrapper header for <semaphore.h> with hidden __sem_post This is required to avoid a check-localplt failure due to a sem_post call through the PLT. Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>	2020-06-24 13:38:08 +02:00
Samuel Thibault	1b90d52df9	htl: Fix case when sem_wait is canceled while holding a token sysdeps/htl/sem-timedwait.c (struct cancel_ctx): Add cancel_wake field. (cancel_hook): When unblocking thread, set cancel_wake field to 1. (__sem_timedwait_internal): Set cancel_wake field to 0 by default. On cancellation exit, check whether we hold a token, to be put back.	2020-06-24 02:20:42 +02:00
Samuel Thibault	eca16db02d	htl: Make sem_wait cancellations points By aligning its implementation on pthread_cond_wait. sysdeps/htl/sem-timedwait.c (cancel_ctx): New structure. (cancel_hook): New function. (__sem_timedwait_internal): Check for cancellation and register cancellation hook that wakes the thread up, and check again for cancellation on exit. * nptl/tst-cancel13.c, nptl/tst-cancelx13.c: Move to... * sysdeps/pthread/: ... here. * nptl/Makefile: Move corresponding references and rules to... * sysdeps/pthread/Makefile: ... here.	2020-06-24 01:19:49 +02:00
Samuel Thibault	3513d5af3d	htl: Simplify non-cancel path of __pthread_cond_timedwait_internal Since __pthread_exit does not return, we do not need to indent the noncancel path * sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal): Move cancelled path before non-cancelled path, to avoid "else" indentation.	2020-06-24 01:19:48 +02:00
Samuel Thibault	9f6e508b42	htl: Enable tst-cancel25 test * nptl/tst-cancel25.c: Move to... * sysdeps/pthread/tst-cancel25.c: ... here. (tf2) Do not test for SIGCANCEL when it is not defined. * nptl/Makefile: Move corresponding reference to... * sysdeps/pthread/Makefile: ... here.	2020-06-24 00:02:31 +02:00
Tulio Magno Quites Machado Filho	ae725e3f9c	powerpc: Add new hwcap values Linux commit ID ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 reserved 2 new bits in AT_HWCAP2: - PPC_FEATURE2_ARCH_3_1 indicates the availability of the POWER ISA 3.1; - PPC_FEATURE2_MMA indicates the availability of the Matrix-Multiply Assist facility.	2020-06-23 18:15:06 -03:00
Alex Butler	03e1378f94	aarch64: MTE compatible strncmp Add support for MTE to strncmp. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Branislav Rankov <branislav.rankov@arm.com> Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-23 17:55:39 +01:00
Alex Butler	adac54ffc5	aarch64: MTE compatible strcmp Add support for MTE to strcmp. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Branislav Rankov <branislav.rankov@arm.com> Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-23 17:55:39 +01:00
Alex Butler	79160c06c7	aarch64: MTE compatible strrchr Add support for MTE to strrchr. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-23 17:55:39 +01:00
Alex Butler	df06b0d90f	aarch64: MTE compatible memrchr Add support for MTE to memrchr. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-23 17:55:39 +01:00
Alex Butler	7ff899969f	aarch64: MTE compatible memchr Add support for MTE to memchr. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Gabor Kertesz <gabor.kertesz@arm.com>	2020-06-23 17:55:39 +01:00
Alex Butler	bb2c12aecb	aarch64: MTE compatible strcpy Add support for MTE to strcpy. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-23 17:55:39 +01:00
Joseph Myers	8ec13b4639	Add MREMAP_DONTUNMAP from Linux 5.7 Add the new constant MREMAP_DONTUNMAP from Linux 5.7 to bits/mman-shared.h. Tested with build-many-glibcs.py.	2020-06-23 14:42:45 +00:00
H.J. Lu	ecbbadbf10	x86: Update CPU feature detection [BZ #26149 ] 1. Divide architecture features into the usable features and the preferred features. The usable features are for correctness and can be exported in a stable ABI. The preferred features are for performance and only for glibc internal use. 2. Change struct cpu_features to struct cpu_features { struct cpu_features_basic basic; unsigned int usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; unsigned int usable[USABLE_FEATURE_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; and initialize usable_p to pointer to the usable arary so that struct cpu_features { struct cpu_features_basic basic; unsigned int usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; }; can be exported via a stable ABI. The cpuid and usable arrays can be expanded with backward binary compatibility for both .o and .so files. 3. Add COMMON_CPUID_INDEX_7_ECX_1 for AVX512_BF16. 4. Detect ENQCMD, PKS, AVX512_VP2INTERSECT, MD_CLEAR, SERIALIZE, HYBRID, TSXLDTRK, L1D_FLUSH, CORE_CAPABILITIES and AVX512_BF16. 5. Rename CAPABILITIES to ARCH_CAPABILITIES. 6. Check if AVX512_VP2INTERSECT, AVX512_BF16 and PKU are usable. 7. Update CPU feature detection test.	2020-06-22 13:09:33 -07:00
Adhemerval Zanella	ea04f02131	aarch64: Remove fpu Makefile The -fno-math-errno is already added by default and the minimum required GCC to build glibc (6.2) make the -ffinite-math-only superflous. Checked on aarch64-linux-gnu.	2020-06-22 11:09:50 -03:00
Adhemerval Zanella	9f21672b89	m68k: Use sqrt{f} builtin for coldfire Checked with a build for m68k-linux-gnu-coldfire.	2020-06-22 11:09:50 -03:00
Adhemerval Zanella	cbf3571f49	arm: Use sqrt{f} builtin Checked on arm-linux-gnueabi and armv7-linux-gnueabihf	2020-06-22 11:09:50 -03:00
Adhemerval Zanella	9dbb3fdfb7	riscv: Use sqrt{f} builtin Checked with a build for riscv64-linux-gnu-rv64imac-lp64 (no builtin support), riscv64-linux-gnu-rv64imafdc-lp64, and riscv64-linux-gnu-rv64imafdc-lp64d.	2020-06-22 11:09:50 -03:00
Adhemerval Zanella	3ca05a8e9e	s390: Use sqrt{f} builtin Checked on s390x-linux-gnu.	2020-06-22 11:09:50 -03:00
Adhemerval Zanella	c9a30f08e1	sparc: Use sqrt{f} builtin It also enabled to use fsqrtd on sparc64. Checked on sparcv9-linux-gnu and sparc64-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	32c65b28f3	mips: Use sqrt{f} builtin Checked with a build against mips-linux-gnu and mips64-linux-gnu and comparing the resulting binaries.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	8a7923b57e	alpha: Use builtin sqrt{f} The generic implementation is simplified by removing the 'optimization' for !_IEEE_FP_INEXACT (which does not handle inexact neither some values). Checked on alpha-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	b24381e50f	i386: Use builtin sqrtl Checked on i686-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	d19d25dd06	x86_64: Use builtin sqrt{f,l} Checked on x86_64-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	169ea8f928	powerpc: Use sqrt{f} builtin The powerpc sqrt implementation is also simplified: - the static constants are open coded within the implementation. - for !USE_SQRT_BUILTIN the function is implemented directly on __ieee754_sqrt (it avoid an superflous extra jump). Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	a2e833667d	s390x: Use fma{f} builtin Checked on s390x-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	271afad8f4	aarch64: Use math-use-builtins for ceil{f} The define is already set on the math-use-builtins-ceil.h, the patch just removes the implementations (it was missed on `c9feb1be93`). Checked on aarch64-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	e80501a5c9	math: Decompose math-use-builtins.h Each symbol definitions are moved on a separated file and it cover all symbol type definitions (float, double, long double, and float128). It allows to set support for architectures without the boiler place of copying default values. Checked with a build on the affected ABIs.	2020-06-22 11:09:45 -03:00
Samuel Thibault	c013d5d3aa	hurd: Add mremap * sysdeps/mach/hurd/mremap.c: New file. * sysdeps/mach/hurd/Makefile [misc] (sysdep_routines): Add mremap. * sysdeps/mach/hurd/Versions (libc.GLIBC_2.32): Add mremap. * sysdeps/mach/hurd/i386/libc.abilist: Add mremap.	2020-06-20 13:49:57 +00:00
Adhemerval Zanella	3297d019e1	ia64: Use generic exp10f The generic implementation is slight worse (Itanium(R) Processor 9020): Before new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 3.61582e+08, "iterations": 2.384e+07, "reciprocal-throughput": 14.8334, "latency": 15.5006, "max-throughput": 6.74153e+07, "min-throughput": 6.45136e+07 } } With new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 3.85549e+08, "iterations": 2.384e+07, "reciprocal-throughput": 15.8391, "latency": 16.5056, "max-throughput": 6.31348e+07, "min-throughput": 6.05857e+07 } } However it fixes all the issues on both: math/test-float-exp10 math/test-float32-exp10 (all the issues wrong results for non default rounding modes). The existing ia64 libm interface uses matherrf and matherrl in addition to matherr for SVID error handling. However, there is no such error handling support for exp10f in ia64 libm. So replacing it with the generic implementation should be fine. Checked on ia64-linux-gnu.	2020-06-19 12:08:52 -03:00
Adhemerval Zanella	be668a8d78	New exp10f version without SVID compat wrapper This patch changes the exp10f error handling semantics to only set errno according to POSIX rules. New symbol version is introduced at GLIBC_2.32. The old wrappers are kept for compat symbols. There are some outliers that need special handling: - ia64 provides an optimized implementation of exp10f that uses ia64 specific routines to set SVID compatibility. The new symbol version is aliased to the exp10f one. - m68k also provides an optimized implementation, and the new version uses it instead of the sysdeps/ieee754/flt32 one. - riscv and csky uses the generic template implementation that does not provide SVID support. For both cases a new exp10f version is not added, but rather the symbols version of the generic sysdeps/ieee754/flt32 is adjusted instead. Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu.	2020-06-19 12:08:47 -03:00
Adhemerval Zanella	4b2d8e4442	i386: Use generic exp10f The generic implementation is twice as fast. Using the exp10f benchmark: * master: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 1.02967e+09, "iterations": 4.768e+07, "reciprocal-throughput": 18.3579, "latency": 24.8331, "max-throughput": 5.44725e+07, "min-throughput": 4.02688e+07 } } * patched: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 1.01821e+09, "iterations": 6.1984e+07, "reciprocal-throughput": 13.1975, "latency": 19.6563, "max-throughput": 7.57719e+07, "min-throughput": 5.08743e+07 } } Checked on i686-linux-gnu.	2020-06-19 10:48:15 -03:00
Paul Zimmermann	6e98983c09	math: Optimized generic exp10f with wrappers It is inspired by expf and reuses its tables and internal functions. The error checks are inlined and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Result for x86_64 (i7-4790K CPU @ 4.00GHz) are: Before new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.0414e+09, "iterations": 1.00128e+08, "reciprocal-throughput": 26.6818, "latency": 54.043, "max-throughput": 3.74787e+07, "min-throughput": 1.85038e+07 } With new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.11951e+09, "iterations": 1.23968e+08, "reciprocal-throughput": 21.0581, "latency": 45.4028, "max-throughput": 4.74876e+07, "min-throughput": 2.20251e+07 } Result for aarch64 (A72 @ 2GHz) are: Before new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.62362e+09, "iterations": 3.3376e+07, "reciprocal-throughput": 127.698, "latency": 149.365, "max-throughput": 7.831e+06, "min-throughput": 6.69501e+06 } With new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.29108e+09, "iterations": 6.6752e+07, "reciprocal-throughput": 51.2111, "latency": 77.3568, "max-throughput": 1.9527e+07, "min-throughput": 1.29271e+07 } Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu, and sparc64-linux-gnu.	2020-06-19 10:48:15 -03:00
H.J. Lu	27f8864bd4	x86: Update F16C detection [BZ #26133 ] Since F16C requires AVX, set F16C usable only when AVX is usable.	2020-06-18 07:01:58 -07:00
Sunil K Pandey	75870237ff	Fix avx2 strncmp offset compare condition check [BZ #25933 ] strcmp-avx2.S: In avx2 strncmp function, strings are compared in chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4 vector size comparison, code must check whether it already passed the given offset. This patch implement avx2 offset check condition for strncmp function, if both string compare same for first 4 vector size.	2020-06-17 07:07:38 -07:00
H.J. Lu	a35a59036e	x86_64: Use %xmmN with vpxor to clear a vector register Since "vpxor %xmmN, %xmmN, %xmmN" clears the whole vector register, use %xmmN, instead of %ymmN, with vpxor to clear a vector register.	2020-06-17 05:44:02 -07:00
H.J. Lu	b7c9bb183b	x86: Correct bit_cpu_CLFLUSHOPT [BZ #26128 ] bit_cpu_CLFLUSHOPT should be (1u << 23), not (1u << 22).	2020-06-17 05:32:37 -07:00
Paul E. Murphy	b637306d3e	powerpc64le: refactor e_sqrtf128.c Combine both implementations into a single file to allow building twice with appropriate multiarch support when possible.	2020-06-16 13:50:44 -05:00
Joseph Myers	b67339d0bb	Update syscall-names.list for Linux 5.7. Linux 5.7 has no new syscalls. Update the version number in syscall-names.list to reflect that it is still current for 5.7. Tested with build-many-glibcs.py.	2020-06-15 22:58:22 +00:00
Vineet Gupta	e93c264336	ieee754/dbl-64: Reduce the scope of temporary storage variables This came to light when adding hard-flaot support to ARC glibc port without hardware sqrt support causing glibc build to fail: \| ../sysdeps/ieee754/dbl-64/e_sqrt.c: In function '__ieee754_sqrt': \| ../sysdeps/ieee754/dbl-64/e_sqrt.c:58:54: error: unused variable 'ty' [-Werror=unused-variable] \| double y, t, del, res, res1, hy, z, zz, p, hx, tx, ty, s; The reason being EMULV() macro uses the hardware provided __builtin_fma() variant, leaving temporary variables 'p, hx, tx, hy, ty' unused hence compiler warning and ensuing error. The intent of the patch was to fix that error, but EMULV is pervasive and used fair bit indirectly via othe rmacros, hence this patch. Functionally it should not result in code gen changes and if at all those would be better since the scope of those temporaries is greatly reduced now Built tested with aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf hppa-linux-gnu x86_64-linux-gnu arm-linux-gnueabihf riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64 powerpc-linux-gnu microblaze-linux-gnu nios2-linux-gnu hppa-linux-gnu Also as suggested by Joseph [1] used --strip and compared the libs with and w/o patch and they are byte-for-byte unchanged (with gcc 9). \| for i in `find . -name libm-2.31.9000.so`; \| do \| echo $i; diff $i /SCRATCH/vgupta/gnu2/install/glibcs/$i ; echo $?; \| done \| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so \| 0 \| ./arm-linux-gnueabi/lib/libm-2.31.9000.so \| 0 \| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so \| 0 \| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so \| 0 \| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so \| 0 \| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so \| 0 \| ./powerpc-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./microblaze-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./nios2-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./hppa-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./s390x-linux-gnu/lib64/libm-2.31.9000.so [1] https://sourceware.org/pipermail/libc-alpha/2019-November/108267.html	2020-06-15 13:09:21 -07:00
Samuel Thibault	c1dcc54113	hurd: Fix __writev_nocancel_nostatus * sysdeps/mach/hurd/Makefile [subdir=misc] (sysdep_routines): Add writev_nocancel writev_nocancel_nostatus. * sysdeps/mach/hurd/not-cancel.h (__writev_nocancel_nostatus): Replace macro with function declaration (with hidden prototype in libc). (__writev_nocancel): New function declaration (with hidden prototype in libc). * sysdeps/mach/hurd/writev_nocancel_nostatus.c: New file. * sysdeps/posix/writev_nocancel.c: New file, includes writev.c to make a nocancel variant that calls __write_nocancel. * sysdeps/posix/writev.c (writev): Do not define alias if __writev is renamed.	2020-06-14 17:45:04 +00:00
Samuel Thibault	0c46891442	hurd: Make send* cancellation points * sysdeps/mach/hurd/send.c (__send): Make the __socket_send call a cancellation point. * sysdeps/mach/hurd/sendto.c (__sendto): Likewise. * sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.	2020-06-14 17:11:22 +00:00
Samuel Thibault	45fce058fe	htl: Enable more cancellation tests * nptl/tst-cancel-self-cancelstate.c, tst-cancel-self.c, tst-cancel9.c, tst-cancelx9.c: Move to... * sysdeps/pthread: ... here. * nptl/Makefile: Move corresponding references and rules to... * sysdeps/pthread/Makefile: ... here.	2020-06-14 16:16:59 +00:00
Samuel Thibault	662de0889a	hurd: Make write and pwrite64 cancellation points and add _nocancel variants. * sysdeps/mach/hurd/write.c (__libc_write): Call __write_nocancel surrounded by enabling async cancel, to replace implementation moved to... * sysdeps/mach/hurd/write_nocancel.c (__write_nocancel): ... here. * sysdeps/mach/hurd/pwrite64.c (__libc_pwrite64): Call __pwrite64_nocancel surrounded by enabling async cancel, to replace implementation moved to... * sysdeps/mach/hurd/pwrite64_nocancel.c (__pwrite64_nocancel): ... here. * sysdeps/mach/hurd/Makefile (sysdep_routines): Add write_nocancel and pwrite64_nocancel. * sysdeps/mach/hurd/not-cancel.h (__write_nocancel, __pwrite64_nocancel): Replace macros with prototypes with a hidden proto on libc. * sysdeps/mach/hurd/dl-sysdep.c (__write_nocancel): New alias, check that it is not hidden. * sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __write_nocancel. (ld.GLIBC_PRIVATE): Add __write_nocancel. * sysdeps/mach/hurd/i386/localplt.data (__write_nocancel): Add reference.	2020-06-14 15:53:21 +00:00
Samuel Thibault	76fe4ef4be	htl: Fix cleanup support for IO locking * sysdeps/htl/stdio-lock.h: New file, registers locking cleanup to htl. * sysdeps/htl/libc-lockP.h: Include <libc-lock.h>. (__libc_cleanup_region_start, __libc_cleanup_end, __libc_cleanup_region_end): Override macros from <libc-lock.h> with versions which register cleanup to htl. (__pthread_get_cleanup_stack): Make reference weak for skipping registration on in the static non-libpthread case.	2020-06-14 15:53:04 +00:00
Samuel Thibault	ea5cad3e37	htl: Add noreturn attribute on __pthread_exit forward * sysdeps/htl/pthread-functions.h (__pthread_exit): Add noreturn attribute. (struct pthread_functions): Add noreturn attribute on ptr___pthread_exit field.	2020-06-14 12:53:38 +00:00
Samuel Thibault	89edef7b39	hurd: Make recv* cancellation points * sysdeps/mach/hurd/recv.c (__recv): Make the __socket_recv call cancellable. * sysdeps/mach/hurd/recvfrom.c (__recvfrom): Make the __socket_recv and __socket_whatis_address calls cancellable. * sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Make the __socket_recv, __socket_whatis_address, __io_reauthenticate, and __auth_user_authenticate calls cancellable.	2020-06-14 00:19:35 +00:00
Paul E. Murphy	146fea0764	powerpc: Automatic CPU detection in preconfigure Added a check to detect the CPU value in preconfigure, so that glibc is built with the correct --with-cpu value. And move existing checks into preconfigure.ac. Co-Authored-By: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Co-Authored-By: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>	2020-06-11 17:15:49 -05:00
Samuel Thibault	62d97c3432	htl: Enable more cancel tests * nptl/tst-cancel11.c, tst-cancel21-static.c, tst-cancel21.c, tst-cancel6.c, tst-cancelx11.c, tst-cancelx21.c, tst-cancelx6.c: Move to... * sysdeps/pthread: ... here. * nptl/Makefile: Move corresponding references and rules to... * sysdeps/pthread/Makefile: ... here.	2020-06-10 21:34:19 +00:00
Andrea Corallo	a365ac45b7	aarch64: MTE compatible strlen Introduce an Arm MTE compatible strlen implementation. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance on modern cores. On cores with less efficient Advanced SIMD implementation such as Cortex-A53 it can be slower. Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-09 09:21:11 +01:00
Andrea Corallo	49beaaec1b	aarch64: MTE compatible strchr Introduce an Arm MTE compatible strchr implementation. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-09 09:20:27 +01:00
Andrea Corallo	f7de454f20	aarch64: MTE compatible strchrnul Introduce an Arm MTE compatible strchrnul implementation. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1. Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>	2020-06-09 09:20:27 +01:00
Krzysztof Koch	d1f75e9644	AArch64: Merge Falkor memcpy and memmove implementations Falkor's memcpy and memmove share some implementation details, therefore, the two routines are moved to a single source file for code reuse. The two routines now share code for small and medium copies (up to and including 128 bytes). Large copies in memcpy do not handle overlap correctly, consequently, the loops for moving/copying more than 128 bytes stay separate for memcpy and memmove. To increase code reuse a number of small modifications were made: 1. The old implementation of memcpy copied the first 16-bytes as soon as the size of data was determined to be greater than 32 bytes. For memcpy code to also work when copying small/medium overlapping data, the first load and store was moved to the large copy case. 2. Medium memcpy case no longer assumes that 16 bytes were already copied and uses 8 registers to copy up to 128 bytes. 3. Small case for memmove was enlarged to that of memcpy, which is less than or equal to 32 bytes. 4. Medium case for memmove was enlarged to that of memcpy, which is less than or equal to 128 bytes. Other changes include: 1. Improve alignment of existing loop bodies. 2. 'Delouse' memmove and memcpy input arguments. Make sure that upper 32-bits of input registers are zeroed if unused. 3. Do one more iteration in memmove loops and reduce the number of copies made from the start/end of the buffer, depending on the direction of the memmove loop. Benchmarking: Looking at the results from bench-memcpy-random.out, we can see that now memmove_falkor is about 5% faster than memcpy_falkor_old, while memmove_falkor_old was more than 15% slower. The memcpy implementation remained largely unmodified, so there is no significant performance change. The reason for such a significant memmove performance gain is the increase of the upper bound on the small copy case to 32 bytes and the increase of the upper bound on the medium copy case to 128 bytes. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-06-08 14:13:05 +01:00
Samuel Thibault	f112dcc506	hurd: document that gcc&gdb look at the trampoline code * sysdeps/mach/hurd/i386/trampoline.c (rpc_wait_trampoline): Document which gcc and gdb files look at the code of the trampoline.	2020-06-08 14:41:57 +02:00
Samuel Thibault	dd7a8ad7ba	pthread: Move back linking rules to nptl and htl `d6d74ec16` ('htl: Enable more tests') moved the linking rules from nptl/Makefile and htl/Makefile to the shared sysdeps/pthread/Makefile. But e.g. on powerpc some tests are added in sysdeps/powerpc/Makefile, which is included after sysdeps/pthread/Makefile, and thus the tests don't get affected by the rules and fail to link. For now let's just copy over the set of rules in both nptl/Makefile and htl/Makefile. * sysdeps/pthread/Makefile: Move libpthread linking rules to... * htl/Makefile: ... here and... * nptl/Makefile: ... there.	2020-06-08 14:34:22 +02:00

... 2 3 4 5 6 ...

13547 Commits