glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-13 14:50:17 +00:00

Author	SHA1	Message	Date
Fangrui Song	a21d58a0dc	x86_64: Remove unneeded static PIE check for undefined weak diagnostic https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in -pie links. Arguably keeping the diagnostic like other ports is more correct, since statically resolving movl foo(%rip), %eax to the link-time zero address produces a corrupted output. It turns out that --enable-static-pie builds do not depend on the ld behavior. GCC generates GOT indirection for weak declarations for -fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't really matter. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-08-27 17:26:06 -07:00
Wilco Dijkstra	2d20ffe431	[PATCH 7/7] sin/cos slow paths: refactor sincos implementation Refactor the sincos implementation - rather than rely on odd partial inlining of preprocessed portions from sin and cos, explicitly write out the cases. This makes sincos much easier to maintain and provides an additional 16-20% speedup between 0 and 2^27. The overall speedup of sincos is 48% over this range. Between 0 and PI it is 66% faster. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same logic as sin and cos.	2021-08-27 17:26:06 -07:00
Wilco Dijkstra	c8aaaf67f6	[PATCH 6/7] sin/cos slow paths: refactor duplicated code into dosin Refactor duplicated code into do_sin. Since all calls to do_sin use copysign to set the sign of the result, move it inside do_sin. Small inputs use a separate polynomial, so move this into do_sin as well (the check is based on the more conservative case when doing large range reduction, but could be relaxed). * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small inputs. Return correct sign. (do_sincos): Remove small input check before do_sin, let do_sin set the sign. (__sin): Likewise. (__cos): Likewise.	2021-08-27 17:26:06 -07:00
Wilco Dijkstra	c015f0cc57	[PATCH 5/7] sin/cos slow paths: remove unused slowpath functions Remove all unused slowpath functions. * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove. (do_cos_slow): Likewise. (do_sin_slow): Likewise. (reduce_and_compute): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bslow): Likewise. (bslow1): Likewise. (bslow2): Likewise. (cslow2): Likewise.	2021-08-27 17:26:06 -07:00
Wilco Dijkstra	d4d26acd8a	[PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction For huge inputs use the improved do_sincos function as well. Now no cases use the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it. * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. (reduce_and_compute_sincos): Remove unused function.	2021-08-27 17:26:05 -07:00
Wilco Dijkstra	76f9784421	[PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction This patch improves the accuracy of the range reduction. When the input is large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not enough. Improve range reduction accuracy to 136 bits. As a result the special checks for results close to zero can be removed. The ULP of the polynomials is at worst 0.55ULP, so there is no reason for the slow functions, and they can be removed. * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to reduce_sincos, improve accuracy to 136 bits. (do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions. (__sin): Use improved reduction and simplified do_sincos calculation. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.	2021-08-27 17:26:05 -07:00
Wilco Dijkstra	e525ff25df	[PATCH 2/7] sin/cos slow paths: remove large range reduction This patch removes the large range reduction code and defers to the huge range reduction code. The first level range reducer supports inputs up to 2^27, which is way too large given that inputs for sin/cos are typically small (< 10), and optimizing for a smaller range would give a significant speedup. Input values above 2^27 are practically never used, so there is no reason for supporting range reduction between 2^27 and 2^48. Removing it significantly simplifies code and enables further speedups. There is about a 2.3x slowdown in this range due to __branred being extremely slow (a better algorithm could easily more than double performance). * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case.	2021-08-27 17:26:05 -07:00
Wilco Dijkstra	bc57e68bbb	[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.	2021-08-27 17:26:05 -07:00
Stan Shebs	d548adb4ef	Let time and gettimeofday use vdso by removing old clang workaround	2021-08-27 17:26:04 -07:00
Stan Shebs	2c9e5207e4	Do not use ppc-specific long double pack/unpack when compiling with clang	2021-08-27 17:26:04 -07:00
Stan Shebs	c3064d5f50	Remove old workaround in power7 logb functions, clang no longer crashes on the inline assembly	2021-08-27 17:26:04 -07:00
Josh Kunz	cb90884046	Additional fixes for llvm-as Unlike GCC, llvm always uses an integrated assembler, which attempts to recognized all `asm` statements written in the C code. glibc uses some syntactically invalid asm statements to emit constants into assembly that are later extracted with a sed or AWK script. This change fixes two such invalid `asm` statements by wrapping the output in a `.ascii` directive.. This does not break the sed/AWK (the same special sequence is output) but it makes the statement syntactically valid. See cf8e3f8757 for a previous fix for the same issue.	2021-08-27 17:26:04 -07:00
Stan Shebs	144448d566	Add workaround for infinite looping in ppc vsyscall for sched_getcpu.	2021-08-27 17:26:03 -07:00
Stan Shebs	6a12504329	Add an LD_DEBUG=tls option to help debug thread-local storage handling in ld.so	2021-08-27 17:26:03 -07:00
Stan Shebs	c4d57c29b5	Make multi-arch ifunc support work with clang	2021-08-27 17:26:02 -07:00
Ambrose Feinstein	af63681769	Redesign the fastload support for additional performance	2021-08-27 17:26:02 -07:00
Stan Shebs	8d141ab782	Fix sense of a test in the static-linking version of ppc get_clockfreq	2021-08-27 17:26:02 -07:00
Shu-Chun Weng	e1c6d2b0f4	Makes it compile for AArch64 De-nesting fix in 83c02e85 changed function signature but AArch64 was untested.	2021-08-27 17:26:01 -07:00
Shu-Chun Weng	83bede0cfc	Makes AArch64 assembly acceptable to clang According to ARMv8 architecture reference manual section C7.2.188, SIMD MOV (to general) instruction format is MOV <Xd>, <Vn>.D[<index>] gas appears to accept "<Vn>.2D[<index>]" as well, but clang's assembler does not. C.f. https://community.arm.com/developer/ip-products/processors/f/cortex-a-forum/5214/aarch64-assembly-syntax-for-armclang	2021-08-27 17:26:01 -07:00
Siva Chandra Reddy	038be62f96	Include STATIC_PIE_BOOTSTRAP with !NESTING in powerpc64/dl-machine.h	2021-08-27 17:26:01 -07:00
Siva Chandra Reddy	738baca865	Enable relaxed relocations when building certain object files for x86_64.	2021-08-27 17:26:01 -07:00
Siva Chandra Reddy	0337af1396	Un-nest an include in dl-reloc-static-pie.c. A corresponding adjustment in sysdeps/x86_64/dl-machine.h has also been made.	2021-08-27 17:26:01 -07:00
Stan Shebs	43afb70033	Disable -mfloat128 for clang, lets power9 insns into power8 executables	2021-08-27 17:26:00 -07:00
Stan Shebs	895947a3ca	Also work around clang bctrl issue in get_clockfreq.c	2021-08-27 17:26:00 -07:00
Raman Tenneti	9e8081d123	Changes to compile glibc-2.27 on PPC (Power8) with clang. + Use DOT_MACHINE macro instead of ".machine" instruction. + Use __isinf and __isinff instead of builtin versions. + In s_logb, s_logbf and s_logbl functions, used float versions to calculate "ret = x & 0x7f800000;" expression.	2021-08-27 17:23:15 -07:00
Raman Tenneti	bb9e16c6ea	Undid the dl_enable_fastload environment variable changes.	2021-08-27 17:23:15 -07:00
Paul Pluzhnikov	590786950c	Add "fastload" support.	2021-08-27 17:23:15 -07:00
Stan Shebs	3372bfe221	Work around lack of mfppr in clang	2021-08-27 17:23:14 -07:00
Stan Shebs	960ba7975c	Work around mtfsb0 syntax limitation with clang	2021-08-27 17:23:14 -07:00
Stan Shebs	e04e10b431	Avoid passing gcc-specific options to clang	2021-08-27 17:23:14 -07:00
Stan Shebs	452fe68a53	Make asm-based constraints be gcc-only	2021-08-27 17:23:14 -07:00
Stan Shebs	4b86f820b8	Make xxland syntax gcc-only	2021-08-27 17:23:14 -07:00
Stan Shebs	5e4f72b895	Add a first approximation of float definitions for ppc clang	2021-08-27 17:23:14 -07:00
Stan Shebs	e21102f77e	Make powerpc .machine directives be gcc-only	2021-08-27 17:23:14 -07:00
Stan Shebs	bb112e11de	Make mutex hints gcc-only, improve a type in __arch_compare_and_exchange_bool_32_acq	2021-08-27 17:23:14 -07:00
Stan Shebs	7724302310	Make power6 directives be gcc-only	2021-08-27 17:23:13 -07:00
Stan Shebs	1e88b203b3	Add power9 flag to go with -mfloat128	2021-08-27 17:23:13 -07:00
Stan Shebs	6fd7bec86f	Disable more attempts to pass -mlong-double-128 to clang	2021-08-27 17:23:13 -07:00
Stan Shebs	d21dfbccdc	Disable attempts to pass -mlong-double-128 to clang	2021-08-27 17:23:13 -07:00
Stan Shebs	b2d69ea7ac	Add workaround for infinite looping in ppc vsyscalls	2021-08-27 17:23:13 -07:00
Stan Shebs	6ea6782b69	Work around clang crash by skipping apparently-unneeded asm	2021-08-27 17:23:13 -07:00
Stan Shebs	b35774068a	Work around clang problem with ifuncs and vdso	2021-08-27 17:23:12 -07:00
Stan Shebs	96509a9dce	Work around a ppc clang inlining bug	2021-08-27 17:23:12 -07:00
Stan Shebs	0f93e3333f	Change de-nesting fix to use added argument instead of globals	2021-08-27 17:23:12 -07:00
Stan Shebs	21991760c7	Fix regressions in async-safe TLS, add run-time control for debugging, add more comments	2021-08-27 17:23:12 -07:00
Stan Shebs	c0ab16f8cc	Fix TLS problems not handled by cherrypick	2021-08-27 17:23:12 -07:00
Brooks Moses	3e9a530aae	Revert upstream removal of async-safe TLS patches.	2021-08-27 17:23:11 -07:00
Andreas Schwab	c4fde9669a	Don't write beyond destination in __mempcpy_avx512_no_vzeroupper (bug 23196) When compiled as mempcpy, the return value is the end of the destination buffer, thus it cannot be used to refer to the start of it. (cherry picked from commit `9aaaab7c6e`)	2021-08-27 16:22:13 -07:00
Stefan Liebler	b3356fb4a1	Fix blocking pthread_join. [BZ #23137 ] On s390 (31bit) if glibc is build with -Os, pthread_join sometimes blocks indefinitely. This is e.g. observable with testcase intl/tst-gettext6. pthread_join is calling lll_wait_tid(tid), which performs the futex-wait syscall in a loop as long as tid != 0 (thread is alive). On s390 (and build with -Os), tid is loaded from memory before comparing against zero and then the tid is loaded a second time in order to pass it to the futex-wait-syscall. If the thread exits in between, then the futex-wait-syscall is called with the value zero and it waits until a futex-wake occurs. As the thread is already exited, there won't be a futex-wake. In lll_wait_tid, the tid is stored to the local variable __tid, which is then used as argument for the futex-wait-syscall. But unfortunately the compiler is allowed to reload the value from memory. With this patch, the tid is loaded with atomic_load_acquire. Then the compiler is not allowed to reload the value for __tid from memory. ChangeLog: [BZ #23137] * sysdeps/nptl/lowlevellock.h (lll_wait_tid): Use atomic_load_acquire to load __tid. (cherry picked from commit `1660901840`)	2021-08-27 16:22:12 -07:00
Joseph Myers	1ab675ca63	Add PTRACE_SECCOMP_GET_METADATA from Linux 4.16 to sys/ptrace.h. This patch adds the PTRACE_SECCOMP_GET_METADATA constant from Linux 4.16 to all relevant sys/ptrace.h files. A type struct __ptrace_seccomp_metadata, analogous to other such types, is also added. Tested for x86_64, and with build-many-glibcs.py. * sysdeps/unix/sysv/linux/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): New enum value and macro. * sysdeps/unix/sysv/linux/bits/ptrace-shared.h (struct __ptrace_seccomp_metadata): New type. * sysdeps/unix/sysv/linux/aarch64/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/arm/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/ia64/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/powerpc/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/s390/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/sparc/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/tile/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. * sysdeps/unix/sysv/linux/x86/sys/ptrace.h (PTRACE_SECCOMP_GET_METADATA): Likewise. (cherry picked from commit `9320ca88a1`)	2021-08-27 16:22:11 -07:00

1 2 3 4 5 ...

11746 Commits