glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-27 23:40:10 +00:00

Author	SHA1	Message	Date
Noah Goldstein	475b63702e	x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h No bug. This patch doubles the rep_movsb_threshold when using ERMS. Based on benchmarks the vector copy loop, especially now that it handles 4k aliasing, is better for these medium ranged. On Skylake with ERMS: Size, Align1, Align2, dst>src,(rep movsb) / (vec copy) 4096, 0, 0, 0, 0.975 4096, 0, 0, 1, 0.953 4096, 12, 0, 0, 0.969 4096, 12, 0, 1, 0.872 4096, 44, 0, 0, 0.979 4096, 44, 0, 1, 0.83 4096, 0, 12, 0, 1.006 4096, 0, 12, 1, 0.989 4096, 0, 44, 0, 0.739 4096, 0, 44, 1, 0.942 4096, 12, 12, 0, 1.009 4096, 12, 12, 1, 0.973 4096, 44, 44, 0, 0.791 4096, 44, 44, 1, 0.961 4096, 2048, 0, 0, 0.978 4096, 2048, 0, 1, 0.951 4096, 2060, 0, 0, 0.986 4096, 2060, 0, 1, 0.963 4096, 2048, 12, 0, 0.971 4096, 2048, 12, 1, 0.941 4096, 2060, 12, 0, 0.977 4096, 2060, 12, 1, 0.949 8192, 0, 0, 0, 0.85 8192, 0, 0, 1, 0.845 8192, 13, 0, 0, 0.937 8192, 13, 0, 1, 0.939 8192, 45, 0, 0, 0.932 8192, 45, 0, 1, 0.927 8192, 0, 13, 0, 0.621 8192, 0, 13, 1, 0.62 8192, 0, 45, 0, 0.53 8192, 0, 45, 1, 0.516 8192, 13, 13, 0, 0.664 8192, 13, 13, 1, 0.659 8192, 45, 45, 0, 0.593 8192, 45, 45, 1, 0.575 8192, 2048, 0, 0, 0.854 8192, 2048, 0, 1, 0.834 8192, 2061, 0, 0, 0.863 8192, 2061, 0, 1, 0.857 8192, 2048, 13, 0, 0.63 8192, 2048, 13, 1, 0.629 8192, 2061, 13, 0, 0.627 8192, 2061, 13, 1, 0.62 Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:18:08 -05:00
Noah Goldstein	a6b7502ec0	x86: Optimize memmove-vec-unaligned-erms.S No bug. The optimizations are as follows: 1) Always align entry to 64 bytes. This makes behavior more predictable and makes other frontend optimizations easier. 2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have significant benefits in the case that: 0 < (dst - src) < [256, 512] 3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%] improvement and for FSRM [-10%, 25%]. In addition to these primary changes there is general cleanup throughout to optimize the aligning routines and control flow logic. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:18:03 -05:00
Noah Goldstein	ac759b1fbf	benchtests: Add partial overlap case in bench-memmove-walk.c This commit adds a new partial overlap benchmark. This is generally the most interesting performance case for memmove and was missing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:17:59 -05:00
Noah Goldstein	5e6cce9b34	benchtests: Add additional cases to bench-memcpy.c and bench-memmove.c This commit adds more benchmarks for the common memcpy/memmove benchmarks. The most signifcant cases are the half page offsets. The current versions leaves dst and src near page aligned which leads to false 4k aliasing on x86_64. This can add noise due to false dependencies from one run to the next. As well, this seems like more of an edge case that common case so it shouldn't be the only thing Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:17:51 -05:00
Noah Goldstein	d585ba47fc	string: Make tests birdirectional test-memcpy.c This commit updates the memcpy tests to test both dst > src and dst < src. This is because there is logic in the code based on the Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:17:30 -05:00
H.J. Lu	d465e5e0da	Remove the last trace of generate-md5 [BZ #28554 ] generate-md5 was removed by commit `d73f5331ce` Author: Roland McGrath <roland@gnu.org> Date: Fri May 2 02:20:45 2003 +0000 2003-05-01 Roland McGrath <roland@redhat.com> Remove its last trace. This fixes BZ #28554.	2021-11-06 06:21:44 -07:00
Sunil K Pandey	2856829ee7	Revert "benchtests: Add acosf function to bench-math" This reverts commit `79d0fc6539`.	2021-11-05 16:13:12 -07:00
H.J. Lu	a586fe9c80	Configure GCC with --enable-initfini-array [BZ #27945 ] Starting from GCC 12, the .init_array and .fini_array sections are enabled unconditionally by commit 13a39886940331149173b25d6ebde0850668d8b9 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Jun 8 16:09:24 2021 -0700 Always enable DT_INIT_ARRAY/DT_FINI_ARRAY on Linux configure GCC with --enable-initfini-array to enable them when using GCC release branches. Fixes BZ #27945.	2021-11-05 15:30:02 -07:00
Florian Weimer	ea32ec354c	elf: Earlier missing dynamic segment check in _dl_map_object_from_fd Separated debuginfo files have PT_DYNAMIC with p_filesz == 0. We need to check for that before the _dl_map_segments call because that could attempt to write to mappings that extend beyond the end of the file, resulting in SIGBUS. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-05 19:34:16 +01:00
Nikita Popov	ff012870b2	gconv: Do not emit spurious NUL character in ISO-2022-JP-3 (bug 28524) Bugfix 27256 has introduced another issue: In conversion from ISO-2022-JP-3 encoding, it is possible to force iconv to emit extra NUL character on internal state reset. To do this, it is sufficient to feed iconv with escape sequence which switches active character set. The simplified check 'data->__statep->__count != ASCII_set' introduced by the aforementioned bugfix picks that case and behaves as if '\0' character has been queued thus emitting it. To eliminate this issue, these steps are taken: * Restore original condition '(data->__statep->__count & ~7) != ASCII_set'. It is necessary since bits 0-2 may contain number of buffered input characters. * Check that queued character is not NUL. Similar step is taken for main conversion loop. Bundled test case follows following logic: * Try to convert ISO-2022-JP-3 escape sequence switching active character set * Reset internal state by providing NULL as input buffer * Ensure that nothing has been converted. Signed-off-by: Nikita Popov <npv1310@gmail.com>	2021-11-04 19:59:42 +01:00
Paul A. Clarke	9fea0f1a2a	[powerpc] Tighten contraints for asm constant parameters There are a few places where only known numeric values are acceptable for `asm` parameters, yet the constraint "i" is used. "i" can include "symbolic constants whose values will be known only at assembly time or later." Use "n" instead of "i" where known numeric values are required. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2021-11-03 09:17:28 -05:00
Adhemerval Zanella	d3bf2f5927	elf: Do not run DSO sorting if tunables is not enabled Since the argorithm selection requires tunables. Checked on x86_64-linux-gnu with --enable-tunables=no.	2021-11-03 09:25:06 -03:00
Adhemerval Zanella	09f214528c	riscv: Build with -mno-relax if linker does not support R_RISCV_ALIGN It allows build both glibc and tests with lld (Since lld does not support R_RISCV_ALIGN linker relaxation). Checked with a build for riscv32-linux-gnu-rv32imafdc-ilp32d and riscv64-linux-gnu-rv64imafdc-lp64d. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Fangrui Song <maskray@google.com>	2021-11-03 09:25:06 -03:00
Fangrui Song	6720d36b66	x86-64: Replace movzx with movzbl Clang cannot assemble movzx in the AT&T dialect mode. ../sysdeps/x86_64/strcmp.S:2232:16: error: invalid operand for instruction movzx (%rsi), %ecx ^~~~ Change movzx to movzbl, which follows the AT&T dialect and is used elsewhere in the file. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-02 20:59:52 -07:00
Fangrui Song	fdcd177fd3	regex: Unnest nested functions in regcomp.c This refactor moves four functions out of a nested scope and converts them into static always_inline functions. collseqwc, table_size, symb_table, extra are now initialized to zero because they are passed as function arguments. On x86-64, .text is 16 byte larger likely due to the 4 stores. This is nothing compared to the amount of work that regcomp has to do looking up the collation weights, or other functions. If the non-buildable `sysdeps/generic/dl-machine.h` doesn't count, this patch removes the last `auto inline` usage from glibc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-11-02 10:07:59 -07:00
Joseph Myers	db432f033d	Use Linux 5.15 in build-many-glibcs.py This patch makes build-many-glibcs.py use Linux 5.15. Tested with build-many-glibcs.py (host-libraries, compilers and glibcs builds).	2021-11-02 16:54:59 +00:00
Adhemerval Zanella	f64f4ce069	elf: Assume disjointed .rela.dyn and .rela.plt for loader The patch removes the the ELF_DURING_STARTUP optimization and assume both .rel.dyn and .rel.plt might not be subsequent. This allows some code simplification since relocation will be handled independently where it is done on bootstrap. At least on x86_64_64, I can not measure any performance implications. Running 10000 time the command LD_DEBUG=statistics ./elf/ld.so ./libc.so And filtering the "total startup time in dynamic loader" result, the geometric mean is: patched master Ryzen 7 5900x 24140 24952 i7-4510U 45957 45982 (The results do show some variation, I did not make any statistical analysis). It also allows build arm with lld, since it inserts ".ARM.exidx" between ".rel.dyn" and ".rel.plt" for the loader. Checked on x86_64-linux-gnu and arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-02 11:21:57 -03:00
Florian Weimer	cca75bd8b5	i386: Explain why __HAVE_64B_ATOMICS has to be 0	2021-11-02 10:26:23 +01:00
Adhemerval Zanella	b8a6ee43bb	benchtests: Add hypotf Based on random input arguments. About 85% tuples have exponents of the two arguments close together (+-1 range). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-01 16:23:39 -03:00
Adhemerval Zanella	dba44dbe54	benchtests: Make hypot input random Instead of inputs based on the algorithm implementation details. About 85% tuples have exponents of the two arguments close together (+-1 range). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-01 16:23:22 -03:00
Adhemerval Zanella	613cb5c7b1	arm: Use have-mtls-dialect-gnu2 to check for ARM TLS descriptors support The lld linker does not support TLSDESC for arm. The have-arm-tls-desc is a leftover of `56583289b1` to support NaCL. Reviewed-by: Fangrui Song <maskray@google.com>	2021-11-01 16:23:15 -03:00
Adhemerval Zanella	d6dea8c847	arm: Use internal symbol for _dl_argv on _dl_start_user The lld does not support R_ARM_GOTOFF32 to preemptible symbol (_dl_argv has default visibility). Use the internal alias instead (one option would to use HIDDEN_JUMPTARGET, bu the macro is not defined for !__ASSEMBLER__ and I made this patch arm-specific to avoid require to check extensivelly on other architecture it this might break something). Checked on arm-linux-gnueabihf. Reviewed-by: Fangrui Song <maskray@google.com>	2021-11-01 16:21:53 -03:00
H.J. Lu	14dbbf46a0	x86-64: Remove Prefer_AVX2_STRCMP Remove Prefer_AVX2_STRCMP to enable EVEX strcmp. When comparing 2 32-byte strings, EVEX strcmp has been improved to require 1 load, 1 VPTESTM, 1 VPCMP, 1 KMOVD and 1 INCL instead of 2 loads, 3 VPCMPs, 2 KORDs, 1 KMOVD and 1 TESTL while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU, 1 VPMOVMSKB and 1 TESTL. EVEX strcmp is now faster than AVX2 strcmp by up to 40% on Tiger Lake and Ice Lake.	2021-11-01 07:53:04 -07:00
H.J. Lu	c46e9afb2d	x86-64: Improve EVEX strcmp with masked load In strcmp-evex.S, to compare 2 32-byte strings, replace VMOVU (%rdi, %rdx), %YMM0 VMOVU (%rsi, %rdx), %YMM1 /* Each bit in K0 represents a mismatch in YMM0 and YMM1. / VPCMP $4, %YMM0, %YMM1, %k0 VPCMP $0, %YMMZERO, %YMM0, %k1 VPCMP $0, %YMMZERO, %YMM1, %k2 / Each bit in K1 represents a NULL in YMM0 or YMM1. / kord %k1, %k2, %k1 / Each bit in K1 represents a NULL or a mismatch. / kord %k0, %k1, %k1 kmovd %k1, %ecx testl %ecx, %ecx jne L(last_vector) with VMOVU (%rdi, %rdx), %YMM0 VPTESTM %YMM0, %YMM0, %k2 / Each bit cleared in K1 represents a mismatch or a null CHAR in YMM0 and 32 bytes at (%rsi, %rdx). */ VPCMP $0, (%rsi, %rdx), %YMM0, %k1{%k2} kmovd %k1, %ecx incl %ecx jne L(last_vector) It makes EVEX strcmp faster than AVX2 strcmp by up to 40% on Tiger Lake and Ice Lake. Co-Authored-By: Noah Goldstein <goldstein.w.n@gmail.com>	2021-11-01 07:52:56 -07:00
Sunil K Pandey	79d0fc6539	benchtests: Add acosf function to bench-math Add acosf function to bench-math and copy acosf-inputs to benchtests. Motivation for this patch is to prepare for upcoming libmvec new functions. Float and double version of libmvec functions stays together. acosf-inputs file generated from acos-inputs file using following scaling formula: f = d * (FLT_MAX/DBL_MAX) Where d is input(double) and f is output(float). If scaled float value is duplicate in new input file, nextafterf() function used to find next float value, ensuring no duplicates. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-10-29 08:52:30 -07:00
Wilco Dijkstra	f392915d1e	benchtests: Improve bench-memcpy-random Improve the random memcpy benchmark. Double the number of tests and increase the size of the memory region to test between 32KB and 1024KB. This improves accuracy on modern cores. Clean up formatting of the frequency array. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2021-10-29 15:45:53 +01:00
Joseph Myers	7ca9377bab	Disable -Waggressive-loop-optimizations warnings in tst-dynarray.c My build-many-glibcs.py bot shows -Waggressive-loop-optimizations errors building the glibc testsuite for 32-bit architectures with GCC mainline, which seem to have appeared between GCC commits 4abc0c196b10251dc80d0743ba9e8ab3e56c61ed and d8edfadfc7a9795b65177a50ce44fd348858e844: In function 'dynarray_long_noscratch_resize', inlined from 'test_long_overflow' at tst-dynarray.c:489:5, inlined from 'do_test' at tst-dynarray.c:571:3: ../malloc/dynarray-skeleton.c:391:36: error: iteration 1073741823 invokes undefined behavior [-Werror=aggressive-loop-optimizations] 391 \| DYNARRAY_ELEMENT_INIT (&list->u.dynarray_header.array[i]); tst-dynarray.c:39:37: note: in definition of macro 'DYNARRAY_ELEMENT_INIT' 39 \| #define DYNARRAY_ELEMENT_INIT(e) ((e) = 23) \| ^ In file included from tst-dynarray.c:42: ../malloc/dynarray-skeleton.c:389:37: note: within this loop 389 \| for (size_t i = old_size; i < size; ++i) \| ~~^~~~~~ In function 'dynarray_long_resize', inlined from 'test_long_overflow' at tst-dynarray.c:479:5, inlined from 'do_test' at tst-dynarray.c:571:3: ../malloc/dynarray-skeleton.c:391:36: error: iteration 1073741823 invokes undefined behavior [-Werror=aggressive-loop-optimizations] 391 \| DYNARRAY_ELEMENT_INIT (&list->u.dynarray_header.array[i]); tst-dynarray.c:27:37: note: in definition of macro 'DYNARRAY_ELEMENT_INIT' 27 \| #define DYNARRAY_ELEMENT_INIT(e) ((e) = 17) \| ^ In file included from tst-dynarray.c:28: ../malloc/dynarray-skeleton.c:389:37: note: within this loop 389 \| for (size_t i = old_size; i < size; ++i) \| ~~^~~~~~ I don't know what GCC change made these errors appear, or why they only appear for 32-bit architectures. However, the warnings appear to be both true (that iteration would indeed involve undefined behavior if executed) and useless in this particular case (that iteration is never executed, because the allocation size overflows and so the allocation fails - but the check for allocation size overflow is in a separate source file and so can't be seen by the compiler when compiling this test). So use the DIAG_* macros to disable -Waggressive-loop-optimizations around the calls in question to dynarray_long_resize and dynarray_long_noscratch_resize in this test. Tested with build-many-glibcs.py (GCC mainline) for arm-linux-gnueabi, where it restores a clean testsuite build.	2021-10-29 14:40:45 +00:00
Stafford Horne	6446c725d4	Fix compiler issue with mmap_internal Compiling mmap_internal fails to compile when we use -1 for MMAP2_PAGE_UNIT on 32 bit architectures. The error is as follows: ../sysdeps/unix/sysv/linux/mmap_internal.h:30:8: error: unknown type name 'uint64_t' \| 30 \| static uint64_t page_unit; \| \| ^~~~~~~~ Fix by adding including stdint.h.	2021-10-29 09:21:37 -03:00
Adhemerval Zanella	04e8169f1d	Check if linker also support -mtls-dialect=gnu2 Since some linkers (for instance lld for i386) does not support it for all architectures. Checked on i686-linux-gnu. Reviewed-by: Fangrui Song <maskray@google.com>	2021-10-29 09:21:37 -03:00
Adhemerval Zanella	3d5ecb6246	Fix LIBC_PROG_BINUTILS for -fuse-ld=lld GCC does not print the correct linker when -fuse-ld=lld is used with the -print-prog-name=ld: $ gcc -v 2>&1 \| tail -n 1 gcc version 11.2.0 (Ubuntu 11.2.0-7ubuntu2) $ gcc ld This is different than for gold: $ gcc -fuse-ld=gold -print-prog-name=ld ld.gold Using ld.lld as the static linker name prints the expected result. This is only required when -fuse-ld=lld is used, if lld is used as the 'ld' programs (through a symlink) LIBC_PROG_BINUTILS works as expected. Checked on x86_64-linux-gnu. Reviewed-by: Fangrui Song <maskray@google.com>	2021-10-29 09:21:37 -03:00
Adhemerval Zanella	66a273d16a	elf: Disable ifuncmain{1,5,5pic,5pie} when using LLD These tests takes the address of a protected symbol (foo_protected) and lld does not support copy relocations on protected data symbols. Checked on x86_64-linux-gnu. Reviewed-by: Fangrui Song <maskray@google.com>	2021-10-29 09:21:37 -03:00
Siddhesh Poyarekar	88e316b064	Handle NULL input to malloc_usable_size [BZ #28506 ] Hoist the NULL check for malloc_usable_size into its entry points in malloc-debug and malloc and assume non-NULL in all callees. This fixes BZ #28506 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2021-10-29 14:53:55 +05:30
Noah Goldstein	1d56fd3bae	x86_64: Add memcmpeq.S to fix disable-multi-arch build The following commit: commit `cf4fd28ea4` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Tue Oct 26 19:43:18 2021 -0500 Broke --disable-multi-arch build for x86_64 because x86_64/memcmpeq.S was not defined outside of multiarch and the alias for __memcmpeq in x86_64/memcmp.S was removed. This commit fixes that issue by adding x86_64/memcmpeq.S. make xcheck passes on x86_64 with and without --disable-multi-arch	2021-10-28 16:35:50 -05:00
Stafford Horne	b3cf94ef15	login: Add back libutil as an empty library There are several packages like sysvinit and buildroot that expect -lutil to work. Rather than impacting them with having to change the linker flags provide an empty libutil.a.	2021-10-29 06:18:55 +09:00
Fangrui Song	6838920383	riscv: Fix incorrect jal with HIDDEN_JUMPTARGET A non-local STV_DEFAULT defined symbol is by default preemptible in a shared object. j/jal cannot target a preemptible symbol. On other architectures, such a jump instruction either causes PLT [BZ #18822], or if short-ranged, sometimes rejected by the linker (but not by GNU ld's riscv port [ld PR/28509]). Use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead. With this patch, ld.so and libc.so can be linked with LLD if source files are compiled/assembled with -mno-relax/-Wa,-mno-relax. Acked-by: Palmer Dabbelt <palmer@dabbelt.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-10-28 11:39:49 -07:00
Noah Goldstein	9b7cfab180	x86_64: Add evex optimized __memcmpeq in memcmpeq-evex.S No bug. This commit adds new optimized __memcmpeq implementation for evex. The primary optimizations are: 1) skipping the logic to find the difference of the first mismatched byte. 2) not updating src/dst addresses as the non-equals logic does not need to be reused by different areas.	2021-10-27 13:03:46 -05:00
Noah Goldstein	b4ed69ba16	x86_64: Add avx2 optimized __memcmpeq in memcmpeq-avx2.S No bug. This commit adds new optimized __memcmpeq implementation for avx2. The primary optimizations are: 1) skipping the logic to find the difference of the first mismatched byte. 2) not updating src/dst addresses as the non-equals logic does not need to be reused by different areas.	2021-10-27 13:03:46 -05:00
Noah Goldstein	fa7f63d8d6	x86_64: Add sse2 optimized __memcmpeq in memcmp-sse2.S No bug. This commit does not modify any of the memcmp implementation. It just adds __memcmpeq ifdefs to skip obvious cases where computing the proper 1/-1 required by memcmp is not needed.	2021-10-27 13:03:46 -05:00
Noah Goldstein	cf4fd28ea4	x86_64: Add support for __memcmpeq using sse2, avx2, and evex No bug. This commit adds support for __memcmpeq to be implemented seperately from memcmp. Support is added for versions optimized with sse2, avx2, and evex.	2021-10-27 13:03:46 -05:00
Noah Goldstein	cf3acd774f	Benchtests: Add benchtests for __memcmpeq No bug. This commit adds __memcmpeq benchmarks. The benchmarks just use the existing ones in memcmp. This will be useful for testing implementations of __memcmpeq that do not just alias memcmp.	2021-10-27 13:03:46 -05:00
Noah Goldstein	3592ccd472	String: Add __memcmpeq as build target No bug. This commit just adds __memcmpeq as a build target so that implementations for __memcmpeq that are not just aliases to memcmp can be supported.	2021-10-27 13:03:46 -05:00
Noah Goldstein	11c88336e3	NEWS: Add item for __memcmpeq	2021-10-26 16:51:29 -05:00
Noah Goldstein	d9283b71ac	String: Add tests for __memcmpeq No bug. This commit adds tests for the new function __memcmpeq. The new tests use the existing tests in 'test-memcmp.c' but relax the result requirement to only check for zero or non-zero returns. All string tests include test-memcmpeq are passing.	2021-10-26 16:51:29 -05:00
Noah Goldstein	9894127d20	String: Add hidden defs for __memcmpeq() to enable internal usage No bug. This commit adds hidden defs for all declarations of __memcmpeq. This enables usage of __memcmpeq without the PLT for usage internal to GLIBC.	2021-10-26 16:51:29 -05:00
Noah Goldstein	44829b3ddb	String: Add support for __memcmpeq() ABI on all targets No bug. This commit adds support for __memcmpeq() as a new ABI for all targets. In this commit __memcmpeq() is implemented only as an alias to the corresponding targets memcmp() implementation. __memcmpeq() is added as a new symbol starting with GLIBC_2.35 and defined in string.h with comments explaining its behavior. Basic tests that it is callable and works where added in string/tester.c As discussed in the proposal "Add new ABI '__memcmpeq()' to libc" __memcmpeq() is essentially a reserved namespace for bcmp(). The means is shares the same specifications as memcmp() except the return value for non-equal byte sequences is any non-zero value. This is less strict than memcmp()'s return value specification and can be better optimized when a boolean return is all that is needed. __memcmpeq() is meant to only be called by compilers if they can prove that the return value of a memcmp() call is only used for its boolean value. All tests in string/tester.c passed. As well build succeeds on x86_64-linux-gnu target.	2021-10-26 16:51:29 -05:00
Fangrui Song	8438135d34	configure: Don't check LD -v --help for LIBC_LINKER_FEATURE When LIBC_LINKER_FEATURE is used to check a linker option with the equal sign, it will likely fail because the LD -v --help output may look like `-z lam-report=[none\|warning\|error]` while the needle is something like `-z lam-report=warning`. The LD -v --help filter doesn't save much time, so just remove it.	2021-10-25 13:17:44 -07:00
H.J. Lu	f9b152c83f	elf: Make global.out depend on reldepmod4.so [BZ #28457 ] The global test is linked with globalmod1.so which dlopens reldepmod4.so. Make global.out depend on reldepmod4.so. This fixes BZ #28457. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2021-10-25 07:13:54 -07:00
Noah Goldstein	bad852b61b	x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'. it could potentially be dangerous to use SSE2 if this function is ever called without using 'vzeroupper' beforehand. While compilers appear to use 'vzeroupper' before function calls if AVX2 has been used, using SSE2 here is more brittle. Since it is not absolutely necessary it should be avoided. It costs 2-extra bytes but the extra bytes should only eat into alignment padding. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-10-23 13:02:42 -05:00
H.J. Lu	d8e7d06381	bench-math: Sort and put each bench per line Sort and put each math bench per line to prepare for new math benches. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2021-10-23 05:20:25 -07:00
Sunil K Pandey	4f690aad9e	x86_64: Add missing libmvec ABI tests Add vector ABI tests for cos, exp, log, pow and sin functions. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-10-22 06:46:49 -07:00

1 2 3 4 5 ...

38010 Commits