glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-14 01:00:07 +00:00

Author	SHA1	Message	Date
Noah Goldstein	d912127bde	x86: Rename strstr_sse2 to strstr_generic as it uses string/strstr.c This is in accordance with other files in the multiarch directory.	2022-06-27 08:35:51 -07:00
Noah Goldstein	d1e931125b	x86: Remove unused file wmemcmp-sse4 The memcmp-sse4 was removed in: commit `7cbc03d030` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Fri Apr 15 12:28:00 2022 -0500 x86: Remove memcmp-sse4.S so this file does nothing.	2022-06-27 08:35:51 -07:00
Noah Goldstein	afc6e4328f	x86: Put wcs{n}len-sse4.1 in the sse4.1 text section Previously was missing but the two implementations shouldn't get in the sse2 (generic) text section.	2022-06-27 08:35:51 -07:00
Noah Goldstein	227afaa672	x86: Align entry for memrchr to 64-bytes. The function was tuned around 64-byte entry alignment and performs better for all sizes with it. As well different code boths where explicitly written to touch the minimum number of cache line i.e sizes <= 32 touch only the entry cache line.	2022-06-27 08:35:51 -07:00
Fangrui Song	dbb0f06cc0	Makerules: Remove no-op -Wl,-d when linking libc_pic.os In GNU ld, -d assigns space to common symbols for -r (i.e. change common symbols to STB_GLOBAL definitions). This option was added in commit `da2d1bc5ad` (1998) perhaps because ld at that time had a bug that common symbols did not override shared object definitions. -d has been long unneeded and more so since -fno-common was added to +cflags.	2022-06-26 15:31:19 -07:00
Andreas Schwab	01c60dc90c	m68k: optimize RTLD_START	2022-06-25 00:22:02 +02:00
Adhemerval Zanella	baf2a265c7	misc: Optimize internal usage of __libc_single_threaded By adding an internal alias to avoid the GOT indirection. On some architecture, __libc_single_thread may be accessed through copy relocations and thus it requires to update also the copies default copy. This is done by adding a new internal macro, libc_hidden_data_{proto,def}, which has an addition argument that specifies the alias name (instead of default __GI_ one). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Fangrui Song <maskray@google.com>	2022-06-24 17:45:58 -03:00
Adhemerval Zanella	5b41b2659d	linux: Add move_mount It was added on Linux 5.2 (2db154b3ea8e14b04fee23e3fdfd5e9d17fbc6ae) as way t move a mount from one place to another and, in the next commit, allow to attach an unattached mount tree. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 16:03:38 -03:00
Adhemerval Zanella	b4deb7beb8	linux: Add fsmount It was added on 5.2 (93766fbd2696c2c4453dd8e1070977e9cd4e6b6d) to provide a way by which a filesystem opened with fsopen and configured by a series of fsconfig calls can have a detached mount object created for it. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 16:03:31 -03:00
Adhemerval Zanella	6c0eedd97e	linux: Add fsopen It was added on Linux 5.2 (24dcb3d90a1f67fe08c68a004af37df059d74005) to start the process of preparing to create a superblock that will then be mountable, using an fd as a context handle. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 16:03:15 -03:00
Florian Weimer	77536da3de	resolv/tst-resolv-noaaaa: Support building for older C standards This avoids a compilation error: tst-resolv-noaaaa.c: In function 'response': tst-resolv-noaaaa.c:74:11: error: a label can only be part of a statement and a declaration is not a statement char ipv4[4] = {192, 0, 2, i + 1}; ^~~~ tst-resolv-noaaaa.c:79:11: error: a label can only be part of a statement and a declaration is not a statement char *name = xasprintf ("ptr-%d", i); ^~~~	2022-06-24 19:44:42 +02:00
Florian Weimer	f282cdbe7f	resolv: Implement no-aaaa stub resolver option Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 18:18:44 +02:00
Florian Weimer	62a321b12d	support: Change non-address output format of support_format_dns_packet It makes sense to include the owner name (LHS) and record type in the output, so that they can be checked for correctness. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 18:18:41 +02:00
Kito Cheng	58fc66a91c	riscv: Use elf_machine_rela_relative to handle R_RISCV_RELATIVE Minor clean-up, we need to change this part in following patch, clean this up to prevent we duplicated the change twice. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-06-23 21:07:19 -07:00
Noah Goldstein	bd42891bb3	x86: Remove faulty sanity tests for RTLD build with no multiarch The sanity tests where meant to ensure that the default implementation was only being built without multiarch with the exception of the multiarch/rtld-.S files. The code used IS_IN (rtld) to check if the build for was for an multiarch/rtld-.S file which is incorrect as IS_IN (rtld) is set for the non-multiarch build as well.	2022-06-23 11:14:08 -07:00
Noah Goldstein	220b83d83d	stdlib: Fixup mbstowcs NULL __dst handling. [BZ #29279 ] commit `464d189b96` (origin/master, origin/HEAD) Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Jun 22 08:24:21 2022 -0700 stdlib: Remove attr_write from mbstows if dst is NULL [BZ: 29265] Incorrectly called `__mbstowcs_chk` in the NULL __dst case which is incorrect as in the NULL __dst case we are explicitly skipping the objsize checks. As well, remove the `__always_inline` attribute which exists in `__fortify_function`. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-06-23 08:26:01 -07:00
Noah Goldstein	3079f652d7	x86: Replace all sse instructions with vex equivilent in avx+ files Most of these don't really matter as there was no dirty upper state but we should generally avoid stray sse when its not needed. The one case that really matters is in svml_d_tanh4_core_avx2.S: blendvps %xmm0, %xmm8, %xmm7 When there was a dirty upper state. Tested on x86_64-linux	2022-06-22 19:42:17 -07:00
Noah Goldstein	3edda6a0f0	x86: Add support for compiling {raw\|w}memchr with high ISA level 1. Refactor files so that all implementations for in the multiarch directory. - Essentially moved sse2 {raw\|w}memchr.S implementation to multiarch/{raw\|w}memchr-sse2.S - The non-multiarch {raw\|w}memchr.S file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e memchr-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (memchr-evex{-rtm}.S). 3. Add new multiarch/rtld-{raw}memchr.S that just include the non-multiarch {raw}memchr.S which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. - Guranteed replacement essentially means that for any ISA level build there must be a function that the baseline of the ISA supports. So for {raw\|w}memchr.S since there is not ISA level 2 function, the ISA level 2 build still includes the ISA level 1 (sse2) function. Once we reach the ISA level 3 build, however, {raw\|w}memchr-avx2{-rtm}.S will always be sufficient so the ISA level 1 implementation ({raw\|w}memchr-sse2.S) will not be built. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-06-22 19:41:35 -07:00
Noah Goldstein	703f434108	x86: Add defines / utilities for making ISA specific x86 builds 1. Factor out some of the ISA level defines in isa-level.c to standalone header isa-level.h 2. Add new headers with ISA level dependent macros for handling ifuncs. Note, this file does not change any code. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-06-22 19:41:35 -07:00
Noah Goldstein	464d189b96	stdlib: Remove attr_write from mbstows if dst is NULL [BZ: 29265] mbstows is defined if dst is NULL and is defined to special cased if dst is NULL so the fortify objsize check if incorrect in that case. Tested on x86-64 linux. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-06-22 11:12:33 -07:00
Noah Goldstein	dd06af4f81	stdlib: Remove trailing whitespace from Makefile This causes precommit tests to fail when pushing commits that modify this file.	2022-06-22 11:12:25 -07:00
Andreas Schwab	dc30acf20b	debug: make __read_chk a cancellation point (bug 29274) The __read_chk function, as the implementation behind the fortified read function, must be a cancellation point, thus it cannot use INLINE_SYSCALL.	2022-06-22 17:00:44 +02:00
Sam James	2249ec60a9	s390: use LC_ALL=C for readelf call Let's use LC_ALL=C as we do elsewhere for consistency. Tested on s390x-ibm-linux-gnu. See: `72bd208846` Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Stefan Liebler <stli@linux.ibm.com>	2022-06-21 10:16:44 +02:00
Sam James	c376ff3287	s390: use $READELF We already check for it in root configure.ac with AC_CHECK_TOOL. Let's use the result. Tested on s390x-ibm-linux-gnu. Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Stefan Liebler <stli@linux.ibm.com>	2022-06-21 10:16:44 +02:00
Noah Goldstein	e5446dfea1	i386: Fix include paths for strspn, strcspn, and strpbrk commit `c22eb807b0` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Thu Jun 16 15:07:12 2022 -0700 x86: Rename generic functions with unique postfix for clarity Changed the names of the strspn-c, strcspn-c, and strpbrk-c files in a general refactor. It didn't change the include paths for the i386 files breaking the i386 build. This commit fixes that. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-06-17 16:25:27 -07:00
H.J. Lu	33ead02758	elf: Silence GCC 11/12 false positive warning Silence GCC 11/12 false positive warning with -mavx512f on dl-load.c: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106008 $ gcc -O2 -fPIC -march=x86-64 -mavx512f -S -Wall ... dl-load.c: In function ‘_dl_map_object_from_fd.constprop’: dl-load.c:1158:30: warning: ‘(((char *)loadcmds.113_68 + _933 + 16))[329406144173384849].mapend’ may be used uninitialized [-Wmaybe-uninitialized]	2022-06-17 15:18:10 -07:00
Noah Goldstein	c22eb807b0	x86: Rename generic functions with unique postfix for clarity No functions are changed. It just renames generic implementations from '{func}_sse2' to '{func}_generic'. This is just because the postfix "_sse2" was overloaded and was used for files that had hand-optimized sse2 assembly implementations and files that just redirected back to the generic implementation. Full xcheck passed on x86_64.	2022-06-16 20:17:45 -07:00
Noah Goldstein	8da9f346cb	x86: Add BMI1/BMI2 checks for ISA_V3 check BMI1/BMI2 are part of the ISA V3 requirements: https://en.wikipedia.org/wiki/X86-64 And defined by GCC when building with `-march=x86-64-v3`	2022-06-16 20:17:45 -07:00
Fangrui Song	4ef05df5ef	x86-64: Handle fewer relocation types for RTLD_BOOTSTRAP The RTLD_BOOTSTRAP branch is used to relocate ld.so itself. It only needs to handle RELATIVE, GLOB_DAT, and JUMP_SLOT. RELATIVE has been handled (by _ELF_DYNAMIC_DO_RELOC due to DT_RELACOUNT, or RELR), so the switch statement only needs to handle GLOB_DAT and JUMP_SLOT. We can drop these `#if[n]def RTLD_BOOTSTRAP` and add a large `# ifndef RTLD_BOOTSTRAP` instead.	2022-06-16 11:48:15 -07:00
Fangrui Song	e89913d0aa	aarch64: Handle fewer relocations for RTLD_BOOTSTRAP The RTLD_BOOTSTRAP branch is used to relocate ld.so itself. It only needs to handle RELATIVE, GLOB_DAT, and JUMP_SLOT. TLSDESC/TLS_DTPMOD/TLS_DTPREL handling can be removed. Remove `case AARCH64_R(RELATIVE)` as well as elf_machine_rela has checked it. Tested on aarch64-linux-gnu.	2022-06-15 19:21:53 -07:00
Fangrui Song	57919813e7	riscv: Change the relocations handled for RTLD_BOOTSTRAP The RTLD_BOOTSTRAP branch is used to relocate ld.so itself. It only needs to handle RELATIVE, GLOB_DAT, and the symbolic relocation type (R_RISCV_{32,64}). NONE and IRELATIVE can be removed. The code relies on ld.so having DT_RELACOUNT so that the RTLD_BOOTSTRAP branch does not need handle RELATIVE. Drop this minor size optimization for clarity. Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-06-15 18:42:03 -07:00
Noah Goldstein	89a25c6f64	x86: Cleanup bounds checking in large memcpy case 1. Fix incorrect lower-bound threshold in L(large_memcpy_2x). Previously was using `__x86_rep_movsb_threshold` and should have been using `__x86_shared_non_temporal_threshold`. 2. Avoid reloading __x86_shared_non_temporal_threshold before the L(large_memcpy_4x) bounds check. 3. Document the second bounds check for L(large_memcpy_4x) more clearly.	2022-06-15 14:25:55 -07:00
Noah Goldstein	b446822b6a	x86: Add bounds `x86_non_temporal_threshold` The lower-bound (16448) and upper-bound (SIZE_MAX / 16) are assumed by memmove-vec-unaligned-erms. The lower-bound is needed because memmove-vec-unaligned-erms unrolls the loop aggressively in the L(large_memset_4x) case. The upper-bound is needed because memmove-vec-unaligned-erms right-shifts the value of `x86_non_temporal_threshold` by LOG_4X_MEMCPY_THRESH (4) which without a bound may overflow. The lack of lower-bound can be a correctness issue. The lack of upper-bound cannot.	2022-06-15 14:25:55 -07:00
Fangrui Song	686216945a	Remove remnant reference to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA This fixes nios2 build after commit `de38b2a343`.	2022-06-15 13:02:17 -07:00
Fangrui Song	de38b2a343	elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA If an executable has copy relocations for extern protected data, that can only work if the library containing the definition is built with assumptions (a) the compiler emits GOT-generating relocations (b) the linker produces R__GLOB_DAT instead of R__RELATIVE. Otherwise the library uses its own definition directly and the executable accesses a stale copy. Note: the GOT relocations defeat the purpose of protected visibility as an optimization, but allow rtld to make the executable and library use the same copy when copy relocations are present, but it turns out this never worked perfectly. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics when both a.so and b.so define protected var and the executable copy relocates var: b.so accesses its own copy even with GLOB_DAT. The behavior change is from commit `62da1e3b00` (x86) and then copied to nios2 (`ae5eae7cfc`) and arc (`0e7d930c4c`). Without ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA, b.so accesses the copy relocated data like a.so. There is now a warning for copy relocation on protected symbol since commit `7374c02b68`. It's extremely unlikely anyone relies on the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA behavior, so let's remove it: this removes a check in the symbol lookup code.	2022-06-15 11:29:55 -07:00
Noah Goldstein	ff439c4717	x86: Add sse42 implementation to strcmp's ifunc This has been missing since the the ifuncs where added. The performance of SSE4.2 is preferable to to SSE2. Measured on Tigerlake with N = 20 runs. Geometric Mean of all benchmarks SSE4.2 / SSE2: 0.906	2022-06-14 20:58:09 -07:00
Noah Goldstein	0355915514	x86: Fix misordered logic for setting `rep_movsb_stop_threshold` Move the setting of `rep_movsb_stop_threshold` to after the tunables have been collected so that the `rep_movsb_stop_threshold` (which is used to redirect control flow to the non_temporal case) will use any user value for `non_temporal_threshold` (set using glibc.cpu.x86_non_temporal_threshold)	2022-06-14 20:58:07 -07:00
Fangrui Song	7374c02b68	elf: Refine direct extern access diagnostics to protected symbol Refine commit `349b0441da`: 1. Copy relocations for extern protected data do not work properly, regardless whether GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS is used. It makes sense to produce a warning unconditionally. 2. Non-zero value of an undefined function symbol may break pointer equality, but may be benign in many cases (many programs don't take the address in the shared object then compare it with the address in the executable). Reword the diagnostic to be clearer. 3. Remove the unneeded condition !(undef_map->l_1_needed & GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS). If the executable does not not have GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (can only occur in error cases), the diagnostic should be emitted as well. When the defining shared object has GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS, report an error to apply the intended enforcement.	2022-06-14 13:07:27 -07:00
Stefan Liebler	876cdf517d	Avoid -Wstringop-overflow= warning in iconv module. On s390x when compiling with GCC 12, I get this warning: utf8-utf16-z9.c: ../iconv/loop.c: In function ‘__from_utf8_loop_etf3eh_single’: ../iconv/loop.c:445:22: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 445 \| bytebuf[inlen++] = inptr++; \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~ ../iconv/loop.c:381:17: note: at offset 4 into destination object ‘bytebuf’ of size 4 381 \| unsigned char bytebuf[MAX_NEEDED_INPUT]; \| ^~~~~~~ ../iconv/loop.c:445:22: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 445 \| bytebuf[inlen++] = inptr++; \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~ ../iconv/loop.c:381:17: note: at offset 5 into destination object ‘bytebuf’ of size 4 381 \| unsigned char bytebuf[MAX_NEEDED_INPUT]; \| ^~~~~~~ This patch tells the compiler that inend is always behind inptr which avoids the warning. Note that the SINGLE function is only used to implement the mbtowc() or wctomb() functions. Those functions use inptr and inend pointing to a variable on stack, compute the inend pointer or explicitly check the arguments which always leads to inptr < inend. Special notes for backporters (according to Siddhesh Poyarekar): If someone wants to backport this patch to release branches, they should also backport the following wcrtomb change. Otherwise the assumptions assumed by this patch are not true. commit `9bcd12d223` Author: Siddhesh Poyarekar <siddhesh@sourceware.org> Date: Fri May 13 19:10:15 2022 +0530 wcrtomb: Make behavior POSIX compliant Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-06-14 11:03:06 +02:00
Wilco Dijkstra	fdaf78656f	Add bounds check to __libc_ifunc_impl_list Add a proper bounds check to __libc_ifunc_impl_list. This makes MAX_IFUNC redundant and fixes several targets that will write outside the array. To avoid unnecessary large diffs, pass the maximum in the argument 'i' to IFUNC_IMPL_ADD - 'max' can be used in new ifunc definitions and existing ones can be updated if desired. Passes buildmanyglibc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-06-10 17:13:29 +01:00
Wilco Dijkstra	f107b7b30d	libio: Avoid RMW of flags2 outside lock (BZ #27842 ) Remove an unconditional RMW on flags2 in flockfile - we don't need to change _IO_FLAGS2_NEED_LOCK since it isn't used in flockfile or funlockfile. This fixes BZ #27842. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-06-10 13:35:57 +01:00
Noah Goldstein	cffb9414c5	x86: Optimize svml_s_tanhf4_core_sse4.S Optimizations are: 1. Reduce code size (-112 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata size (-4k+ rodata is shared with avx2). Result is roughly a 15-16% speedup: Function, New Time, Old Time, New / Old _ZGVbN4v_tanhf, 3.158, 3.749, 0.842	2022-06-09 12:51:25 -07:00
Noah Goldstein	bcc41f66a4	x86: Optimize svml_s_tanhf8_core_avx2.S Optimizations are: 1. Reduce code size (-81 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata size (-32 bytes). Result is roughly a 17-18% speedup: Function, New Time, Old Time, New / Old _ZGVdN8v_tanhf, 1.977, 2.402, 0.823	2022-06-09 12:51:22 -07:00
Noah Goldstein	3a49ce8799	x86: Add data file that can be shared by tanhf-avx2 and tanhf-sse4 tanhf-avx2 and tanhf-sse4 use the same data tables so we can save over 4kb using a shared datatable. This does increase the memory footprint of the sse4 version (as now all the targets are 32 bytes instead of 16), generally it seems worth the code size save. NB: This patch doesn't do anything itself, it is setup for future patches.	2022-06-09 12:51:15 -07:00
Noah Goldstein	e560b3c2d2	x86: Optimize svml_s_tanhf16_core_avx512.S Optimizations are: 1. Reduce code size (-67 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Reduce rodata usage (-448 bytes). Result is roughly a 14% speedup: Function, New Time, Old Time, New / Old _ZGVeN16v_tanhf, 0.649, 0.752, 0.863	2022-06-09 12:51:12 -07:00
Noah Goldstein	fe1915d4f6	x86: Improve svml_s_atanhf4_core_sse4.S Improvements are: 1. Reduce code size (-62 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata usage (-16 bytes). The throughput improvement is not significant as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVbN4v_atanhf, 8.821, 8.903, 0.991	2022-06-09 12:51:09 -07:00
Noah Goldstein	65897e9916	x86: Improve svml_s_atanhf8_core_avx2.S Improvements are: 1. Reduce code size (-60 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Shrink rodata usage (-32 bytes). The throughput improvement is not that significant (3-5%) as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVdN8v_atanhf, 2.799, 2.923, 0.958	2022-06-09 12:51:04 -07:00
Noah Goldstein	73bae395cf	x86: Improve svml_s_atanhf16_core_avx512.S Improvements are: 1. Reduce code size (-64 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Reduce rodata size ([-128, -188] bytes). The throughput improvement is not significant as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVeN16v_atanhf, 1.39, 1.408, 0.987	2022-06-09 12:50:58 -07:00
Noah Goldstein	0f91811333	x86: Align varshift table to 32-bytes This ensures the load will never split a cache line.	2022-06-09 12:50:26 -07:00
Noah Goldstein	4654e7fd5a	x86: Add copyright to strpbrk-c.c	2022-06-09 12:50:00 -07:00

... 3 4 5 6 7 ...

39192 Commits