glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-22 21:10:07 +00:00

Author	SHA1	Message	Date
Adhemerval Zanella	4a76fb1da8	powerpc: Remove power8 strcasestr optimization Similar to strstr (`1e9a550ba4`), power8 strcasestr does not show much improvement compared to the generic implementation. The geomean on bench-strcasestr shows: __strcasestr_power8 __strcasestr_ppc power10 1159 1120 power9 1640 1469 power8 1787 1904 The strcasestr uses the same 'trick' as power7 strstr to detect potential quadradic behavior, which only adds overheads for input that trigger quadradic behavior and it is really a hack. Checked on powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-03-12 17:11:01 -03:00
Adhemerval Zanella	1e9a550ba4	powerpc: Remove power7 strstr optimization The optimization is not faster than the generic algorithm, using the bench-strstr the geometric mean running on a POWER10 machine using gcc 13.1.1 is 482.47 while the default __strstr_ppc is 340.97 (which uses the generic implementation). Also, there is no need to redirect the internal str/mem call to optimized version, internal ifunc is supported and enabled for internal calls (meaning that the generic implementation will use any asm optimization if available). Checked on powerpc64le-linux-gnu. Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-02-23 08:50:00 -03:00
Joseph Myers	83d8d289b2	Rename c2x / gnu2x tests to c23 / gnu23 Complete the internal renaming from "C2X" and related names in GCC by renaming -c2x and -gnu2x tests to -c23 and -gnu23. Tested for x86_64, and with build-many-glibcs.py for powerpc64le.	2024-02-01 17:55:57 +00:00
Adhemerval Zanella Netto	ae4b8d6a0e	string: Use builtins for ffs and ffsll It allows to remove a lot of arch-specific implementations. Checked on x86_64, aarch64, powerpc64. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-02-01 09:31:33 -03:00
Joseph Myers	42cc619dfb	Refer to C23 in place of C2X in glibc WG14 decided to use the name C23 as the informal name of the next revision of the C standard (notwithstanding the publication date in 2024). Update references to C2X in glibc to use the C23 name. This is intended to update everything except where it involves renaming files (the changes involving renaming tests are intended to be done separately). In the case of the _ISOC2X_SOURCE feature test macro - the only user-visible interface involved - support for that macro is kept for backwards compatibility, while adding _ISOC23_SOURCE. Tested for x86_64.	2024-02-01 11:02:01 +00:00
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
Adhemerval Zanella	ecb1e7220d	powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:34 -03:00
Manjunath Matti	93a739d4a1	powerpc: Add space for HWCAP3/HWCAP4 in the TCB for future Power. This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc. These hardware capabilities bits will be used by future Power architectures. Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises the availability of the new HWCAP3/HWCAP4 data in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2023-12-15 20:20:14 -06:00
Amrita H S	90bcc8721e	powerpc: Fix performance issues of strcmp power10 Current implementation of strcmp for power10 has performance regression for multiple small sizes and alignment combination. Most of these performance issues are fixed by this patch. The compare loop is unrolled and page crosses of unrolled loop is handled. Thanks to Paul E. Murphy for helping in fixing the performance issues. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com> Co-Authored-By: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-12-15 16:42:40 -06:00
MAHESH BODAPATI	b9182c793c	powerpc : Add optimized memchr for POWER10 Optimized memchr for POWER10 based on existing rawmemchr and strlen. Reordering instructions and loop unrolling helped in getting better performance. Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-12-14 14:40:14 -06:00
Amrita H S	3367d8e180	powerpc: Optimized strcmp for power10 This patch is based on __strcmp_power9 and __strlen_power10. Improvements from __strcmp_power9: 1. Uses new POWER10 instructions - This code uses lxvp to decrease contention on load by loading 32 bytes per instruction. 2. Performance implication - This version has around 30% better performance on average. - Performance regression is seen for a specific combination of sizes and alignments. Some of them is observed without changes also, while rest may be induced by the patch. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2023-12-07 11:10:40 -06:00
Adhemerval Zanella	55f41ef8de	elf: Remove LD_PROFILE for static binaries The _dl_non_dynamic_init does not parse LD_PROFILE, which does not enable profile for dlopen objects. Since dlopen is deprecated for static objects, it is better to remove the support. It also allows to trim down libc.a of profile support. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Manjunath Matti	4eac1825ed	fegetenv_and_set_rn now uses the builtins provided by GCC. On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' where possible, this was introduced by commit `f1c56cdff0`. GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn which now returns the FPSCR fields in a double. This feature is available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined. GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds this feature. Changes are done to use __builtin_set_fpscr_rn instead of mffscrn or mffscrni in __fe_mffscrn(rn). Suggested-by: Carl Love <cel@us.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-09-27 13:55:36 -03:00
Samuel Thibault	41d8c3bc33	powerpc longjmp: Fix build after chk hidden builtin fix `04bf7d2d8a` ("chk: Add and fix hidden builtin definitions for _chk") added an #undef for longjmp and siglongjmp to compensate for the definition in include/setjmp.h, but missed doing so for the powerpc version too. Fixes: `04bf7d2d8a` ("chk: Add and fix hidden builtin definitions for _chk")	2023-08-04 10:03:59 +02:00
Mahesh Bodapati	21841f0d56	PowerPC: Influence cpu/arch hwcap features via GLIBC_TUNABLES This patch enables the option to influence hwcaps used by PowerPC. The environment variable, GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz...., can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature xxx and zzz, where the feature name is case-sensitive and has to match the ones mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. Note that the hwcap tunables only used in the IFUNC selection. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-08-01 07:41:17 -05:00
Adhemerval Zanella Netto	648c3b574d	powerpc: Fix powerpc64 strchrnul build with old gcc The compiler might not see that internal definition is an alias due the libc_ifunc macro, which redefines __strchrnul. With gcc 6 it fails with: In file included from <command-line>:0:0: ./../include/libc-symbols.h:472:33: error: ‘__EI___strchrnul’ aliased to undefined symbol ‘__GI___strchrnul’ extern thread __typeof (name) __EI_##name \ ^ ./../include/libc-symbols.h:468:3: note: in expansion of macro ‘__hidden_ver2’ __hidden_ver2 (, local, internal, name) ^~~~~~~~~~~~~ ./../include/libc-symbols.h:476:29: note: in expansion of macro ‘__hidden_ver1’ # define hidden_def(name) __hidden_ver1(__GI_##name, name, name); ^~~~~~~~~~~~~ ./../include/libc-symbols.h:557:32: note: in expansion of macro ‘hidden_def’ # define libc_hidden_def(name) hidden_def (name) ^~~~~~~~~~ ../sysdeps/powerpc/powerpc64/multiarch/strchrnul.c:38:1: note: in expansion of macro ‘libc_hidden_def’ libc_hidden_def (__strchrnul) ^~~~~~~~~~~~~~~ Use libc_ifunc_hidden as stpcpy. Checked on powerpc64 with gcc 6 and gcc 13. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-07-26 09:45:22 -03:00
Siddhesh Poyarekar	c6cb8783b5	configure: Use autoconf 2.71 Bump autoconf requirement to 2.71 to allow regenerating configure on more recent distributions. autoconf 2.71 has been in Fedora since F36 and is the current version in Debian stable (bookworm). It appears to be current in Gentoo as well. All sysdeps configure and preconfigure scripts have also been regenerated; all changes are trivial transformations that do not affect functionality. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-07-17 10:08:10 -04:00
Adhemerval Zanella	dddc88587a	sparc: Fix la_symbind for bind-now (BZ 23734) The sparc ABI has multiple cases on how to handle JMP_SLOT relocations, (sparc_fixup_plt/sparc64_fixup_plt). For BINDNOW, _dl_audit_symbind will be responsible to setup the final relocation value; while for lazy binding _dl_fixup/_dl_profile_fixup will call the audit callback and tail cail elf_machine_fixup_plt (which will call sparc64_fixup_plt). This patch fixes by issuing the SPARC specific routine on bindnow and forwarding the audit value to elf_machine_fixup_plt for lazy resolution. It fixes the la_symbind for bind-now tests on sparc64 and sparcv9: elf/tst-audit24a elf/tst-audit24b elf/tst-audit24c elf/tst-audit24d Checked on sparc64-linux-gnu and sparcv9-linux-gnu. Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2023-07-12 15:29:08 -03:00
Frederic Berat	d636339306	sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: Fix warn unused result The fread routine return value needs to be checked when fortification is enabled, hence use xfread helper. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-06-22 00:21:17 -04:00
Paul Pluzhnikov	6b3ddc9ae5	Regenerate configure fragment -- BZ 25337. In commit `0b25c28e02` I updated congure.ac but neglected to regenerate updated configure. Fix this here.	2023-05-23 16:21:29 +00:00
Paul Pluzhnikov	0b25c28e02	Fix misspellings in sysdeps/powerpc -- BZ 25337 All fixes are in comments, so the binaries should be identical before/after this commit, but I can't verify this. Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-05-23 15:23:09 +00:00
Andreas Schwab	ea08d8dcea	Remove last remnants of have-protected	2023-05-22 13:31:04 +02:00
Mahesh Bodapati	36cc908ed5	powerpc:GCC(<10) doesn't allow -mlong-double-64 after -mabi=ieeelongdouble Removed -mabi=ieeelongdouble on failing tests. It resolves the error. error: ‘-mabi=ieeelongdouble’ requires ‘-mlong-double-128’	2023-05-19 17:35:01 -05:00
Sachin Monga	1a57ab0c92	Added Redirects to longdouble error functions [BZ #29033 ] This patch redirects the error functions to the appropriate longdouble variants which enables the compiler to optimize for the abi ieeelongdouble. Signed-off-by: Sachin Monga <smonga@linux.ibm.com>	2023-05-10 13:59:48 -05:00
Adhemerval Zanella	59db5735e6	powerpc: Disable stack protector in early static initialization Similar to `fb95c31638`, also disable for string-ppc64.c (pulled on rltd as the default string implementation). Checked on powerpc64-linux-gnu.	2023-04-03 17:42:08 -03:00
Adhemerval Zanella Netto	33237fe83d	Remove --enable-tunables configure option And make always supported. The configure option was added on glibc 2.25 and some features require it (such as hwcap mask, huge pages support, and lock elisition tuning). It also simplifies the build permutations. Changes from v1: * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs more discussion. * Cleanup more code. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-03-29 14:33:06 -03:00
Adhemerval Zanella Netto	92fdb11ae7	powerpc: Remove powerpc64 strncmp variants The default, and power7 implementation just adds word aligned access when inputs have the same aligment. The unaligned case is still done by byte operations. This is already covered by the generic implementation, which also add the unaligned input optimization. Checked on powerpc64-linux-gnu built without multi-arch for powerpc64, power7, power8, and power9 (build for le). Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-03-02 16:41:43 -03:00
Adhemerval Zanella Netto	a46bb1523d	powerpc: Remove strncmp variants The default, power4, and power7 implementation just adds word aligned access when inputs have the same aligment. The unaligned case is still done by byte operations. This is already covered by the generic implementation, which also add the unaligned input optimization. Checked on powerpc-linux-gnu built without multi-arch for powerpc, power4, and power7. Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-03-02 16:41:43 -03:00
Mahesh Bodapati	56fc4b45c0	powerpc:Regenerate ulps for hypot For new inputs added in commit `3efbf11fdf`, regenerate the ulps of hypot from 0(default) to 1	2023-02-23 22:06:03 -06:00
Adhemerval Zanella	22999b2f0f	string: Add libc_hidden_proto for memrchr Although static linker can optimize it to local call, it follows the internal scheme to provide hidden proto and definitions. Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>	2023-02-08 17:13:58 -03:00
Adhemerval Zanella	7ea510127e	string: Add libc_hidden_proto for strchrnul Although static linker can optimize it to local call, it follows the internal scheme to provide hidden proto and definitions. Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>	2023-02-08 17:13:56 -03:00
Richard Henderson	080685c90f	powerpc: Add string-fza.h While ppc has the more important string functions in assembly, there are still a few generic routines used. Use the Power 6 CMPB insn for testing of zeros. Checked on powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-02-06 16:19:35 -03:00
Adhemerval Zanella	0f4254311e	string: Improve generic strnlen with memchr It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-02-06 16:19:35 -03:00
Adhemerval Zanella	2a8867a17f	string: Improve generic memchr New algorithm read the first aligned address and mask off the unwanted bytes (this strategy is similar to arch-specific implementations used on powerpc, sparc, and sh). The loop now read word-aligned address and check using the has_eq macro. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-02-06 16:19:35 -03:00
Adhemerval Zanella	685e844a97	string: Improve generic strchrnul New algorithm read the first aligned address and mask off the unwanted bytes (this strategy is similar to arch-specific implementations used on powerpc, sparc, and sh). The loop now read word-aligned address and check using the has_zero_eq function. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-02-06 16:19:35 -03:00
Richard Henderson	d45890b28c	Parameterize OP_T_THRES from memcopy.h It moves OP_T_THRES out of memcopy.h to its own header and adjust each architecture that redefines it. Checked with a build and check with run-built-tests=no for all major Linux ABIs. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2023-02-06 16:19:35 -03:00
Joseph Myers	6d7e8eda9b	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
Rajalakshmi Srinivasaraghavan	2f47198b04	powerpc64: Remove old strncmp optimization This patch cleans up the power4 strncmp optimization for powerpc64 which is unlikely to be used anywhere. Tested on ppc64le with and without --disable-multi-arch flag. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-12-02 14:26:41 -06:00
Alan Modra	94628de778	elf/tst-tlsopt-powerpc fails when compiled with -mcpu=power10 (BZ# 29776) Supports pcrel addressing of TLS GOT entry. Also tweak the non-pcrel asm constraint to better reflect how the reg is used.	2022-11-14 22:04:25 +10:30
Florian Weimer	1f34a23288	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249) This makes it more likely that the compiler can compute the strlen argument in _startup_fatal at compile time, which is required to avoid a dependency on strlen this early during process startup. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2022-11-03 17:28:03 +01:00
Adhemerval Zanella	5c5a8b99cf	Disable use of -fsignaling-nans if compiler does not support it Reviewed-by: Fangrui Song <maskray@google.com>	2022-11-01 09:46:08 -03:00
Joseph Myers	f66780ba46	Fix build with GCC 13 _FloatN, _FloatNx built-in functions GCC 13 has added more _FloatN and _FloatNx versions of existing <math.h> and <complex.h> built-in functions, for use in libstdc++-v3. This breaks the glibc build because of how those functions are defined as aliases to functions with the same ABI but different types. Add appropriate -fno-builtin-* options for compiling relevant files, as already done for the case of long double functions aliasing double ones and based on the list of files used there. I fixed some mistakes in that list of double files that I noticed while implementing this fix, but there may well be more such (harmless) cases, in this list or the new one (files that don't actually exist or don't define the named functions as aliases so don't need the options). I did try to exclude cases where glibc doesn't define certain functions for _FloatN or _FloatNx types at all from the new uses of -fno-builtin-* options. As with the options for double files (see the commit message for commit `49348beafe`, "Fix build with GCC 10 when long double = double."), it's deliberate that the options are used even if GCC currently doesn't have a built-in version of a given functions, so providing some level of future-proofing against more such built-in functions being added in future. Tested with build-many-glibcs.py for aarch64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu (compilers and glibcs builds) with GCC mainline.	2022-10-31 23:20:08 +00:00
Florian Weimer	58548b9d68	Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources In the future, this will result in a compilation failure if the macros are unexpectedly undefined (due to header inclusion ordering or header inclusion missing altogether). Assembler sources are more difficult to convert. In many cases, they are hand-optimized for the mangling and no-mangling variants, which is why they are not converted. sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c are special: These are C sources, but most of the implementation is in assembler, so the PTR_DEMANGLE macro has to be undefined in some cases, to match the assembler style. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-18 17:04:10 +02:00
Florian Weimer	88f4b6929c	Introduce <pointer_guard.h>, extracted from <sysdep.h> This allows us to define a generic no-op version of PTR_MANGLE and PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources, avoiding an unintended loss of hardening due to missing include files or unlucky header inclusion ordering. In i386 and x86_64, we can avoid a <tls.h> dependency in the C code by using the computed constant from <tcb-offsets.h>. <sysdep.h> no longer includes these definitions, so there is no cyclic dependency anymore when computing the <tcb-offsets.h> constants. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-18 17:03:55 +02:00
Adhemerval Zanella	5355f9ca7b	elf: Remove -fno-tree-loop-distribute-patterns usage on dl-support Besides the option being gcc specific, this approach is still fragile and not future proof since we do not know if this will be the only optimization option gcc will add that transforms loops to memset (or any libcall). This patch adds a new header, dl-symbol-redir-ifunc.h, that can b used to redirect the compiler generated libcalls to port the generic memset implementation if required. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-10-10 10:32:28 -03:00
Javier Pello	ab40f20364	elf: Remove _dl_string_hwcap Removal of legacy hwcaps support from the dynamic loader left no users of _dl_string_hwcap. Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-06 07:59:48 -03:00
Joseph Myers	3e5760fcb4	Update _FloatN header support for C++ in GCC 13 GCC 13 adds support for _FloatN and _FloatNx types in C++, so breaking the installed glibc headers that assume such support is not present. GCC mostly works around this with fixincludes, but that doesn't help for building glibc and its tests (glibc doesn't itself contain C++ code, but there's C++ code built for tests). Update glibc's bits/floatn-common.h and bits/floatn.h headers to handle the GCC 13 support directly. In general the changes match those made by fixincludes, though I think the ones in sysdeps/powerpc/bits/floatn.h, where the header tests __LDBL_MANT_DIG__ == 113 or uses #elif, wouldn't match the existing fixincludes patterns. Some places involving special C++ handling in relation to _FloatN support are not changed. There's no need to change the __HAVE_FLOATN_NOT_TYPEDEF definition (also in a form that wouldn't be matched by the fixincludes fixes) because it's only used in relation to macro definitions using features not supported for C++ (__builtin_types_compatible_p and _Generic). And there's no need to change the inline function overloads for issignaling, iszero and iscanonical in C++ because cases where types have the same format but are no longer compatible types are handled automatically by the C++ overload resolution rules. This patch also does not change the overload handling for iseqsig, and there I think changes are needed, beyond those in this patch or made by fixincludes. The way that overload is defined, via a template parameter to a structure type, requires overloads whenever the types are incompatible, even if they have the same format. So I think we need to add overloads with GCC 13 for every supported _FloatN and _FloatNx type, rather than just having one for _Float128 when it has a different ABI to long double as at present (but for older GCC, such overloads must not be defined for types that end up defined as typedefs for another type). Tested with build-many-glibcs.py: compilers build for aarch64-linux-gnu ia64-linux-gnu mips64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu; glibcs build for aarch64-linux-gnu ia64-linux-gnu i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu.	2022-09-28 20:10:08 +00:00
Wilco Dijkstra	22f4ab2d20	Use atomic_exchange_release/acquire Rename atomic_exchange_rel/acq to use atomic_exchange_release/acquire since these map to the standard C11 atomic builtins. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-26 16:58:08 +01:00
Jason A. Donenfeld	eaad4f9e8f	arc4random: simplify design for better safety Rather than buffering 16 MiB of entropy in userspace (by way of chacha20), simply call getrandom() every time. This approach is doubtlessly slower, for now, but trying to prematurely optimize arc4random appears to be leading toward all sorts of nasty properties and gotchas. Instead, this patch takes a much more conservative approach. The interface is added as a basic loop wrapper around getrandom(), and then later, the kernel and libc together can work together on optimizing that. This prevents numerous issues in which userspace is unaware of when it really must throw away its buffer, since we avoid buffering all together. Future improvements may include userspace learning more from the kernel about when to do that, which might make these sorts of chacha20-based optimizations more possible. The current heuristic of 16 MiB is meaningless garbage that doesn't correspond to anything the kernel might know about. So for now, let's just do something conservative that we know is correct and won't lead to cryptographic issues for users of this function. This patch might be considered along the lines of, "optimization is the root of all evil," in that the much more complex implementation it replaces moves too fast without considering security implications, whereas the incremental approach done here is a much safer way of going about things. Once this lands, we can take our time in optimizing this properly using new interplay between the kernel and userspace. getrandom(0) is used, since that's the one that ensures the bytes returned are cryptographically secure. But on systems without it, we fallback to using /dev/urandom. This is unfortunate because it means opening a file descriptor, but there's not much of a choice. Secondly, as part of the fallback, in order to get more or less the same properties of getrandom(0), we poll on /dev/random, and if the poll succeeds at least once, then we assume the RNG is initialized. This is a rough approximation, as the ancient "non-blocking pool" initialized after the "blocking pool", not before, and it may not port back to all ancient kernels, though it does to all kernels supported by glibc (≥3.2), so generally it's the best approximation we can do. The motivation for including arc4random, in the first place, is to have source-level compatibility with existing code. That means this patch doesn't attempt to litigate the interface itself. It does, however, choose a conservative approach for implementing it. Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Cristian Rodríguez <crrodriguez@opensuse.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: Mark Harris <mark.hsj@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: linux-crypto@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-07-27 08:58:27 -03:00
Adhemerval Zanella Netto	b7060acfe8	powerpc64: Add optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-ppc.c. It targets POWER8 and it is used on default for LE. On a POWER8 it shows the following improvements (using formatted bench-arc4random data): POWER8 GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 138.77 arc4random_buf(16) [single-thread] 174.36 arc4random_buf(32) [single-thread] 228.11 arc4random_buf(48) [single-thread] 252.31 arc4random_buf(64) [single-thread] 270.11 arc4random_buf(80) [single-thread] 278.97 arc4random_buf(96) [single-thread] 287.78 arc4random_buf(112) [single-thread] 291.92 arc4random_buf(128) [single-thread] 295.25 POWER8 MB/s ----------------------------------------------- arc4random [single-thread] 198.06 arc4random_buf(16) [single-thread] 278.79 arc4random_buf(32) [single-thread] 448.89 arc4random_buf(48) [single-thread] 551.09 arc4random_buf(64) [single-thread] 646.12 arc4random_buf(80) [single-thread] 698.04 arc4random_buf(96) [single-thread] 756.06 arc4random_buf(112) [single-thread] 784.12 arc4random_buf(128) [single-thread] 808.04 ----------------------------------------------- Checked on powerpc64-linux-gnu and powerpc64le-linux-gnu. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2022-07-22 11:58:27 -03:00

1 2 3 4 5 ...

1548 Commits