glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-24 22:10:13 +00:00

Author	SHA1	Message	Date
Noah Goldstein	9a421348cd	elf: Optimize _dl_new_hash in dl-new-hash.h Unroll slightly and enforce good instruction scheduling. This improves performance on out-of-order machines. The unrolling allows for pipelined multiplies. As well, as an optional sysdep, reorder the operations and prevent reassosiation for better scheduling and higher ILP. This commit only adds the barrier for x86, although it should be either no change or a win for any architecture. Unrolling further started to induce slowdowns for sizes [0, 4] but can help the loop so if larger sizes are the target further unrolling can be beneficial. Results for _dl_new_hash Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz Time as Geometric Mean of N=30 runs Geometric of all benchmark New / Old: 0.674 type, length, New Time, Old Time, New Time / Old Time fixed, 0, 2.865, 2.72, 1.053 fixed, 1, 3.567, 2.489, 1.433 fixed, 2, 2.577, 3.649, 0.706 fixed, 3, 3.644, 5.983, 0.609 fixed, 4, 4.211, 6.833, 0.616 fixed, 5, 4.741, 9.372, 0.506 fixed, 6, 5.415, 9.561, 0.566 fixed, 7, 6.649, 10.789, 0.616 fixed, 8, 8.081, 11.808, 0.684 fixed, 9, 8.427, 12.935, 0.651 fixed, 10, 8.673, 14.134, 0.614 fixed, 11, 10.69, 15.408, 0.694 fixed, 12, 10.789, 16.982, 0.635 fixed, 13, 12.169, 18.411, 0.661 fixed, 14, 12.659, 19.914, 0.636 fixed, 15, 13.526, 21.541, 0.628 fixed, 16, 14.211, 23.088, 0.616 fixed, 32, 29.412, 52.722, 0.558 fixed, 64, 65.41, 142.351, 0.459 fixed, 128, 138.505, 295.625, 0.469 fixed, 256, 291.707, 601.983, 0.485 random, 2, 12.698, 12.849, 0.988 random, 4, 16.065, 15.857, 1.013 random, 8, 19.564, 21.105, 0.927 random, 16, 23.919, 26.823, 0.892 random, 32, 31.987, 39.591, 0.808 random, 64, 49.282, 71.487, 0.689 random, 128, 82.23, 145.364, 0.566 random, 256, 152.209, 298.434, 0.51 Co-authored-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Noah Goldstein	3d155d4b6c	nss: Optimize nss_hash in nss_hash.c The prior unrolling didn't really do much as it left the dependency chain between iterations. Unrolled the loop for 4 so 4x multiplies could be pipelined in out-of-order machines. Results for __nss_hash Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz Time as Geometric Mean of N=25 runs Geometric of all benchmark New / Old: 0.845 type, length, New Time, Old Time, New Time / Old Time fixed, 0, 4.019, 3.729, 1.078 fixed, 1, 4.95, 5.707, 0.867 fixed, 2, 5.152, 5.657, 0.911 fixed, 3, 4.641, 5.721, 0.811 fixed, 4, 5.551, 5.81, 0.955 fixed, 5, 6.525, 6.552, 0.996 fixed, 6, 6.711, 6.561, 1.023 fixed, 7, 6.715, 6.767, 0.992 fixed, 8, 7.874, 7.915, 0.995 fixed, 9, 8.888, 9.767, 0.91 fixed, 10, 8.959, 9.762, 0.918 fixed, 11, 9.188, 9.987, 0.92 fixed, 12, 9.708, 10.618, 0.914 fixed, 13, 10.393, 11.14, 0.933 fixed, 14, 10.628, 12.097, 0.879 fixed, 15, 10.982, 12.965, 0.847 fixed, 16, 11.851, 14.429, 0.821 fixed, 32, 24.334, 34.414, 0.707 fixed, 64, 55.618, 86.688, 0.642 fixed, 128, 118.261, 224.36, 0.527 fixed, 256, 256.183, 538.629, 0.476 random, 2, 11.194, 11.556, 0.969 random, 4, 17.516, 17.205, 1.018 random, 8, 23.501, 20.985, 1.12 random, 16, 28.131, 29.212, 0.963 random, 32, 35.436, 38.662, 0.917 random, 64, 45.74, 58.868, 0.777 random, 128, 75.394, 121.963, 0.618 random, 256, 139.524, 260.726, 0.535 Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Noah Goldstein	319dddc143	benchtests: Add benchtests for dl_elf_hash, dl_new_hash and nss_hash Benchtests are for throughput and include random / fixed size benchmarks. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Noah Goldstein	5f2f0f6977	nss: Add tests for the nss_hash in nss_hash.h If we want to further optimize the function tests are needed. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Noah Goldstein	6fd435485f	elf: Add tests for the dl hash funcs (_dl_new_hash and _dl_elf_hash) If we want to further optimize the functions tests are needed. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Noah Goldstein	c4bd509d47	elf: Refactor dl_new_hash so it can be tested / benchmarked No change to the code other than moving the function to dl-new-hash.h. Changed name so its now in the reserved namespace. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 10:38:40 -05:00
Florian Weimer	93ec1cf0fe	locale: Add more cached data to LC_CTYPE This data will be used in number formatting. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	7ee41feba6	locale: Remove private union from struct __locale_data This avoids an alias violation later. This commit also fixes an incorrect double-checked locking idiom in _nl_init_era_entries. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	bbebe83a28	locale: Remove cleanup function pointer from struct __localedata We can call the cleanup functions directly from _nl_unload_locale if we pass the category to it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	0b6342e769	locale: Call _nl_unload_locale from _nl_archive_subfreeres The function performs the same steps for ld_archive locales (mapped from an archive), and this code is not performance-critical, so the specialization does not add value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	0060a6de54	stdio-common: Add tst-memstream-string for open_memstream overflow This code path is exercised indirectly by some of the DNS stub resolver tests, via their own use of xopen_memstream for constructing strings describing result data. The relative lack of test suite coverage became apparent when these tests starting failing after a printf changes uncovered bug 28949. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	b094c52b1b	__printf_fphex always uses LC_NUMERIC There is no hexadecimal currency printing. strfmon uses __printf_fp_l exclusively. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	859e7a00af	vfprintf: Consolidate some multibyte/wide character processing form_character and form_string processing a sufficiently similar that the logic can be shared. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	5442ea7ffe	vfprintf: Move argument processing into vfprintf-process-arg.c This simplies formatting and helps with debugging. It also allows the use of localized COMPILE_WPRINTF preprocessor conditionals. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	21bb8382b6	stdio-common: Add tst-vfprintf-width-i18n to cover numeric field width Related to bug 28943 and bug 28944. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Sergei Trofimovich	5a5f94af05	string.h: fix __fortified_attr_access macro call [BZ #29162 ] commit `e938c0274` "Don't add access size hints to fortifiable functions" converted a few '__attr_access ((...))' into '__fortified_attr_access (...)' calls. But one of conversions had double parentheses of '__fortified_attr_access (...)'. Noticed as a gnat6 build failure: /<<NIX>>-glibc-2.34-210-dev/include/bits/string_fortified.h:110:50: error: macro "__fortified_attr_access" requires 3 arguments, but only 1 given The change fixes parentheses. This is seen when using compilers that do not support __builtin___stpncpy_chk, e.g. gcc older than 4.7, clang older than 2.6 or some compiler not derived from gcc or clang. Signed-off-by: Sergei Trofimovich <slyich@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-23 13:56:43 +05:30
H.J. Lu	2d5ec6692f	Enable DT_RELR in glibc shared libraries and PIEs automatically Enable DT_RELR in glibc shared libraries and position independent executables (PIE) automatically if linker supports -z pack-relative-relocs. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-05-18 13:04:13 -07:00
Stefan Liebler	728894dba4	S390: Enable static PIE This commit enables static PIE on 64bit. On 31bit, static PIE is not supported. A new configure check in sysdeps/s390/s390-64/configure.ac also performs a minimal test for requirements in ld: Ensure you also have those patches for: - binutils (ld) - "[PR ld/22263] s390: Avoid dynamic TLS relocs in PIE" https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26b1426577b5dcb32d149c64cca3e603b81948a9 (Tested by configure check above) Otherwise there will be a R_390_TLS_TPOFF relocation, which fails to be processed in _dl_relocate_static_pie() as static TLS map is not setup. - "s390: Add DT_JMPREL pointing to .rela.[i]plt with static-pie" https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d942d8db12adf4c9e5c7d9ed6496a779ece7149e (We can't test it in configure as we are not able to link a static PIE executable if the system glibc lacks static PIE support) Otherwise there won't be DT_JMPREL, DT_PLTRELA, DT_PLTRELASZ entries and the IFUNC symbols are not processed, which leads to crashes. - kernel (the mentioned links to the commits belong to 5.19 merge window): - "s390/mmap: increase stack/mmap gap to 128MB" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=f2f47d0ef72c30622e62471903ea19446ea79ee2 - "s390/vdso: move vdso mapping to its own function" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=57761da4dc5cd60bed2c81ba0edb7495c3c740b8 - "s390/vdso: map vdso above stack" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=9e37a2e8546f9e48ea76c839116fa5174d14e033 - "s390/vdso: add vdso randomization" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=41cd81abafdc4e58a93fcb677712a76885e3ca25 (We can't test the kernel of the target system) Otherwise if /proc/sys/kernel/randomize_va_space is turned off (0), static PIE executables like ldconfig will crash. While startup sbrk is used to enlarge the HEAP. Unfortunately the underlying brk syscall fails as there is not enough space after the HEAP. Then the address of the TLS image is invalid and the following memcpy in __libc_setup_tls() leads to a segfault. If /proc/sys/kernel/randomize_va_space is activated (default: 2), there is enough space after HEAP. - glibc - "Linux: Define MMAP_CALL_INTERNAL" https://sourceware.org/git/?p=glibc.git;a=commit;h=c1b68685d438373efe64e5f076f4215723004dfb - "i386: Remove OPTIMIZE_FOR_GCC_5 from Linux libc-do-syscall.S" https://sourceware.org/git/?p=glibc.git;a=commit;h=6e5c7a1e262961adb52443ab91bd2c9b72316402 - "i386: Honor I386_USE_SYSENTER for 6-argument Linux system calls" https://sourceware.org/git/?p=glibc.git;a=commit;h=60f0f2130d30cfd008ca39743027f1e200592dff - "ia64: Always define IA64_USE_NEW_STUB as a flag macro" https://sourceware.org/git/?p=glibc.git;a=commit;h=18bd9c3d3b1b6a9182698c85354578d1d58e9d64 - "Linux: Implement a useful version of _startup_fatal" https://sourceware.org/git/?p=glibc.git;a=commit;h=a2a6bce7d7e52c1c34369a7da62c501cc350bc31 - "Linux: Introduce __brk_call for invoking the brk system call" https://sourceware.org/git/?p=glibc.git;a=commit;h=b57ab258c1140bc45464b4b9908713e3e0ee35aa - "csu: Implement and use _dl_early_allocate during static startup" https://sourceware.org/git/?p=glibc.git;a=commit;h=f787e138aa0bf677bf74fa2a08595c446292f3d7 The mentioned patch series by Florian Weimer avoids the mentioned failing sbrk syscall by falling back to mmap. This commit also adjusts startup code in start.S to be ready for static PIE. We have to add a wrapper function for main as we are not allowed to use GOT relocations before __libc_start_main is called. (Compare also to: - commit `14d886edbd` "aarch64: fix start code for static pie" - commit `3d1d79283e` "aarch64: fix static pie enabled libc when main is in a shared library" )	2022-05-18 14:31:26 +02:00
Adhemerval Zanella	d2a1ec2097	linux: Add tst-pidfd.c To check for the pidfd functions pidfd_open, pidfd_getfd, pid_send_signal, and waitid with P_PIDFD. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:36:59 -03:00
Adhemerval Zanella	b3528b0048	linux: Add P_PIDFD It was added on Linux 5.4 (3695eae5fee0605f316fbaad0b9e3de791d7dfaf) to extend waitid to wait on pidfd. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:34:36 -03:00
Adhemerval Zanella	56cf9e8eec	linux: Add pidfd_send_signal This was added on Linux 5.1(3eb39f47934f9d5a3027fe00d906a45fe3a15fad) as a way to avoid the race condition of using kill (where PID might be reused by the kernel between between obtaining the pid and sending the signal). If the siginfo_t argument is NULL then pidfd_send_signal is equivalent to kill. If it is not NULL pidfd_send_signal is equivalent to rt_sigqueueinfo. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:33:46 -03:00
Adhemerval Zanella	32dd8c251a	linux: Add pidfd_getfd This was added on Linux 5.6 (8649c322f75c96e7ced2fec201e123b2b073bf09) as a way to retrieve a file descriptors for another process though pidfd (created either with CLONE_PIDFD or pidfd_getfd). The functionality is similar to recvmmsg SCM_RIGHTS. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:33:07 -03:00
Adhemerval Zanella	97f5d19c45	linux: Add pidfd_open This was added on Linux 5.3 (32fcb426ec001cb6d5a4a195091a8486ea77e2df) as a way to retrieve a pid file descriptors for process that has not been created CLONE_PIDFD (by usual fork/clone). Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:32:28 -03:00
Szabolcs Nagy	1da064c015	aarch64: Move ld.so _start to separate file and drop _dl_skip_args A separate asm file is easier to maintain than a macro that expands to inline asm. The RTLD_START macro is only needed now because _dl_start is local in rtld.c, but _start has to call it, if _dl_start was made hidden then it could be empty. _dl_skip_args is no longer needed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-17 10:14:03 +01:00
Szabolcs Nagy	9faf5262c7	linux: Add a getauxval test [BZ #23293 ] This is for bug 23293 and it relies on the glibc test system running tests via explicit ld.so invokation by default. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-17 10:14:03 +01:00
Szabolcs Nagy	86147bbeec	rtld: Remove DL_ARGV_NOT_RELRO and make _dl_skip_args const _dl_skip_args is always 0, so the target specific code that modifies argv after relro protection is applied is no longer used. After the patch relro protection is applied to _dl_argv consistently on all targets. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-17 10:14:03 +01:00
Szabolcs Nagy	ad43cac44a	rtld: Use generic argv adjustment in ld.so [BZ #23293 ] When an executable is invoked as ./ld.so [ld.so-args] ./exe [exe-args] then the argv is adujusted in ld.so before calling the entry point of the executable so ld.so args are not visible to it. On most targets this requires moving argv, env and auxv on the stack to ensure correct stack alignment at the entry point. This had several issues: - The code for this adjustment on the stack is written in asm as part of the target specific ld.so _start code which is hard to maintain. - The adjustment is done after _dl_start returns, where it's too late to update GLRO(dl_auxv), as it is already readonly, so it points to memory that was clobbered by the adjustment. This is bug 23293. - _environ is also wrong in ld.so after the adjustment, but it is likely not used after _dl_start returns so this is not user visible. - _dl_argv was updated, but for this it was moved out of relro, which changes security properties across targets unnecessarily. This patch introduces a generic _dl_start_args_adjust function that handles the argument adjustments after ld.so processed its own args and before relro protection is applied. The same algorithm is used on all targets, _dl_skip_args is now 0, so existing target specific adjustment code is no longer used. The bug affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc, other targets don't need the change in principle, only for consistency. The GNU Hurd start code relied on _dl_skip_args after dl_main returned, now it checks directly if args were adjusted and fixes the Hurd startup data accordingly. Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO. Tested on aarch64-linux-gnu and cross tested on i686-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-17 10:14:03 +01:00
Florian Weimer	d055481ce3	scripts/glibcelf.py: Add T_RISCV_ constants SHT_RISCV_ATTRIBUTES, PT_RISCV_ATTRIBUTES, DT_RISCV_VARIANT_CC were added in commit `0b6c675073` ("Update RISC-V specific ELF definitions"). This caused the elf/tst-glibcelf consistency check to fail. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-16 22:22:38 +02:00
Adhemerval Zanella	d2db60d8d8	Remove dl-librecon.h header. The Linux version used by i686 and m68k provide three overrrides for generic code: 1. DISTINGUISH_LIB_VERSIONS to print additional information when libc5 is used by a dependency. 2. EXTRA_LD_ENVVARS to that enabled LD_LIBRARY_VERSION environment variable. 3. EXTRA_UNSECURE_ENVVARS to add two environment variables related to aout support. None are really requires, it has some decades since libc5 or aout suppported was removed and Linux even remove support for aout files. The LD_LIBRARY_VERSION is also dead code, dl_correct_cache_id is not used anywhere. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-05-16 15:03:49 -03:00
Adhemerval Zanella	c628c22963	elf: Remove ldconfig kernel version check Now that it was removed on libc.so.	2022-05-16 15:03:49 -03:00
Adhemerval Zanella	b46d250656	Remove kernel version check The kernel version check is used to avoid glibc to run on older kernels where some syscall are not available and fallback code are not enabled to handle graciously fail. However, it does not prevent if the kernel does not correctly advertise its version through vDSO note, uname or procfs. Also kernel version checks are sometime not desirable by users, where they want to deploy on different system with different kernel version knowing the minimum set of syscall is always presented on such systems. The kernel version check has been removed along with the LD_ASSUME_KERNEL environment variable. The minimum kernel used to built glibc is still provided through NT_GNU_ABI_TAG ELF note and also printed when libc.so is issued. Checked on x86_64-linux-gnu.	2022-05-16 15:03:49 -03:00
Adhemerval Zanella	97a912f7a8	linux: Use /sys/devices/system/cpu on __get_nprocs_conf (BZ#28991) Currently on Linux __get_nprocs_conf first tries to enumerate the cpus present in the system by iterating on /sys/devices/system/cpuX directories. This only enumerates the CPUs that are present in system (but possibly offline), not taking in account possible CPU that might added in the system through hotplugging. Linux provides the maximum number of configured cpus on the /sys/devices/system/cpu file. Although it might present a larger value of possible active CPUs on some system (where kernel either get the information from firmaware or is configured at boot time), the information is what kernel presents to userland. This also change the returned value of _SC_NPROCESSORS_CONF, which aligns as the maximum configure cpu in the system. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-05-16 14:26:49 -03:00
Florian Weimer	f787e138aa	csu: Implement and use _dl_early_allocate during static startup This implements mmap fallback for a brk failure during TLS allocation. scripts/tls-elf-edit.py is updated to support the new patching method. The script no longer requires that in the input object is of ET_DYN type. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-16 18:42:03 +02:00
Florian Weimer	b57ab258c1	Linux: Introduce __brk_call for invoking the brk system call Alpha and sparc can now use the generic implementation. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-16 18:41:52 +02:00
Jonathan Wakely	21244c70c2	sys/cdefs.h: Do not require C++ compilers to define __STDC__ The check for an ISO C compiler assumes that anything GCC-like will define __STDC__, even if it's actually a C++ compiler. That's currently true for G++ and compilers like clang++ that also define __GNUC__, but it might not always be true. The C++ standard leaves it implementation-defined whether or not __STDC__ is defined by C++ compilers. And really the check should be "ISO C or ISO C++ conforming compiler" anyway. So only give an error if __GNUC__ is defined and neither __STDC__ nor __cplusplus is defined. Reviewed-by: Fangrui Song <maskray@google.com>	2022-05-16 16:48:51 +01:00
Siddhesh Poyarekar	61a8753010	fortify: Ensure that __glibc_fortify condition is a constant [BZ #29141 ] The fix `c8ee1c85` introduced a -1 check for object size without also checking that object size is a constant. Because of this, the tree optimizer passes in gcc fail to fold away one of the branches in __glibc_fortify and trips on a spurious Wstringop-overflow. The warning itself is incorrect and the branch does go away eventually in DCE in the rtl passes in gcc, but the constant check is a helpful hint to simplify code early, so add it in. Resolves: BZ #29141 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-16 20:10:08 +05:30
Andreas Schwab	0b6c675073	Update RISC-V specific ELF definitions The definitions are taken from the 1.0-rc2 version of the ELF psABI.	2022-05-16 15:51:57 +02:00
Adhemerval Zanella	9403b71ae9	x86_64: Remove bzero optimization Both symbols are marked as legacy in POSIX.1-2001 and removed on POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE or _DEFAULT_SOURCE. GCC also replaces bcopy with a memmove and bzero with memset on default configuration (to actually get a bzero libc call the code requires to omit string.h inclusion and built with -fno-builtin), so it is highly unlikely programs are actually calling libc bzero symbol. On a recent Linux distro (Ubuntu 22.04), there is no bzero calls by the installed binaries. $ cat count_bstring.sh #!/bin/bash files=`IFS=':';for i in $PATH; do test -d "$i" && find "$i" -maxdepth 1 -executable -type f; done` total=0 for file in $files; do symbols=`objdump -R $file 2>&1` if [ $? -eq 0 ]; then ncalls=`echo $symbols \| grep -w $1 \| wc -l` ((total=total+ncalls)) if [ $ncalls -gt 0 ]; then echo "$file: $ncalls" fi fi done echo "TOTAL=$total" $ ./count_bstring.sh bzero TOTAL=0 Checked on x86_64-linux-gnu.	2022-05-16 09:36:06 -03:00
Maciej W. Rozycki	7b1cfba79e	RISC-V: Use an autoconf template to produce `preconfigure' Avoid fiddling with autoconf internals and use AC_DEFINE_UNQUOTED to define macros in the configuration headers rather than handcoding an equivalent shell sequence with the use of the `as_echo' undocumented variable. Switch to using AC_MSG_ERROR rather than `echo' and `exit' directly for error handling. Owing to the lack of any kind of error annotation it makes it difficult to spot the message in the flood in a parallel build and neither it is logged in `config.log'. Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-13 17:07:23 +01:00
Maciej W. Rozycki	353a1220e3	MIPS: Use an autoconf template to produce `preconfigure' Avoid fiddling with autoconf internals and use AC_DEFINE_UNQUOTED to define macros in the configuration headers rather than handcoding an equivalent shell sequence with the use of the `as_echo' undocumented variable. Similarly use AC_MSG_ERROR for error handling rather than the internal undocumented `as_fn_error' variable. Switch to using 1 as the exit code as it makes no sense to refer $? in the contexts involved, it's not a command failure handled there.	2022-05-13 17:07:23 +01:00
Maciej W. Rozycki	fe7dd93db3	m68k: Use an autoconf template to produce `preconfigure' Switch to using AC_MSG_ERROR rather than `echo' and `exit' directly for error handling. Owing to the lack of any kind of error annotation it makes it difficult to spot the message in the flood in a parallel build and neither it is logged in `config.log'.	2022-05-13 17:07:23 +01:00
Maciej W. Rozycki	7c20479d08	C-SKY: Use an autoconf template to produce `preconfigure' Avoid fiddling with autoconf internals and use AC_DEFINE_UNQUOTED to define macros in the configuration headers rather than handcoding an equivalent shell sequence with the use of the `as_echo' undocumented variable. Switch to using AC_MSG_ERROR rather than `echo' and `exit' directly for error handling. Owing to the lack of any kind of error annotation it makes it difficult to spot the message in the flood in a parallel build and neither it is logged in `config.log'.	2022-05-13 17:07:23 +01:00
Adhemerval Zanella	f39ff483f3	Remove configure fno_unit_at_a_time Since it is not used any longer. Reviewed-by: Fangrui Song <maskray@google.com>	2022-05-13 10:54:41 -03:00
Adhemerval Zanella	6fad891dfd	stdio: Remove the usage of $(fno-unit-at-a-time) for siglist.c The siglist.c is built with -fno-toplevel-reorder to avoid compiler to reorder the compat assembly directives due an assembler issue [1] (fixed on 2.39). This patch removes the compiler flags by split the compat symbol generation in two phases. First the __sys_siglist and __sys_sigabbrev without any compat symbol directive is preprocessed to generate an assembly source code. This generate assembly is then used as input on a platform agnostic siglist.S which then creates the compat definitions. This prevents compiler to move any compat directive prior the _sys_errlist definition itself. Checked on a make check run-built-tests=no on all affected ABIs. Reviewed-by: Fangrui Song <maskray@google.com>	2022-05-13 10:54:41 -03:00
Adhemerval Zanella	900fa25736	stdio: Remove the usage of $(fno-unit-at-a-time) for errlist.c The errlist.c is built with -fno-toplevel-reorder to avoid compiler to reorder the compat assembly directives due an assembler issue [1] (fixed on 2.39). This patch removes the compiler flags by split the compat symbol generation in two phases. First the _sys_errlist_internal internal without any compat symbol directive is preprocessed to generate an assembly source code. This generate assembly is then used as input on a platform agnostic errlist-data.S which then creates the compat definitions. This prevents compiler to move any compat directive prior the _sys_errlist_internal definition itself. Checked on a make check run-built-tests=no on all affected ABIs. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=29012	2022-05-13 10:54:41 -03:00
H.J. Lu	111254f3e1	Add declare_object_symbol_alias for assembly codes (BZ #28128 ) There are 2 problems in: #define declare_symbol_alias(symbol, original, type, size) \ declare_symbol_alias_1 (symbol, original, type, size) #ifdef __ASSEMBLER__ # define declare_symbol_alias_1(symbol, original, type, size) \ strong_alias (original, symbol); \ .type C_SYMBOL_NAME (symbol), %##type; \ .size C_SYMBOL_NAME (symbol), size 1. .type and .size are substituted by arguments. 2. %##type is expanded to "% type" due to the GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101613 But assembler doesn't support "% type". Workaround BZ #28128 by 1. Don't define declare_symbol_alias for assembly codes. 2. Define declare_object_symbol_alias for assembly codes. Reviewed-by: Fangrui Song <maskray@google.com>	2022-05-13 10:54:41 -03:00
Siddhesh Poyarekar	9bcd12d223	wcrtomb: Make behavior POSIX compliant The GNU implementation of wcrtomb assumes that there are at least MB_CUR_MAX bytes available in the destination buffer passed to wcrtomb as the first argument. This is not compatible with the POSIX definition, which only requires enough space for the input wide character. This does not break much in practice because when users supply buffers smaller than MB_CUR_MAX (e.g. in ncurses), they compute and dynamically allocate the buffer, which results in enough spare space (thanks to usable_size in malloc and padding in alloca) that no actual buffer overflow occurs. However when the code is built with _FORTIFY_SOURCE, it runs into the hard check against MB_CUR_MAX in __wcrtomb_chk and hence fails. It wasn't evident until now since dynamic allocations would result in wcrtomb not being fortified but since _FORTIFY_SOURCE=3, that limitation is gone, resulting in such code failing. To fix this problem, introduce an internal buffer that is MB_LEN_MAX long and use that to perform the conversion and then copy the resultant bytes into the destination buffer. Also move the fortification check into the main implementation, which checks the result after conversion and aborts if the resultant byte count is greater than the destination buffer size. One complication is that applications that assume the MB_CUR_MAX limitation to be gone may not be able to run safely on older glibcs if they use static destination buffers smaller than MB_CUR_MAX; dynamic allocations will always have enough spare space that no actual overruns will occur. One alternative to fixing this is to bump symbol version to prevent them from running on older glibcs but that seems too strict a constraint. Instead, since these users will only have made this decision on reading the manual, I have put a note in the manual warning them about the pitfalls of having static buffers smaller than MB_CUR_MAX and running them on older glibc. Benchmarking: The wcrtomb microbenchmark shows significant increases in maximum execution time for all locales, ranging from 10x for ar_SA.UTF-8 to 1.5x-2x for nearly everything else. The mean execution time however saw practically no impact, with some results even being quicker, indicating that cache locality has a much bigger role in the overhead. Given that the additional copy uses a temporary buffer inside wcrtomb, it's likely that a hot path will end up putting that buffer (which is responsible for the additional overhead) in a similar place on stack, giving the necessary cache locality to negate the overhead. However in situations where wcrtomb ends up getting called at wildly different spots on the call stack (or is on different call stacks, e.g. with threads or different execution contexts) and is still a hotspot, the performance lag will be visible. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-13 19:15:46 +05:30
Wangyang Guo	8162147872	nptl: Add backoff mechanism to spinlock loop When mutiple threads waiting for lock at the same time, once lock owner releases the lock, waiters will see lock available and all try to lock, which may cause an expensive CAS storm. Binary exponential backoff with random jitter is introduced. As try-lock attempt increases, there is more likely that a larger number threads compete for adaptive mutex lock, so increase wait time in exponential. A random jitter is also added to avoid synchronous try-lock from other threads. v2: Remove read-check before try-lock for performance. v3: 1. Restore read-check since it works well in some platform. 2. Make backoff arch dependent, and enable it for x86_64. 3. Limit max backoff to reduce latency in large critical section. v4: Fix strict-prototypes error in sysdeps/nptl/pthread_mutex_backoff.h v5: Commit log updated for regression in large critical section. Result of pthread-mutex-locks bench Test Platform: Xeon 8280L (2 socket, 112 CPUs in total) First Row: thread number First Col: critical section length Values: backoff vs upstream, time based, low is better non-critical-length: 1 1 2 4 8 16 32 64 112 140 0 0.99 0.58 0.52 0.49 0.43 0.44 0.46 0.52 0.54 1 0.98 0.43 0.56 0.50 0.44 0.45 0.50 0.56 0.57 2 0.99 0.41 0.57 0.51 0.45 0.47 0.48 0.60 0.61 4 0.99 0.45 0.59 0.53 0.48 0.49 0.52 0.64 0.65 8 1.00 0.66 0.71 0.63 0.56 0.59 0.66 0.72 0.71 16 0.97 0.78 0.91 0.73 0.67 0.70 0.79 0.80 0.80 32 0.95 1.17 0.98 0.87 0.82 0.86 0.89 0.90 0.90 64 0.96 0.95 1.01 1.01 0.98 1.00 1.03 0.99 0.99 128 0.99 1.01 1.01 1.17 1.08 1.12 1.02 0.97 1.02 non-critical-length: 32 1 2 4 8 16 32 64 112 140 0 1.03 0.97 0.75 0.65 0.58 0.58 0.56 0.70 0.70 1 0.94 0.95 0.76 0.65 0.58 0.58 0.61 0.71 0.72 2 0.97 0.96 0.77 0.66 0.58 0.59 0.62 0.74 0.74 4 0.99 0.96 0.78 0.66 0.60 0.61 0.66 0.76 0.77 8 0.99 0.99 0.84 0.70 0.64 0.66 0.71 0.80 0.80 16 0.98 0.97 0.95 0.76 0.70 0.73 0.81 0.85 0.84 32 1.04 1.12 1.04 0.89 0.82 0.86 0.93 0.91 0.91 64 0.99 1.15 1.07 1.00 0.99 1.01 1.05 0.99 0.99 128 1.00 1.21 1.20 1.22 1.25 1.31 1.12 1.10 0.99 non-critical-length: 128 1 2 4 8 16 32 64 112 140 0 1.02 1.00 0.99 0.67 0.61 0.61 0.61 0.74 0.73 1 0.95 0.99 1.00 0.68 0.61 0.60 0.60 0.74 0.74 2 1.00 1.04 1.00 0.68 0.59 0.61 0.65 0.76 0.76 4 1.00 0.96 0.98 0.70 0.63 0.63 0.67 0.78 0.77 8 1.01 1.02 0.89 0.73 0.65 0.67 0.71 0.81 0.80 16 0.99 0.96 0.96 0.79 0.71 0.73 0.80 0.84 0.84 32 0.99 0.95 1.05 0.89 0.84 0.85 0.94 0.92 0.91 64 1.00 0.99 1.16 1.04 1.00 1.02 1.06 0.99 0.99 128 1.00 1.06 0.98 1.14 1.39 1.26 1.08 1.02 0.98 There is regression in large critical section. But adaptive mutex is aimed for "quick" locks. Small critical section is more common when users choose to use adaptive pthread_mutex. Signed-off-by: Wangyang Guo <wangyang.guo@intel.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2022-05-09 14:38:40 -07:00
Florian Weimer	a2a6bce7d7	Linux: Implement a useful version of _startup_fatal On i386 and ia64, the TCB is not available at this point. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-09 18:15:16 +02:00
Florian Weimer	18bd9c3d3b	ia64: Always define IA64_USE_NEW_STUB as a flag macro And keep the previous definition if it exists. This allows disabling IA64_USE_NEW_STUB while keeping USE_DL_SYSINFO defined. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-09 18:15:16 +02:00

1 2 3 4 5 ...

38904 Commits