glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-12 06:10:10 +00:00

Author	SHA1	Message	Date
Joseph Myers	5146b73d72	Add ARPHRD_CAN, ARPHRD_MCTP to net/if_arp.h Add the constant ARPHRD_MCTP, from Linux 5.15, to net/if_arp.h, along with ARPHRD_CAN which was added to Linux in version 2.6.25 (commit cd05acfe65ed2cf2db683fa9a6adb8d35635263b, "[CAN]: Allocate protocol numbers for PF_CAN") but apparently missed for glibc at the time. Tested for x86_64. (cherry picked from commit `a94d9659cd`)	2022-05-03 11:07:10 +02:00
Joseph Myers	fd5dbfd1cd	Update kernel version to 5.15 in tst-mman-consts.py This patch updates the kernel version in the test tst-mman-consts.py to 5.15. (There are no new MAP_* constants covered by this test in 5.15 that need any other header changes.) Tested with build-many-glibcs.py. (cherry picked from commit `5c3ece451d`)	2022-05-03 11:07:07 +02:00
Joseph Myers	bc6fba3c80	Add PF_MCTP, AF_MCTP from Linux 5.15 to bits/socket.h Linux 5.15 adds a new address / protocol family PF_MCTP / AF_MCTP; add these constants to bits/socket.h. Tested for x86_64. (cherry picked from commit `bdeb7a8fa9`)	2022-05-03 11:07:03 +02:00
DJ Delorie	c66c92181d	posix/glob.c: update from gnulib Copied from gnulib/lib/glob.c in order to fix rhbz 1982608 Also fixes swbz 25659 Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `7c477b57a3`)	2022-04-28 11:57:23 -04:00
Adhemerval Zanella	88a8637cb4	linux: Fix fchmodat with AT_SYMLINK_NOFOLLOW for 64 bit time_t (BZ#29097) The AT_SYMLINK_NOFOLLOW emulation ues the default 32 bit stat internal calls, which fails with EOVERFLOW if the file constains timestamps beyond 2038. Checked on i686-linux-gnu. (cherry picked from commit `118a2aee07`)	2022-04-28 10:10:30 -03:00
Carlos O'Donell	55640ed3fd	i386: Regenerate ulps These failures were caught while building glibc master for Fedora Rawhide which is built with '-mtune=generic -msse2 -mfpmath=sse' using gcc 11.3 (gcc-11.3.1-2.fc35) on a Cascadelake Intel Xeon processor. (cherry picked from commit `e465d97653`)	2022-04-27 21:20:43 -04:00
Adhemerval Zanella	9681691402	linux: Fix missing internal 64 bit time_t stat usage These are two missing spots initially done by `52a5fe70a2`. Checked on i686-linux-gnu. (cherry picked from commit `834ddd0432`)	2022-04-27 14:52:26 -03:00
Noah Goldstein	c796418d00	x86: Optimize L(less_vec) case in memcmp-evex-movbe.S No bug. Optimizations are twofold. 1) Replace page cross and 0/1 checks with masked load instructions in L(less_vec). In applications this reduces branch-misses in the hot [0, 32] case. 2) Change controlflow so that L(less_vec) case gets the fall through. Change 2) helps copies in the [0, 32] size range but comes at the cost of copies in the [33, 64] size range. From profiles of GCC and Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this appears to the the right tradeoff. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `abddd61de0`)	2022-04-26 18:18:16 -07:00
H.J. Lu	f3a99b2216	x86: Don't set Prefer_No_AVX512 for processors with AVX512 and AVX-VNNI Don't set Prefer_No_AVX512 on processors with AVX512 and AVX-VNNI since they won't lower CPU frequency when ZMM load and store instructions are used. (cherry picked from commit `ceeffe968c`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	4bbd0f866a	x86-64: Use notl in EVEX strcmp [BZ #28646 ] Must use notl %edi here as lower bits are for CHAR comparisons potentially out of range thus can be 0 without indicating mismatch. This fixes BZ #28646. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `4df1fa6ddc`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	7cb126e7e7	x86: Shrink memcmp-sse4.S code size No bug. This implementation refactors memcmp-sse4.S primarily with minimizing code size in mind. It does this by removing the lookup table logic and removing the unrolled check from (256, 512] bytes. memcmp-sse4 code size reduction : -3487 bytes wmemcmp-sse4 code size reduction: -1472 bytes The current memcmp-sse4.S implementation has a large code size cost. This has serious adverse affects on the ICache / ITLB. While in micro-benchmarks the implementations appears fast, traces of real-world code have shown that the speed in micro benchmarks does not translate when the ICache/ITLB are not primed, and that the cost of the code size has measurable negative affects on overall application performance. See https://research.google/pubs/pub48320/ for more details. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `2f9062d717`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	cecbac5212	x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h No bug. This patch doubles the rep_movsb_threshold when using ERMS. Based on benchmarks the vector copy loop, especially now that it handles 4k aliasing, is better for these medium ranged. On Skylake with ERMS: Size, Align1, Align2, dst>src,(rep movsb) / (vec copy) 4096, 0, 0, 0, 0.975 4096, 0, 0, 1, 0.953 4096, 12, 0, 0, 0.969 4096, 12, 0, 1, 0.872 4096, 44, 0, 0, 0.979 4096, 44, 0, 1, 0.83 4096, 0, 12, 0, 1.006 4096, 0, 12, 1, 0.989 4096, 0, 44, 0, 0.739 4096, 0, 44, 1, 0.942 4096, 12, 12, 0, 1.009 4096, 12, 12, 1, 0.973 4096, 44, 44, 0, 0.791 4096, 44, 44, 1, 0.961 4096, 2048, 0, 0, 0.978 4096, 2048, 0, 1, 0.951 4096, 2060, 0, 0, 0.986 4096, 2060, 0, 1, 0.963 4096, 2048, 12, 0, 0.971 4096, 2048, 12, 1, 0.941 4096, 2060, 12, 0, 0.977 4096, 2060, 12, 1, 0.949 8192, 0, 0, 0, 0.85 8192, 0, 0, 1, 0.845 8192, 13, 0, 0, 0.937 8192, 13, 0, 1, 0.939 8192, 45, 0, 0, 0.932 8192, 45, 0, 1, 0.927 8192, 0, 13, 0, 0.621 8192, 0, 13, 1, 0.62 8192, 0, 45, 0, 0.53 8192, 0, 45, 1, 0.516 8192, 13, 13, 0, 0.664 8192, 13, 13, 1, 0.659 8192, 45, 45, 0, 0.593 8192, 45, 45, 1, 0.575 8192, 2048, 0, 0, 0.854 8192, 2048, 0, 1, 0.834 8192, 2061, 0, 0, 0.863 8192, 2061, 0, 1, 0.857 8192, 2048, 13, 0, 0.63 8192, 2048, 13, 1, 0.629 8192, 2061, 13, 0, 0.627 8192, 2061, 13, 1, 0.62 Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `475b63702e`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	a7392db2ff	x86: Optimize memmove-vec-unaligned-erms.S No bug. The optimizations are as follows: 1) Always align entry to 64 bytes. This makes behavior more predictable and makes other frontend optimizations easier. 2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have significant benefits in the case that: 0 < (dst - src) < [256, 512] 3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%] improvement and for FSRM [-10%, 25%]. In addition to these primary changes there is general cleanup throughout to optimize the aligning routines and control flow logic. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `a6b7502ec0`)	2022-04-26 18:18:16 -07:00
Fangrui Song	2e64237a87	x86-64: Replace movzx with movzbl Clang cannot assemble movzx in the AT&T dialect mode. ../sysdeps/x86_64/strcmp.S:2232:16: error: invalid operand for instruction movzx (%rsi), %ecx ^~~~ Change movzx to movzbl, which follows the AT&T dialect and is used elsewhere in the file. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `6720d36b66`)	2022-04-26 18:18:16 -07:00
H.J. Lu	a182bb7a39	x86-64: Remove Prefer_AVX2_STRCMP Remove Prefer_AVX2_STRCMP to enable EVEX strcmp. When comparing 2 32-byte strings, EVEX strcmp has been improved to require 1 load, 1 VPTESTM, 1 VPCMP, 1 KMOVD and 1 INCL instead of 2 loads, 3 VPCMPs, 2 KORDs, 1 KMOVD and 1 TESTL while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU, 1 VPMOVMSKB and 1 TESTL. EVEX strcmp is now faster than AVX2 strcmp by up to 40% on Tiger Lake and Ice Lake. (cherry picked from commit `14dbbf46a0`)	2022-04-26 18:18:16 -07:00
H.J. Lu	f35ad30da4	x86-64: Improve EVEX strcmp with masked load In strcmp-evex.S, to compare 2 32-byte strings, replace VMOVU (%rdi, %rdx), %YMM0 VMOVU (%rsi, %rdx), %YMM1 /* Each bit in K0 represents a mismatch in YMM0 and YMM1. / VPCMP $4, %YMM0, %YMM1, %k0 VPCMP $0, %YMMZERO, %YMM0, %k1 VPCMP $0, %YMMZERO, %YMM1, %k2 / Each bit in K1 represents a NULL in YMM0 or YMM1. / kord %k1, %k2, %k1 / Each bit in K1 represents a NULL or a mismatch. / kord %k0, %k1, %k1 kmovd %k1, %ecx testl %ecx, %ecx jne L(last_vector) with VMOVU (%rdi, %rdx), %YMM0 VPTESTM %YMM0, %YMM0, %k2 / Each bit cleared in K1 represents a mismatch or a null CHAR in YMM0 and 32 bytes at (%rsi, %rdx). */ VPCMP $0, (%rsi, %rdx), %YMM0, %k1{%k2} kmovd %k1, %ecx incl %ecx jne L(last_vector) It makes EVEX strcmp faster than AVX2 strcmp by up to 40% on Tiger Lake and Ice Lake. Co-Authored-By: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `c46e9afb2d`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	baf3ece634	x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'. it could potentially be dangerous to use SSE2 if this function is ever called without using 'vzeroupper' beforehand. While compilers appear to use 'vzeroupper' before function calls if AVX2 has been used, using SSE2 here is more brittle. Since it is not absolutely necessary it should be avoided. It costs 2-extra bytes but the extra bytes should only eat into alignment padding. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `bad852b61b`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	6d18a93dbb	x86: Optimize memset-vec-unaligned-erms.S No bug. Optimization are 1. change control flow for L(more_2x_vec) to fall through to loop and jump for L(less_4x_vec) and L(less_8x_vec). This uses less code size and saves jumps for length > 4x VEC_SIZE. 2. For EVEX/AVX512 move L(less_vec) closer to entry. 3. Avoid complex address mode for length > 2x VEC_SIZE 4. Slightly better aligning code for the loop from the perspective of code size and uops. 5. Align targets so they make full use of their fetch block and if possible cache line. 6. Try and reduce total number of icache lines that will need to be pulled in for a given length. 7. Include "local" version of stosb target. For AVX2/EVEX/AVX512 jumping to the stosb target in the sse2 code section will almost certainly be to a new page. The new version does increase code size marginally by duplicating the target but should get better iTLB behavior as a result. test-memset, test-wmemset, and test-bzero are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `e59ced2384`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	5ec3416853	x86: Optimize memcmp-evex-movbe.S for frontend behavior and size No bug. The frontend optimizations are to: 1. Reorganize logically connected basic blocks so they are either in the same cache line or adjacent cache lines. 2. Avoid cases when basic blocks unnecissarily cross cache lines. 3. Try and 32 byte align any basic blocks possible without sacrificing code size. Smaller / Less hot basic blocks are used for this. Overall code size shrunk by 168 bytes. This should make up for any extra costs due to aligning to 64 bytes. In general performance before deviated a great deal dependending on whether entry alignment % 64 was 0, 16, 32, or 48. These changes essentially make it so that the current implementation is at least equal to the best alignment of the original for any arguments. The only additional optimization is in the page cross case. Branch on equals case was removed from the size == [4, 7] case. As well the [4, 7] and [2, 3] case where swapped as [4, 7] is likely a more hot argument size. test-memcmp and test-wmemcmp are both passing. (cherry picked from commit `1bd8b8d58f`)	2022-04-26 18:18:16 -07:00
Noah Goldstein	b5a44a6a47	x86: Modify ENTRY in sysdep.h so that p2align can be specified No bug. This change adds a new macro ENTRY_P2ALIGN which takes a second argument, log2 of the desired function alignment. The old ENTRY(name) macro is just ENTRY_P2ALIGN(name, 4) so this doesn't affect any existing functionality. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `fc5bd179ef`)	2022-04-26 18:18:16 -07:00
H.J. Lu	16245986fb	x86-64: Optimize load of all bits set into ZMM register [BZ #28252 ] Optimize loads of all bits set into ZMM register in AVX512 SVML codes by replacing vpbroadcastq .L_2il0floatpacket.16(%rip), %zmmX and vmovups .L_2il0floatpacket.13(%rip), %zmmX with vpternlogd $0xff, %zmmX, %zmmX, %zmmX This fixes BZ #28252. (cherry picked from commit `78c9ec9000`)	2022-04-26 18:18:15 -07:00
Florian Weimer	83cc145830	scripts/glibcelf.py: Mark as UNSUPPORTED on Python 3.5 and earlier enum.IntFlag and enum.EnumMeta._missing_ support are not part of earlier Python versions. (cherry picked from commit `b571f3adff`)	2022-04-26 15:59:12 +02:00
Florian Weimer	bc56ab1f4a	dlfcn: Do not use rtld_active () to determine ld.so state (bug 29078) When audit modules are loaded, ld.so initialization is not yet complete, and rtld_active () returns false even though ld.so is mostly working. Instead, the static dlopen hook is used, but that does not work at all because this is not a static dlopen situation. Commit `466c1ea15f` ("dlfcn: Rework static dlopen hooks") moved the hook pointer into _rtld_global_ro, which means that separate protection is not needed anymore and the hook pointer can be checked directly. The guard for disabling libio vtable hardening in _IO_vtable_check should stay for now. Fixes commit `8e1472d2c1` ("ld.so: Examine GLRO to detect inactive loader [BZ #20204]"). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `8dcb6d0af0`)	2022-04-26 15:28:39 +02:00
Florian Weimer	0d477e92c4	INSTALL: Rephrase -with-default-link documentation Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `c935789bdf`)	2022-04-26 15:27:43 +02:00
Joan Bruguera	ca0faa140f	misc: Fix rare fortify crash on wchar funcs. [BZ 29030] If `__glibc_objsize (__o) == (size_t) -1` (i.e. `__o` is unknown size), fortify checks should pass, and `__whatever_alias` should be called. Previously, `__glibc_objsize (__o) == (size_t) -1` was explicitly checked, but on commit `a643f60c53`, this was moved into `__glibc_safe_or_unknown_len`. A comment says the -1 case should work as: "The -1 check is redundant because since it implies that __glibc_safe_len_cond is true.". But this fails when: * `__s > 1` * `__osz == -1` (i.e. unknown size at compile time) * `__l` is big enough * `__l * __s <= __osz` can be folded to a constant (I only found this to be true for `mbsrtowcs` and other functions in wchar2.h) In this case `__l * __s <= __osz` is false, and `__whatever_chk_warn` will be called by `__glibc_fortify` or `__glibc_fortify_n` and crash the program. This commit adds the explicit `__osz == -1` check again. moc crashes on startup due to this, see: https://bugs.archlinux.org/task/74041 Minimal test case (test.c): #include <wchar.h> int main (void) { const char hw = "HelloWorld"; mbsrtowcs (NULL, &hw, (size_t)-1, NULL); return 0; } Build with: gcc -O2 -Wp,-D_FORTIFY_SOURCE=2 test.c -o test && ./test Output: buffer overflow detected *: terminated Fixes: BZ #29030 Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit `33e03f9cd2`)	2022-04-25 18:44:27 +05:30
Florian Weimer	f0c71b34f9	Default to --with-default-link=no (bug 25812) This is necessary to place the libio vtables into the RELRO segment. New tests elf/tst-relro-ldso and elf/tst-relro-libc are added to verify that this is what actually happens. The new tests fail on ia64 due to lack of (default) RELRO support inbutils, so they are XFAILed there. (cherry picked from commit `198abcbb94`)	2022-04-22 11:31:14 +02:00
Florian Weimer	3e0a91b79b	scripts: Add glibcelf.py module Hopefully, this will lead to tests that are easier to maintain. The current approach of parsing readelf -W output using regular expressions is not necessarily easier than parsing the ELF data directly. This module is still somewhat incomplete (e.g., coverage of relocation types and versioning information is missing), but it is sufficient to perform basic symbol analysis or program header analysis. The EM_* mapping for architecture-specific constant classes (e.g., SttX86_64) is not yet implemented. The classes are defined for the benefit of elf/tst-glibcelf.py. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit `30035d6772`)	2022-04-22 11:28:57 +02:00
Adhemerval Zanella	71326f1f2f	nptl: Fix pthread_cancel cancelhandling atomic operations The `404656009b` reversion did not setup the atomic loop to set the cancel bits correctly. The fix is essentially what pthread_cancel did prior `26cfbb7162`. Checked on x86_64-linux-gnu and aarch64-linux-gnu. (cherry picked from commit `62be968167`)	2022-04-20 12:22:34 -03:00
=Joshua Kinard	b87b697f15	mips: Fix mips64n32 64 bit time_t stat support (BZ#29069) Add missing support initially added by `4e8521333b` (which missed n32 stat). (cherry picked from commit `78fb888273`)	2022-04-18 13:16:21 -03:00
Samuel Thibault	5d8c777634	hurd: Fix arbitrary error code ELIBBAD is Linux-specific. (cherry picked from commit `67ab66541d`)	2022-04-18 17:54:13 +02:00
Adhemerval Zanella	290db09546	nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029) Some Linux interfaces never restart after being interrupted by a signal handler, regardless of the use of SA_RESTART [1]. It means that for pthread cancellation, if the target thread disables cancellation with pthread_setcancelstate and calls such interfaces (like poll or select), it should not see spurious EINTR failures due the internal SIGCANCEL. However recent changes made pthread_cancel to always sent the internal signal, regardless of the target thread cancellation status or type. To fix it, the previous semantic is restored, where the cancel signal is only sent if the target thread has cancelation enabled in asynchronous mode. The cancel state and cancel type is moved back to cancelhandling and atomic operation are used to synchronize between threads. The patch essentially revert the following commits: `8c1c0aae20` nptl: Move cancel type out of cancelhandling `2b51742531` nptl: Move cancel state out of cancelhandling `26cfbb7162` nptl: Remove CANCELING_BITMASK However I changed the atomic operation to follow the internal C11 semantic and removed the MACRO usage, it simplifies a bit the resulting code (and removes another usage of the old atomic macros). Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and powerpc64-linux-gnu. [1] https://man7.org/linux/man-pages/man7/signal.7.html Reviewed-by: Florian Weimer <fweimer@redhat.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> (cherry-picked from commit `404656009b`)	2022-04-15 09:52:54 -03:00
Stefan Liebler	0c03cb54c8	S390: Add new s390 platform z16. The new IBM z16 is added to platform string array. The macro _DL_PLATFORMS_COUNT is incremented. _dl_hwcaps_subdir is extended by "z16" if HWCAP_S390_VXRS_PDE2 is set. HWCAP_S390_NNPA is not tested in _dl_hwcaps_subdirs_active as those instructions may be replaced or removed in future. tst-glibc-hwcaps.c is extended in order to test z16 via new marker5. A fatal glibc error is dumped if glibc was build with architecture level set for z16, but run on an older machine. (See dl-hwcap-check.h) (cherry picked from commit `2376944b9e`)	2022-04-14 14:21:57 +02:00
Carlos O'Donell	ceed89d089	NEWS: Update fixed bug list for LD_AUDIT backports.	2022-04-12 13:49:31 -04:00
Adhemerval Zanella	4dca2d3a7b	hppa: Fix bind-now audit (BZ #28857 ) On hppa, a function pointer returned by la_symbind is actually a function descriptor has the plabel bit set (bit 30). This must be cleared to get the actual address of the descriptor. If the descriptor has been bound, the first word of the descriptor is the physical address of theA function, otherwise, the first word of the descriptor points to a trampoline in the PLT. This patch also adds a workaround on tests because on hppa (and it seems to be the only ABI I have see it), some shared library adds a dynamic PLT relocation to am empty symbol name: $ readelf -r elf/tst-audit25mod1.so [...] Relocation section '.rela.plt' at offset 0x464 contains 6 entries: Offset Info Type Sym.Value Sym. Name + Addend 00002008 00000081 R_PARISC_IPLT 508 [...] It breaks some assumptions on the test, where a symbol with an empty name ("") is passed on la_symbind. Checked on x86_64-linux-gnu and hppa-linux-gnu. (cherry picked from commit `9e94f57484`)	2022-04-12 13:33:17 -04:00
H.J. Lu	aabdad371f	elf: Replace tst-audit24bmod2.so with tst-audit24bmod2 Replace tst-audit24bmod2.so with tst-audit24bmod2 to silence: make[2]: Entering directory '/export/gnu/import/git/gitlab/x86-glibc/elf' Makefile:2201: warning: overriding recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so' ../Makerules:765: warning: ignoring old recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so' (cherry picked from commit `fa7ad1df19`)	2022-04-12 13:33:17 -04:00
Szabolcs Nagy	165e7ad459	Fix elf/tst-audit25a with default bind now toolchains This test relies on lazy binding for the executable so request that explicitly in case the toolchain defaults to bind now. (cherry picked from commit `80a08d0faa`)	2022-04-12 13:33:17 -04:00
Ben Woodard	b118bce87a	elf: Fix runtime linker auditing on aarch64 (BZ #26643 ) The rtld audit support show two problems on aarch64: 1. _dl_runtime_resolve does not preserve x8, the indirect result location register, which might generate wrong result calls depending of the function signature. 2. The NEON Q registers pushed onto the stack by _dl_runtime_resolve were twice the size of D registers extracted from the stack frame by _dl_runtime_profile. While 2. might result in wrong information passed on the PLT tracing, 1. generates wrong runtime behaviour. The aarch64 rtld audit support is changed to: * Both La_aarch64_regs and La_aarch64_retval are expanded to include both x8 and the full sized NEON V registers, as defined by the ABI. * dl_runtime_profile needed to extract registers saved by _dl_runtime_resolve and put them into the new correctly sized La_aarch64_regs structure. * The LAV_CURRENT check is change to only accept new audit modules to avoid the undefined behavior of not save/restore x8. * Different than other architectures, audit modules older than LAV_CURRENT are rejected (both La_aarch64_regs and La_aarch64_retval changed their layout and there are no requirements to support multiple audit interface with the inherent aarch64 issues). * A new field is also reserved on both La_aarch64_regs and La_aarch64_retval to support variant pcs symbols. Similar to x86, a new La_aarch64_vector type to represent the NEON register is added on the La_aarch64_regs (so each type can be accessed directly). Since LAV_CURRENT was already bumped to support bind-now, there is no need to increase it again. Checked on aarch64-linux-gnu. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `ce9a68c57c`) Resolved conflicts: NEWS elf/rtld.c	2022-04-12 13:33:10 -04:00
Adhemerval Zanella	056fc1c0e3	elf: Issue la_symbind for bind-now (BZ #23734 ) The audit symbind callback is not called for binaries built with -Wl,-z,now or when LD_BIND_NOW=1 is used, nor the PLT tracking callbacks (plt_enter and plt_exit) since this would change the expected program semantics (where no PLT is expected) and would have performance implications (such as for BZ#15533). LAV_CURRENT is also bumped to indicate the audit ABI change (where la_symbind flags are set by the loader to indicate no possible PLT trace). To handle powerpc64 ELFv1 function descriptor, _dl_audit_symbind requires to know whether bind-now is used so the symbol value is updated to function text segment instead of the OPD (for lazy binding this is done by PPC64_LOAD_FUNCPTR on _dl_runtime_resolve). Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, powerpc64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `32612615c5`) Resolved conflicts: NEWS - Manual merge.	2022-04-12 13:32:59 -04:00
Adhemerval Zanella	efb21b5fb2	elf: Fix initial-exec TLS access on audit modules (BZ #28096 ) For audit modules and dependencies with initial-exec TLS, we can not set the initial TLS image on default loader initialization because it would already be set by the audit setup. However, subsequent thread creation would need to follow the default behaviour. This patch fixes it by setting l_auditing link_map field not only for the audit modules, but also for all its dependencies. This is used on _dl_allocate_tls_init to avoid the static TLS initialization at load time. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `254d3d5aef`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	98047ba95c	elf: Add la_activity during application exit la_activity is not called during application exit, even though la_objclose is. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `5fa11a2bc9`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	2255621f0e	elf: Do not fail for failed dlmopen on audit modules (BZ #28061 ) The dl_main sets the LM_ID_BASE to RT_ADD just before starting to add load new shared objects. The state is set to RT_CONSISTENT just after all objects are loaded. However if a audit modules tries to dlmopen an inexistent module, the _dl_open will assert that the namespace is in an inconsistent state. This is different than dlopen, since first it will not use LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible to set and reset the r_state value. So the assert on _dl_open can not really be seen if the state is consistent, since _dt_main resets it. This patch removes the assert. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `484e672dda`) Resolved conflicts: elf/Makefile elf/dl-open.c	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	d1b9bee29a	elf: Issue audit la_objopen for vDSO The vDSO is is listed in the link_map chain, but is never the subject of an la_objopen call. A new internal flag __RTLD_VDSO is added that acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate' extra space for the 'struct link_map'. The return value from the callback is currently ignored, since there is no PLT call involved by glibc when using the vDSO, neither the vDSO are exported directly. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `f0e23d34a7`) Resolved conflicts: elf/Makefile	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	02c6a3d353	elf: Add audit tests for modules with TLSDESC Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `d1b38173c9`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	29496b3103	elf: Avoid unnecessary slowdown from profiling with audit (BZ#15533) The rtld-audit interfaces introduces a slowdown due to enabling profiling instrumentation (as if LD_AUDIT implied LD_PROFILE). However, instrumenting is only necessary if one of audit libraries provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise, the slowdown can be avoided. The following patch adjusts the logic that enables profiling to iterate over all audit modules and check if any of those provides a PLT hook. To keep la_symbind to work even without PLT callbacks, _dl_fixup now calls the audit callback if the modules implements it. Co-authored-by: Alexander Monakov <amonakov@ispras.ru> Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `063f9ba220`) Resolved conflicts: NEWS elf/Makefile	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	a8e211daea	elf: Add _dl_audit_pltexit It consolidates the code required to call la_pltexit audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `8c0664e2b8`) Resolved conflicts: sysdeps/hppa/dl-runtime.c	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	fd9c4e8a1b	elf: Add _dl_audit_pltenter It consolidates the code required to call la_pltenter audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `eff687e846`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	31473c273b	elf: Add _dl_audit_preinit It consolidates the code required to call la_preinit audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `0b98a87487`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	b2d99731b6	elf: Add _dl_audit_symbind_alt and _dl_audit_symbind It consolidates the code required to call la_symbind{32,64} audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `cda4f265c6`)	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	198660741b	elf: Add _dl_audit_objclose It consolidates the code required to call la_objclose audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `311c9ee54e`)	2022-04-08 14:18:11 -04:00
Adhemerval Zanella	ec0fc2a153	elf: Add _dl_audit_objsearch It consolidates the code required to call la_objsearch audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `c91008d349`)	2022-04-08 14:18:11 -04:00

1 2 3 4 5 ...

37932 Commits