Optimize loads of all bits set into ZMM register in AVX512 SVML codes
by replacing
vpbroadcastq .L_2il0floatpacket.16(%rip), %zmmX
and
vmovups .L_2il0floatpacket.13(%rip), %zmmX
with
vpternlogd $0xff, %zmmX, %zmmX, %zmmX
This fixes BZ #28252.
The Autoconf documentation for the AC_CACHE_CHECK macro states:
The commands-to-set-it must have no side effects except for setting
the variable cache-id, see below.
However, the tests for support of -msahf and -mmovbe were embedded in
the commands-to-set-it for lib_cv_include_x86_isa_level. This had the
consequence that libc_cv_have_x86_lahf_sahf and libc_cv_have_x86_movbe
were not defined whenever lib_cv_include_x86_isa_level was read from
cache. These variables' being undefined meant that their unquoted use
in later test expressions led to the 'test' built-in's misparsing its
arguments and emitting errors like "test: =: unexpected operator" or
"test: =: unary operator expected", depending on the particular shell.
This commit refactors the tests for LAHF/SAHF and MOVBE instruction
support into their own AC_CACHE_CHECK macro invocations to obey the
rule that the commands-to-set-it must have no side effects other than
setting the variable named by cache-id.
Signed-off-by: Matt Whitlock <sourceware@mattwhitlock.name>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time
address of _DYNAMIC. &__ehdr_start is a better way to get the load address.
This is similar to commits b37b75d269
(x86-64) and 43d06ed218 (aarch64).
Reviewed-by: Joseph Myers <joseph@codesourcery.com>
&__ehdr_start is a better way to get the load address.
This is similar to commits b37b75d269
(x86-64) and 43d06ed218 (aarch64).
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing. Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time
address of _DYNAMIC. &__ehdr_start is a better way to get the load address.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.
Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.
When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.
As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.
Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Gnumach's 0650a4ee30e3 implements support for high bits being set in the
mask parameter of vm_map. This allows to remove the fmh kludge that was
masking away the address range by mapping a dumb area there.
In "mips: align stack in clone [BZ #28223]"
(commit 1f51cd9a86) I made a mistake: I
misbelieved one "word" was 2-byte and "doubleword" should be 4-byte.
But in MIPS ABI one "word" is defined 32-bit (4-byte), so "doubleword" is
8-byte [1], and "quadword" is 16-byte [2].
[1]: "System V Application Binary Interface: MIPS(R) RISC Processor
Supplement, 3rd edition", page 3-31
[2]: "MIPSpro(TM) 64-Bit Porting and Transition Guide", page 23
The MIPS O32 ABI requires 4 byte aligned stack, and the MIPS N64 and N32
ABI require 8 byte aligned stack. Previously if the caller passed an
unaligned stack to clone the the child misbehaved.
Fixes bug 28223.
When the memory object is read-only, the kernel would be right in
refusing max vmprot containing VM_PROT_WRITE.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
The AArch64 ABI is largely platform agnostic and does not specify
_GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the
only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value
to the link-time address _DYNAMIC. [2]
In 2012, __ehdr_start was implemented in GNU ld and gold in binutils
2.23. Using adrp+add / (-mcmodel=tiny) adr to access
__ehdr_start/_DYNAMIC gives us a robust way to get the load address and
the link-time address of _DYNAMIC.
[1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2
[2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the
link-time address _DYNAMIC.
LLD is widely used on aarch64 Android and ChromeOS devices. Software
just works without the need for _GLOBAL_OFFSET_TABLE_[0].
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Simplify handling of remaining bytes. Avoid lots of taken branches and complex
whilelo computations, instead unconditionally write vectors from the end.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Improve performance of large memsets. Simplify alignment code. For zero memset
use DC ZVA, which almost doubles performance. For non-zero memsets use the
unroll8 loop which is about 10% faster.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Improve performance of small memsets by reducing instruction counts and
improving code alignment. Bench-memset shows 35-45% performance gain for
small sizes.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Linux 5.13 adds a PTRACE_GET_RSEQ_CONFIGURATION constant, with an
associated ptrace_rseq_configuration structure.
Add this constant to the various sys/ptrace.h headers in glibc, with
the structure in bits/ptrace-shared.h (named struct
__ptrace_rseq_configuration in glibc, as with other such structures).
Tested for x86_64, and with build-many-glibcs.py.
Helper thread frees copied attribute on NOTIFY_REMOVED message
received from the OS kernel. Unfortunately, it fails to check whether
copied attribute actually exists (data.attr != NULL). This worked
earlier because free() checks passed pointer before actually
attempting to release corresponding memory. But
__pthread_attr_destroy assumes pointer is not NULL.
So passing NULL pointer to __pthread_attr_destroy will result in
segmentation fault. This scenario is possible if
notification->sigev_notify_attributes == NULL (which means default
thread attributes should be used).
Signed-off-by: Nikita Popov <npv1310@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
We'd like to support processors without Altivec or VSX, so check
the relevant hwcap bits before selecting them.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
A number of optimised memset routines assume the cacheline size is 128B,
so we better check before using them.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
We use PPC_FEATURE_HAS_VSX to select a number of POWER7 optimised
functions. These functions don't use any VSX instructions, so
PPC_FEATURE_ARCH_2_06 seems like a better fit.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
__REDIRECT and __THROW are not compatible with C++ due to the ordering of the
__asm__ alias and the throw specifier. __REDIRECT_NTH has to be used
instead.
Fixes commit 8a40aff86b ("io: Add time64 alias
for fcntl"), commit 82c395d91e ("misc: Add
time64 alias for ioctl"), commit b39ffab860
("Linux: Add time64 alias for prctl").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It turned that the generic implementation of brk() does not work
for sparc, since on failure kernel will just return the previous
input value without setting the conditional register.
This patches adds back a sparc32 and sparc64 implementation removed
by 720480934a.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
labellist and precedencelist could get freed a second time if there
are allocation failures, so set them to NULL to avoid a double-free.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
commit 3ec5d83d2a
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sat Jan 25 14:19:40 2020 -0800
x86-64: Avoid rep movsb with short distance [BZ #27130]
introduced some regressions on Intel processors without Fast Short REP
MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with
short distance only on Intel processors with FSRM. bench-memmove-large
on Skylake server shows that cycles of __memmove_evex_unaligned_erms
improves for the following data size:
before after Improvement
length=4127, align1=3, align2=0: 479.38 349.25 27%
length=4223, align1=9, align2=5: 405.62 333.25 18%
length=8223, align1=3, align2=0: 786.12 496.38 37%
length=8319, align1=9, align2=5: 727.50 501.38 31%
length=16415, align1=3, align2=0: 1436.88 840.00 41%
length=16511, align1=9, align2=5: 1375.50 836.38 39%
length=32799, align1=3, align2=0: 2890.00 1860.12 36%
length=32895, align1=9, align2=5: 2891.38 1931.88 33%
1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes
<bits/platform/x86.h>.
2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the
processor has the feature.
3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the
feature is active. There may be other preconditions, like sufficient
stack space or further setup for AMX, which must be satisfied before the
feature can be used.
This fixes BZ #27958.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Remove unused code and declare __libc_mallopt when !IS_IN (libc) to
allow the debug hook to build with --disable-tunables.
Also, run tst-ifunc-isa-2* tests only when tunables are enabled since
the result depends on it.
Tested on x86_64.
Reported-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
84f7ce8447 ("posix: Add glob64 with 64-bit time_t support") replaced
GLOB_NO_LSTAT with defining GLOB_LSTAT and GLOB_LSTAT64, but the posix
and gnu versions of the change were missing in the commit.
These deprecated functions are only safe to call from
__malloc_initialize_hook and as a result, are not useful in the
general case. Move the implementations to libc_malloc_debug so that
existing binaries that need it will now have to preload the debug DSO
to work correctly.
This also allows simplification of the core malloc implementation by
dropping all the undumping support code that was added to make
malloc_set_state work.
One known breakage is that of ancient emacs binaries that depend on
this. They will now crash when running with this libc. With
LD_BIND_NOW=1, it will terminate immediately because of not being able
to find malloc_set_state but with lazy binding it will crash in
unpredictable ways. It will need a preloaded libc_malloc_debug.so so
that its initialization hook is executed to allow its malloc
implementation to work properly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The malloc-check debugging feature is tightly integrated into glibc
malloc, so thanks to an idea from Florian Weimer, much of the malloc
implementation has been moved into libc_malloc_debug.so to support
malloc-check. Due to this, glibc malloc and malloc-check can no
longer work together; they use altogether different (but identical)
structures for heap management. This should not make a difference
though since the malloc check hook is not disabled anywhere.
malloc_set_state does, but it does so early enough that it shouldn't
cause any problems.
The malloc check tunable is now in the debug DSO and has no effect
when the DSO is not preloaded.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Wean mtrace away from the malloc hooks and move them into the debug
DSO. Split the API away from the implementation so that we can add
the API to libc.so as well as libc_malloc_debug.so, with the libc
implementations being empty.
Update localplt data since memalign no longer has any callers after
this change.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Split the mcheck implementation into the debugging hooks and API so
that the API can be replicated in libc and libc_malloc_debug.so. The
libc APIs always result in failure.
The mcheck implementation has also been moved entirely into
libc_malloc_debug.so and with it, all of the hook initialization code
can now be moved into the debug library. Now the initialization can
be done independently of libc internals.
With this patch, libc_malloc_debug.so can no longer be used with older
libcs, which is not its goal anyway. tst-vfork3 breaks due to this
since it spawns shell scripts, which in turn execute using the system
glibc. Move the test to tests-container so that only the built glibc
is used.
This move also fixes bugs in the mcheck version of memalign and
realloc, thus allowing removal of the tests from tests-mcheck
exclusion list.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so. With this, the hooks now no
longer have any effect on the core library.
libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.
Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.
The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.
Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
a "control signal" to enable explicit write (vs. the side-effect of FPU
instructions). However this bit is RAZ and write-only, thus effectively
never stored in FPU_STATUS register. Thus when reading the register
there is no need to clear it. This shaves off a BCLR instruction from
the fe*exceptino family of functions and while no big deal still makes
sense to do.
This came up when debugging a race in math/test-fenv-tls [1]
[1]: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The SSBD feature is implemented in 2 different ways on AMD processors:
newer systems (Zen3) provides AMD_SSBD (function 8000_0008, EBX[24]),
while older system provides AMD_VIRT_SSBD (function 8000_0008, EBX[25]).
However for AMD_VIRT_SSBD, kernel shows both 'ssdb' and 'virt_ssdb' on
/proc/cpuinfo; while for AMD_SSBD only 'ssdb' is provided.
This now check is AMD_SSBD is set to check for 'ssbd', otherwise check
if AMD_VIRT_SSDB is set to check for 'virt_ssbd'.
Checked on x86_64-linux-gnu on a Ryzen 9 5900x.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This switches to public symbols without __ prefixes, due to improved
namespace management in glibc.
The script was used with --no-new-version to move the symbols
__res_nquery, __res_nquerydomain, __res_nsearch, __res_query,
__res_querydomain, __res_search, res_query, res_querydomain,
res_search. The public symbols res_nquery, res_nquerydomain,
res_nsearch, res_ownok, res_query, res_querydomain, res_search
were added with make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This switches to public symbols without __ prefixes, due to improved
namespace management in glibc.
The symbols res_mkquery, __res_mkquery, __res_nmkquery were
moved with the script (using --no-new-version).
res_mkquery@@GLIBC_2.34, res_nmkquery@@GLIBC_2.34 were added using
make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Switch to public symbols without __ prefix (due to improved
namespace management).
__res_send, __res_nsend were moved using the script (with
--no-new-version). res_send@@GLIBC_2.34 and res_nsend@@GLIBC_2.34
were added using make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This reflects what the remaining functions in the file do.
The __res_dnok, __res_hnok, __res_mailok, __res_ownok were moved
with the script, using --no-new-version, and turned into compat
symbols. __libc_res_dnok@@GLIBC_PRIVATE and
__libc_res_hnok@@GLIBC_PRIVATE are added for internal use, to avoid
accidentally binding to compatibility symbols. The new public
symbols res_dnok, res_hnok, res_mailok, res_ownok were added using
make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat it to GNU style.
dn_skipname is used outside glibc, so do not deprecate it,
and export it as dn_skipname (not __dn_skipname). Due to internal
users, provide a __libc_dn_skipname alias, and keep __dn_skipname
as a pure compatibility symbol.
__dn_skipname@GLIBC_2.0 was moved using the script, and
dn_skipname@@GLIBC_2.34 was added using make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat it to GNU style.
dn_comp is used in various programs, so keep it as a non-deprecated
symbol. Switch to dn_comp (not __dn_comp) for the ABI name. There
are no internal users, so interposition is not a problem.
The __dn_comp symbol was moved with scripts/move-symbol-to-libc.py
--no-new-version. dn_comp@@GLIBC_2.34 was added with
make update-all-abi.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style.
This switches back to the dn_expand name for the ABI symbol and turns
__dn_expand into a compatibility symbol. With the improved namespace
management in current glibc, it is no longer necessary to use a
private namespace symbol. To avoid old code binding to a
GLIBC_PRIVATE symbol by accident, use __libc_dn_expand for the
internal symbol name.
The symbols dn_expand, __dnexpand were moved using
scripts/move-symbol-to-libc.py, followed by an adjustment to make
dn_expand the only GLIBC_2.34 symbol.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style, and eliminate the labellen function.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style, and eliminate the digits variable.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style. Check for negative error returns
(instead of -1).
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And reformat to GNU style. Avoid out-of-bounds pointer arithmetic.
This also results in a fix of bug 28091 due to the additional packet
length checks.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@systemhalted.org>
Reformat to GNU style. Avoid out-of-bounds buffer arithmetic.
Eliminate the labellen function.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reformat to GNU style. Avoid out-of-bounds pointer arithmetic
(e.g., use eom - dn < 2 instead of dn + 1 >= eom). Inline the
labellen function and fold the compression pointer check into
the length check (l >= 64). Assume ASCII encoding.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This is updated version of the 572bd547d5 (reverted by 40ebfd016a)
that fixes the _dl_next_tls_modid issues.
This issue with 572bd547d5 patch is the DTV entry will be only
update on dl_open_worker() with the update_tls_slotinfo() call after
all dependencies are being processed by _dl_map_object_deps(). However
_dl_map_object_deps() itself might call _dl_next_tls_modid(), and since
the _dl_tls_dtv_slotinfo_list::map is not yet set the entry will be
wrongly reused.
This patch fixes by renaming the _dl_next_tls_modid() function to
_dl_assign_tls_modid() and by passing the link_map so it can set
the slotinfo value so a subsequente _dl_next_tls_modid() call will
see the entry as allocated.
The intermediary value is cleared up on remove_slotinfo() for the case
a library fails to load with RTLD_NOW.
This patch fixes BZ #27135.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The clone3 system call (since Linux 5.3) provides a superset of the
functionality of clone and clone2. It also provides a number of API
improvements, including the ability to specify the size of the child's
stack area which can be used by kernel to compute the shadow stack size
when allocating the shadow stack. Add:
extern int __clone_internal (struct clone_args *__cl_args,
int (*__func) (void *__arg), void *__arg);
to provide an abstract interface for clone, clone2 and clone3.
1. Simplify stack management for thread creation by passing both stack
base and size to create_thread.
2. Consolidate clone vs clone2 differences into a single file.
3. Call __clone3 if HAVE_CLONE3_WAPPER is defined. If __clone3 returns
-1 with ENOSYS, fall back to clone or clone2.
4. Use only __clone_internal to clone a thread. Since the stack size
argument for create_thread is now unconditional, always pass stack size
to create_thread.
5. Enable the public clone3 wrapper in the future after it has been
added to all targets.
NB: Sandbox will return ENOSYS on clone3 in both Chromium:
The following revision refers to this bug:
218438259d
commit 218438259dd795456f0a48f67cbe5b4e520db88b
Author: Matthew Denton <mpdenton@chromium.org>
Date: Thu Jun 03 20:06:13 2021
Linux sandbox: return ENOSYS for clone3
Because clone3 uses a pointer argument rather than a flags argument, we
cannot examine the contents with seccomp, which is essential to
preventing sandboxed processes from starting other processes. So, we
won't be able to support clone3 in Chromium. This CL modifies the
BPF policy to return ENOSYS for clone3 so glibc always uses the fallback
to clone.
Bug: 1213452
Change-Id: I7c7c585a319e0264eac5b1ebee1a45be2d782303
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2936184
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Commit-Queue: Matthew Denton <mpdenton@chromium.org>
Cr-Commit-Position: refs/heads/master@{#888980}
[modify] https://crrev.com/218438259dd795456f0a48f67cbe5b4e520db88b/sandbox/linux/seccomp-bpf-helpers/baseline_policy.cc
and Firefox:
https://hg.mozilla.org/integration/autoland/rev/ecb4011a0c76
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
1. Align struct hdr to MALLOC_ALIGNMENT bytes so that malloc hooks in
libmcheck align memory to MALLOC_ALIGNMENT bytes.
2. Remove tst-mallocalign1 from tests-exclude-mcheck for i386 and x32.
3. Add tst-pvalloc-fortify and tst-reallocarray to tests-exclude-mcheck
since they use malloc_usable_size (see BZ #22057).
This fixed BZ #28068.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The previous approach defeats the vDSO optimization on older kernels
because a failing clock_gettime64 system call is performed on every
function call. It also results in a clobbered errno value, exposing
an OpenJDK bug (JDK-8270244).
This patch fixes by open-code INLINE_VSYSCALL macro and replace all
INLINE_SYSCALL_CALL with INTERNAL_SYSCALL_CALLS. Now for
__clock_gettime64x, the 64-bit vDSO is used and the 32-bit vDSO is
tried before falling back to 64-bit syscalls.
The previous code preferred 64-bit syscall for the case where the kernel
provides 64-bit time_t syscalls *and* also a 32-bit vDSO (in this case
the *64-bit* syscall should be preferable over the vDSO). All
architectures that provides 32-bit vDSO (i386, mips, powerpc, s390)
modulo sparc; but I am not sure if some kernels versions do provide
only 32-bit vDSO while still providing 64-bit time_t syscall.
Regardless, for such cases the 64-bit time_t syscall is used if the
vDSO returns overflowed 32-bit time_t.
Tested on i686-linux-gnu (with a time64 and non-time64 kernel),
x86_64-linux-gnu. Built with build-many-glibcs.py.
Co-authored-by: Florian Weimer <fweimer@redhat.com>
<limits.h> used to be a header file with no declarations.
GCC's libgomp includes it in a #pragma GCC visibility hidden block.
Including <unistd.h> from <limits.h> (indirectly) declares everything
in <unistd.h> with hidden visibility, resulting in linker failures.
This commit avoids C declarations in assembler mode and only declares
__sysconf in <limits.h> (and not the entire contents of <unistd.h>).
The __sysconf symbol is already part of the ABI. PTHREAD_STACK_MIN
is no longer defined for __USE_DYNAMIC_STACK_SIZE && __ASSEMBLER__
because there is no possible definition.
Additionally, PTHREAD_STACK_MIN is now defined by <pthread.h> for
__USE_MISC because this is what developers expect based on the macro
name. It also helps to avoid libgomp linker failures in GCC because
libgomp includes <pthread.h> before its visibility hacks.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This was put in __libc_fork by c32c868ab8 ("posix: Add _Fork [BZ #4737]")
so we need to avoid locking them again in _Fork called by __libc_lock, otherwise
we deadlock.
The constant PTHREAD_STACK_MIN may be too small for some processors.
Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE. When
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).
Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
provide a constant target specific PTHREAD_STACK_MIN value.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
As a result, is not necessary to specify __attribute__ ((nocommon))
on individual definitions.
GCC 10 defaults to -fno-common on all architectures except ARC,
but this change is compatible with older GCC versions and ARC, too.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
1. Add sysdeps/generic/malloc-size.h to define size related macros for
malloc.
2. Move x86_64/tst-mallocalign1.c to malloc and replace ALIGN_MASK with
MALLOC_ALIGN_MASK.
3. Add tst-mallocalign1 to tests-exclude-mcheck for i386 and x32 since
mcheck doesn't honor MALLOC_ALIGNMENT.
This slightly reduces code size, as can be seen below.
__libc_lock_unlock is usually used along with __libc_lock_lock in
the same function. __libc_lock_lock already has an out-of-line
slow path, so this change should not introduce many additional
non-leaf functions.
This change also fixes a link failure in 32-bit Arm thumb mode
because commit 1f9c804fbd
("nptl: Use internal low-level lock type for !IS_IN (libc)")
introduced __libc_do_syscall calls outside of libc.
Before x86-64:
text data bss dec hex filename
1937748 20456 54896 2013100 1eb7ac libc.so.6
25601 856 12768 39225 9939 nss/libnss_db.so.2
40310 952 25144 66406 10366 nss/libnss_files.so.2
After x86-64:
text data bss dec hex filename
1935312 20456 54896 2010664 1eae28 libc.so.6
25559 864 12768 39191 9917 nss/libnss_db.so.2
39764 960 25144 65868 1014c nss/libnss_files.so.2
Before i686:
2110961 11272 39144 2161377 20fae1 libc.so.6
27243 428 12652 40323 9d83 nss/libnss_db.so.2
43062 476 25028 68566 10bd6 nss/libnss_files.so.2
After i686:
2107347 11272 39144 2157763 20ecc3 libc.so.6
26929 432 12652 40013 9c4d nss/libnss_db.so.2
43132 480 25028 68640 10c20 nss/libnss_files.so.2
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The configure script checks for -mlong-double-128 but mentions -mlongdouble
when it fails.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
5 years ago, commit 8f1b841e45
unintentionally added an ifunc to the loader.
That modification has not caused any harm so far, but it doesn't add any
value either, because the hwcap information is available later during
libc initialization.
Suggested-by: Anton Blanchard <anton@ozlabs.org>
The following commit
commit 6f573a27b6
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Wed Jun 23 01:19:34 2021 -0400
x86-64: Add wcslen optimize for sse4.1
Added wcsnlen-sse4.1 to the wcslen ifunc implementation list and did
not add wcslen-sse4.1 to wcslen ifunc implementation list. This commit
fixes that by removing wcsnlen-sse4.1 from the wcslen ifunc
implementation list and adding wcslen-sse4.1 to the ifunc
implementation list.
Testing:
test-wcslen.c, test-rsi-wcslen.c, and test-rsi-strlen.c are passing as
well as all other tests in wcsmbs and string.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
commit 6f573a27b6
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Wed Jun 23 01:19:34 2021 -0400
x86-64: Add wcslen optimize for sse4.1
added wcsnlen-sse4.1 to the wcslen ifunc implementation list. Since the
random value in the the RSI register is larger than the wide-character
string length in the existing wcslen test, it didn't trigger the wcslen
test failure. Add a test to force 0 into the RSI register before calling
wcslen.
https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld
diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in
-pie links. Arguably keeping the diagnostic like other ports is more
correct, since statically resolving movl foo(%rip), %eax to the
link-time zero address produces a corrupted output.
It turns out that --enable-static-pie builds do not depend on the ld
behavior. GCC generates GOT indirection for weak declarations for
-fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't
really matter.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch adds a way to close a range of file descriptors on
posix_spawn as a new file action. The API is similar to the one
provided by Solaris 11 [1], where the file action causes the all open
file descriptors greater than or equal to input on to be closed when
the new process is spawned.
The function posix_spawn_file_actions_addclosefrom_np is safe to be
implemented by iterating over /proc/self/fd, since the Linux spawni.c
helper process does not use CLONE_FILES, so its has own file descriptor
table and any failure (in /proc operation) aborts the process creation
and returns an error to the caller.
I am aware that this file action might be redundant to the current
approach of POSIX in promoting O_CLOEXEC in more interfaces. However
O_CLOEXEC is still not the default and for some specific usages, the
caller needs to close all possible file descriptors to avoid them
leaking. Some examples are CPython (discussed in BZ#10353) and OpenJDK
jspawnhelper [2] (where OpenJDK spawns a helper process to exactly
closes all file descriptors). Most likely any environment which calls
functions that might open file descriptor under the hood and aim to use
posix_spawn might face the same requirement.
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
[1] https://docs.oracle.com/cd/E36784_01/html/E36874/posix-spawn-file-actions-addclosefrom-np-3c.html
[2] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82
The function closes all open file descriptors greater than or equal to
input argument. Negative values are clamped to 0, i.e, it will close
all file descriptors.
As indicated by the bug report, this is a common symbol provided by
different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although
its has inherent issues with not taking in consideration internal libc
file descriptors (such as syslog), this is also a common feature used
in multiple projects [1][2][3][4][5].
The Linux fallback implementation iterates over /proc and close all
file descriptors sequentially. Although it was raised the questioning
whether getdents on /proc/self/fd might return disjointed entries
when file descriptor are closed; it does not seems the case on my
testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy
is used on different projects [1][2][3][5].
Also, the interface is set a fail-safe meaning that a failure in the
fallback results in a process abort.
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
[1] 5238e95759/src/basic/fd-util.c (L217)
[2] ddf4b77e11/src/lxc/start.c (L236)
[3] 9e4f2f3a6b/Modules/_posixsubprocess.c (L220)
[4] 5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)
[5] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82
It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC
added on 5.11 (582f1fb6b721f). Although FreeBSD has added the same
syscall, this only adds the symbol on Linux ports. This syscall is
required to provided a fail-safe way to implement the closefrom
symbol (BZ #10353).
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.13. (There are no new MAP_* constants covered by this test in
5.13 that need any other header changes.)
Tested with build-many-glibcs.py.
Now that there are no internal users anymore, these new symbol
versions can be removed from the public ABI. The compatibility
symbols remain.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This avoids an ABI hazard (types changing between different modules
of glibc) without introducing linknamespace issues. In particular,
NSS modules now call __lll_lock_wait_private@@GLIBC_PRIVATE to wait
on internal locks (the unlock path is inlined and performs a direct
system call).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch consolidates the setsockopt implementation on
sysdeps/unix/sysv/linux/getsockopt.c. The changes are:
1. Remove it from auto-generation syscalls.list on all architectures.
2. Add __ASSUME_SETSOCKOPT_SYSCALL as default and undef if for
specific kernel versions on some architectures.
This also fix a potential issue where 32-bit time_t ABI should use the
linux setsockopt which overrides the underlying SO_* constants used for
socket timestamping for _TIME_BITS=64.
Checked on x86_64-linux-gnu and i686-linux-gnu.
This patch consolidates the getsockopt Linux syscall implementation on
sysdeps/unix/sysv/linux/getsockopt.c. The changes are:
1. Remove it from auto-generation syscalls.list on all architectures.
2. Add __ASSUME_GETSOCKOPT_SYSCALL as default and undef if for
specific kernel versions on some architectures.
This also fix a potential issue where 32-bit time_t ABI should use the
linux getsockopt which overrides the underlying SO_* constants used for
socket timestamping for _TIME_BITS=64.
Checked on x86_64-linux-gnu and i686-linux-gnu.
This avoids crashes in libc when cmsg is null and refrencing msg
structure when it is null
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbols gai_cancel, gai_error, gai_suspend, getaddrinfo_a,
__gai_suspend_time64 were moved using scripts/move-symbol-to-libc.py.
For Hurd (which remains !PTHREAD_IN_LIBC), a few #define redirects
had to be added because several pthread functions are not available
under __. (Linux uses __ prefixes for most hidden aliases, and has
to in some cases to avoid linknamespace issues.)
This patch modifies the current POWER9 implementation of strcpy and
stpcpy to optimize it for POWER9/10.
Since no new POWER10 instructions are used, the original POWER9 strcpy is
modified instead of creating a new implementation for POWER10. This
implementation is based on both the original POWER9 implementation of
strcpy and the preamble of the new POWER10 implementation of strlen.
The changes also affect stpcpy, which uses the same implementation with
some additional code before returning.
On POWER9, averaging improvements across the benchmark
inputs (length/source alignment/destination alignment), for an
experiment that ran the benchmark five times, bench-strcpy showed an
improvement of 5.23%, and bench-stpcpy showed an improvement of 6.59%.
On POWER10, bench-strcpy showed 13.16%, and bench-stpcpy showed 13.59%.
The changes are:
1. Removed the null string optimization.
Although this results in a few extra cycles for the null string, in
combination with the second change, this resulted in improvements for
for other cases.
2. Adapted the preamble from strlen for POWER10.
This is the part of the function that handles up to the first 16 bytes
of the string.
3. Increased number of unrolled iterations in the main loop to 6.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Tested-by: Matheus Castanho <msc@linux.ibm.com>
From
https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html
* Intel TSX will be disabled by default.
* The processor will force abort all Restricted Transactional Memory (RTM)
transactions by default.
* A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated,
which is set to indicate to updated software that the loaded microcode is
forcing RTM abort.
* On processors that enumerate support for RTM, the CPUID enumeration bits
for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to
be set by default after microcode update.
* Workloads that were benefited from Intel TSX might experience a change
in performance.
* System software may use a new bit in Model-Specific Register (MSR) 0x10F
TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock
Elision (HLE) and RTM bits to indicate to software that Intel TSX is
disabled.
1. Add RTM_ALWAYS_ABORT to CPUID features.
2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set. This skips the
string/tst-memchr-rtm etc. testcases on the affected processors, which
always fail after a microcde update.
3. Check RTM feature, instead of usability, against /proc/cpuinfo.
This fixes BZ #28033.
Linux 5.13 has three new syscalls (landlock_create_ruleset,
landlock_add_rule, landlock_restrict_self). Update syscall-names.list
and regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Tested with build-many-glibcs.py.
On s390 (31bit), the pointer to the first byte after s always wraps
around with n >= 0x80000000 and can lead to stop searching before
end of s.
Thus this patch just use NULL as byte after s in this case and
the srst instruction stops searching with "not found" when wrapping
around from top address to zero.
This is observable with testcase string/test-memchr
starting with commit "String: Add overflow tests for strnlen, memchr,
and strncat [BZ #27974]"
https://sourceware.org/git/?p=glibc.git;a=commit;h=da5a6fba0febbfc90896ce1b2eb75c6d8a88a72d
Starting with recent commit 84f7ce8447
"posix: Add glob64 with 64-bit time_t support", elf/check-localplt
fails due to extra PLT reference __glob64_time64 in __glob64_time64
itself.
This is observable with gcc 7.5 on x86_64 with -m32 or s390x with
-m31. E.g. if build with gcc 10, gcc is generating a call to
__glob64_time64.localalias.
This patch is adding a hidden version of __glob64_time64 in the
same way as for __globfree64_time64.
Add hp-timing.h using the cntvct_el0 counter. Return timing in nanoseconds
so it is fully compatible with generic hp-timing. Don't set HP_TIMING_INLINE
in the dynamic linker since it adds unnecessary overheads and some ancient
kernels may not handle emulating cntcvt correctly. Currently cntvct_el0 is
only used for timing in the benchtests.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimize strnlen by avoiding UMINV which is slow on most cores. On Neoverse N1
large strings are 1.8x faster than the current version, and bench-strnlen is
50% faster overall. This version is MTE compatible.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
malloc initialization depends on __get_nprocs, so using
scratch buffers in __get_nprocs may result in infinite recursion.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The symbols forkpty, login, login_tty, logout, logwtmp, openpty
were moved using scripts/move-symbol-to-libc.py.
This is a single commit because most of the symbols are tied together
via forkpty, for example.
Several changes to use hidden prototypes are needed. This commit
also updates pseudoterminal terminology on modified lines.
For 390 (31-bit), this commit follows the existing style for the
compat symbol version creation.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
RFC 8335 defines the network utility PROBE, which builds off of the
capabilities of Ping to query more detailed interface information from
networking nodes.
The definitions included in this patchset have been accepted into the
linux net-next branch and will be included in Linux 5.13. This
patchset adds the same definitions to the glibc for use in the
iputils package.
The relevant commits for the Linux definitions can be found here:
e542d29ca8750f4fc2a1
These changes have been tested by running the glibc tests on x86_64
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
After recent commit
447954a206
"math: redirect roundeven function", building on
s390x fails with:
Error: symbol `__roundevenl' is already defined
Similar to aarch64/riscv fix, this patch redirects target
specific functions for s390x:
commit 3213ed770c
"Update math: redirect roundeven function"
Austin Group issue 62 [1] dropped the async-signal-safe requirement
for fork and provided a async-signal-safe _Fork replacement that
does not run the atfork handlers. It will be included in the next
POSIX standard.
It allow to close a long standing issue to make fork AS-safe (BZ#4737).
As indicated on the bug, besides the internal lock for the atfork
handlers itself; there is no guarantee that the handlers itself will
not introduce more AS-safe issues.
The idea is synchronize fork with the required internal locks to allow
children in multithread processes to use mostly of standard function
(even though POSIX states only AS-safe function should be used). On
signal handles, _Fork should be used intead and only AS-safe functions
should be used.
For testing, the new tst-_Fork only check basic usage. I also added
a new tst-mallocfork3 which uses the same strategy to check for
deadlock of tst-mallocfork2 but using threads instead of subprocesses
(and it does deadlock if it replaces _Fork with fork).
[1] https://austingroupbugs.net/view.php?id=62
The valgrind/helgrind test suite needs a way to make stack dealloction
more prompt, and this feature seems to be generally useful.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
librt.so is no longer installed for PTHREAD_IN_LIBC, and tests
are not linked against it. $(librt) is introduced globally for
shared tests that need to be linked for both PTHREAD_IN_LIBC
and !PTHREAD_IN_LIBC.
GLIBC_PRIVATE symbols that were needed during the transition are
removed again.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The symbols were moved using scripts/move-symbol-to-libc.py.
The way the ABI intransition is implemented is changed with this
commit: the implementation is now consolidated in one file with a
TIMER_T_WAS_INT_COMPAT check.
The shared librt is now empty, so this commit adds a placeholder
symbol at the base version, GLIBC_2.2, and potentially at the
GLIBC_2.3.3 version as well (the leftover from the int/timer_t ABI
transition).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
The way the ABI intransition is implemented is changed with this
commit: the implementation is now consolidated in one file with a
TIMER_T_WAS_INT_COMPAT check.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
The way the ABI intransition is implemented is changed with this
commit: the implementation is now consolidated in one file with a
TIMER_T_WAS_INT_COMPAT check.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
timer_create and timer_delete are tied together via the int/timer_t
compatibility code. The way the ABI intransition is implemented
is changed with this commit: the implementation is now consolidated
in one file with a TIMER_T_WAS_INT_COMPAT check.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is almost equivalent to __WORDSIZE == 64
&& OTHER_SHLIB_COMPAT (librt, GLIBC_2_1, GLIBC_2_3_3), except
that this expression is true for mips64/n64 targets as well,
even though those did not undergo the timer_t transition.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch is using the corresponding GCC builtin for roundevenf,
roundeven and roundevenl if the USE_FUNCTION_BUILTIN macros are defined
to one in math-use-builtins.h.
These builtin functions is supported since GCC 10.
The code of the generic implementation is not changed.
Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch redirect roundeven function for futhermore changes.
Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This adds several temporary GLIBC_PRIVATE exports. The symbol names
are changed so that they all start with __timer_.
It is now possible to invoke the fork handler directly, so
pthread_atfork is no longer necessary. The associated error cannot
happen anymore, and cancellation handling can be removed from
the helper thread routine.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
A placeholder symbol is needed on some architectures for the
GLIBC_2.3.4 version.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
A placeholder symbol is required to keep the GLIBC_2.7 version.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
An explicit call from fork into the mq_notify implementation replaces
the previous use of pthread_atfork.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
To introduce the proper symbol versioning, the implementation of
the system call wrapper us moved to a C file.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
Placeholder symbols are needed on some architectures, to keep the
GLIBC_2.1 and GLIBC_2.4 symbol versions around.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
Move the common code into rt/lio_listio-common.c and include
the file in both rt/lio_listio.c and rt/lio_listio64.c. The common
code automatically defines both public symbols for __WORDSIZE == 64.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
Both symbols have to be moved at the same time because they
are intertwined for __WORDSIZE == 64. The treatment of this case
is also changed to match more closely how the other files suppress
the declaration of the *64 identifier.
The symbols were moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
There is a minor oddity here: This is generic code shared with Hurd,
and Hurd does not have time64 support. This is why the
versioned_symbol export for __aio_suspend_time64 is restricted to
the PTHREAD_IN_LIBC code.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
Both symbols have to be moved at the same time because they
are intertwined for __WORDSIZE == 64. The treatment of this case
is also changed to match more closely how the other files suppress
the declaration of the *64 identifier.
The symbols were moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
A version placeholder symbol is needed on alpha and sparc because
of the additional symbols formerly at version GLIBC_2.3.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>:
This commit also moves the aio_misc and aio_sigquue helper,
so GLIBC_PRIVATE exports need to be added.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
No bug. The way wcsnlen will check if near the end of maxlen
is the following macro:
mov %r11, %rsi; \
subq %rax, %rsi; \
andq $-64, %rax; \
testq $-64, %rsi; \
je L(strnlen_ret)
Which words independently of s + maxlen overflowing. So the
second overflow check is unnecissary for correctness and
just extra overhead in the common no overflow case.
test-strlen.c, test-wcslen.c, test-strnlen.c and test-wcsnlen.c are
all passing
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The pthread_atfork is similar between Linux and Hurd, only the compat
version bits differs. The generic version is place at sysdeps/pthread
with a common name.
It also fixes an issue with Hurd license, where the static-only object
did not use LGPL + exception.
Checked on x86_64-linux-gnu, i686-linux-gnu, and with a build for
i686-gnu.
The Linux nptl implementation is used as base for generic fork
implementation to handle the internal locks and mutexes. The
system specific bits are moved a new internal _Fork symbol.
(This new implementation will be used to provide a async-signal-safe
_Fork now that POSIX has clarified that fork might not be
async-signal-safe [1]).
For Hurd it means that the __nss_database_fork_prepare_parent and
__nss_database_fork_subprocess will be run in a slight different
order.
[1] https://austingroupbugs.net/view.php?id=62
AMD define different flags for IRPB, IBRS, and STIPBP [1], so new
x86_64_cpu are added and IBRS_IBPB is only tested for Intel.
The SSDB is also defined and implemented different on AMD [2],
and also a new AMD_SSDB flag is added. It should map to the
cpuinfo 'ssdb' on recent AMD cpus.
It fixes tst-cpu-features-cpuinfo and tst-cpu-features-cpuinfo-static
on recent AMD cpus.
Checked on x86_64-linux-gnu on AMD Ryzen 9 5900X.
[1] https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf
[2] https://bugzilla.kernel.org/show_bug.cgi?id=199889
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
IBT and SHSTK usable bits are copied from CPUID feature bits and later
cleared if kernel doesn't support CET. Copy IBT and SHSTK usable only
if CET is enabled so that they aren't set on CET capable processors
with non-CET enabled glibc.
This commit fixes the bug mentioned in the previous commit.
The previous implementations of wmemchr in these files relied
on maxlen * sizeof(wchar_t) which was not guranteed by the standard.
The new overflow tests added in the previous commit now
pass (As well as all the other tests).
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This commit fixes the bug mentioned in the previous commit.
The previous implementations of wmemchr in these files relied
on n * sizeof(wchar_t) which was not guranteed by the standard.
The new overflow tests added in the previous commit now
pass (As well as all the other tests).
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
No bug. This comment adds the ifunc / build infrastructure
necessary for wcslen to prefer the sse4.1 implementation
in strlen-vec.S. test-wcslen.c is passing.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Since strlen.S contains SSE2 version of strlen/strnlen and SSE4.1
version of wcslen/wcsnlen, move strlen.S to multiarch/strlen-vec.S
and include multiarch/strlen-vec.S from SSE2 and SSE4.1 variants.
This also removes the unused symbols, __GI___strlen_sse2 and
__GI___wcsnlen_sse4_1.
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
The large timeout are already tests by io/tst-utimensat-skeleton.c.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
It breaks the usage case of live migration like CRIU or similar
and most usages can be optimized away by either building glibc with
a minimum 5.1 kernel or by using the 32-bit syscall for the common
case.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
It breaks the usage case of live migration like CRIU or similar.
The performance drawback is it would require an extra syscall
on older kernels without 64-bit time support.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
It breaks the usage case of live migration like CRIU or similar.
The performance drawback is it would require an extra syscall
on older kernels without 64-bit time support.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one. This also avoids the need
to use supports_time64() (which breaks the usage case of live migration
like CRIU or similar).
It also fixes an issue on 32-bit select call for !__ASSUME_PSELECT
(microblase with older kernels only) where the expected timeout
is a 'struct timeval' instead of 'struct timespec'.
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one. This also avoids the need
to use supports_time64() (which breaks the usage case of live migration
like CRIU or similar).
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one. The 64-bit usage should
be rare since the timeout is a relative one. This also avoids the need
to use supports_time64() (which breaks the usage case of live migration
like CRIU or similar).
Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For the legacy ABI with supports 32-bit time_t it calls the 64-bit
time directly, since the LFS symbols calls the 64-bit time_t ones
internally.
Checked on i686-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
This mirrors the situation on Hurd. These directories are on
the include search part, so #include <pthreadP.h> works after this
change on both Hurd and nptl.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The pthread-based implementation is the generic one. Replacing
the stubs makes it clear that they do not have to be adjusted for
the libpthread move.
Result of:
git mv -f sysdeps/pthread/aio_misc.h sysdeps/generic/
git mv sysdeps/pthread/timer_routines.c sysdeps/htl/
git mv -f sysdeps/pthread/{aio,lio,timer}_*.c rt/
Followed by manual adjustment of the #include paths in
sysdeps/unix/sysv/linux/wordsize-64, and a move of the version
definitions formerly in sysdeps/pthread/Versions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This function has no dependency on libpthread, so the move is also
applied to Hurd.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This function has no dependency on libpthread, so the move is also
applied to Hurd.
To avoid localplt failures, use __open64_nocancel instead of
pthread_setcancelstate and open.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Result of: git mv -f sysdeps/posix/shm_unlink.c rt
and manual removal of the _POSIX_MAPPED_FILES preprocessor condition.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Result of: git mv -f sysdeps/posix/shm_open.c rt
and manual removal of the _POSIX_MAPPED_FILES preprocessor condition.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These were turned into compat symbols as part of the libpthread
move. It turns out they are used by language run-time libraries
(e.g., the GCC D front end), so it makes to preserve them as
external symbols even though they are not declared in any header
file.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Starting with recent commit 92a7d13439
"x86-64: Align child stack to 16 bytes [BZ #27902]"
the new test misc/tst-misalign-clone has failed on s390x/s390.
This patch is now aligning the stack to a double
word boundary as also done in start.S files.
Similar to fts, ftw routines passes a stat pointer that might
differ of size and layout when 64-bit time API is used.
Checked on i686-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Similar to glob, fts routines passes a stat pointer that might
differ of size and layout when 64-bit time API is used.
Checked on i686-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The glob might pass a different stat struct for gl_stat and gl_lstat
when GLOB_ALTDIRFUNC is used. This requires add a new 64-bit time
version that also uses 64-bit time stat functions.
Checked on i686-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
A new build flag, _TIME_BITS, enables the usage of the newer 64-bit
time symbols for legacy ABI (where 32-bit time_t is default). The 64
bit time support is only enabled if LFS (_FILE_OFFSET_BITS=64) is
also used.
Different than LFS support, the y2038 symbols are added only for the
required ABIs (armhf, csky, hppa, i386, m68k, microblaze, mips32,
mips64-n32, nios2, powerpc32, sparc32, s390-32, and sh). The ABIs with
64-bit time support are unchanged, both for symbol and types
redirection.
On Linux the full 64-bit time support requires a minimum of kernel
version v5.1. Otherwise, the 32-bit fallbacks are used and might
results in error with overflow return code (EOVERFLOW).
The i686-gnu does not yet support 64-bit time.
This patch exports following rediretions to support 64-bit time:
* libc:
adjtime
adjtimex
clock_adjtime
clock_getres
clock_gettime
clock_nanosleep
clock_settime
cnd_timedwait
ctime
ctime_r
difftime
fstat
fstatat
futimens
futimes
futimesat
getitimer
getrusage
gettimeofday
gmtime
gmtime_r
localtime
localtime_r
lstat_time
lutimes
mktime
msgctl
mtx_timedlock
nanosleep
nanosleep
ntp_gettime
ntp_gettimex
ppoll
pselec
pselect
pthread_clockjoin_np
pthread_cond_clockwait
pthread_cond_timedwait
pthread_mutex_clocklock
pthread_mutex_timedlock
pthread_rwlock_clockrdlock
pthread_rwlock_clockwrlock
pthread_rwlock_timedrdlock
pthread_rwlock_timedwrlock
pthread_timedjoin_np
recvmmsg
sched_rr_get_interval
select
sem_clockwait
semctl
semtimedop
sem_timedwait
setitimer
settimeofday
shmctl
sigtimedwait
stat
thrd_sleep
time
timegm
timerfd_gettime
timerfd_settime
timespec_get
utime
utimensat
utimes
utimes
wait3
wait4
* librt:
aio_suspend
mq_timedreceive
mq_timedsend
timer_gettime
timer_settime
* libanl:
gai_suspend
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It is only used for !__USE_MISC, the default way uses the kernel
headers. The patch also adds the SO_TIMESTAMP, SO_TIMESTAMPNS, and
SO_TIMESTAMPING which uses new values for 64-bit time_t kernel
interfaces.
The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit stat implementations.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Instead of replicate the same definitions from struct_shmid64_ds.h
on the multiple struct_shmid_ds.h, use a common header which is included
when required (struct_shmid64_ds_helper.h).
The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit semctl implementation.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Instead of replicate the same definitions from struct_semid64_ds.h
on the multiple struct_semid_ds.h, use a common header which is included
when required (struct_semid64_ds_helper.h).
The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit semctl implementation.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Instead of replicate the same definitions from struct_msqid64_ds.h
on the multiple struct_msqid_ds.h, use a common header which is included
when required (struct_msqid64_ds_helper.h).
The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit stat implementations.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Instead of replicate the same definitions from struct_stat_time64.h
on the multiple struct_stat.h, use a common header which is included
when required (struct_stat_time64_helper.h). The 64-bit time support
is added only for LFS support.
The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit stat implementations.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The __USE_TIME_BITS64 is not defined internally yet.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Handle the SO_TIMESTAMP{NS} similar to recvmsg: for
!__ASSUME_TIME64_SYSCALLS it converts the first 32-bit time SO_TIMESTAMP
or SO_TIMESTAMPNS and appends it to the control buffer if has extra
space or returns MSG_CTRUNC otherwise. The 32-bit time field is kept
as-is.
Also for !__ASSUME_TIME64_SYSCALLS it limits the maximum number of
'struct mmsghdr *' to IOV_MAX (and also increases the stack size
requirement to IOV_MAX times sizeof (socklen_t)). The Linux imposes
a similar limit to sendmmsg, so bound the array size on recvmmsg is not
unreasonable. And this will be used only on older when building with
32-bit time support.
Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15
kernel).
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The recvmsg handling is more complicated because it requires check the
returned kernel control message and make some convertions. For
!__ASSUME_TIME64_SYSCALLS it converts the first 32-bit time SO_TIMESTAMP
or SO_TIMESTAMPNS and appends it to the control buffer if has extra
space or returns MSG_CTRUNC otherwise. The 32-bit time field is kept
as-is.
Calls with __TIMESIZE=32 will see the converted 64-bit time control
messages as spurious control message of unknown type. Calls with
__TIMESIZE=64 running on pre-time64 kernels will see the original
message as a spurious control ones of unknown typ while running on
kernel with native 64-bit time support will only see the time64 version
of the control message.
Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15
kernel).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The constant values will be changed for __TIMESIZE=64, so binaries built
with 64-bit time support might fail to work properly on old kernels.
Both {get,set}sockopt will retry the syscall with the old constant
values and the timeout value adjusted when kernel returns ENOTPROTOPT.
It also adds an internal only SO_{RCV,SND}TIMEO where
COMPAT_SO_{RCV,SND}TIMEO_OLD indicates pre 32-bit time support and
COMPAT_SO_{RCV,SND}TIMEO_NEW indicates time64 support. It allows to
refer to constant independently of the time_t ABI and kernel version
used.
Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15
kernel).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The s390 will require the 64-bit time symbols for y2038 support.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The n32 will require the 64-bit time symbols for y2038 support.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The n32 will require the 64-bit time symbols for y2038 support.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Commit 68ab82f566 added support for the scv
syscall ABI on powerpc. Since then systems that have kernel and processor
support started using scv. However adding the proper support for a new syscall
ABI requires changes to several other projects (e.g. qemu, valgrind, strace,
kernel), which are gradually receiving support.
Meanwhile, having a way to disable scv on glibc at build time can be useful for
distros that may encounter conflicts with projects that still do not support the
scv ABI, buying time until proper support is added.
This commit adds a --disable-scv option that disables scv support and uses sc
for all syscalls, like before commit 68ab82f566.
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
Now that pthread_kill is provided by libc.so it is possible to
implement the generic POSIX implementation as
'pthread_kill(pthread_self(), sig)'.
For Linux implementation, pthread_kill read the targeting TID from
the TCB. For raise, this it not possible because it would make raise
fail when issue after vfork (where creates the resulting process
has a different TID from the parent, but its TCB is not updated as
for pthread_create). To make raise use pthread_kill, it is make
usable from vfork by getting the target thread id through gettid
syscall.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Now that the thread cancellation type is not accessed concurrently
anymore, it is possible to move it out the cancelhandling.
By removing the cancel state out of the internal thread cancel handling
state there is no need to check if cancelled bit was set in CAS
operation.
It allows simplifing the cancellation wrappers and the
CANCEL_CANCELED_AND_ASYNCHRONOUS is removed.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Now that thread cancellation state is not accessed concurrently anymore,
it is possible to move it out the 'cancelhandling'.
The code is also simplified: CANCELLATION_P is replaced with a
internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is
removed.
With this behavior pthread_setcancelstate does not require to act on
cancellation if cancel type is asynchronous (is already handled either
by pthread_setcanceltype or by the signal handler).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Since commit 0c1c3a771e
("dlfcn: Move dlopen into libc") libdl.a is empty, so linking
against it is no longer necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Move all gconv-modules configuration files to gconv-modules.conf.
That is, the S390 extensions now become gconv-modules-s390.conf. Move
both configuration files into gconv-modules.d.
Now GCONV_PATH/gconv-modules is read only for backward compatibility
for third-party gconv modules directories.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add inline assembler for the roundeven functions.
Passes GLIBC regression. Note GCC does not inline the builtin (PR100966),
so this cannot be used for now.
This patch replaced obsolete AC_TRY_COMPILE to AC_COMPILE_IFELSE or
AC_PREPROC_IFELSE.
It has been confirmed that GNU 'autoconf' 2.69 suppressed obsolete
warnings, updated the following files:
- configure
- sysdeps/mach/configure
- sysdeps/mach/hurd/configure
- sysdeps/s390/configure
- sysdeps/unix/sysv/linux/configure
and didn't change the following files:
- sysdeps/ieee754/ldbl-opt/configure
- sysdeps/unix/sysv/linux/powerpc/configure
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Consolidate all hooks structures into a single one. There are
no static dlopen ABI concerns because glibc 2.34 already comes
with substantial ABI-incompatible changes in this area. (Static
dlopen requires the exact same dynamic glibc version that was used
for static linking.)
The new approach uses a pointer to the hooks structure into
_rtld_global_ro and initalizes it in __rtld_static_init. This avoids
a back-and-forth with various callback functions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This commit removes the ELF constructor and internal variables from
dlfcn/dlfcn.c. The file now serves the same purpose as
nptl/libpthread-compat.c, so it is renamed to dlfcn/libdl-compat.c.
The use of libdl-shared-only-routines ensures that libdl.a is empty.
This commit adjusts the test suite not to use $(libdl). The libdl.so
symbolic link is no longer installed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
In elf/Makefile, remove the $(libdl) dependency from testobj1.so
because it the unused libdl DSO now causes elf/tst-unused-deps to
fail.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
There is a minor functionality enhancement: dlerror now sets
errno if it was set as part of the exception. (This is the result
of using %m in asprintf, to avoid the strerror PLT call.) The
previous errno value upon function return was unpredictable.
Documenting this as a feature is premature; we need to make sure
that the error codes are meaningful when they are set by the dynamic
loader.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some targets have a GLIBC_2.0 baseline for libdl, while using
GLIBC_2.2 for libc. This means that the generated libc.map file
does not have any version nodes for GLIBC_2.0 or GLIBC_2.1. However,
moving symbols from libdl into libc needs such version nodes.
(Future symbol moves from librt will need this as well.)
This kludge is only necessary for symbols predating GLIBC_2.2 because
the affected targets use GLIBC_2.2 as the baseline for libc. Given
the small number and fixed set of affected architectures, no generic
mechanism is implemented, and instead the map file fragment is
hard-coded in scripts/versions.mk.
The compat_symbol macro already emits the appropriate version strings,
so no adjustments are needed there.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some symbols have explicit versioned_symbol or compat_symbol markers
in the sources, but no corresponding entry in the Versions files.
This presently works because the local: * directive is only applied
to the base version.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
__pthread_attr_copy can fail and does not initialize the attribute
structure in that case.
If __pthread_attr_copy is never called and there is no allocated
attribute, pthread_attr_destroy should not be called, otherwise
there is a null pointer dereference in rt/tst-mqueue6.
Fixes commit 42d3593505
("Use __pthread_attr_copy in mq_notify (bug 27896)").
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The symbol has never been exported, so no compatibility symbol is
needed. Removing this file prevents ld from creation an exported
symbol in case GLIBC_2_0 expands to a symbol version which
does not have a local: *; directive in the symbol version map file.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch was based on the __memcmp_power8 and the recent
__strlen_power10.
Improvements from __memcmp_power8:
1. Don't need alignment code.
On POWER10 lxvp and lxvl do not generate alignment interrupts, so
they are safe for use on caching-inhibited memory. Notice that the
comparison on the main loop will wait for both VSR to be ready.
Therefore aligning one of the input address does not improve
performance. In order to align both registers a vperm is necessary
which add too much overhead.
2. Uses new POWER10 instructions
This code uses lxvp to decrease contention on load by loading 32 bytes
per instruction.
The vextractbm is used to have a smaller tail code for calculating the
return value.
3. Performance improvement
This version has around 35% better performance on average. I saw no
performance regressions for any length or alignment.
Thanks Matheus for helping me out with some details.
Co-authored-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
This patch optimizes the performance of memset for A64FX [1] which
implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB cache
per NUMA node.
The performance optimization makes use of Scalable Vector Register
with several techniques such as loop unrolling, memory access
alignment, cache zero fill and prefetch.
SVE assembler code for memset is implemented as Vector Length Agnostic
code so theoretically it can be run on any SOC which supports ARMv8-A
SVE standard.
We confirmed that all testcases have been passed by running 'make
check' and 'make xcheck' not only on A64FX but also on ThunderX2.
And also we confirmed that the SVE 512 bit vector register performance
is roughly 4 times better than Advanced SIMD 128 bit register and 8
times better than scalar 64 bit register by running 'make bench'.
[1] https://github.com/fujitsu/A64FX
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
This patch optimizes the performance of memcpy/memmove for A64FX [1]
which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB
cache per NUMA node.
The performance optimization makes use of Scalable Vector Register
with several techniques such as loop unrolling, memory access
alignment, cache zero fill, and software pipelining.
SVE assembler code for memcpy/memmove is implemented as Vector Length
Agnostic code so theoretically it can be run on any SOC which supports
ARMv8-A SVE standard.
We confirmed that all testcases have been passed by running 'make
check' and 'make xcheck' not only on A64FX but also on ThunderX2.
And also we confirmed that the SVE 512 bit vector register performance
is roughly 4 times better than Advanced SIMD 128 bit register and 8
times better than scalar 64 bit register by running 'make bench'.
[1] https://github.com/fujitsu/A64FX
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
This patch is a test helper script to change Vector Length for child
process. This script can be used as test-wrapper for 'make check'.
Usage examples:
~/build$ make check subdirs=string \
test-wrapper='~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16'
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16 \
make test t=string/test-memcpy
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 32 \
./debugglibc.sh string/test-memmove
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 64 \
./testrun.sh string/test-memset
This patch defines BTI_C and BTI_J macros conditionally for
performance.
If HAVE_AARCH64_BTI is true, BTI_C and BTI_J are defined as HINT
instruction for ARMv8.5 BTI (Branch Target Identification).
If HAVE_AARCH64_BTI is false, both BTI_C and BTI_J are defined as
NOP.
Since the variable expands to nothing under Linux, it is no longer
necessary to clutter the makefiles with it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When using scv for templated ASM syscalls, current code interprets any
negative return value as error, but the only valid error codes are in
the range -4095..-1 according to the ABI.
This commit also fixes 'signal.gen.test' strace test, where the issue
was first identified.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
1. Replace
if ((((uintptr_t) &_d) & (__alignof (double) - 1)) != 0)
which may be optimized out by compiler, with
int
__attribute__ ((weak, noclone, noinline))
is_aligned (void *p, int align)
{
return (((uintptr_t) p) & (align - 1)) != 0;
}
2. Add TEST_STACK_ALIGN_INIT to TEST_STACK_ALIGN.
3. Add a common TEST_STACK_ALIGN_INIT to check 16-byte stack alignment
for both i386 and x86-64.
4. Update powerpc to use TEST_STACK_ALIGN_INIT.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Only the placeholder compatibility symbols are left now.
The __errno_location symbol was removed (moved) using
scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbols were moved using scripts/move-symbol-to-libc.py.
The libpthread placeholder symbols need some changes because some
symbol versions have gone away completely. But
__errno_location@@GLIBC_2.0 still exists, so the GLIBC_2.0 version
is still there.
The internal __pthread_create symbol now points to the correct
function, so the sysdeps/nptl/thrd_create.c override is no longer
necessary.
There was an issue how the hidden alias of pthread_getattr_default_np
was defined, so this commit cleans up that aspects and removes the
GLIBC_PRIVATE export altogether.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>