Simplify handling of remaining bytes. Avoid lots of taken branches and complex
whilelo computations, instead unconditionally write vectors from the end.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Improve performance of large memsets. Simplify alignment code. For zero memset
use DC ZVA, which almost doubles performance. For non-zero memsets use the
unroll8 loop which is about 10% faster.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Improve performance of small memsets by reducing instruction counts and
improving code alignment. Bench-memset shows 35-45% performance gain for
small sizes.
Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
Linux 5.13 adds a PTRACE_GET_RSEQ_CONFIGURATION constant, with an
associated ptrace_rseq_configuration structure.
Add this constant to the various sys/ptrace.h headers in glibc, with
the structure in bits/ptrace-shared.h (named struct
__ptrace_rseq_configuration in glibc, as with other such structures).
Tested for x86_64, and with build-many-glibcs.py.
Helper thread frees copied attribute on NOTIFY_REMOVED message
received from the OS kernel. Unfortunately, it fails to check whether
copied attribute actually exists (data.attr != NULL). This worked
earlier because free() checks passed pointer before actually
attempting to release corresponding memory. But
__pthread_attr_destroy assumes pointer is not NULL.
So passing NULL pointer to __pthread_attr_destroy will result in
segmentation fault. This scenario is possible if
notification->sigev_notify_attributes == NULL (which means default
thread attributes should be used).
Signed-off-by: Nikita Popov <npv1310@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
We'd like to support processors without Altivec or VSX, so check
the relevant hwcap bits before selecting them.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
A number of optimised memset routines assume the cacheline size is 128B,
so we better check before using them.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
We use PPC_FEATURE_HAS_VSX to select a number of POWER7 optimised
functions. These functions don't use any VSX instructions, so
PPC_FEATURE_ARCH_2_06 seems like a better fit.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
__REDIRECT and __THROW are not compatible with C++ due to the ordering of the
__asm__ alias and the throw specifier. __REDIRECT_NTH has to be used
instead.
Fixes commit 8a40aff86b ("io: Add time64 alias
for fcntl"), commit 82c395d91e ("misc: Add
time64 alias for ioctl"), commit b39ffab860
("Linux: Add time64 alias for prctl").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux 5.13 adds an INADDR_DUMMY definition; add a corresponding
definition to glibc's netinet/in.h. (This isn't strictly a new kernel
interface, rather a value defined in RFC 7600.)
Tested for x86_64.
It turned that the generic implementation of brk() does not work
for sparc, since on failure kernel will just return the previous
input value without setting the conditional register.
This patches adds back a sparc32 and sparc64 implementation removed
by 720480934a.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
This test depends on the "last" function being called in a
different thread than the "first" function, as "last" posts
a semaphore that "first" is waiting on. However, if pthread_create
fails - for example, if running in an older container before
the clone3()-in-container-EPERM fixes - exit() is called in the
same thread as everything else, the semaphore never gets posted,
and first hangs.
The fix is to pre-post that semaphore before a single-threaded
exit.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
labellist and precedencelist could get freed a second time if there
are allocation failures, so set them to NULL to avoid a double-free.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
If close() on infd and outfd succeeded, reset the fd numbers so that
we don't attempt to close them again.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
The allocated `conf` would leak if we have to skip over the file due
to the underlying filesystem not supporting dt_type.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
The test nptl/tst-thread_local1.cc fails to build with GCC mainline
because of changes to what libstdc++ headers implicitly include what
other headers:
tst-thread_local1.cc: In function 'int do_test()':
tst-thread_local1.cc:177:5: error: variable 'std::array<std::pair<const char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X' has initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~
Fix this by adding an explicit include of <array>.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
If pos >= count but realloc fails, tmp will not have been placed in
getnames[pos] yet, and so will not be freed in free_null. Detected
by Coverity.
Also remove misleading comment from nis_getnames(), since it actually
did properly release getnames when out of memory.
Tested-by: Carlos O'Donell <carlos@redhat.com>
These comments refer to slow paths that were removed in
glibc 2.34 or earlier. The corresponding "names" that yield
separate workload traces for "make bench" are thus obsolete.
We are however keeping the corresponding inputs.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Commit 03e187a41d added a regression when an audit module does not have
libc as DT_NEEDED (although unusual it is possible).
Checked on x86_64-linux-gnu.
commit 3ec5d83d2a
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sat Jan 25 14:19:40 2020 -0800
x86-64: Avoid rep movsb with short distance [BZ #27130]
introduced some regressions on Intel processors without Fast Short REP
MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with
short distance only on Intel processors with FSRM. bench-memmove-large
on Skylake server shows that cycles of __memmove_evex_unaligned_erms
improves for the following data size:
before after Improvement
length=4127, align1=3, align2=0: 479.38 349.25 27%
length=4223, align1=9, align2=5: 405.62 333.25 18%
length=8223, align1=3, align2=0: 786.12 496.38 37%
length=8319, align1=9, align2=5: 727.50 501.38 31%
length=16415, align1=3, align2=0: 1436.88 840.00 41%
length=16511, align1=9, align2=5: 1375.50 836.38 39%
length=32799, align1=3, align2=0: 2890.00 1860.12 36%
length=32895, align1=9, align2=5: 2891.38 1931.88 33%
The benchmark and tests must fail in case of allocation failure in the
implementation array. Also annotate the x* allocators in support.h so
that the compiler has more information about them.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tell the compiler that xmalloc family of allocators always return
non-NULL. xrealloc in locale/programs also always returns non-NULL,
but that conflicts with default realloc behaviour and that of xrealloc
in libsupport, so keep it as is for now and resolve the differences
later.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
All references to libraries in the manual are without the .so prefix,
so do the same for libc_malloc_debug.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
mcheck and malloc-check no longer work with static binaries, so drop
those tests.
Reported-by: Samuel Thibault <samuel.thibault@gnu.org>
Tested-by: Samuel Thibault <samuel.thibault@gnu.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
These functions call the core allocator functions (realloc and malloc
respectively) and are hence guaranteed to allocate memory using the
correct functions when multiple allocators are interposed. Having
these functions interposed in one allocator and not another may result
in confusion, hence discourage interposing them altogether.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes
<bits/platform/x86.h>.
2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the
processor has the feature.
3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the
feature is active. There may be other preconditions, like sufficient
stack space or further setup for AMX, which must be satisfied before the
feature can be used.
This fixes BZ #27958.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>