Unsigned branch instructions could be used for r2 to fix the wrong
behavior when a negative length is passed to memcpy.
This commit fixes the armv7 version.
Unsigned branch instructions could be used for r2 to fix the wrong
behavior when a negative length is passed to memcpy and memmove.
This commit fixes the generic arm implementation of memcpy amd memmove.
The strerrorname_np returns error number name (e.g. "EINVAL" for EINVAL)
while strerrordesc_np returns string describing error number (e.g
"Invalid argument" for EINVAL). Different than strerror,
strerrordesc_np does not attempt to translate the return description,
both functions return NULL for an invalid error number.
They should be used instead of sys_errlist and sys_nerr, both are
thread and async-signal safe. These functions are GNU extensions.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The sigabbrev_np returns the abbreviated signal name (e.g. "HUP" for
SIGHUP) while sigdescr_np returns the string describing the error
number (e.g "Hangup" for SIGHUP). Different than strsignal,
sigdescr_np does not attempt to translate the return description and
both functions return NULL for an invalid signal number.
They should be used instead of sys_siglist or sys_sigabbrev and they
are both thread and async-signal safe. They are added as GNU
extensions on string.h header (same as strsignal).
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Use snprintf instead of mempcpy plus itoa_word and remove unused
definitions. There is no potential for infinite recursion because
snprintf only use strerror_r for the %m specifier.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The buffer allocation uses the same strategy of strsignal.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If the thread is terminated then __libc_thread_freeres will free the
storage via __glibc_tls_internal_free.
It is only within the calling thread that this matters. It makes
strerror MT-safe.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The per-thread state is refactored two use two strategies:
1. The default one uses a TLS structure, which will be placed in the
static TLS space (using __thread keyword).
2. Linux allocates via struct pthread and access it through THREAD_*
macros.
The default strategy has the disadvantage of increasing libc.so static
TLS consumption and thus decreasing the possible surplus used in
some scenarios (which might be mitigated by BZ#25051 fix).
It is used only on Hurd, where accessing the thread storage in the in
single thread case is not straightforward (afaiu, Hurd developers could
correct me here).
The fallback static allocation used for allocation failure is also
removed: defining its size is problematic without synchronizing with
translated messages (to avoid partial translation) and the resulting
usage is not thread-safe.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The __NSIG_WORDS value is based on minimum number of words to hold
the maximum number of signals supported by the architecture.
This patch also adds __NSIG_BYTES, which is the number of bytes
required to represent the supported number of signals. It is used in
syscalls which takes a sigset_t.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The symbol is deprecated by strerror since its usage imposes some issues
such as copy relocations.
Its internal name is also changed to _sys_errlist_internal to avoid
static linking usage. The compat code is also refactored by removing
the over enginered errlist-compat.c generation from manual entried and
extra comment token in linker script file. It disantangle the code
generation from manual and simplify both Linux and Hurd compat code.
The definitions from errlist.c are moved to errlist.h and a new test
is added to avoid a new errno entry without an associated one in manual.
Checked on x86_64-linux-gnu and i686-linux-gnu. I also run a check-abi
on all affected platforms.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The symbol was deprecated by strsignal and its usage imposes issues
such as copy relocations.
Its internal name is changed to __sys_siglist and __sys_sigabbrev to
avoid static linking usage. The compat code is also refactored, since
both Linux and Hurd usage the same strategy: export the same array with
different object sizes.
The libSegfault change avoids calling strsignal on the SIGFAULT signal
handler (the current usage is already sketchy, adding a call that
potentially issue locale internal function is even sketchier).
Checked on x86_64-linux-gnu and i686-linux-gnu. I also run a check-abi
on all affected platforms.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It refactor how signals are defined by each architecture. Instead of
include a generic header (bits/signum-generic.h) and undef non-default
values in an arch specific header (bits/signum.h) the new scheme uses a
common definition (bits/signum-generic.h) and each architectures add
its specific definitions on a new header (bits/signum-arch.h).
For Linux it requires copy some system default definitions to alpha,
hppa, and sparc. They are historical values and newer ports uses
the generic Linux signum-arch.h.
For Hurd the BSD signum is removed and moved to a new header (it is
used currently only on Hurd).
Checked on a build against all affected ABIs.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables
to update thresholds for "rep movsb" and "rep stosb" at run-time.
Note that the user specified threshold for "rep movsb" smaller than
the minimum threshold will be ignored.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
In TS 18661-1, getpayload had an unspecified return value for a
non-NaN argument, while C2x requires the return value -1 in that case.
This patch implements the return value of -1. I don't think this is
worth having a new symbol version that's an alias of the old one,
although occasionally we do that in such cases where the new function
semantics are a refinement of the old ones (to avoid programs relying
on the new semantics running on older glibc versions but not behaving
as intended).
Tested for x86_64 and x86; also ran math/ tests for aarch64 and
powerpc.
An extension called extended feature disable (XFD) is an extension added
for Intel AMX to the XSAVE feature set that allows an operating system
to enable a feature while preventing specific user threads from using
the feature.
The variable is placed in libc.so, and it can be true only in
an outer libc, not libcs loaded via dlmopen or static dlopen.
Since thread creation from inner namespaces does not work,
pthread_create can update __libc_single_threaded directly.
Using __libc_early_init and its initial flag, implementation of this
variable is very straightforward. A future version may reset the flag
during fork (but not in an inner namespace), or after joining all
threads except one.
Reviewed-by: DJ Delorie <dj@redhat.com>
These tests validate that rseq is registered from various execution
contexts (main thread, destructor, other threads, other threads created
from destructor, forked process (without exec), pthread_atfork handlers,
pthread setspecific destructors, signal handlers, atexit handlers).
tst-rseq.c only links against libc.so, testing registration of rseq in
a non-multithreaded environment.
tst-rseq-nptl.c also links against libpthread.so, testing registration
of rseq in a multithreaded environment.
See the Linux kernel selftests for extensive rseq stress-tests.
When available, use the cpu_id field from __rseq_abi on Linux to
implement sched_getcpu(). Fall-back on the vgetcpu vDSO if unavailable.
Benchmarks:
x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading
glibc sched_getcpu(): 13.7 ns (baseline)
glibc sched_getcpu() using rseq: 2.5 ns (speedup: 5.5x)
inline load cpuid from __rseq_abi TLS: 0.8 ns (speedup: 17.1x)
Register rseq TLS for each thread (including main), and unregister for
each thread (excluding main). "rseq" stands for Restartable Sequences.
See the rseq(2) man page proposed here:
https://lkml.org/lkml/2018/9/19/647
Those are based on glibc master branch commit 3ee1e0ec5c.
The rseq system call was merged into Linux 4.18.
The TLS_STATIC_SURPLUS define is increased to leave additional room for
dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working.
The increase (76 bytes) is larger than 32 bytes because it has not been
increased in quite a while. The cost in terms of additional TLS storage
is quite significant, but it will also obscure some initial-exec-related
dlopen failures.
The time argument is NULL in this case, and attempt to convert it
leads to a null pointer dereference.
This fixes commit d2e3b697da
("y2038: linux: Provide __settimeofday64 implementation").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch updates the kernel version in the test tst-mman-consts.py
to 5.7. (There are no new constants covered by this test in 5.7 that
need any other header changes; there's a new MREMAP_DONTUNMAP, but
this test doesn't yet cover MREMAP_*.)
Tested with build-many-glibcs.py.
1. Add the directories to hold POWER10 files.
2. Add support to select POWER10 libraries based on AT_PLATFORM.
3. Let submachine=power10 be set automatically.
* hurd/hurdselect.c: Include <sysdep-cancel.h>.
(_hurd_select): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/accept4.c: Include <sysdep-cancel.h>.
(__libc_accept4): Surround call to __socket_accept with enabling async cancel,
and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/connect.c: Include <sysdep-cancel.h>.
(__connect): Surround call to __file_name_lookup and __socket_connect
with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.
* sysdeps/mach/hurd/fdatasync.c: Include <sysdep-cancel.h>.
(fdatasync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/fsync.c: Include <sysdep-cancel.h>.
(fsync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/ioctl.c: Include <sysdep-cancel.h>.
(__ioctl): When request is TIOCDRAIN, surround call to send_rpc with enabling
async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_object_sync with enabling async cancel.
* sysdeps/mach/hurd/sigsuspend.c: Include <sysdep-cancel.h>.
(__sigsuspend): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/sigwait.c: Include <sysdep-cancel.h>.
(__sigwait): Surround wait code with enabling async cancel.
* sysdeps/mach/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_msync with enabling async cancel.
* sysdeps/mach/sleep.c: Include <sysdep-cancel.h>.
(__sleep): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/usleep.c: Include <sysdep-cancel.h>.
(usleep): Surround call to __vm_msync with enabling async cancel.
and add _nocancel variant.
* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add fcntl_nocancel.
* sysdeps/mach/hurd/fcntl.c [NOCANCEL]: Include <not-cancel.h>.
[!NOCANCEL]: Include <sysdep-cancel.h>.
(__libc_fcntl) [!NOCANCEL]: Surround __file_record_lock call with enabling async cancel, and use HURD_FD_PORT_USE_CANCEL instead of HURD_FD_PORT_USE.
* sysdeps/mach/hurd/fcntl_nocancel.c: New file, defines __fcntl_nocancel by including fcntl.c.
* sysdeps/mach/hurd/not-cancel.h (__fcntl64_nocancel): Replace macro with
__fcntl_nocancel declaration with hidden proto, and make
__fcntl64_nocancel call __fcntl_nocancel.
and add _nocancel variant.
* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add wait4_nocancel.
* sysdeps/mach/hurd/wait4.c: Include <sysdep-cancel.h>
(__wait4): Surround __proc_wait with enabling async cancel, and use
__USEPORT_CANCEL instead of __USEPORT.
* sysdeps/mach/hurd/wait4_nocancel.c: New file, contains previous
implementation of __wait4.
* sysdeps/mach/hurd/not-cancel.h (__waitpid_nocancel): Replace macro with
__wait4_nocancel declaration with hidden proto, and make
__waitpid_nocancel call __wait4_nocancel.
HURD_*PORT_USE link fd and port with a stack-stored structure, so on
thread cancel we need to cleanup this.
* hurd/fd-cleanup.c: New file.
* hurd/port-cleanup.c (_hurd_port_use_cleanup): New function.
* hurd/Makefile (routines): Add fd-cleanup.
* sysdeps/hurd/include/hurd.h (__USEPORT_CANCEL): New macro.
* sysdeps/hurd/include/hurd/fd.h (_hurd_fd_port_use_data): New
structure.
(_hurd_fd_port_use_cleanup): New prototype.
(HURD_DPORT_USE_CANCEL, HURD_FD_PORT_USE_CANCEL): New macros.
* sysdeps/hurd/include/hurd/port.h (_hurd_port_use_data): New structure.
(_hurd_port_use_cleanup): New prototype.
(HURD_PORT_USE_CANCEL): New macro.
* hurd/hurd/fd.h (HURD_FD_PORT_USE): Also refer to HURD_FD_PORT_USE_CANCEL.
* hurd/hurd.h (__USEPORT): Also refer to __USEPORT_CANCEL.
* hurd/hurd/port.h (HURD_PORT_USE): Also refer to HURD_PORT_USE_CANCEL.
* hurd/fd-read.c (_hurd_fd_read): Call HURD_FD_PORT_USE_CANCEL instead
of HURD_FD_PORT_USE.
* hurd/fd-write.c (_hurd_fd_write): Likewise.
* sysdeps/mach/hurd/send.c (__send): Call HURD_DPORT_USE_CANCEL instead
of HURD_DPORT_USE.
* sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.
* sysdeps/mach/hurd/sendto.c (__sendto): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Likewise.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Call __USEPORT_CANCEL
instead of __USEPORT, and HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.
This adds sysdeps/htl/libc-lock.h which augments sysdeps/mach/libc-lock.h with
the htl-aware cleanup handling. Otherwise inclusion of libc-lock.h
without libc-lockP.h would keep only the mach-aware handling.
This also fixes cleanup getting called when the binary is
statically-linked without libpthread.
* sysdeps/htl/libc-lockP.h (__libc_cleanup_region_start,
__libc_cleanup_end, __libc_cleanup_region_end,
__pthread_get_cleanup_stack): Move to...
* sysdeps/htl/libc-lock.h: ... new file.
(__libc_cleanup_region_start): Always set handler and arg.
(__libc_cleanup_end): Always call the cleanup handler.
(__libc_cleanup_push, __libc_cleanup_pop): New macros.
These only need exactly to use __libc_ptf_call.
* sysdeps/htl/flockfile.c: Include <libc-lockP.h> instead of
<libc-lock.h>
* sysdeps/htl/ftrylockfile.c: Include <libc-lockP.h> instead of
<errno.h>, <pthread.h>, <stdio-lock.h>
* sysdeps/htl/funlockfile.c: Include <libc-lockP.h> instead of
<pthread.h> and <stdio-lock.h>
Like hurd_thread_cancel does.
* sysdeps/mach/hurd/htl/pt-docancel.c: Include <hurd/signal.h>
(__pthread_do_cancel): Lock target thread's critical_section_lock and ss
lock around thread mangling.
Intel Advanced Matrix Extensions (Intel AMX) is a new programming
paradigm consisting of two components: a set of 2-dimensional registers
(tiles) representing sub-arrays from a larger 2-dimensional memory image,
and accelerators able to operate on tiles. Intel AMX is an extensible
architecture. New accelerators can be added and the existing accelerator
may be enhanced to provide higher performance. The initial features are
AMX-BF16, AMX-TILE and AMX-INT8, which are usable only if the operating
system supports both XTILECFG state and XTILEDATA state.
Add AMX-BF16, AMX-TILE and AMX-INT8 support to HAS_CPU_FEATURE and
CPU_FEATURE_USABLE.
It turned out that an 256b-mvc instruction which depends on the
result of a previous 256b-mvc instruction is counterproductive.
Therefore this patch adjusts the 256b-loop by storing the
first byte with stc and setting the remaining 255b with mvc.
Now the 255b-mvc instruction depends on the stc instruction.
This patch introduces an extra loop without pfd instructions
as it turned out that the pfd instructions are usefull
for copies >=64KB but are counterproductive for smaller copies.
* sysdeps/htl/sem-timedwait.c (struct cancel_ctx): Add cancel_wake
field.
(cancel_hook): When unblocking thread, set cancel_wake field to 1.
(__sem_timedwait_internal): Set cancel_wake field to 0 by default.
On cancellation exit, check whether we hold a token, to be put back.
By aligning its implementation on pthread_cond_wait.
* sysdeps/htl/sem-timedwait.c (cancel_ctx): New structure.
(cancel_hook): New function.
(__sem_timedwait_internal): Check for cancellation and register
cancellation hook that wakes the thread up, and check again for
cancellation on exit.
* nptl/tst-cancel13.c, nptl/tst-cancelx13.c: Move to...
* sysdeps/pthread/: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
Since __pthread_exit does not return, we do not need to indent the
noncancel path
* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Move cancelled path before non-cancelled path, to avoid "else"
indentation.
* nptl/tst-cancel25.c: Move to...
* sysdeps/pthread/tst-cancel25.c: ... here.
(tf2) Do not test for SIGCANCEL when it is not defined.
* nptl/Makefile: Move corresponding reference to...
* sysdeps/pthread/Makefile: ... here.
Linux commit ID ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 reserved 2 new
bits in AT_HWCAP2:
- PPC_FEATURE2_ARCH_3_1 indicates the availability of the POWER ISA
3.1;
- PPC_FEATURE2_MMA indicates the availability of the Matrix-Multiply
Assist facility.
Add support for MTE to strncmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Add support for MTE to strcmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Add support for MTE to strrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Add support for MTE to memrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Add support for MTE to memchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Gabor Kertesz <gabor.kertesz@arm.com>
Add support for MTE to strcpy. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
1. Divide architecture features into the usable features and the preferred
features. The usable features are for correctness and can be exported in
a stable ABI. The preferred features are for performance and only for
glibc internal use.
2. Change struct cpu_features to
struct cpu_features
{
struct cpu_features_basic basic;
unsigned int *usable_p;
struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
unsigned int usable[USABLE_FEATURE_INDEX_MAX];
unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
...
};
and initialize usable_p to pointer to the usable arary so that
struct cpu_features
{
struct cpu_features_basic basic;
unsigned int *usable_p;
struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
};
can be exported via a stable ABI. The cpuid and usable arrays can be
expanded with backward binary compatibility for both .o and .so files.
3. Add COMMON_CPUID_INDEX_7_ECX_1 for AVX512_BF16.
4. Detect ENQCMD, PKS, AVX512_VP2INTERSECT, MD_CLEAR, SERIALIZE, HYBRID,
TSXLDTRK, L1D_FLUSH, CORE_CAPABILITIES and AVX512_BF16.
5. Rename CAPABILITIES to ARCH_CAPABILITIES.
6. Check if AVX512_VP2INTERSECT, AVX512_BF16 and PKU are usable.
7. Update CPU feature detection test.
The -fno-math-errno is already added by default and the minimum
required GCC to build glibc (6.2) make the -ffinite-math-only
superflous.
Checked on aarch64-linux-gnu.
Checked with a build for riscv64-linux-gnu-rv64imac-lp64 (no
builtin support), riscv64-linux-gnu-rv64imafdc-lp64, and
riscv64-linux-gnu-rv64imafdc-lp64d.
The generic implementation is simplified by removing the
'optimization' for !_IEEE_FP_INEXACT (which does not handle
inexact neither some values).
Checked on alpha-linux-gnu.
The powerpc sqrt implementation is also simplified:
- the static constants are open coded within the implementation.
- for !USE_SQRT_BUILTIN the function is implemented directly on
__ieee754_sqrt (it avoid an superflous extra jump).
Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
The define is already set on the math-use-builtins-ceil.h, the patch
just removes the implementations (it was missed on c9feb1be93).
Checked on aarch64-linux-gnu.
Each symbol definitions are moved on a separated file and it
cover all symbol type definitions (float, double, long double,
and float128).
It allows to set support for architectures without the boiler
place of copying default values.
Checked with a build on the affected ABIs.
The generic implementation is slight worse (Itanium(R) Processor 9020):
Before new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 3.61582e+08,
"iterations": 2.384e+07,
"reciprocal-throughput": 14.8334,
"latency": 15.5006,
"max-throughput": 6.74153e+07,
"min-throughput": 6.45136e+07
}
}
With new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 3.85549e+08,
"iterations": 2.384e+07,
"reciprocal-throughput": 15.8391,
"latency": 16.5056,
"max-throughput": 6.31348e+07,
"min-throughput": 6.05857e+07
}
}
However it fixes all the issues on both:
math/test-float-exp10
math/test-float32-exp10
(all the issues wrong results for non default rounding modes).
The existing ia64 libm interface uses matherrf and matherrl in addition
to matherr for SVID error handling. However, there is no such error
handling support for exp10f in ia64 libm. So replacing it with the
generic implementation should be fine.
Checked on ia64-linux-gnu.
This patch changes the exp10f error handling semantics to only set
errno according to POSIX rules. New symbol version is introduced at
GLIBC_2.32. The old wrappers are kept for compat symbols.
There are some outliers that need special handling:
- ia64 provides an optimized implementation of exp10f that uses ia64
specific routines to set SVID compatibility. The new symbol version
is aliased to the exp10f one.
- m68k also provides an optimized implementation, and the new version
uses it instead of the sysdeps/ieee754/flt32 one.
- riscv and csky uses the generic template implementation that
does not provide SVID support. For both cases a new exp10f
version is not added, but rather the symbols version of the
generic sysdeps/ieee754/flt32 is adjusted instead.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
powerpc64le-linux-gnu.
It is inspired by expf and reuses its tables and internal functions.
The error checks are inlined and errno setting is in separate tail
called functions, but the wrappers are kept in this patch to handle
the _LIB_VERSION==_SVID_ case.
Double precision arithmetics is used which is expected to be faster on
most targets (including soft-float) than using single precision and it
is easier to get good precision result with it.
Result for x86_64 (i7-4790K CPU @ 4.00GHz) are:
Before new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 4.0414e+09,
"iterations": 1.00128e+08,
"reciprocal-throughput": 26.6818,
"latency": 54.043,
"max-throughput": 3.74787e+07,
"min-throughput": 1.85038e+07
}
With new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 4.11951e+09,
"iterations": 1.23968e+08,
"reciprocal-throughput": 21.0581,
"latency": 45.4028,
"max-throughput": 4.74876e+07,
"min-throughput": 2.20251e+07
}
Result for aarch64 (A72 @ 2GHz) are:
Before new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 4.62362e+09,
"iterations": 3.3376e+07,
"reciprocal-throughput": 127.698,
"latency": 149.365,
"max-throughput": 7.831e+06,
"min-throughput": 6.69501e+06
}
With new code:
"exp10f": {
"workload-spec2017.wrf (adapted)": {
"duration": 4.29108e+09,
"iterations": 6.6752e+07,
"reciprocal-throughput": 51.2111,
"latency": 77.3568,
"max-throughput": 1.9527e+07,
"min-throughput": 1.29271e+07
}
Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu,
and sparc64-linux-gnu.
strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.
Linux 5.7 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 5.7.
Tested with build-many-glibcs.py.
This came to light when adding hard-flaot support to ARC glibc port
without hardware sqrt support causing glibc build to fail:
| ../sysdeps/ieee754/dbl-64/e_sqrt.c: In function '__ieee754_sqrt':
| ../sysdeps/ieee754/dbl-64/e_sqrt.c:58:54: error: unused variable 'ty' [-Werror=unused-variable]
| double y, t, del, res, res1, hy, z, zz, p, hx, tx, ty, s;
The reason being EMULV() macro uses the hardware provided
__builtin_fma() variant, leaving temporary variables 'p, hx, tx, hy, ty'
unused hence compiler warning and ensuing error.
The intent of the patch was to fix that error, but EMULV is pervasive
and used fair bit indirectly via othe rmacros, hence this patch.
Functionally it should not result in code gen changes and if at all
those would be better since the scope of those temporaries is greatly
reduced now
Built tested with aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf hppa-linux-gnu x86_64-linux-gnu arm-linux-gnueabihf riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64 powerpc-linux-gnu microblaze-linux-gnu nios2-linux-gnu hppa-linux-gnu
Also as suggested by Joseph [1] used --strip and compared the libs with
and w/o patch and they are byte-for-byte unchanged (with gcc 9).
| for i in `find . -name libm-2.31.9000.so`;
| do
| echo $i; diff $i /SCRATCH/vgupta/gnu2/install/glibcs/$i ; echo $?;
| done
| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so
[1] https://sourceware.org/pipermail/libc-alpha/2019-November/108267.html
* sysdeps/mach/hurd/Makefile [subdir=misc] (sysdep_routines): Add
writev_nocancel writev_nocancel_nostatus.
* sysdeps/mach/hurd/not-cancel.h (__writev_nocancel_nostatus): Replace
macro with function declaration (with hidden prototype in libc).
(__writev_nocancel): New function declaration (with hidden prototype in libc).
* sysdeps/mach/hurd/writev_nocancel_nostatus.c: New file.
* sysdeps/posix/writev_nocancel.c: New file, includes writev.c to make a
nocancel variant that calls __write_nocancel.
* sysdeps/posix/writev.c (writev): Do not define alias if __writev is
renamed.
and add _nocancel variants.
* sysdeps/mach/hurd/write.c (__libc_write): Call __write_nocancel
surrounded by enabling async cancel, to replace implementation moved
to...
* sysdeps/mach/hurd/write_nocancel.c (__write_nocancel): ... here.
* sysdeps/mach/hurd/pwrite64.c (__libc_pwrite64): Call
__pwrite64_nocancel surrounded by enabling async cancel, to replace
implementation moved to...
* sysdeps/mach/hurd/pwrite64_nocancel.c (__pwrite64_nocancel): ... here.
* sysdeps/mach/hurd/Makefile (sysdep_routines): Add write_nocancel and
pwrite64_nocancel.
* sysdeps/mach/hurd/not-cancel.h (__write_nocancel,
__pwrite64_nocancel): Replace macros with prototypes with a hidden proto on
libc.
* sysdeps/mach/hurd/dl-sysdep.c (__write_nocancel): New alias, check
that it is not hidden.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __write_nocancel.
(ld.GLIBC_PRIVATE): Add __write_nocancel.
* sysdeps/mach/hurd/i386/localplt.data (__write_nocancel): Add
reference.
* sysdeps/htl/stdio-lock.h: New file, registers locking cleanup to htl.
* sysdeps/htl/libc-lockP.h: Include <libc-lock.h>.
(__libc_cleanup_region_start, __libc_cleanup_end,
__libc_cleanup_region_end): Override macros from <libc-lock.h> with
versions which register cleanup to htl.
(__pthread_get_cleanup_stack): Make reference weak for skipping
registration on in the static non-libpthread case.
* sysdeps/mach/hurd/recv.c (__recv): Make the __socket_recv call
cancellable.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Make the __socket_recv and
__socket_whatis_address calls cancellable.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Make the __socket_recv,
__socket_whatis_address, __io_reauthenticate, and __auth_user_authenticate calls
cancellable.
Added a check to detect the CPU value in preconfigure, so that glibc is
built with the correct --with-cpu value. And move existing checks into
preconfigure.ac.
Co-Authored-By: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Co-Authored-By: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Introduce an Arm MTE compatible strlen implementation.
The existing implementation assumes that any access to the pages in
which the string resides is safe. This assumption is not true when
MTE is enabled. This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance on modern cores. On cores with less
efficient Advanced SIMD implementation such as Cortex-A53 it can
be slower.
Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Introduce an Arm MTE compatible strchr implementation.
The existing implementation assumes that any access to the pages in
which the string resides is safe. This assumption is not true when
MTE is enabled. This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.
Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Introduce an Arm MTE compatible strchrnul implementation.
The existing implementation assumes that any access to the pages in
which the string resides is safe. This assumption is not true when
MTE is enabled. This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.
Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
Falkor's memcpy and memmove share some implementation details,
therefore, the two routines are moved to a single source file
for code reuse.
The two routines now share code for small and medium copies
(up to and including 128 bytes). Large copies in memcpy do not
handle overlap correctly, consequently, the loops for
moving/copying more than 128 bytes stay separate for memcpy
and memmove.
To increase code reuse a number of small modifications were made:
1. The old implementation of memcpy copied the first 16-bytes as
soon as the size of data was determined to be greater than 32 bytes.
For memcpy code to also work when copying small/medium overlapping
data, the first load and store was moved to the large copy case.
2. Medium memcpy case no longer assumes that 16 bytes were already
copied and uses 8 registers to copy up to 128 bytes.
3. Small case for memmove was enlarged to that of memcpy, which is
less than or equal to 32 bytes.
4. Medium case for memmove was enlarged to that of memcpy, which is
less than or equal to 128 bytes.
Other changes include:
1. Improve alignment of existing loop bodies.
2. 'Delouse' memmove and memcpy input arguments. Make sure that
upper 32-bits of input registers are zeroed if unused.
3. Do one more iteration in memmove loops and reduce the number of
copies made from the start/end of the buffer, depending on
the direction of the memmove loop.
Benchmarking:
Looking at the results from bench-memcpy-random.out, we can see that
now memmove_falkor is about 5% faster than memcpy_falkor_old, while
memmove_falkor_old was more than 15% slower. The memcpy implementation
remained largely unmodified, so there is no significant performance
change.
The reason for such a significant memmove performance gain is the
increase of the upper bound on the small copy case to 32 bytes and
the increase of the upper bound on the medium copy case to 128 bytes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
d6d74ec16 ('htl: Enable more tests') moved the linking rules from
nptl/Makefile and htl/Makefile to the shared sysdeps/pthread/Makefile. But
e.g. on powerpc some tests are added in sysdeps/powerpc/Makefile, which is
included *after* sysdeps/pthread/Makefile, and thus the tests don't get
affected by the rules and fail to link. For now let's just copy over the
set of rules in both nptl/Makefile and htl/Makefile.
* sysdeps/pthread/Makefile: Move libpthread linking rules to...
* htl/Makefile: ... here and...
* nptl/Makefile: ... there.
We really need modules to use their own pthread_atfork so that
__dso_handle properly identifies them.
* sysdeps/htl/pt-atfork.c (__pthread_atfork): Hide function.
(pthread_atfork): Hide alias.
* sysdeps/htl/old_pt-atfork.c (pthread_atfork): Rename macro to
__pthread_atfork to fix building the compatibility alias.
* sysdeps/htl/pthreadP.h: Include <link.h>
(__pthread_init_static_tls): New prototype.
* htl/pt-alloc.c (__pthread_init_static_tls): New function.
* sysdeps/mach/hurd/htl/pt-sysdep.c (_init_routine): Initialize tcb
field of initial thread. Set GL(dl_init_static_tls) to
&__pthread_init_static_tls.
and add _nocancel variants.
* sysdeps/mach/hurd/pread64.c (__libc_pread64): Call __pread64_nocancel
surrounded by enabling async cancel, to replace implementation moved to...
* sysdeps/mach/hurd/pread64_nocancel.c (__pread64_nocancel): ... here.
* sysdeps/mach/hurd/read.c (__libc_read): Call __read_nocancel surrounded by
enabling async cancel, to replace implementation moved to...
* sysdeps/mach/hurd/read_nocancel.c (__read_nocancel): ... here.
* sysdeps/mach/hurd/Makefile (sysdep_routines): Add read_nocancel and
pread64_nocancel.
* sysdeps/mach/hurd/not-cancel.h (__read_nocancel, __pread64_nocancel):
Replace macros with prototypes with a hidden proto on libc.
* sysdeps/mach/hurd/dl-sysdep.c: Include <not-cancel.h>.
(__pread64_nocancel): New alias, check that it is not hidden.
(__read_nocancel): New alias, check that it is not hidden.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __read_nocancel and
__pread64_nocancel.
(ld.GLIBC_2.1): Add __pread64.
(ld.GLIBC_PRIVATE): Add __read_nocancel and __pread64_nocancel.
* sysdeps/mach/hurd/i386/ld.abilist (__pread64): Add symbol.
* sysdeps/mach/hurd/i386/localplt.data (__read_nocancel, __pread64,
__pread64_nocancel): Add references.
* sysdeps/i386/htl/Makefile: New file.
* sysdeps/i386/htl/tcb-offsets.sym: New file.
* sysdeps/mach/hurd/i386/Makefile [setjmp] (gen-as-const-headers): Add
signal-defines.sym.
* sysdeps/mach/hurd/i386/____longjmp_chk.S: Include tcb-offsets.h.
(____longjmp_chk): Harmonize with i386's __longjmp. Clear SS_ONSTACK
when jumping off the alternate stack.
* sysdeps/mach/hurd/i386/__longjmp.S: New file.
The existing macros are fragile and expect local variables with a
certain name. Fix this by defining them as functions with default
implementation in a new header dl-runtime.h which arches can override
if need be.
This came up during ARC port review, hence the need for argument pltgot
in reloc_index() which is not needed by existing ports.
This patch potentially only affects hppa/x86 ports,
build tested for both those configs and a few more.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This started as a trivial change to Anton's rawmemchr. I got
carried away. This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).
The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.
Finally, explicly use the P7 strlen with the runtime loader when building
P9. We need to be cautious about vector/vsx extensions here on P9 only
builds.
This defines the macro such that it should behave best on all
supported powerpc targets. Likewise, this allows us to remove the
ppc64le specific s_fmaf128.c.
I have verified powerpc64le multiarch and powerpc64le power9
no-multiarch builds continue to generate optimize fmaf128.
commit e9698175b0
Author: Lukasz Majewski <lukma@denx.de>
Date: Mon Mar 16 08:31:41 2020 +0100
y2038: Replace __clock_gettime with __clock_gettime64
breaks benchtests with sysdeps/generic/hp-timing.h:
In file included from ./bench-timing.h:23,
from ./bench-skeleton.c:25,
from
/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/benchtests/bench-rint.c:45:
./bench-skeleton.c: In function ‘main’:
../sysdeps/generic/hp-timing.h:37:23: error: storage size of ‘tv’ isn’t known
37 | struct __timespec64 tv; \
| ^~
Define HP_TIMING_NOW with clock_gettime in sysdeps/generic/hp-timing.h
if _ISOMAC is defined. Don't define __clock_gettime in bench-timing.h
since it is no longer needed.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The build uses an undefined macro evaluation for fmaf128 build.
For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0.
Checked with a build for:
powerpc64le-linux-gnu-power9-disable-multi-arch
powerpc64le-linux-gnu-power9
powerpc64le-linux-gnu
powerpc64-linux-gnu-power8
powerpc64-linux-gnu
powerpc-linux-gnu-power4
powerpc-linux-gnu
timer_create needs to create threads with all signals blocked,
including SIGTIMER (which happens to equal SIGCANCEL).
Fixes commit b3cae39dcb ("nptl: Start
new threads with all signals blocked [BZ #25098]").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This introduces the function __pthread_attr_extension to allocate the
extension space, which is freed by pthread_attr_destroy.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This allows to reuse the storage after calling pthread_cond_destroy.
* sysdeps/htl/bits/types/struct___pthread_cond.h (__pthread_cond):
Replace unused struct __pthread_condimpl *__impl field with unsigned int
__wrefs.
(__PTHREAD_COND_INITIALIZER): Update accordingly.
* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Register as waiter in __wrefs field. On unregistering, wake any pending
pthread_cond_destroy.
* sysdeps/htl/pt-cond-destroy.c (__pthread_cond_destroy): Register wake
request in __wrefs.
* nptl/Makefile (tests): Move tst-cond20 tst-cond21 to...
* sysdeps/pthread/Makefile (tests): ... here.
* nptl/tst-cond20.c nptl/tst-cond21.c: Move to...
* sysdeps/pthread/tst-cond20.c sysdeps/pthread/tst-cond21.c: ... here.
Linux overrides this file via sysdeps/unix/sysv/linux/i386/sysdep.c.
Hurd does not have sysdeps/unix/i386 on its search path, so it uses
csu/sysdep.c instead.
_hurdsig_preemptors and _hurdsig_preempted_set are not ABI symbols,
so do not declare them. HURD_PREEMPT_SIGNAL_P is an implementation
detail, so move it as well.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
This fixes various build errors due to deprecation warnings.
Fixes commit 02802fafcf
("signal: Deprecate additional legacy signal handling functions").
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
This change makes it easier to set a breakpoint on these calls.
This also addresses the issue that including <ldsodefs.h> without
<unistd.h> does not result usable _dl_*printf macros because of the
use of the STD*_FILENO macros there.
(The private symbol for _dl_fatal_printf will go away again
once the exception handling implementation is unified between
libc and ld.so.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Also add the private type union pthread_attr_transparent, to reduce
the amount of casting that is required.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This is part of the libpthread removal project:
<https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>
Use __getline instead of __getdelim to avoid a localplt failure.
Likewise for __getrlimit/getrlimit.
The abilist updates were performed by:
git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
| while read x ; do
echo "GLIBC_2.32 pthread_getattr_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py --only-linux pthread_getattr_np
The private export of __pthread_getaffinity_np is no longer needed, but
the hidden alias still necessary so that the symbol can be exported with
versioned_symbol.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This is part of the libpthread removal project:
<https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>
The abilist updates were performed by:
git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
| while read x ; do
echo "GLIBC_2.32 pthread_getaffinity_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py pthread_getaffinity_np
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This is part of the libpthread removal project:
<https://sourceware.org/ml/libc-alpha/2019-10/msg00080.html>
The symbol did not previously exist in libc, so a new GLIBC_2.32
symbol is needed, to get correct dependency for binaries which
use the symbol but no longer link against libpthread.
The abilist updates were performed by:
git ls-files 'sysdeps/unix/sysv/linux/**/libc.abilist' \
| while read x ; do
echo "GLIBC_2.32 pthread_attr_setaffinity_np F" >> $x
done
python3 scripts/move-symbol-to-libc.py pthread_attr_setaffinity_np
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The stubs for pthread_getaffinity_np, pthread_getname_np,
pthread_setaffinity_np, pthread_setname_np are replaced, and corresponding
tests are moved.
After the removal of the NaCl port, nptl is Linux-specific, and the stubs
are no longer needed. This effectively reverts commit
c76d1ff514 ("NPTL: Add stubs for Linux-only
extension functions.").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This fixes a build error:
../sysdeps/unix/sysv/linux/ntp_gettime.c: In function ‘__ntp_gettime’:
../sysdeps/unix/sysv/linux/ntp_gettime.c:56:10: error: ‘ntv64.tai’ is used uninitialized in this function [-Werror=uninitialized]
56 | *ntv = valid_ntptimeval64_to_ntptimeval (ntv64);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The __clock_gettime internal function is not supporting 64 bit time on
architectures with __WORDSIZE == 32 and __TIMESIZE != 64 (like e.g. ARM 32
bit).
The __clock_gettime64 function shall be used instead in the glibc itself as
it supports 64 bit time on those systems.
This patch does not bring any changes to systems with __WORDSIZE == 64 as
for them the __clock_gettime64 is aliased to __clock_gettime (in
./include/time.h).
This patch provides new __ntp_gettimex64 explicit 64 bit function for getting
time parameters via NTP interface.
The call to __adjtimex in __ntp_gettime64 function has been replaced with
direct call to __clock_adjtime64 syscall, to simplify the code.
Moreover, a 32 bit version - __ntp_gettimex has been refactored to internally
use __ntp_gettimex64.
The __ntp_gettimex is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
ntptimeval and 64 bit struct __ntptimeval64.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __ntp_gettimex64 and __ntp_gettimex.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch provides new __ntp_gettime64 explicit 64 bit function for getting
time parameters via NTP interface.
Internally, the __clock_adjtime64 syscall is used instead of __adjtimex. This
patch is necessary for having architectures with __WORDSIZE == 32 Y2038 safe.
Moreover, a 32 bit version - __ntp_gettime has been refactored to internally
use __ntp_gettime64.
The __ntp_gettime is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
ntptimeval and 64 bit struct __ntptimeval64.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __ntp_gettime64 and __ntp_gettime.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Those functions allow easy conversion between Y2038 safe, glibc internal
struct __ntptimeval64 and struct ntptimeval.
The reserved fields (i.e. __glibc_reserved{1234}) during conversion are
zeroed as well, to provide behavior similar to one in ntp_gettimex function
(where those are cleared before the struct ntptimeval is returned).
Those functions are put in Linux specific sys/timex.h file, as putting
them into glibc's local include/time.h would cause build break on HURD as
it doesn't support struct timex related syscalls.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This type is a glibc's "internal" type to get time parameters data from
Linux kernel (NTP daemon interface). It stores time in struct __timeval64
rather than struct timeval, which makes it Y2038-proof.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch provides new __adjtime64 explicit 64 bit function for adjusting
Linux kernel clock.
Internally, the __clock_adjtime64 syscall is used instead of __adjtimex. This
patch is necessary for having architectures with __WORDSIZE == 32 Y2038 safe.
Moreover, a 32 bit version - __adjtime has been refactored to internally use
__adjtime64.
The __adjtime is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
timeval and 64 bit struct __timeval64.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both __adjtime64 and __adjtime.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch provides new ___adjtimex64 explicit 64 bit function for adjusting
Linux kernel clock.
Internally, the __clock_adjtime64 syscall is used. This patch is necessary
for having architectures with __WORDSIZE == 32 Y2038 safe.
Moreover, a 32 bit version - ___adjtimex has been refactored to internally
use ___adjtimex64.
The ___adjtimex is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between struct
timex and 64 bit struct __timex64.
Last but not least, in ___adjtimex64 function the __clock_adjtime syscall has
been replaced with __clock_adjtime64 to support 64 bit time on architectures
with __WORDSIZE == 32 and __TIMESIZE != 64.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without to
test the proper usage of both ___adjtimex64 and ___adjtimex.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch replaces auto generated wrapper (as described in
sysdeps/unix/sysv/linux/syscalls.list) for clock_adjtime with one which adds
extra support for reading 64 bit time values on machines with __TIMESIZE != 64.
To achieve this goal new __clock_adjtime64 explicit 64 bit function for
adjusting Linux clock has been added.
Moreover, a 32 bit version - __clock_adjtime has been refactored to internally
use __clock_adjtime64.
The __clock_adjtime is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversions between 64 bit
struct __timespec64 and struct timespec.
The new __clock_adjtime64 syscall available from Linux 5.1+ has been used, when
applicable.
Up till v5.4 in the Linux kernel there was a bug preventing this call from
obtaining correct struct's timex time.tv_sec time after time_t overflow
(i.e. not being Y2038 safe).
Build tests:
- ./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Linux kernel, headers and minimal kernel version for glibc build test matrix:
- Linux v5.1 (with clock_adjtime64) and glibc build with v5.1 as
minimal kernel version (--enable-kernel="5.1.0")
The __ASSUME_TIME64_SYSCALLS flag defined.
- Linux v5.1 and default minimal kernel version
The __ASSUME_TIME64_SYSCALLS not defined, but kernel supports clock_adjtime64
syscall.
- Linux v4.19 (no clock_adjtime64 support) with default minimal kernel version
for contemporary glibc (3.2.0)
This kernel doesn't support clock_adjtime64 syscall, so the fallback to
clock_adjtime is tested.
Above tests were performed with Y2038 redirection applied as well as without
(so the __TIMESIZE != 64 execution path is checked as well).
No regressions were observed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This version uses vector instructions and is up to 60% faster on medium
matches and up to 90% faster on long matches, compared to the POWER7
version. A few examples:
__rawmemchr_power9 __rawmemchr_power7
Length 32, alignment 0: 2.27566 3.77765
Length 64, alignment 2: 2.46231 3.51064
Length 1024, alignment 0: 17.3059 32.6678
When CET is enabled, it is an error to dlopen a non CET enabled shared
library in CET enabled application. It may be desirable to make CET
permissive, that is disable CET when dlopening a non CET enabled shared
library. With the new --enable-cet=permissive configure option, CET is
disabled when dlopening a non CET enabled shared library.
Add DEFAULT_DL_X86_CET_CONTROL to config.h.in:
/* The default value of x86 CET control. */
#define DEFAULT_DL_X86_CET_CONTROL cet_elf_property
which enables CET features based on ELF property note.
--enable-cet=permissive it to
/* The default value of x86 CET control. */
#define DEFAULT_DL_X86_CET_CONTROL cet_permissive
which enables CET features permissively.
Update tst-cet-legacy-5a, tst-cet-legacy-5b, tst-cet-legacy-6a and
tst-cet-legacy-6b to check --enable-cet and --enable-cet=permissive.
This was originally added to support binutils older than version
2.22:
<https://sourceware.org/ml/libc-alpha/2010-12/msg00051.html>
Since 2.22 is older than the minimum required binutils version
for building glibc, we no longer need this. (The changes do
not impact the statically linked startup code.)
Add stpcpy support to the POWER9 strcpy. This is up to 40% faster on
small strings and up to 90% faster on long relatively unaligned strings,
compared to the POWER8 version. A few examples:
__stpcpy_power9 __stpcpy_power8
Length 20, alignments in bytes 4/ 4: 2.58246 4.8788
Length 1024, alignments in bytes 1/ 6: 24.8186 47.8528
This version uses VSX store vector with length instructions and is
significantly faster on small strings and relatively unaligned large
strings, compared to the POWER8 version. A few examples:
__strcpy_power9 __strcpy_power8
Length 16, alignments in bytes 0/ 0: 2.52454 4.62695
Length 412, alignments in bytes 4/ 0: 11.6 22.9185
1. Include <dl-procruntime.c> to get architecture specific initializer in
rtld_global.
2. Change _dl_x86_feature_1[2] to _dl_x86_feature_1.
3. Add _dl_x86_feature_control after _dl_x86_feature_1, which is a
struct of 2 bitfields for IBT and SHSTK control
This fixes [BZ #25887].
The getcpu cache was removed from the kernel in Linux 2.6.24. glibc
support from the sched_getcpu implementation was removed in commit
dd26c44403 ("Consolidate sched_getcpu").
This patch fixes the optimized implementation of strcpy and strnlen
on a big-endian arm64 machine.
The optimized method uses neon, which can process 128bit with one
instruction. On a big-endian machine, the bit order should be reversed
for the whole 128-bits double word. But with instuction
rev64 datav.16b, datav.16b
it reverses 64bits in the two halves rather than reversing 128bits.
There is no such instruction as rev128 to reverse the 128bits, but we
can fix this by loading the data registers accordingly.
Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
2911cb68ed3d("aarch64: Optimized implementation of strnlen").
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
After using "make update-syscall-lists" to update arch-syscall.h for
new kernel versions, sysd-syscalls will not be not be regenerated.
This will cause a compile error because the new data is not being
picked up.
Fixes commit a1bd5f8673
("Linux: Use system call tables during build").
Reviewed-by: Florian Weimer <fweimer@redhat.com>
When using outline atomics (-moutline-atomics, the default for ARMv8-A
starting with GCC 10), libgcc contains an ELF constructor which calls
__getauxval. This code is built outside of glibc, so none of its
internal PLT avoidance schemes can be applied to it. This change
suppresses the elf/check-localplt failure.
The script can now be called to query the definition status of
system call numbers across all architectures, like this:
$ python3 sysdeps/unix/sysv/linux/glibcsyscalls.py query-syscall sync_file_range sync_file_range2
sync_file_range:
defined: aarch64 alpha csky hppa i386 ia64 m68k microblaze mips/mips32 mips/mips64/n32 mips/mips64/n64 nios2 riscv/rv64 s390/s390-32 s390/s390-64 sh sparc/sparc32 sparc/sparc64 x86_64/64 x86_64/x32
undefined: arm powerpc/powerpc32 powerpc/powerpc64
sync_file_range2:
defined: arm powerpc/powerpc32 powerpc/powerpc64
undefined: aarch64 alpha csky hppa i386 ia64 m68k microblaze mips/mips32 mips/mips64/n32 mips/mips64/n64 nios2 riscv/rv64 s390/s390-32 s390/s390-64 sh sparc/sparc32 sparc/sparc64 x86_64/64 x86_64/x32
This command lists the headers containing the system call numbers:
$ python3 sysdeps/unix/sysv/linux/glibcsyscalls.py list-headers
The argument parser code is based on a suggestion from Adhemerval Zanella.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since __x86_shared_non_temporal_threshold is defined as
long int __x86_shared_non_temporal_threshold;
and long int is 4 bytes for x32, use RDX_LP to compare against
__x86_shared_non_temporal_threshold in assembly code.
Only alpha and ia64 do not support __NR_umount2 (defined as
__NR_umount), but recent kernel fixes (74cd2184833f for ia64, and
12b57c5c70f39 for alpha) add the required alias.
Checked with a build against all affected ABIs.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This consolidates the copy-pasted arch specific semaphore header into
single version (based on s390) which suffices 32-bit and and 64-bit
arch/ABI based on the canonical WORDSIZE.
For now I've left out arches which use alternate defines to choose for
32 vs 64-bit builds (aarch64, mips) which in theory can also use the same
header.
Passes build-many for
aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf
riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64
x86_64-linux-gnu microblaze-linux-gnu nios2-linux-gnu
Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Those functions allow easy conversion between Y2038 safe, glibc internal
struct __timex64 and struct timex.
Those functions are put in Linux specific sys/timex.h file, as putting
them into glibc's local include/time.h would cause build break on HURD as
it doesn't support struct timex related syscalls.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The introduced glibc's 'internal' struct __timex64 is a copy of Linux kernel's
struct __kernel_timex (v5.6) introduced for properly handling data for
clock_adjtime64 syscall.
As the struct's __kernel_timex size is the same as for archs with
__WORDSIZE == 64, proper padding and data types conversion (i.e. long to long
long) had to be added for architectures with __WORDSIZE == 32 &&
__TIMESIZE != 64.
Moreover, it stores time in struct __timeval64 rather than struct
timeval, which makes it Y2038-proof.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For Linux glibc ports the __TIMESIZE == 64 ensures proper aliasing for
__clock_gettime64 (to __clock_gettime).
When __TIMESIZE != 64 (like ARM32, PPC) the glibc expects separate definition
of the __clock_gettime64.
The HURD port only provides __clock_gettime, so this patch adds
__clock_gettime64 as a tiny wrapper on it.
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
strcmp is used while resolving PLT references. Vector registers
should not be used during this. The P9 strcmp makes heavy use of
vector registers, so it should be avoided in rtld.
This prevents quiet vector register corruption when glibc is configured
with --disable-multi-arch and --with-cpu=power9. This can be seen with
test-float64x-compat_totalordermag during the first call into
totalordermagf64x@GLIBC_2.27.
Add a guard to fallback to the power8 implementation when building
power9 strcmp for libraries other than libc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The minimum GCC version has been raised to 6.2 for building
glibc. Therefore, follow the advice inside the implementation
and remove the GCC < 6 codepath.
Likewise, remove the hidden_proto as all internal usages should
inline now.
Commit a98dc92dd1 ("x86: Add cache
information support for Zhaoxin processors") introduced an unused
variable warning in the default i686-linux-gnu build:
In file included from ../sysdeps/i386/cacheinfo.c:3:
../sysdeps/x86/cacheinfo.c: In function 'init_cacheinfo':
../sysdeps/x86/cacheinfo.c:762:16: error: unused variable 'eax' [-Werror=unused-variable]
762 | unsigned int eax;
| ^~~
Add a C wrapper to pass arguments in
/* Control process execution. */
extern int prctl (int __option, ...) __THROW;
to prctl syscall:
extern int prctl (int, unsigned long int, unsigned long int,
unsigned long int, unsigned long int);
On platforms where long double may have two different formats, i.e.: the
same format as double (64-bits) or something else (128-bits), building
with -mlong-double-128 is the default and function calls in the user
program match the name of the function in Glibc. When building with
-mlong-double-64, Glibc installed headers redirect such calls to the
appropriate function.
Likewise, the internals of glibc are now built against IEEE long double.
However, the only (minimally) notable usage of long double is difftime.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
GCC 7.5.0 (PR94200) will refuse to compile if both -mabi=% and
-mlong-double-128 are passed on the command line. Surprisingly,
it will work happily if the latter is not. For the sake of
maintaining status quo, test for and blacklist such compilers.
Tested with a GCC 8.3.1 and GCC 7.5.0 compiler for ppc64le.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
This is a small step up from 2.25 which brings in support for
rewriting the .gnu.attributes section of libc/libm.so.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Add compiler feature tests to ensure we can build ieee128 long double.
These test for -mabi=ieeelongdouble, -mno-gnu-attribute, and -Wno-psabi.
Likewise, verify some compiler bugs have been addressed. These aren't
helpful for building glibc, but may cause test failures when testing
the new long double. See notes below from Raji.
On powerpc64le, some older compiler versions give error for the function
signbit() for 128-bit floating point types. This is fixed by PR83862
in gcc 8.0 and backported to gcc6 and gcc7. This patch adds a test
to check compiler version to avoid compiler errors during make check.
Likewise, test for -mno-gnu-attribute support which was
On powerpc64le, a few files are built on IEEE long double mode
(-mabi=ieeelongdouble), whereas most are built on IBM long double mode
(-mabi=ibmlongdouble, the default for -mlong-double-128). Since binutils
2.31, linking object files with different long double modes causes
errors similar to:
ld: libc_pic.a(s_isinfl.os) uses IBM long double,
libc_pic.a(ieee128-qefgcvt.os) uses IEEE long double.
collect2: error: ld returned 1 exit status
make[2]: *** [../Makerules:649: libc_pic.os] Error 1
The warnings are fair and correct, but in order for glibc to have
support for both long double modes on powerpc64le, they have to be
ignored. This can be accomplished with the use of -mno-gnu-attribute
option when building the few files that require IEEE long double mode.
However, -mno-gnu-attribute is not available in GCC 6, the minimum
version required to build glibc, so this patch adds a test for this
feature in powerpc64le builds, and fails early if it's not available.
Co-Authored-By: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
Co-Authored-By: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Improve the commentary to aid future developers who will stumble
upon this novel, yet not always perfect, mechanism to support
alternative formats for long double.
Likewise, rename __LONG_DOUBLE_USES_FLOAT128 to
__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI now that development work
has settled down. The command used was
git grep -l __LONG_DOUBLE_USES_FLOAT128 ':!./ChangeLog*' | \
xargs sed -i 's/__LONG_DOUBLE_USES_FLOAT128/__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI/g'
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
To obtain Zhaoxin CPU cache information, add a new function
handle_zhaoxin().
Add a new function get_common_cache_info() that extracts the code
in init_cacheinfo() to get the value of the variable shared, threads.
Add Zhaoxin branch in init_cacheinfo() for initializing variables,
such as __x86_shared_cache_size.
Since the the U marker can only be applied to 2 unsigned long arguments
in syscalls.list files, add a C wrapper for process_vm_readv and
process_vm_writev syscals which have more than 2 unsigned long arguments.
Update the default typesizes.h to match the new kernel sizes for 32-bit
architectures with a 64-bit time_t and friends. This follows the sizes
used for RV32 which is a y2038 safe architecture added after Linux 5.1.
Reviewed-by: Vineet Gupta <vgupta@synopsys.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>
Remove the sem-pad.h file and instead have architectures override the
struct semid_ds via the bits/types/struct_semid_ds.h file.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Split out the struct semid_ds into it's own file. This will allow us to
have architectures specify their own version.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Mark unsigned long arguments in mmap, read, recv, recvfrom, send, sendto,
write, ioperm, sendfile64, setxattr, lsetxattr, fsetxattr, getxattr,
lgetxattr, fgetxattr, listxattr, llistxattr and flistxattr with U in
syscalls.list files.
X32 has 32-bit long and pointer with 64-bit off_t. Since x32 psABI
requires that pointers passed in registers must be zero-extended to
64bit, x32 can share many syscall interfaces with LP64. When a LP64
syscall with long and unsigned long int arguments is used for x32, these
arguments must be properly extended to 64-bit. Otherwise if the upper
32 bits of the register have undefined value, such a syscall will be
rejected by kernel.
For syscalls implemented in assembly codes, 'U' is added to syscall
signature key letters for unsigned long, which is zero-extended to
64-bit types. SYSCALL_ULONG_ARG_1 and SYSCALL_ULONG_ARG_2 are passed
to syscall-template.S for the first and the second unsigned long int
arguments if PSEUDOS_HAVE_ULONG_INDICES is defined. They are used by
x32 to zero-extend 32-bit arguments to 64 bits.
Tested on i386, x86-64 and x32 as well as with build-many-glibcs.py.
This change should not have an effect because the system call was
never defined. Also add the misssing attribute_compat_text_section
attribute to the sstk function (a minor optimization). Also update the
NEWS file to document the change.
Fixes commit 9cc93ba097
("misc: Turn sstk into a compat symbol").
This patch removes the IEEE_DOUBLE_BIG_ENDIAN and
IEEE_DOUBLE_MIXED_ENDIAN macros from gmp-impl.h and gmp-mparam.h, and
the ieee_double_extract union from gmp-impl.h. The macros were used
only in defining the union, which was used nowhere in glibc. As GMP's
gmp-impl.h is over 5000 lines, the file in glibc is so far from the
GMP version that it doesn't seem to make sense to keep things there
that are not relevant in glibc. (I expect there is plenty more in the
header after this patch that is also not relevant in glibc and can be
cleaned up later.)
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
Most gmp-mparam.h headers in glibc define various macros to the same
values they would be defined to by the generic version of that header,
plus macros IEEE_DOUBLE_BIG_ENDIAN or IEEE_DOUBLE_MIXED_ENDIAN related
to the representation of double. The latter macros are in turn only
used in gmp-impl.h to define union ieee_double_extract, which is not
used in glibc. Thus all of these headers, except for the generic one
and those that define _LONG_LONG_LIMB for ILP32 configurations with
64-bit registers, are redundant, and this patch removes them.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
This function is defined in libc.so, and the dynamic loader calls
right after relocation has been finished, before any ELF constructors
or the preinit function is invoked. It is also used in the static
build for initializing parts of the static libc.
To locate __libc_early_init, a direct symbol lookup function is used,
_dl_lookup_direct. It does not search the entire symbol scope and
consults merely a single link map. This function could also be used
to implement lookups in the vDSO (as an optimization).
A per-namespace variable (libc_map) is added for locating libc.so,
to avoid repeated traversals of the search scope. It is similar to
GL(dl_initfirst). An alternative would have been to thread a context
argument from _dl_open down to _dl_map_object_from_fd (where libc.so
is identified). This could have avoided the global variable, but
the change would be larger as a result. It would not have been
possible to use this to replace GL(dl_initfirst) because that global
variable is used to pass the function pointer past the stack switch
from dl_main to the main program. Replacing that requires adding
a new argument to _dl_init, which in turn needs changes to the
architecture-specific libc.so startup code written in assembler.
__libc_early_init should not be used to replace _dl_var_init (as
it exists today on some architectures). Instead, _dl_lookup_direct
should be used to look up a new variable symbol in libc.so, and
that should then be initialized from the dynamic loader, immediately
after the object has been loaded in _dl_map_object_from_fd (before
relocation is run). This way, more IFUNC resolvers which depend on
these variables will work.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
MIPS needs to ignore certain existing symbols during symbol lookup.
The old scheme uses the ELF_MACHINE_SYM_NO_MATCH macro, with an
inline function, within its own header, with a sysdeps override for
MIPS. This allows re-use of the function from another file (without
having to include <dl-machine.h> or providing the default definition
for ELF_MACHINE_SYM_NO_MATCH).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The upper bits of the sigset_t s not fully initialized in the signal
mask calls that return information from kernel (sigprocmask,
sigpending, and pthread_sigmask), since the exported sigset_t size
(1024 bits) is larger than Linux support one (64 or 128 bits).
It might make sigisemptyset/sigorset/sigandset fail if the mask
is filled prior the call.
This patch changes the internal signal function to handle up to
supported Linux signal number (_NSIG), the remaining bits are
untouched.
Checked on x86_64-linux-gnu and i686-linux-gnu.
It is required because __libc_unwind_longjmp (used on thread
cancellation) calls __sigprocmask. Replace with a direct call.
They are required because __libc_unwind_longjmp (used for thread
cancellation) calls __sigprocmask. Replace this with a direct call.
The sigblock function is not exported and is not used internally, so
it can be removed.
Checked on cross build for ia64-linux-gnu.