asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.
These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.
The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.
Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.
Also remove code comments claiming correctly rounded results.
After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4) (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.
Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.
Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.
The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.
Tested on x86_64.
The len variable is only used in the else branch.
We don't need the call to strlen if the name is 0 or 1 characters long.
2019-10-02 Lode Willems <Lode.Willems@UGent.be>
* tdlib/getenv.c: Move the call to strlen into the branch it's used.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.10. (There are no new MAP_* constants covered by this test in
5.10 that need any other header changes.)
Tested with build-many-glibcs.py.
GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache
commits, e.g.:
785969a047
elf: Implement a string table for ldconfig, with tail merging
If glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
<glibc>/elf/stringtable.c:stringtable_finalize()
which leads to ldconfig failing with "String table is too large".
This is also recognizable in following tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable
See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small
uint32_t values incorrectly detects overflow"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
The secondary/non-primary/inner libc (loaded via dlmopen, LD_AUDIT,
static dlopen) must not use sbrk to allocate member because that would
interfere with allocations in the outer libc. On Linux, this does not
matter because sbrk itself was changed to fail in secondary libcs.
_dl_addr occasionally shows up in profiles, but had to be used before
because __libc_multiple_libs was unreliable. So this change achieves
a slight reduction in startup time.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Change sbrk to fail for !__libc_initial (in the generic
implementation). As a result, sbrk is (relatively) safe to use
for the __libc_initial case (from the main libc). It is therefore
no longer necessary to avoid using it in that case (or updating the
brk cache), and the __libc_initial flag does not need to be updated
as part of dlmopen or static dlopen.
As before, direct brk system calls on Linux may lead to memory
corruption.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
GCC 11 with
commit 6fbec038f7a7ddf29f074943611b53210d17c40c
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Feb 3 11:55:43 2020 -0800
Use SHF_GNU_RETAIN to preserve symbol definitions
places used symbols in SECTION_RETAIN sections if assembler supports it.
Mark __libc_freeres_fn as used to avoid
gconv_dl.c: In function 'free_mem':
gconv_dl.c:191:1: error: 'do_release_all' without 'used' attribute and 'free_mem' with 'used' attribute are placed in a section with the same name [-Werror=attributes]
191 | do_release_all (void *nodep)
| ^~~~~~~~~~~~~~
In file included from <command-line>:
gconv_dl.c:202:18: note: 'free_mem' was declared here
202 | libc_freeres_fn (free_mem)
| ^~~~~~~~
./../include/libc-symbols.h:316:15: note: in definition of macro 'libc_freeres_fn'
316 | static void name (void)
| ^~~~
cc1: all warnings being treated as errors
Linux 5.10 has one new syscall, process_madvise. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Otherwise, it will not participate in the dependency sorting.
Fixes commit 9ffa50b26b
("elf: Include libc.so.6 as main program in dependency sort
(bug 20972)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch makes build-many-glibcs.py use the recent GMP 6.2.1
release.
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
This symbol is not in the implementation reserved namespace for static
linking and it was never used: it seems it was mistakenly added in the
orignal strlen_asimd commit 436e4d5b96
The failure paths in _dl_map_object_from_fd did not clean every
potentially allocated resource up.
Handle l_phdr, l_libname and mapped segments in the common failure
handling code.
There are various bits that may not be cleaned properly on failure
(e.g. executable stack, incomplete dl_map_segments) fixing those
need further changes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
_dl_map_object_from_fd has complex error handling with cleanups.
It was managed by a separate function to avoid code bloat at
every failure case, but since the code was changed to use gotos
there is no longer such code bloat from inlining.
Maintaining a separate error handling function is harder as it
needs to access local state which has to be passed down. And the
same lose function was used in open_verify which is error prone.
The goto labels are changed since there is no longer a call.
The new code generates slightly smaller binary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
inttypes.h has inline implementations of the strtoimax, strtoumax,
wcstoimax and wcstoumax functions, despite the corresponding stdlib.h
and wchar.h inlines having been removed in 2007 (commit
9b2e9577b2).
Remove those inlines, thereby eliminating all references to the
corresponding __*_internal functions from installed headers (so they
could be made into compat symbols in future if desired).
Tested for x86_64 and x86.
Some internal functions need to know if a database has a nonzero
list of actions; success getting the database does not guarantee
that. Add checks for such as needed.
Skip the ":" in each nsswitch.conf line so as not to add a dummy
action libnss_:.so
See also https://bugzilla.redhat.com/show_bug.cgi?id=1906066
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since we can't tell if the tunable value is set by user or not:
https://sourceware.org/bugzilla/show_bug.cgi?id=27069
remove the default REP MOVSB threshold tunable value so that the correct
default value will be set correctly by init_cacheinfo ().
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Since elf.h is a public header file copied to other projects,
try to make it free from spelling typos.
This change fixes the following spelling typos in comments of elf.h:
Auxialiary -> Auxiliary
tenatively -> tentatively
compatability -> compatibility
If linked-list of tcache contains a loop, it invokes infinite
loop in _int_free when freeing tcache. The PoC which invokes
such infinite loop is on the Bugzilla(#27052). This loop
should terminate when the loop exceeds mp_.tcache_count and
the program should abort. The affected glibc version is
2.29 or later.
Reviewed-by: DJ Delorie <dj@redhat.com>
_dl_map_object_deps always sorts the initially loaded object first
during dependency sorting. This means it is relocated last in
dl_open_worker. This results in crashes in IFUNC resolvers without
lazy bindings if libraries are preloaded that refer to IFUNCs in
libc.so.6: the resolvers are called when libc.so.6 has not been
relocated yet, so references to _rtld_global_ro etc. crash.
The fix is to check against the libc.so.6 link map recorded by the
__libc_early_init framework, and let it participate in the dependency
sort.
This fixes bug 20972.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.
If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored. To protect
the main executable even when mprotect is filtered the linux kernel
will have to be changed to add PROT_BTI to it.
The delayed failure reporting is mainly needed because currently
_dl_process_gnu_properties does not propagate failures such that
the required cleanups happen. Using the link_map_machine struct for
error propagation is not ideal, but this seemed to be the least
intrusive solution.
Fixes bug 26831.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Program headers are processed in two pass: after the first pass
load segments are mmapped so in the second pass target specific
note processing logic can access the notes.
The second pass is moved later so various link_map fields are
set up that may be useful for note processing such as l_phdr.
The second pass should be before the fd is closed so that is
available.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Handle unaligned executable load segments (the bfd linker is not
expected to produce such binaries, but other linkers may).
Computing the mapping bounds follows _dl_map_object_from_fd more
closely now.
Fixes bug 26988.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The _dl_open_check and _rtld_main_check hooks are not called on the
dependencies of a loaded module, so BTI protection was missed on
every module other than the main executable and directly dlopened
libraries.
The fix just iterates over dependencies to enable BTI.
Fixes bug 26926.
Starting with recent commits, I get 43 conform/.../linknamespace FAILs:
- nss: Introduce <nss_module.h>
- <nss_action.h>: New abstraction for combining NSS modules and NSS actions
- nss: Implement <nss_database.h> (see nss/nss_database.c)
- nsswitch: use new internal API (core)
- nsswitch: user new internal API (tests)
- nsswitch: use new internal API (callers)
e.g. conform/XPG42/wordexp.h/linknamespace.out
[initial] wordexp -> [libc.a(wordexp.o)] __getpwnam_r -> [libc.a(getpwnam_r.o)] __nss_database_custom -> [libc.a(nsswitch.o)] __nss_database_get -> [libc.a(nss_database.o)] feof_unlocked
[initial] wordexp -> [libc.a(wordexp.o)] __getpwnam_r -> [libc.a(getpwnam_r.o)] __nss_database_custom -> [libc.a(nsswitch.o)] __nss_database_get -> [libc.a(nss_database.o)] ferror_unlocked
This patch is just using __ferror_unlocked and __feof_unlocked instead of the
non "__" prefixed ones.
Reviewed-by: DJ Delorie <dj@redhat.com>
It removes all the arch-specific assembly implementation. The
outliers are alpha, where its kernel ABI explict return -ENOMEM
in case of failure; and i686, where it can't use
"call *%gs:SYSINFO_OFFSET" during statup in static PIE.
Also some ABIs exports an additional ___brk_addr symbol and to
handle it an internal HAVE_INTERNAL_BRK_ADDR_SYMBOL is added.
Checked on x86_64-linux-gnu, i686-linux-gnu, adn with builsd for
the affected ABIs.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Subdirectories z13, z14, z15 can be selected, mostly based on the
level of support for vector instructions.
Co-Authored-By: Stefan Liebler <stli@linux.ibm.com>
The misattributed dependencies can cause failures in parallel testing
if the dependencies have not been built yet.
Fixes commit a332bd1518
("elf: Add elf/tst-dlopenfail-2 [BZ #25396]").
If glibc is build with -O3 on at least 390 (-m31) or x86 (-m32),
gcc 11 dumps this warning:
svc_tcp.c: In function 'rendezvous_request':
svc_tcp.c:274:3: error: 'memcpy' offset [0, 15] is out of the bounds [0, 0] [-Werror=array-bounds]
274 | memcpy (&xprt->xp_raddr, &addr, sizeof (addr));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
In out-of-memory case, if one of the mallocs in makefd_xprt function
returns NULL, a message is dumped, makefd_xprt returns NULL
and the subsequent memcpy would copy to NULL.
Instead of a segfaulting, we delay a bit (see also __svc_accept_failed
and Bug 14889 (CVE-2011-4609) - svc_run() produces high cpu usage when
accept() fails with EMFILE (CVE-2011-4609).
The same applies to svc_unix.c.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
float_t supposedly represents the type that is used to evaluate float
expressions internally. While the isa supports single-precision float
operations, the port of glibc to s390 incorrectly deferred to the
generic definitions which, back then, tied float_t to double. gcc by
default evaluates float in single precision, so that scenario violates
the C standard (sections 5.2.4.2.2 and 7.12 in C11/C17). With
-fexcess-precision=standard, gcc evaluates float in double precision,
which aligns with the standard yet at the cost of added conversion
instructions.
With this patch, we drop the s390-specific definition of float_t and
defer to the default behavior, which aligns float_t with the
compiler-defined FLT_EVAL_METHOD in a standard-compliant way.
Checked on s390x-linux-gnu with 31-bit and 64-bit builds.