Commit Graph

38139 Commits

Author SHA1 Message Date
Adhemerval Zanella
691d9ae9e6 Remove ununsed tcb-offset
Some architectures do not use the auto-generated tcb-offsets.h.
2021-12-17 17:47:29 -03:00
Aurelien Jarno
225da459ce riscv: align stack before calling _dl_init [BZ #28703]
Align the stack pointer to 128 bits during the call to _dl_init() as
specified by the RISC-V ABI [1]. This fixes the elf/tst-align2 test.

Fixes bug 28703.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
2021-12-17 20:29:34 +01:00
Aurelien Jarno
d2e594d715 riscv: align stack in clone [BZ #28702]
The RISC-V ABI [1] mandates that "the stack pointer shall be aligned to
a 128-bit boundary upon procedure entry". This as not the case in clone.

This fixes the misc/tst-misalign-clone-internal and
misc/tst-misalign-clone tests.

Fixes bug 28702.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
2021-12-17 20:29:32 +01:00
Aurelien Jarno
94058f6cde elf: Fix tst-cpu-features-cpuinfo for KVM guests on some AMD systems [BZ #28704]
On KVM guests running on some AMD systems, the IBRS feature is reported
as a synthetic feature using the Intel feature, while the cpuinfo entry
keeps the same. Handle that by first checking the presence of the Intel
feature on AMD systems.

Fixes bug 28704.
2021-12-17 20:20:15 +01:00
Matheus Castanho
ae91d3df24 powerpc64[le]: Allocate extra stack frame on syscall.S
The syscall function does not allocate the extra stack frame for scv like other
assembly syscalls using DO_CALL_SCV. So after commit d120fb9941 changed the
offset that is used to save LR, syscall ended up using an invalid offset,
causing regressions on powerpc64. So make sure the extra stack frame is
allocated in syscall.S as well to make it consistent with other uses of
DO_CALL_SCV and avoid similar issues in the future.

Tested on powerpc, powerpc64, and powerpc64le (with and without scv)

Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
2021-12-17 15:40:53 -03:00
Maxim Kuvyrkov
c16dc431c8 Update copyright header in recently merged ab_GE locale
ab_GE locale was committed under DCO and this header
proposed in [1] suits it better.

[1] https://sourceware.org/pipermail/libc-alpha/2021-September/130692.html

Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
2021-12-17 18:22:21 +00:00
Siddhesh Poyarekar
2bbd07c715 fortify: Fix spurious warning with realpath
The length and object size arguments were swapped around for realpath.
Also add a smoke test so that any changes in this area get caught in
future.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-12-17 18:49:27 +05:30
Florian Weimer
b99b0f93ee nss: Use "files dns" as the default for the hosts database (bug 28700)
This matches what is currently in nss/nsswitch.conf.  The new ordering
matches what most distributions use in their installed configuration
files.

It is common to add localhost to /etc/hosts because the name does not
exist in the DNS, but is commonly used as a host name.

With the built-in "dns [!UNAVAIL=return] files" default, dns is
searched first and provides an answer for "localhost" (NXDOMAIN).
We never look at the files database as a result, so the contents of
/etc/hosts is ignored.  This means that "getent hosts localhost"
fail without a /etc/nsswitch.conf file, even though the host name
is listed in /etc/hosts.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-12-17 12:01:25 +01:00
Florian Weimer
ce1e5b1122 arm: Guard ucontext _rtld_global_ro access by SHARED, not PIC macro
Due to PIE-by-default, PIC is now defined in more cases.  libc.a
does not have _rtld_global_ro, and statically linking setcontext
fails.  SHARED is the right condition to use, so that libc.a
references _dl_hwcap instead of _rtld_global_ro.

For static PIE support, the !SHARED case would still have to be made
PIC.  This patch does not achieve that.

Fixes commit 23645707f1
("Replace --enable-static-pie with --disable-default-pie").

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-17 11:48:44 +01:00
Siddhesh Poyarekar
72e4a717bd Fix The GNU ToolChain Authors copyright notice
I (and maybe one or two others) added a (C) to the copyright notice
regardless of the contribution checklist[1] not mentioning it.  Fix all
these instances so that the notice reads as "Copyright The GNU Toolchain
Authors" across the source code.

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-17 07:59:45 +05:30
Patrick McGehearty
0a4df6f534 Remove upper limit on tunable MALLOC_MMAP_THRESHOLD
The current limit on MALLOC_MMAP_THRESHOLD is either 1 Mbyte (for
32-bit apps) or 32 Mbytes (for 64-bit apps).  This value was set by a
patch dated 2006 (15 years ago).  Attempts to set the threshold higher
are currently ignored.

The default behavior is appropriate for many highly parallel
applications where many processes or threads are sharing RAM. In other
situations where the number of active processes or threads closely
matches the number of cores, a much higher limit may be desired by the
application designer. By today's standards on personal computers and
small servers, 2 Gbytes of RAM per core is commonly available. On
larger systems 4 Gbytes or more of RAM is sometimes available.
Instead of raising the limit to match current needs, this patch
proposes to remove the limit of the tunable, leaving the decision up
to the user of a tunable to judge the best value for their needs.

This patch does not change any of the defaults for malloc tunables,
retaining the current behavior of the dynamic malloc mmap threshold.

bugzilla 27801 - Remove upper limit on tunable MALLOC_MMAP_THRESHOLD
Reviewed-by: DJ Delorie <dj@redhat.com>

malloc/
        malloc.c changed do_set_mmap_threshold to remove test
        for HEAP_MAX_SIZE.
2021-12-16 17:24:37 +00:00
Nart Tlisha
a16c5ab139 localedata: add new locale ab_GE
Add the Abkhazian language in the Georgia territory

The ab_GE was just recently added to CLDR, it should be available
in CLDR v41, https://github.com/unicode-org/cldr/pull/1402

The Abkhazian language has been added to Gnome for localization

The locale has been tested on Ubuntu 20.04, Mint 20.2 and Fedora 35 Beta

Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
2021-12-16 14:37:14 +00:00
Stefan Liebler
ff3cb03f38 Fix __minimal_malloc segfaults in __mmap due to stack-protector
Starting with commit b05fae4d8e
"elf: Use the minimal malloc on tunables_strdup",
I get lots of segfaults in static tests on s390x when also using, e.g.:
export GLIBC_TUNABLES="glibc.elision.enable=1"

tunables_strdup callls __minimal_malloc which tries to call __mmap
due to insufficient space left. __mmap itself first setups a new
stack frame and segfaults when copying the stack-protector canary
from thread-pointer. The latter one is not yet setup.

Thus this patch also turns off stack-protection for mmap.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-16 15:19:28 +01:00
Siddhesh Poyarekar
ae23fa3e5f __glibc_unsafe_len: Fix comment
We know that the length is *unsafe*.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-16 07:21:43 +05:30
Adhemerval Zanella
0f982c1827 malloc: Enable huge page support on main arena
This patch adds support huge page support on main arena allocation,
enable with tunable glibc.malloc.hugetlb=2.  The patch essentially
disable the __glibc_morecore() sbrk() call (similar when memory
tag does when sbrk() call does not support it) and fallback to
default page size if the memory allocation fails.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
0849eed45d malloc: Move MORECORE fallback mmap to sysmalloc_mmap_fallback
So it can be used on hugepage code as well.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
c1beb51d08 malloc: Add Huge Page support to arenas
It is enabled as default for glibc.malloc.hugetlb set to 2 or higher.
It also uses a non configurable minimum value and maximum value,
currently set respectively to 1 and 4 selected huge page size.

The arena allocation with huge pages does not use MAP_NORESERVE.  As
indicate by kernel internal documentation [1], the flag might trigger
a SIGBUS on soft page faults if at memory access there is no left
pages in the pool.

On systems without a reserved huge pages pool, is just stress the
mmap(MAP_HUGETLB) allocation failure.  To improve test coverage it is
required to create a pool with some allocated pages.

Checked on x86_64-linux-gnu with no reserved pages, 10 reserved pages
(which trigger mmap(MAP_HUGETBL) failures) and with 256 reserved pages
(which does not trigger mmap(MAP_HUGETLB) failures).

[1] https://www.kernel.org/doc/html/v4.18/vm/hugetlbfs_reserv.html#resv-map-modifications

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
98d5fcb8d0 malloc: Add Huge Page support for mmap
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages.  And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.

This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.

Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.

To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting.  On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure.  To improve test coverage it is required to create
a pool with some allocated pages.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:38 -03:00
Adhemerval Zanella
6cc3ccc67e malloc: Move mmap logic to its own function
So it can be used with different pagesize and flags.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:15 -03:00
Adhemerval Zanella
7478c9959a malloc: Add THP/madvise support for sbrk
To increase effectiveness with Transparent Huge Page with madvise, the
large page size is use instead page size for sbrk increment for the
main arena.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:15 -03:00
Adhemerval Zanella
5f6d8d97c6 malloc: Add madvise support for Transparent Huge Pages
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'.  The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages.  However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.

To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.

This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size.  The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.

To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:14 -03:00
Florian Weimer
cb976fba4c powerpc: Use global register variable in <thread_pointer.h>
A local register variable is merely a compiler hint, and so not
appropriate in this context.  Move the global register variable into
<thread_pointer.h> and include it from <tls.h>, as there can only
be one global definition for one particular register.

Fixes commit 8dbeb0561e
("nptl: Add <thread_pointer.h> for defining __thread_pointer").

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
2021-12-15 16:06:25 +01:00
Adhemerval Zanella
a6d2f948b7 Use LFS and 64 bit time for installed programs (BZ #15333)
The installed programs are built with a combination of different
values for MODULE_NAME, as below.  To enable both Long File Support
and 64 bt time, -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 is added for
nonlibi, nscd, lddlibc4, libresolv, ldconfig, locale_programs,
iconvprogs, libnss_files, libnss_compat, libnss_db, libnss_hesiod,
libutil, libpcprofile, and libSegFault.

  nscd/nscd
    nscd/nscd.o                           MODULE_NAME=nscd
    nscd/connections.o                    MODULE_NAME=nscd
    nscd/pwdcache.o                       MODULE_NAME=nscd
    nscd/getpwnam_r.o                     MODULE_NAME=nscd
    nscd/getpwuid_r.o                     MODULE_NAME=nscd
    nscd/grpcache.o                       MODULE_NAME=nscd
    nscd/getgrnam_r.o                     MODULE_NAME=nscd
    nscd/getgrgid_r.o                     MODULE_NAME=nscd
    nscd/hstcache.o                       MODULE_NAME=nscd
    nscd/gethstbyad_r.o                   MODULE_NAME=nscd
    nscd/gethstbynm3_r.o                  MODULE_NAME=nscd
    nscd/getsrvbynm_r.o                   MODULE_NAME=nscd
    nscd/getsrvbypt_r.o                   MODULE_NAME=nscd
    nscd/servicescache.o                  MODULE_NAME=nscd
    nscd/dbg_log.o                        MODULE_NAME=nscd
    nscd/nscd_conf.o                      MODULE_NAME=nscd
    nscd/nscd_stat.o                      MODULE_NAME=nscd
    nscd/cache.o                          MODULE_NAME=nscd
    nscd/mem.o                            MODULE_NAME=nscd
    nscd/nscd_setup_thread.o              MODULE_NAME=nscd
    nscd/xmalloc.o                        MODULE_NAME=nscd
    nscd/xstrdup.o                        MODULE_NAME=nscd
    nscd/aicache.o                        MODULE_NAME=nscd
    nscd/initgrcache.o                    MODULE_NAME=nscd
    nscd/gai.o                            MODULE_NAME=nscd
    nscd/res_hconf.o                      MODULE_NAME=nscd
    nscd/netgroupcache.o                  MODULE_NAME=nscd
    nscd/cachedumper.o                    MODULE_NAME=nscd
  elf/lddlibc4
    elf/lddlibc4                          MODULE_NAME=lddlibc4
  elf/pldd
    elf/pldd.o                            MODULE_NAME=nonlib
    elf/xmalloc.o                         MODULE_NAME=nonlib
  elf/sln
    elf/sln.o                             MODULE_NAME=nonlib
    elf/static-stubs.o                    MODULE_NAME=nonlib
  elf/sprof                               MODULE_NAME=nonlib
  elf/ldconfig
    elf/ldconfig.o                        MODULE_NAME=ldconfig
    elf/cache.o                           MODULE_NAME=nonlib
    elf/readlib.o                         MODULE_NAME=nonlib
    elf/xmalloc.o                         MODULE_NAME=nonlib
    elf/xstrdup.o                         MODULE_NAME=nonlib
    elf/chroot_canon.o                    MODULE_NAME=nonlib
    elf/static-stubs.o                    MODULE_NAME=nonlib
    elf/stringtable.o                     MODULE_NAME=nonlib
  io/pwd
    io/pwd.o                              MODULE_NAME=nonlib
  locale/locale
    locale/locale.o                       MODULE_NAME=locale_programs
    locale/locale-spec.o                  MODULE_NAME=locale_programs
    locale/charmap-dir.o                  MODULE_NAME=locale_programs
    locale/simple-hash.o                  MODULE_NAME=locale_programs
    locale/xmalloc.o                      MODULE_NAME=locale_programs
    locale/xstrdup.o                      MODULE_NAME=locale_programs
    locale/record-status.o                MODULE_NAME=locale_programs
    locale/xasprintf.o                    MODULE_NAME=locale_programs
  locale/localedef
    locale/localedef.o                    MODULE_NAME=locale_programs
    locale/ld-ctype.o                     MODULE_NAME=locale_programs
    locale/ld-messages.o                  MODULE_NAME=locale_programs
    locale/ld-monetary.o                  MODULE_NAME=locale_programs
    locale/ld-numeric.o                   MODULE_NAME=locale_programs
    locale/ld-time.o                      MODULE_NAME=locale_programs
    locale/ld-paper.o                     MODULE_NAME=locale_programs
    locale/ld-name.o                      MODULE_NAME=locale_programs
    locale/ld-address.o                   MODULE_NAME=locale_programs
    locale/ld-telephone.o                 MODULE_NAME=locale_programs
    locale/ld-measurement.o               MODULE_NAME=locale_programs
    locale/ld-identification.o            MODULE_NAME=locale_programs
    locale/ld-collate.o                   MODULE_NAME=locale_programs
    locale/charmap.o                      MODULE_NAME=locale_programs
    locale/linereader.o                   MODULE_NAME=locale_programs
    locale/locfile.o                      MODULE_NAME=locale_programs
    locale/repertoire.o                   MODULE_NAME=locale_programs
    locale/locarchive.o                   MODULE_NAME=locale_programs
    locale/md5.o                          MODULE_NAME=locale_programs
    locale/charmap-dir.o                  MODULE_NAME=locale_programs
    locale/simple-hash.o                  MODULE_NAME=locale_programs
    locale/xmalloc.o                      MODULE_NAME=locale_programs
    locale/xstrdup.o                      MODULE_NAME=locale_programs
    locale/record-status.o                MODULE_NAME=locale_programs
    locale/xasprintf.o                    MODULE_NAME=locale_programs
  catgets/gencat
    catgets/gencat.o                      MODULE_NAME=nonlib
    catgets/xmalloc.o                     MODULE_NAME=nonlib
  nss/makedb
    nss/makedb.o                          MODULE_NAME=nonlib
    nss/xmalloc.o                         MODULE_NAME=nonlib
    nss/hash-string.o                     MODULE_NAME=nonlib
  nss/getent
    nss/getent.o                          MODULE_NAME=nonlib
  posix/getconf
    posix/getconf.o                       MODULE_NAME=nonlib
  login/utmpdump
    login/utmpdump.o                      MODULE_NAME=nonlib
  debug/pcprofiledump
    debug/pcprofiledump.o                 MODULE_NAME=nonlib
  timezone/zic
    timezone/zic.o                        MODULE_NAME=nonlib
  timezone/zdump
    timezone/zdump.o                      MODULE_NAME=nonlib
  iconv/iconv_prog
    iconv/iconv_prog.o                    MODULE_NAME=nonlib
    iconv/iconv_charmap.o                 MODULE_NAME=iconvprogs
    iconv/charmap.o                       MODULE_NAME=iconvprogs
    iconv/charmap-dir.o                   MODULE_NAME=iconvprogs
    iconv/linereader.o                    MODULE_NAME=iconvprogs
    iconv/dummy-repertoire.o              MODULE_NAME=iconvprogs
    iconv/simple-hash.o                   MODULE_NAME=iconvprogs
    iconv/xstrdup.o                       MODULE_NAME=iconvprogs
    iconv/xmalloc.o                       MODULE_NAME=iconvprogs
    iconv/record-status.o                 MODULE_NAME=iconvprogs
  iconv/iconvconfig
    iconv/iconvconfig.o                   MODULE_NAME=nonlib
    iconv/strtab.o                        MODULE_NAME=iconvprogs
    iconv/xmalloc.o                       MODULE_NAME=iconvprogs
    iconv/hash-string.o                   MODULE_NAME=iconvprogs
  nss/libnss_files.so                     MODULE_NAME=libnss_files
  nss/libnss_compat.so.2                  MODULE_NAME=libnss_compat
  nss/libnss_db.so                        MODULE_NAME=libnss_db
  hesiod/libnss_hesiod.so                 MODULE_NAME=libnss_hesiod
  login/libutil.so                        MODULE_NAME=libutil
  debug/libpcprofile.so                   MODULE_NAME=libpcprofile
  debug/libSegFault.so                    MODULE_NAME=libSegFault

Also, to avoid adding both LFS and 64 bit time support on internal
tests they are moved to a newer 'testsuite-internal' module.  It
should be similar to 'nonlib' regarding internal definition and
linking namespace.

This patch also enables LFS and 64 bit support of libsupport container
programs (echo-container, test-container, shell-container, and
true-container).

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 09:01:01 -03:00
H.J. Lu
4435c29892 Support target specific ALIGN for variable alignment test [BZ #28676]
Add <tst-file-align.h> to support target specific ALIGN for variable
alignment test:

1. Alpha: Use 0x10000.
2. MicroBlaze and Nios II: Use 0x8000.
3. All others: Use 0x200000.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-14 14:50:33 -08:00
H.J. Lu
f6ff87868a NEWS: Document LD_PREFER_MAP_32BIT_EXEC as x86-64 only 2021-12-14 07:58:05 -08:00
H.J. Lu
fd6062ede3 elf: Align argument of __munmap to page size [BZ #28676]
On Linux/x86-64, for elf/tst-align3, we now get

munmap(0x7f88f9401000, 1126424)         = 0

instead of

munmap(0x7f1615200018, 544768)          = -1 EINVAL (Invalid argument)

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-14 07:16:51 -08:00
Florian Weimer
0884724a95 elf: Use new dependency sorting algorithm by default
The default has to change eventually, and there are no known failures
that require a delay.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-14 14:44:04 +01:00
Khem Raj
f8392bb766 intl: Emit no lines in bison generated files
Improve reproducibility:
Do not put any #line preprocessor commands in bison generated files.
These lines contain absolute paths containing file locations on
the host build machine.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-14 09:33:25 -03:00
Samuel Thibault
ec06717856 hurd: Do not set PIE_UNSUPPORTED
This is now supported.
2021-12-14 08:38:05 +01:00
H.J. Lu
1f3d460761 NEWS: Move LD_PREFER_MAP_32BIT_EXEC
Move LD_PREFER_MAP_32BIT_EXEC to

Deprecated and removed features, and other changes affecting compatibility:
2021-12-13 16:33:57 -08:00
Samuel Thibault
cf44f08379 mach: Fix spurious inclusion of stack_chk_fail_local in libmachuser.a
When linking programs statically, stack_chk_fail_local already comes
from libc_nonshared, so we don't need it in lib{mach,hurd}user.a.
2021-12-14 01:01:48 +01:00
H.J. Lu
57e349b1b0 Disable DT_RUNPATH on NSS tests [BZ #28455]
The glibc internal NSS functions should always load NSS modules from
the system.  For testing purpose, disable DT_RUNPATH on NSS tests so
that the glibc internal NSS functions can load testing NSS modules
via DT_RPATH.

This partially fixes BZ #28455.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-12-13 07:32:04 -08:00
Akila Welihinda
3b1402b3fc sysdeps: Simplify sin Taylor Series calculation
The macro TAYLOR_SIN adds the term `-0.5*da*a^2 + da` in hopes
of regaining some precision as a function of da. However the
comment says we add the term `-0.5*da*a^2 + 0.5*da` which is
different. This fix updates the comment to reflect the
code and also simplifies the calculation by replacing `a` with `x`
because they always have the same value.

Signed-off-by: Akila Welihinda <akilawelihinda@ucla.edu>
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-12-13 15:31:05 +01:00
Adhemerval Zanella
104d2005d5 math: Remove the error handling wrapper from hypot and hypotf
The error handling is moved to sysdeps/ieee754 version with no SVID
support.  The compatibility symbol versions still use the wrapper with
SVID error handling around the new code.  There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
2021-12-13 10:08:46 -03:00
Wilco Dijkstra
2f44eef584 math: Use fmin/fmax on hypot
It optimizes for architectures that provides fast builtins.

Checked on aarch64-linux-gnu.
2021-12-13 10:08:46 -03:00
Adhemerval Zanella
ecb94e9587 aarch64: Add math-use-builtins-f{max,min}.h
It allows to remove the arch-specific implementations.
2021-12-13 10:08:46 -03:00
Adhemerval Zanella
583c4d424e math: Add math-use-builtinds-fmin.h
It allows the architecture to use the builtin instead of generic
implementation.
2021-12-13 10:08:43 -03:00
Adhemerval Zanella
72ab1eaec7 math: Add math-use-builtinds-fmax.h
It allows the architecture to use the builtin instead of generic
implementation.
2021-12-13 09:08:07 -03:00
Adhemerval Zanella
2eb1cd2f47 math: Remove powerpc e_hypot
The generic implementation is shows only slight worse performance:

POWER10    reciprocal-throughput    latency
master                   8.28478    13.7253
new hypot                7.21945    13.1933

POWER9     reciprocal-throughput    latency
master                   13.4024    14.0967
new hypot                14.8479    15.8061

POWER8     reciprocal-throughput    latency
master                   15.5767    16.8885
new hypot                16.5371    18.4057

One way to improve might to make gcc generate xsmaxdp/xsmindp for
fmax/fmin (it onl does for -ffast-math, clang does for default
options).

Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu
(power9).
2021-12-13 09:08:07 -03:00
Adhemerval Zanella
a1d3c9b642 i386: Move hypot implementation to C
The generic hypotf is slight slower, mostly due the tricks the assembly
does to optimize the isinf/isnan/issignaling.  The generic hypot is way
slower, since the optimized implementation uses the i386 default
excessive precision to issue the operation directly.  A similar
implementation is provided instead of using the generic implementation:

Checked on i686-linux-gnu.
2021-12-13 09:08:02 -03:00
Adhemerval Zanella
c212d6397e math: Use an improved algorithm for hypotl (ldbl-128)
This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:

  - Handle qNaN and sNaN.
  - Tune the 'widely varying operands' to avoid spurious underflow
    due the multiplication and fix the return value for upwards
    rounding mode.
  - Handle required underflow exception for subnormal results.

The main advantage of the new algorithm is its precision.  With a
random 1e9 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.05% results with an error of
1 ulp (453266 results) while the new implementation only shows
0.0001% of total (1280).

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

[1] https://arxiv.org/pdf/1904.09481.pdf
2021-12-13 09:02:34 -03:00
Adhemerval Zanella
aa9c28cde3 math: Use an improved algorithm for hypotl (ldbl-96)
This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:

 - Handle qNaN and sNaN.
 - Tune the 'widely varying operands' to avoid spurious underflow
   due the multiplication and fix the return value for upwards
   rounding mode.
 - Handle required underflow exception for subnormal results.

The main advantage of the new algorithm is its precision.  With a
random 1e8 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.02% results with an error of
1 ulp (23158 results) while the new implementation only shows
0.0001% of total (111).

[1] https://arxiv.org/pdf/1904.09481.pdf
2021-12-13 09:02:34 -03:00
Wilco Dijkstra
ccfa865a82 math: Improve hypot performance with FMA
Improve hypot performance significantly by using fma when available. The
fma version has twice the throughput of the previous version and 70% of
the latency.  The non-fma version has 30% higher throughput and 10%
higher latency.

Max ULP error is 0.949 with fma and 0.792 without fma.

Passes GLIBC testsuite.
2021-12-13 09:02:34 -03:00
Wilco Dijkstra
6c848d7038 math: Use an improved algorithm for hypot (dbl-64)
This implementation is based on the 'An Improved Algorithm for
hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the
following changes:

 - Handle qNaN and sNaN.
 - Tune the 'widely varying operands' to avoid spurious underflow
   due the multiplication and fix the return value for upwards
   rounding mode.
 - Handle required underflow exception for denormal results.

The main advantage of the new algorithm is its precision: with a
random 1e9 input pairs in the range of [DBL_MIN, DBL_MAX], glibc
current implementation shows around 0.34% results with an error of
1 ulp (3424869 results) while the new implementation only shows
0.002% of total (18851).

The performance result are also only slight worse than current
implementation.  On x86_64 (Ryzen 5900X) with gcc 12:

Before:

  "hypot": {
   "workload-random": {
    "duration": 3.73319e+09,
    "iterations": 1.12e+08,
    "reciprocal-throughput": 22.8737,
    "latency": 43.7904,
    "max-throughput": 4.37184e+07,
    "min-throughput": 2.28361e+07
   }
  }

After:

  "hypot": {
   "workload-random": {
    "duration": 3.7597e+09,
    "iterations": 9.8e+07,
    "reciprocal-throughput": 23.7547,
    "latency": 52.9739,
    "max-throughput": 4.2097e+07,
    "min-throughput": 1.88772e+07
   }
  }

Co-Authored-By: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

[1] https://arxiv.org/pdf/1904.09481.pdf
2021-12-13 09:02:34 -03:00
Adhemerval Zanella
7fe0ace3e2 math: Simplify hypotf implementation
Use a more optimized comparison for check for NaN and infinite and
add an inlined issignaling implementation for float.  With gcc it
results in 2 FP comparisons.

The file Copyright is also changed to use  GPL, the implementation was
completely changed by 7c10fd3515 to use double precision instead of
scaling and this change removes all the GET_FLOAT_WORD usage.

Checked on x86_64-linux-gnu.
2021-12-13 09:02:30 -03:00
Siddhesh Poyarekar
5afe4c0d69 Cleanup encoding in comments
Replace non-UTF-8 and non-ASCII characters in comments with their UTF-8
equivalents so that files don't end up with mixed encodings.  With this,
all files (except tests that actually test different encodings) have a
single encoding.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-13 10:01:45 +05:30
Siddhesh Poyarekar
23645707f1 Replace --enable-static-pie with --disable-default-pie
Build glibc programs and tests as PIE by default and enable static-pie
automatically if the architecture and toolchain supports it.

Also add a new configuration option --disable-default-pie to prevent
building programs as PIE.

Only the following architectures now have PIE disabled by default
because they do not work at the moment.  hppa, ia64, alpha and csky
don't work because the linker is unable to handle a pcrel relocation
generated from PIE objects.  The microblaze compiler is currently
failing with an ICE.  GNU hurd tries to enable static-pie, which does
not work and hence fails.  All these targets have default PIE disabled
at the moment and I have left it to the target maintainers to enable PIE
on their targets.

build-many-glibcs runs clean for all targets.  I also tested x86_64 on
Fedora and Ubuntu, to verify that the default build as well as
--disable-default-pie work as expected with both system toolchains.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-12-13 08:08:59 +05:30
Samuel Thibault
556a6126f8 hurd: Add rules for static PIE build
This fixes [BZ #28671].
2021-12-12 00:42:13 +01:00
Samuel Thibault
26803075e4 hurd: Fix gmon-static
We need to use crt0 for gmon-static too.
2021-12-12 00:42:12 +01:00
H.J. Lu
ea5814467a x86-64: Remove LD_PREFER_MAP_32BIT_EXEC support [BZ #28656]
Remove the LD_PREFER_MAP_32BIT_EXEC environment variable support since
the first PT_LOAD segment is no longer executable due to defaulting to
-z separate-code.

This fixes [BZ #28656].

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-10 14:01:34 -08:00