Commit Graph

1629 Commits

Author SHA1 Message Date
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
H.J. Lu
d4877540e5 i686: Don't include multiarch memove in libc.a
On i686, there is no multiarch memove in libc.a, don't include multiarch
memove in ifunc-impl-list.c in libc.a.
2021-08-30 05:57:49 -07:00
Fangrui Song
710ba420fd Remove sysdeps/*/tls-macros.h
They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing.  Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-08-18 09:15:20 -07:00
Arjun Shankar
e785361ce3 i386: Regenerate ulps
These failures were caught while building glibc master for Fedora Rawhide
which is built with `-mtune=generic -msse2 -mfpmath=sse'.
2021-07-25 22:29:27 +02:00
H.J. Lu
84ea6ea24b mcheck: Align struct hdr to MALLOC_ALIGNMENT bytes [BZ #28068]
1. Align struct hdr to MALLOC_ALIGNMENT bytes so that malloc hooks in
libmcheck align memory to MALLOC_ALIGNMENT bytes.
2. Remove tst-mallocalign1 from tests-exclude-mcheck for i386 and x32.
3. Add tst-pvalloc-fortify and tst-reallocarray to tests-exclude-mcheck
since they use malloc_usable_size (see BZ #22057).

This fixed BZ #28068.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-07-12 18:13:32 -07:00
H.J. Lu
dc76a059fd Add a generic malloc test for MALLOC_ALIGNMENT
1. Add sysdeps/generic/malloc-size.h to define size related macros for
malloc.
2. Move x86_64/tst-mallocalign1.c to malloc and replace ALIGN_MASK with
MALLOC_ALIGN_MASK.
3. Add tst-mallocalign1 to tests-exclude-mcheck for i386 and x32 since
mcheck doesn't honor MALLOC_ALIGNMENT.
2021-07-09 06:39:30 -07:00
H.J. Lu
79aec84102 Properly check stack alignment [BZ #27901]
1. Replace

if ((((uintptr_t) &_d) & (__alignof (double) - 1)) != 0)

which may be optimized out by compiler, with

int
__attribute__ ((weak, noclone, noinline))
is_aligned (void *p, int align)
{
  return (((uintptr_t) p) & (align - 1)) != 0;
}

2. Add TEST_STACK_ALIGN_INIT to TEST_STACK_ALIGN.
3. Add a common TEST_STACK_ALIGN_INIT to check 16-byte stack alignment
for both i386 and x86-64.
4. Update powerpc to use TEST_STACK_ALIGN_INIT.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-24 07:42:12 -07:00
Adhemerval Zanella
db373e4c57 Remove architecture specific sched_cpucount optimizations
And replace the generic algorithm with the Brian Kernighan's one.
GCC optimize it with popcnt if the architecture supports, so there
is no need to add the extra POPCNT define to enable it.

This is really a micro-optimization that only adds complexity:
recent ABIs already support it (x86-64-v2 or power64le) and it
simplifies the code for internal usage, since i686 does not allow an
internal iFUNC call.

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.
2021-05-07 13:35:29 -03:00
Florian Weimer
4baf02b332 nptl: Move pthread_spin_trylock into libc
The symbol was moved using scripts/move-symbol-to-libc.py.
2021-04-23 17:06:48 +02:00
Florian Weimer
da8e3710d8 nptl: Move pthread_spin_lock into libc
The symbol was moved using scripts/move-symbol-to-libc.py.
2021-04-23 17:06:46 +02:00
Florian Weimer
ce4b3b7bef nptl: Move pthread_spin_init, Move pthread_spin_unlock into libc
For some architectures, the two functions are aliased, so these
symbols need to be moved at the same time.

The symbols were moved using scripts/move-symbol-to-libc.py.
2021-04-23 17:06:44 +02:00
Florian Weimer
60d5e40ab2 x86: Remove low-level lock optimization
The current approach is to do this optimizations at a higher level,
in generic code, so that single-threaded cases can be specifically
targeted.

Furthermore, using IS_IN (libc) as a compile-time indicator that
all locks are private is no longer correct once process-shared lock
implementations are moved into libc.

The generic <lowlevellock.h> is not compatible with assembler code
(obviously), so it's necessary to remove two long-unused #includes.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-04-21 19:49:51 +02:00
Szabolcs Nagy
2208066603 elf: Remove lazy tlsdesc relocation related code
Remove generic tlsdesc code related to lazy tlsdesc processing since
lazy tlsdesc relocation is no longer supported.  This includes removing
GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at
load time when that lock is already held.

Added a documentation comment too.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-04-21 14:35:53 +01:00
Szabolcs Nagy
a75a02a696 i386: Remove lazy tlsdesc relocation related code
Like in commit e75711ebfa976d5468ec292282566a18b07e4d67 for x86_64,
remove unused lazy tlsdesc relocation processing code:

  _dl_tlsdesc_resolve_abs_plus_addend
  _dl_tlsdesc_resolve_rel
  _dl_tlsdesc_resolve_rela
  _dl_tlsdesc_resolve_hold
2021-04-15 09:47:59 +01:00
Szabolcs Nagy
ddcacd91cc i386: Avoid lazy relocation of tlsdesc [BZ #27137]
Lazy tlsdesc relocation is racy because the static tls optimization and
tlsdesc management operations are done without holding the dlopen lock.

This similar to the commit b7cf203b5c
for aarch64, but it fixes a different race: bug 27137.

On i386 the code is a bit more complicated than on x86_64 because both
rel and rela relocs are supported.
2021-04-15 09:47:43 +01:00
Adhemerval Zanella
30c2a0e41b i386: Update ulps
Required after 43576de04a "Improve the accuracy of tgamma
(BZ #26983)"
2021-04-13 16:33:27 -03:00
Adhemerval Zanella
1d64e962ab i386: Update ulps
Required after 9acda61d94 "Fix the inaccuracy of j0f/j1f/y0f/y1f
[BZ #14469, #14470, #14471, #14472]".
2021-04-05 10:02:15 -03:00
H.J. Lu
3e2f285c5f nptl: Remove MULTI_PAGE_ALIASING [BZ #23554]
MULTI_PAGE_ALIASING was introduced to mitigate an aliasing issue on
Pentium 4.  It is no longer needed for processors after Pentium 4.
2021-03-19 15:04:17 -07:00
Florian Weimer
f01a61e138 i386: Regenerate ulps 2021-03-02 15:41:29 +01:00
Florian Weimer
fd19b84640 i386: Implement backtrace on top of <unwind-link.h>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-03-01 15:58:58 +01:00
Florian Weimer
9fc813e1a3 Implement <unwind-link.h> for dynamically loading the libgcc_s unwinder
This will be used to consolidate the libgcc_s access for backtrace
and pthread_cancel.

Unlike the existing backtrace implementations, it provides some
hardening based on pointer mangling.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-03-01 15:58:01 +01:00
Florian Weimer
035c012e32 Reduce the statically linked startup code [BZ #23323]
It turns out the startup code in csu/elf-init.c has a perfect pair of
ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A
New Method to Bypass 64-bit Linux ASLR").  These functions are not
needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY
are already processed by the dynamic linker.  However, the dynamic
linker skipped the main program for some reason.  For maximum
backwards compatibility, this is not changed, and instead, the main
map is consulted from __libc_start_main if the init function argument
is a NULL pointer.

For statically linked binaries, the old approach based on linker
symbols is still used because there is nothing else available.

A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because
new binaries running on an old libc would not run their ELF
constructors, leading to difficult-to-debug issues.
2021-02-25 12:13:02 +01:00
H.J. Lu
89de9d3958 x86: Use x86/nptl/pthreaddef.h
1. Move sysdeps/i386/nptl/pthreaddef.h to sysdeps/x86/nptl/pthreaddef.h.
2. Remove sysdeps/x86_64/nptl/pthreaddef.h.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-02-22 15:52:56 -08:00
Siddhesh Poyarekar
d46c51e9f9 i686: Regenerate ULPs 2021-02-03 23:16:39 +05:30
Szabolcs Nagy
374cef32ac configure: Check for static PIE support
Add SUPPORT_STATIC_PIE that targets can define if they support
static PIE. This requires PI_STATIC_AND_HIDDEN support and various
linker features as described in

  commit 9d7a3741c9
  Add --enable-static-pie configure option to build static PIE [BZ #19574]

Currently defined on x86_64, i386 and aarch64 where static PIE is
known to work.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-21 15:54:50 +00:00
H.J. Lu
6ea5b57afa x86: Check IFUNC definition in unrelocated executable [BZ #20019]
Calling an IFUNC function defined in unrelocated executable also leads to
segfault.  Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.
2021-01-04 12:01:01 -08:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
94547d9209 x86 long double: Support pseudo numbers in isnanl
This syncs up isnanl behaviour with gcc.  Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:40 +05:30
Siddhesh Poyarekar
b7f8815617 x86 long double: Support pseudo numbers in fpclassifyl
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:26 +05:30
Florian Weimer
bca0283815 i386: Regenerate ulps
For new inputs added in commit cad5ad81d2.
2020-12-21 18:19:03 +01:00
Samuel Thibault
407765e9f2 hurd: Fix ELF_MACHINE_USER_ADDRESS_MASK value
x86 binaries are linked at 0x08000000, so we need to let them get mapped
there.
2020-12-20 01:47:47 +01:00
Jakub Jelinek
1d9cbb9608 x86: Fix THREAD_SELF definition to avoid ld.so crash (bug 27004)
The previous definition of THREAD_SELF did not tell the compiler
that %fs (or %gs) usage is invalid for the !DL_LOOKUP_GSCOPE_LOCK
case in _dl_lookup_symbol_x.  As a result, ld.so could try to use the
TCB before it was initialized.

As the comment in tls.h explains, asm volatile is undesirable here.
Using the __seg_fs (or __seg_gs) namespace does not interfere with
optimization, and expresses that THREAD_SELF is potentially trapping.
2020-12-03 13:48:55 +01:00
Florian Weimer
1daccf403b nptl: Move stack list variables into _rtld_global
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-16 19:33:30 +01:00
Florian Weimer
0f34d426ac x86: Remove UP macro. Define LOCK_PREFIX unconditionally.
The UP macro is never defined.  Also define LOCK_PREFIX
unconditionally, to the same string.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-13 15:20:03 +01:00
Samuel Thibault
3d3316b1de hurd: keep only required PLTs in ld.so
We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so.
See Roland's comment in
https://sourceware.org/bugzilla/show_bug.cgi?id=15605
"in the Hurd it's crucial that calls like __mmap be the libc ones
instead of the rtld-local ones after the bootstrap phase, when the
dynamic linker is being used for dlopen and the like."

We used to just avoid all hidden use in the rtld ; this commit switches to
keeping only those that should use PLT calls, i.e. essentially those defined in
sysdeps/mach/hurd/dl-sysdep.c:

__assert_fail
__assert_perror_fail
__*stat64
_exit

This fixes a few startup issues, notably the call to __tunable_get_val that is
made before PLTs are set up.
2020-11-11 02:36:22 +01:00
H.J. Lu
0f09154c64 x86: Initialize CPU info via IFUNC relocation [BZ 26203]
X86 CPU features in ld.so are initialized by init_cpu_features, which is
invoked by DL_PLATFORM_INIT from _dl_sysdep_start.  But when ld.so is
loaded by static executable, DL_PLATFORM_INIT is never called.  Also
x86 cache info in libc.o and libc.a is initialized by a constructor
which may be called too late.  Since some fields in _rtld_global_ro
in ld.so are initialized by dynamic relocation, we can also initialize
x86 CPU features in _rtld_global_ro in ld.so and cache info in libc.so
by initializing dummy function pointers in ld.so and libc.so via IFUNC
relocation.

Key points:

1. IFUNC is always supported, independent of --enable-multi-arch or
--disable-multi-arch.  Linker generates IFUNC relocations from input
IFUNC objects and ld.so performs IFUNC relocations.
2. There are no IFUNC dependencies in ld.so before dynamic relocation
have been performed,
3. The x86 CPU features in ld.so is initialized by DL_PLATFORM_INIT
in dynamic executable and by IFUNC relocation in dlopen in static
executable.
4. The x86 cache info in libc.o is initialized by IFUNC relocation.
5. In libc.a, both x86 CPU features and cache info are initialized from
ARCH_INIT_CPU_FEATURES, not by IFUNC relocation, before __libc_early_init
is called.

Note: _dl_x86_init_cpu_features can be called more than once from
DL_PLATFORM_INIT and during relocation in ld.so.
2020-10-16 16:17:53 -07:00
Szabolcs Nagy
238032ead6 aarch64: enforce >=64K guard size [BZ #26691]
There are several compiler implementations that allow large stack
allocations to jump over the guard page at the end of the stack and
corrupt memory beyond that. See CVE-2017-1000364.

Compilers can emit code to probe the stack such that the guard page
cannot be skipped, but on aarch64 the probe interval is 64K by default
instead of the minimum supported page size (4K).

This patch enforces at least 64K guard on aarch64 unless the guard
is disabled by setting its size to 0.  For backward compatibility
reasons the increased guard is not reported, so it is only observable
by exhausting the address space or parsing /proc/self/maps on linux.

On other targets the patch has no effect. If the stack probe interval
is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can
be defined to get large enough stack guard on libc allocated stacks.

The patch does not affect threads with user allocated stacks.

Fixes bug 26691.
2020-10-02 09:57:44 +01:00
Florian Weimer
90ccfdf176 x86: Use one ldbl2mpn.c file for both i386 and x86_64 2020-09-22 17:58:39 +02:00
H.J. Lu
9620398097 x86: Install <sys/platform/x86.h> [BZ #26124]
Install <sys/platform/x86.h> so that programmers can do

 #if __has_include(<sys/platform/x86.h>)
 #include <sys/platform/x86.h>
 #endif
 ...

   if (CPU_FEATURE_USABLE (SSE2))
 ...
   if (CPU_FEATURE_USABLE (AVX2))
 ...

<sys/platform/x86.h> exports only:

enum
{
  COMMON_CPUID_INDEX_1 = 0,
  COMMON_CPUID_INDEX_7,
  COMMON_CPUID_INDEX_80000001,
  COMMON_CPUID_INDEX_D_ECX_1,
  COMMON_CPUID_INDEX_80000007,
  COMMON_CPUID_INDEX_80000008,
  COMMON_CPUID_INDEX_7_ECX_1,
  /* Keep the following line at the end.  */
  COMMON_CPUID_INDEX_MAX
};

struct cpuid_features
{
  struct cpuid_registers cpuid;
  struct cpuid_registers usable;
};

struct cpu_features
{
  struct cpu_features_basic basic;
  struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
};

/* Get a pointer to the CPU features structure.  */
extern const struct cpu_features *__x86_get_cpu_features
  (unsigned int max) __attribute__ ((const));

Since all feature checks are done through macros, programs compiled with
a newer <sys/platform/x86.h> are compatible with the older glibc binaries
as long as the layout of struct cpu_features is identical.  The features
array can be expanded with backward binary compatibility for both .o and
.so files.  When COMMON_CPUID_INDEX_MAX is increased to support new
processor features, __x86_get_cpu_features in the older glibc binaries
returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return false on the
new processor feature.  No new symbol version is neeeded.

Both CPU_FEATURE_USABLE and HAS_CPU_FEATURE are provided.  HAS_CPU_FEATURE
can be used to identify processor features.

Note: Although GCC has __builtin_cpu_supports, it only supports a subset
of <sys/platform/x86.h> and it is equivalent to CPU_FEATURE_USABLE.  It
doesn't support HAS_CPU_FEATURE.
2020-09-11 17:20:52 -07:00
Patsy Griffin
86a912c863 Update i686 ulps.
Without this ULP patch these 3 tests fail on i686:
FAIL: math/test-float128-j0
FAIL: math/test-float64x-j0
FAIL: math/test-ldouble-j0

CPU info:
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel Xeon Processor (Cascadelake)
2020-09-02 10:00:29 -04:00
Adhemerval Zanella
5ff35e9544 math: Update x86_64 ulps
From new j0 test.
2020-08-08 16:43:11 -03:00
H.J. Lu
107e6a3c22 x86: Support usable check for all CPU features
Support usable check for all CPU features with the following changes:

1. Change struct cpu_features to

struct cpuid_features
{
  struct cpuid_registers cpuid;
  struct cpuid_registers usable;
};

struct cpu_features
{
  struct cpu_features_basic basic;
  struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
...
};

so that there is a usable bit for each cpuid bit.
2. After the cpuid bits have been initialized, copy the known bits to the
usable bits.  EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for
CPU feature detection.
3. Clear the usable bits which require OS support.
4. If the feature is supported by OS, copy its cpuid bit to its usable
bit.
5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE
and CPU_FEATURE_USABLE_P to check if a feature is usable.
6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13.
7. Unset MPX feature since it has been deprecated.

The results are

1. If the feature is known and doesn't requre OS support, its usable bit
is copied from the cpuid bit.
2. Otherwise, its usable bit is copied from the cpuid bit only if the
feature is known to supported by OS.
3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the
feature can be used.
4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports
the feature.
2020-07-13 06:05:16 -07:00
H.J. Lu
9016b6f389 x86: Remove the unused __x86_prefetchw
Since

commit c867597bff
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Jun 8 13:57:50 2016 -0700

    X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove

removed the only usage of __x86_prefetchw, we can remove the unused
__x86_prefetchw.
2020-07-11 09:34:03 -07:00
Patsy Franklin
b21c2c24ed Update i686 libm-test-ulps
Without my ULP patch these 18 tests fail on i686:
 https://koji.fedoraproject.org/koji/taskinfo?taskID=46467301

+ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel Xeon Processor (Cascadelake)

FAIL: math/test-double-j0
FAIL: math/test-double-y0
FAIL: math/test-float-erfc
FAIL: math/test-float-j0
FAIL: math/test-float-j1
FAIL: math/test-float-lgamma
FAIL: math/test-float-tgamma
FAIL: math/test-float-y0
FAIL: math/test-float32-erfc
FAIL: math/test-float32-j0
FAIL: math/test-float32-j1
FAIL: math/test-float32-lgamma
FAIL: math/test-float32-tgamma
FAIL: math/test-float32-y0
FAIL: math/test-float32x-j0
FAIL: math/test-float32x-y0
FAIL: math/test-float64-j0
FAIL: math/test-float64-y0

With my ULP patch applied these tests now pass:
 https://koji.fedoraproject.org/koji/taskinfo?taskID=46436310
2020-07-09 23:43:25 -04:00
Adhemerval Zanella
b24381e50f i386: Use builtin sqrtl
Checked on i686-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
4b2d8e4442 i386: Use generic exp10f
The generic implementation is twice as fast.  Using the exp10f
benchmark:

 * master:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.02967e+09,
    "iterations": 4.768e+07,
    "reciprocal-throughput": 18.3579,
    "latency": 24.8331,
    "max-throughput": 5.44725e+07,
    "min-throughput": 4.02688e+07
   }
  }

 * patched:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.01821e+09,
    "iterations": 6.1984e+07,
    "reciprocal-throughput": 13.1975,
    "latency": 19.6563,
    "max-throughput": 7.57719e+07,
    "min-throughput": 5.08743e+07
   }
  }

Checked on i686-linux-gnu.
2020-06-19 10:48:15 -03:00
Samuel Thibault
02937d825a hurd: fix clearing SS_ONSTACK when longjmp-ing from sighandler
* sysdeps/i386/htl/Makefile: New file.
* sysdeps/i386/htl/tcb-offsets.sym: New file.
* sysdeps/mach/hurd/i386/Makefile [setjmp] (gen-as-const-headers): Add
signal-defines.sym.
* sysdeps/mach/hurd/i386/____longjmp_chk.S: Include tcb-offsets.h.
(____longjmp_chk): Harmonize with i386's __longjmp. Clear SS_ONSTACK
when jumping off the alternate stack.
* sysdeps/mach/hurd/i386/__longjmp.S: New file.
2020-06-06 20:24:30 +02:00
Florian Weimer
fff30716a7 i386: Remove NO_TLS_DIRECT_SEG_REFS handling
This was needed for 32-bit PV Xen, which has been superseded by this
point according to Xen developers.
2020-05-28 11:53:08 +02:00
Samuel Thibault
415d0b0b3f Update i386 libm-test-ulps 2020-05-26 13:21:57 +02:00
H.J. Lu
674ea88294 x86: Move CET control to _dl_x86_feature_control [BZ #25887]
1. Include <dl-procruntime.c> to get architecture specific initializer in
rtld_global.
2. Change _dl_x86_feature_1[2] to _dl_x86_feature_1.
3. Add _dl_x86_feature_control after _dl_x86_feature_1, which is a
struct of 2 bitfields for IBT and SHSTK control

This fixes [BZ #25887].
2020-05-18 06:15:02 -07:00
Joseph Myers
8d9ffbb9d0 Remove most gmp-mparam.h headers.
Most gmp-mparam.h headers in glibc define various macros to the same
values they would be defined to by the generic version of that header,
plus macros IEEE_DOUBLE_BIG_ENDIAN or IEEE_DOUBLE_MIXED_ENDIAN related
to the representation of double.  The latter macros are in turn only
used in gmp-impl.h to define union ieee_double_extract, which is not
used in glibc.  Thus all of these headers, except for the generic one
and those that define _LONG_LONG_LIMB for ILP32 configurations with
64-bit registers, are redundant, and this patch removes them.

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
2020-04-24 22:08:59 +00:00
H.J. Lu
93a0959ef2 i386: Remove build support for GCC older than GCC 6
Since GCC 6.2 or later is required to build glibc, remove build support
for GCC older than GCC 6.

Testd with GCC 6.4 and GCC 9.3.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-04-07 06:44:37 -07:00
Paul Zimmermann
a9d42c09a3 math: Add inputs that yield larger errors for float type (x86_64)
The corner cases included were generated using exhaustive search
for all float/binary32 values on x86_64 (comparing to MPFR for
correct rounding to nearest).

For the j0/j1/y0 functions, only cases with ulp error <= 9 were
included.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-31 21:48:54 -04:00
Adhemerval Zanella
1c15464ca0 math: Remove inline math tests
With mathinline removal there is no need to keep building and testing
inline math tests.

The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries.  The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-03-19 11:45:44 -03:00
Adhemerval Zanella
b5b7fb76e1 i386: Use comdat instead of .gnu.linkonce for i386 setup pic register (BZ #20543)
GCC has moved from using .gnu.linkonce for i386 setup pic register with
minimum current version (as for binutils minimum binutils that support
comdat).

Trying to pinpoint when binutils has added comdat support for i686, it
seems it was around 2004 [1].  I also checking with some ancient
binutils older than 2.16 I see:

test.o: In function `__x86.get_pc_thunk.bx':
test.o(.text.__x86.get_pc_thunk.bx+0x0): multiple definition of `__x86.get_pc_thunk.bx'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../i386-linux-gnu/crti.o(.gnu.linkonce.t.__x86.get_pc_thunk.bx+0x0): first defined here

Which seems that such version can not handle either comdat at all or
a mix of linkonce and comdat.  For binutils 2.16.1 I am getting a
different issue trying to link a binary with and more recent
ctri.o (unrecognized relocation (0x2b) in section `.init', which is
R_386_GOT32X and old binutils won't generate it anyway).

So I think that either unlikely someone will use an older binutils than
the one used to glibc and even this scenario may fail with some issue
as the R_386_GOT32X.  Also, 2.16.1 is quite old and not really supported
(glibc itself required 2.25).

Checked on i686-linux-gnu.

[1] https://gcc.gnu.org/ml/gcc/2004-05/msg00030.html
2020-02-28 13:58:08 -03:00
Florian Weimer
fe49a73316 x86: Avoid single-argument _Static_assert in <tls.h>
Older GCC versions do not support this extension.  Fixes commit f1bdee6179
("x86 tls: Use _Static_assert for TLS access size assertion").
2020-02-17 11:12:03 +01:00
Samuel Thibault
f1bdee6179 x86 tls: Use _Static_assert for TLS access size assertion 2020-02-17 00:40:39 +01:00
Adhemerval Zanella
bc2eb9321e linux: Remove INTERNAL_SYSCALL_DECL
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, the INTERNAL_SYSCALL_DECL is an empty declaration
on all ports.

This patch removes the 'err' argument on INTERNAL_SYSCALL* macro
and remove the INTERNAL_SYSCALL_DECL usage.

Checked with a build against all affected ABIs.
2020-02-14 21:12:45 -03:00
Adhemerval Zanella
fcb78a5505 linux: Consolidate INLINE_SYSCALL
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, there is no need to replicate the INLINE_SYSCALL.

The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__,
which is ok now and it allows cleanup some archaic code that assume
otherwise.

Checked with a build against all affected ABIs.
2020-02-14 21:09:12 -03:00
H.J. Lu
0455f251f4 i386: Use ENTRY/END in assembly codes
Use ENTRY and END in assembly codes so that ENDBR32 will be added at
function entries when CET is enabled.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
825b58f3fb i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__
Since _mcount and __fentry__ don't use ENTRY, we need to add _CET_ENDBR
by hand.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
4031d7484a i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump target
Add a missing _CET_ENDBR to indirect jump targe in sysdeps/i386/sub_n.S.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
Samuel Thibault
196e62cbe4 htl: Add type sizes in bits/pthreadtypes-arch.h and check them 2020-01-13 01:24:43 +01:00
Wilco Dijkstra
220622dde5 Add libm_alias_finite for _finite symbols
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol.  It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).

The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.

Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).

Passes buildmanyglibc.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:04 -03:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Florian Weimer
4db71d2f98 elf: Do not run IFUNC resolvers for LD_DEBUG=unused [BZ #24214]
This commit adds missing skip_ifunc checks to aarch64, arm, i386,
sparc, and x86_64.  A new test case ensures that IRELATIVE IFUNC
resolvers do not run in various diagnostic modes of the dynamic
loader.

Reviewed-By: Szabolcs Nagy <szabolcs.nagy@arm.com>
2019-12-02 14:55:22 +01:00
Adhemerval Zanella
48dbce60cf nptl: Add tests for internal pthread_rwlock_t offsets
This patch new build tests to check for internal fields offsets for
internal pthread_rwlock_t definition.  Althoug the '__data.__flags'
field layout should be preserved due static initializators, the patch
also adds tests for the futexes that may be used in a shared memory
(although using different libc version in such scenario is not really
supported).

Checked with a build against all affected ABIs.

Change-Id: Iccc103d557de13d17e4a3f59a0cad2f4a640c148
2019-11-26 13:53:36 +00:00
Adhemerval Zanella
71d260c107 nptl: Cleanup mutex internal offset tests
The offsets of pthread_mutex_t __data.__nusers, __data.__spins,
__data.elision, __data.list are not required to be constant over
the releases.  Only the __data.__kind is used for static
initializers.

This patch also adds an additional size check for __data.__kind.

Checked with a build against affected ABIs.

Change-Id: I7a4e48cc91b4c4ada57e9a5d1b151fb702bfaa9f
2019-11-26 13:53:36 +00:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Andreas Schwab
b72971845a Update i386 libm-test-ulps 2019-08-20 17:06:02 +02:00
Andreas Schwab
56e098118a Update i386 libm-test-ulps 2019-08-15 12:25:51 +02:00
H.J. Lu
d039da1c00 x86: Add sysdeps/x86/dl-lookupcfg.h
Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are
identical, we can replace them with sysdeps/x86/dl-lookupcfg.h.

	* sysdeps/i386/dl-lookupcfg.h: Moved to ...
	* sysdeps/x86/dl-lookupcfg.h: Here.
	* sysdeps/x86_64/dl-lookupcfg.h: Removed.
2019-06-26 15:07:28 -07:00
Joseph Myers
e0cb7b6131 Add and move fall-through comments in system-specific code.
This patch fixes -Wimplicit-fallthrough warnings in system-specific
code that show up building glibc with -Wextra, by adding fall-through
comments, or moving existing such comments to the place required for
them to work (immediately before the case label being fallen through).

Tested with build-many-glibcs.py.

	* sysdeps/i386/dl-machine.h (elf_machine_rela): Add fall-through
	comments.
	* sysdeps/m68k/m680x0/fpu/s_cexp_template.c (s(__cexp)): Likewise.
	* sysdeps/m68k/memcopy.h (WORD_COPY_FWD): Likewise.
	(WORD_COPY_BWD): Likewise.
	* sysdeps/mach/hurd/ioctl.c (__ioctl): Likewise.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/s390/iso-8859-1_cp037_z900.c (TR_LOOP): Likewise.
	* sysdeps/mips/dl-machine.h (elf_machine_reloc): Move fall-through
	comment.
	* sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise.
2019-02-26 02:09:18 +00:00
Andreas Schwab
65f7767a91 Fix handling of collating elements in fnmatch (bug 17396, bug 16976)
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for
regexp matching.  As a side effect it also removes the use of an unbound
VLA.
2019-02-04 15:45:02 +01:00
Adhemerval Zanella
0b13e25581 i386: Remove bogus THREAD_ATOMIC_* macros
The x86 defines optimized THREAD_ATOMIC_* macros where reference always
the current thread instead of the one indicated by input 'descr' argument.
It work as long the input is the self thread pointer, however it generates
wrong code if the semantic is to set a bit atomicialy from another thread.

This is not an issue for current GLIBC usage, however the new cancellation
code expects that some synchronization code to atomically set bits from
different threads.

If some usage indeed proves to be a hotspot we can add an extra macro
with a more descriptive name (THREAD_ATOMIC_BIT_SET_SELF for instance)
where i386 might optimize it.

Checked on i686-linux-gnu.

	* sysdeps/i686/nptl/tls.h (THREAD_ATOMIC_CMPXCHG_VAL,
	THREAD_ATOMIC_AND, THREAD_ATOMIC_BIT_SET): Remove macros.
2019-01-03 18:38:15 -02:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
H.J. Lu
cd815050e5 x86: Merge i386/x86_64 atomic-machine.h
Merge i386 and x86_64 atomic-machine.h to x86 atomic-machine.h.

Tested on i686 and x86_64 as well as with build-many-glibcs.py.

	* sysdeps/i386/atomic-machine.h: Merged with ...
	* sysdeps/x86_64/atomic-machine.h: To ...
	* sysdeps/x86/atomic-machine.h: This.  New file.
2018-12-18 04:25:26 -08:00
Szabolcs Nagy
a502c5294b Remove the error handling wrapper from pow
Introduce new pow symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_pow.c and enabled for targets with their own pow implementation or
ifunc dispatch on __ieee754_pow by including math/w_pow.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously powl was an alias of pow, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __pow_finite symbol is now an alias of pow.  Both __pow_finite and
pow set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that
may affect that header.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add pow.
	* math/w_pow_compat.c (__pow_compat): Change to versioned compat
	symbol.
	* math/w_pow.c: New file.
	* sysdeps/i386/fpu/w_pow.c: New file.
	* sysdeps/ia64/fpu/e_pow.S: Add versioned symbols.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_pow.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_pow.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to
	__pow.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.
2018-11-21 09:58:36 +00:00
Szabolcs Nagy
718d6542f2 Remove the error handling wrapper from log2
Introduce new log2 symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log2.c and enabled for targets with their own log2 implementation by
including math/w_log2.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously log2l was an alias of log2, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __log2_finite symbol is now an alias of log2.  Both __log2_finite
and log2 set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add log2.
	* math/w_log2_compat.c (__log2_compat): Change to versioned compat
	symbol.
	* math/w_log2.c: New file.
	* sysdeps/i386/fpu/w_log2.c: New file.
	* sysdeps/ia64/fpu/e_log2.S: Add versioned symbols.
	* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_log2.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_log2.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
2018-11-21 09:57:21 +00:00
Szabolcs Nagy
f29b7c492d Remove the error handling wrapper from log
Introduce new log symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log.c and enabled for targets with their own log implementation by
including math/w_log.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously logl was an alias of log, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __log_finite symbol is now an alias of log.  Both __log_finite and
log set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that may
affect that header.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add log.
	* math/w_log_compat.c (__log_compat): Change to versioned compat
	symbol.
	* math/w_log.c: New file.
	* sysdeps/i386/fpu/w_log.c: New file.
	* sysdeps/ia64/fpu/e_log.S: Update.
	* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_log.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_log.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to
	__log.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/w_log.c: New file.
2018-11-21 09:56:27 +00:00
Szabolcs Nagy
c20a10561a Remove the error handling wrapper from exp and exp2
Introduce new exp and exp2 symbol version that don't do SVID compatible
error handling.  The standard errno and fp exception based error handling
is inline in the new code and does not have significant overhead.

The double precision wrappers are disabled for sysdeps/ieee754/dbl-64
by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and
math/w_exp2.c files use the wrapper template and can be included by
targets that have their own exp and exp2 implementations or use ifunc
on the glibc internal __ieee754_exp symbol.

The compatibility symbol versions still use the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously expl and exp2l were aliases of exp and exp2,
now they point to the compatibility symbols with the wrapper, because
they still need the SVID compatible error handling.  This affects
NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets
as well.

The _finite symbols are now aliases of the standard symbols (they have
no performance advantage anymore).  Both the standard symbols and
_finite symbols set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that may
affect that header (the new macro name is __exp instead of __ieee754_exp
which breaks some math.h macros).

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add exp and exp2.
	* math/w_exp2_compat.c (__exp2_compat): Change to versioned compat
	symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly.
	* math/w_exp_compat.c (__exp_compat): Likewise.
	* math/w_exp.c: New file.
	* math/w_exp2.c: New file.
	* sysdeps/i386/fpu/w_exp.c: New file.
	* sysdeps/i386/fpu/w_exp2.c: New file.
	* sysdeps/ia64/fpu/e_exp.S: Add versioned symbols.
	* sysdeps/ia64/fpu/e_exp2.S: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_exp.c: New file.
	* sysdeps/ieee754/dbl-64/w_exp2.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp2.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to
	__exp.
	* sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.
2018-11-21 09:55:02 +00:00
Joseph Myers
9a7c643ac2 Fix i686 build with GCC 9.
This patch fixes the glibc build for i686 with current mainline GCC,
where there are warnings about inconsistent attributes for aliases in
certain files defining libm IFUNCs.

In three of the files, the aliases were defined in terms of internal
symbols such as __sinf, and copied attributes from file-local
declarations of those functions which lacked the nothrow attribute.
Since the nothrow attribute is present on the declarations from
<math.h> (which include declarations of those __-prefixed functions),
the natural fix was to include <math.h> in those files, replacing the
local declarations.

In the other three files, a more complicated __hidden_ver1 call was
involved in the warnings.  <math.h> has not been included at this
point and, furthermore, it is included indirectly only later in the
source file after macros have been defined to remap a function name
therein.  So there isn't an obvious declaration from which to copy the
attribute and it seems simplest and safest just to add __THROW to the
hidden_ver1 calls.

Tested for i686 (build-many-glibcs.py compilers build for
x86_64-linux-gnu with GCC mainline; full testsuite run with GCC 7).

	* sysdeps/i386/i686/fpu/multiarch/e_expf.c [SHARED]: Use __THROW
	with __hidden_ver1 call.
	* sysdeps/i386/i686/fpu/multiarch/e_log2f.c [SHARED]: Likewise.
	* sysdeps/i386/i686/fpu/multiarch/e_logf.c [SHARED]: Likewise.
	* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: Include <math.h>.
	(__cosf): Do not declare here.
	* sysdeps/i386/i686/fpu/multiarch/s_sincosf.c: Include <math.h>.
	(__sincosf): Do not declare here.
	* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: Include <math.h>.
	(__sinf): Do not declare here.
2018-11-12 18:47:05 +00:00
H.J. Lu
72771e5375 x86: Use _rdtsc intrinsic for HP_TIMING_NOW
Since _rdtsc intrinsic is supported in GCC 4.9, we can use it for
HP_TIMING_NOW.  This patch

1. Create x86 hp-timing.h to replace i686 and x86_64 hp-timing.h.
2. Move MINIMUM_ISA from init-arch.h to isa.h so that x86 hp-timing.h
can check minimum x86 ISA to decide if _rdtsc can be used.

NB: Checking if __i686__ isn't sufficient since __i686__ may not be
defined when building for i686 class processors.

	* sysdeps/i386/init-arch.h: Removed.
	* sysdeps/i386/i586/init-arch.h: Likewise.
	* sysdeps/i386/i686/init-arch.h: Likewise.
	* sysdeps/i386/i686/hp-timing.h: Likewise.
	* sysdeps/x86_64/hp-timing.h: Likewise.
	* sysdeps/i386/isa.h: New file.
	* sysdeps/i386/i586/isa.h: Likewise.
	* sysdeps/i386/i686/isa.h: Likewise.
	* sysdeps/x86_64/isa.h: Likewise.
	* sysdeps/x86/hp-timing.h: New file.
	* sysdeps/x86/init-arch.h: Include <isa.h>.
2018-10-17 15:16:45 -07:00
Joseph Myers
c52944e8cc Remove unnecessary math_private.h includes.
After my changes to move various macros, inlines and other content
from math_private.h to more specific headers, many files including
math_private.h no longer need to do so.  Furthermore, since the
optimized inlines of various functions have been moved to
include/fenv.h or replaced by use of function names GCC inlines
automatically, a missing math_private.h include where one is
appropriate will reliably cause a build failure rather than possibly
causing code to be less well optimized while still building
successfully.  Thus, this patch removes includes of math_private.h
that are now unnecessary.  In the case of two RISC-V files, the
include is replaced by one of stdbool.h because the files in question
were relying on math_private.h to get a definition of bool.

Tested for x86_64 and x86, and with build-many-glibcs.py.

	* math/fromfp.h: Do not include <math_private.h>.
	* math/s_cacosh_template.c: Likewise.
	* math/s_casin_template.c: Likewise.
	* math/s_casinh_template.c: Likewise.
	* math/s_ccos_template.c: Likewise.
	* math/s_cproj_template.c: Likewise.
	* math/s_fdim_template.c: Likewise.
	* math/s_fmaxmag_template.c: Likewise.
	* math/s_fminmag_template.c: Likewise.
	* math/s_iseqsig_template.c: Likewise.
	* math/s_ldexp_template.c: Likewise.
	* math/s_nextdown_template.c: Likewise.
	* math/w_log1p_template.c: Likewise.
	* math/w_scalbln_template.c: Likewise.
	* sysdeps/aarch64/fpu/feholdexcpt.c: Likewise.
	* sysdeps/aarch64/fpu/fesetround.c: Likewise.
	* sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise.
	* sysdeps/aarch64/fpu/ftestexcept.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrintf.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrintf.c: Likewise.
	* sysdeps/i386/fpu/s_atanl.c: Likewise.
	* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
	* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
	* sysdeps/i386/fpu/s_fdim.c: Likewise.
	* sysdeps/i386/fpu/s_logbl.c: Likewise.
	* sysdeps/i386/fpu/s_rintl.c: Likewise.
	* sysdeps/i386/fpu/s_significandl.c: Likewise.
	* sysdeps/ia64/fpu/s_matherrf.c: Likewise.
	* sysdeps/ia64/fpu/s_matherrl.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_atan.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fma.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise.
	* sysdeps/ieee754/k_standardf.c: Likewise.
	* sysdeps/ieee754/k_standardl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fma.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise.
	* sysdeps/ieee754/s_signgam.c: Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_floor.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise.
	* sysdeps/riscv/rvd/s_finite.c: Likewise.
	* sysdeps/riscv/rvd/s_fmax.c: Likewise.
	* sysdeps/riscv/rvd/s_fmin.c: Likewise.
	* sysdeps/riscv/rvd/s_fpclassify.c: Likewise.
	* sysdeps/riscv/rvd/s_isinf.c: Likewise.
	* sysdeps/riscv/rvd/s_isnan.c: Likewise.
	* sysdeps/riscv/rvd/s_issignaling.c: Likewise.
	* sysdeps/riscv/rvf/fegetround.c: Likewise.
	* sysdeps/riscv/rvf/feholdexcpt.c: Likewise.
	* sysdeps/riscv/rvf/fesetenv.c: Likewise.
	* sysdeps/riscv/rvf/fesetround.c: Likewise.
	* sysdeps/riscv/rvf/feupdateenv.c: Likewise.
	* sysdeps/riscv/rvf/fgetexcptflg.c: Likewise.
	* sysdeps/riscv/rvf/ftestexcept.c: Likewise.
	* sysdeps/riscv/rvf/s_ceilf.c: Likewise.
	* sysdeps/riscv/rvf/s_finitef.c: Likewise.
	* sysdeps/riscv/rvf/s_floorf.c: Likewise.
	* sysdeps/riscv/rvf/s_fmaxf.c: Likewise.
	* sysdeps/riscv/rvf/s_fminf.c: Likewise.
	* sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise.
	* sysdeps/riscv/rvf/s_isinff.c: Likewise.
	* sysdeps/riscv/rvf/s_isnanf.c: Likewise.
	* sysdeps/riscv/rvf/s_issignalingf.c: Likewise.
	* sysdeps/riscv/rvf/s_nearbyintf.c: Likewise.
	* sysdeps/riscv/rvf/s_roundevenf.c: Likewise.
	* sysdeps/riscv/rvf/s_roundf.c: Likewise.
	* sysdeps/riscv/rvf/s_truncf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of
	<math_private.h>.
	* sysdeps/riscv/rvf/s_rintf.c: Likewise.
2018-09-28 21:53:33 +00:00
H.J. Lu
7b1f940676 i386: Use _dl_runtime_[resolve|profile]_shstk for SHSTK [BZ #23716]
When elf_machine_runtime_setup is called to set up resolver, it should
use _dl_runtime_resolve_shstk or _dl_runtime_profile_shstk if SHSTK is
enabled by kernel.

Tested on i686 with and without --enable-cet as well as on CET emulator
with --enable-cet.

	[BZ #23716]
	* sysdeps/i386/dl-cet.c: Removed.
	* sysdeps/i386/dl-machine.h (_dl_runtime_resolve_shstk): New
	prototype.
	(_dl_runtime_profile_shstk): Likewise.
	(elf_machine_runtime_setup): Use _dl_runtime_profile_shstk or
	_dl_runtime_resolve_shstk if SHSTK is enabled by kernel.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2018-09-28 13:31:30 -07:00
Szabolcs Nagy
424c4f60ed Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent).  The __exp1 internal symbol
is no longer necessary.

There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases.  The lookup table and consts for
log are 4168 bytes.  The .rodata+.text is decreased by 37908 bytes on
aarch64.  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.

	* NEWS: Mention pow improvements.
	* math/Makefile (type-double-routines): Add e_pow_log_data.
	* sysdeps/generic/math_private.h (__exp1): Remove.
	* sysdeps/i386/fpu/e_pow_log_data.c: New file.
	* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
	contraction.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
	(exp_inline): Remove.
	(__ieee754_exp): Only single double input is handled.
	* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
	(__pow_log_data): Define.
	* sysdeps/ieee754/dbl-64/upow.h: Remove.
	* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
	contraction.
	(CFLAGS-e_pow-fma4.c): Likewise.
2018-09-19 10:04:51 +01:00
Joseph Myers
f29b6f17e4 Use rint functions not __rint functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __rint functions to call the
corresponding rint names instead, with asm redirection to __rint when
the calls are not inlined.  The x86_64 math_private.h is removed as no
longer useful after this patch.

This patch is relative to a tree with my floor patch
<https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied,
and much the same considerations arise regarding possibly replacing an
IFUNC call with a direct inline expansion.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_rintf.c: Likewise.
	* sysdeps/alpha/fpu/s_rint.c: Likewise.
	* sysdeps/alpha/fpu/s_rintf.c: Likewise.
	* sysdeps/i386/fpu/s_rintl.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_rint.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise.
	* sysdeps/ieee754/float128/s_rintf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_rintf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise.
	* sysdeps/powerpc/fpu/s_rint.c: Likewise.
	* sysdeps/powerpc/fpu/s_rintf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_rint.c: Likewise.
	* sysdeps/riscv/rvf/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/math_private.h: Remove file.
	* math/e_scalb.c (invalid_fn): Use rint functions instead of
	__rint variants.
	* math/e_scalbf.c (invalid_fn): Likewise.
	* math/e_scalbl.c (invalid_fn): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
	* sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
2018-09-14 13:10:39 +00:00
Szabolcs Nagy
3e08ff544b Add new log2 implementation
Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c)
where the last term is approximated by a polynomial of x/c - 1, the first
order coefficient is about 1/ln2 in this case.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, for which the table size is doubled.

The worst case error is 0.547 ULP (0.55 without fma), the read only
global data size is 1168 bytes (2192 without fma) on aarch64.  The
non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log2 thruput: 2.00x in [0.01 11.1]
log2 latency: 2.04x in [0.01 11.1]
log2 thruput: 2.17x in [0.999 1.001]
log2 latency: 2.88x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linxu-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log2 improvements.
	* math/Makefile (type-double-routines): Add e_log2_data.
	* sysdeps/i386/fpu/e_log2_data.c: New file.
	* sysdeps/ia64/fpu/e_log2_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log2.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log2_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add.
	* sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.
2018-09-12 17:36:33 +01:00
Szabolcs Nagy
f41b0a43e4 Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1.  The log(c) is very near a double
precision value, it has about 62 bits precision.  The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1.  Near 1 a single polynomial of
x - 1 is used.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.

With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log improvement.
	* math/Makefile (type-double-routines): Add e_log_data.
	* sysdeps/i386/fpu/e_log_data.c: New file.
	* sysdeps/ia64/fpu/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
	* sysdeps/ieee754/dbl-64/ulog.h: Remove.
	* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
2018-09-12 17:33:30 +01:00
H.J. Lu
5a274db4ea i386: Use ENTRY and END in start.S [BZ #23606]
Wrapping the _start function with ENTRY and END to insert ENDBR32 at
function entry when CET is enabled.  Since _start now includes CFI,
without "cfi_undefined (eip)", unwinder may not terminate at _start
and we will get

Program received signal SIGSEGV, Segmentation fault.
0xf7dc661e in ?? () from /lib/libgcc_s.so.1
Missing separate debuginfos, use: dnf debuginfo-install libgcc-8.2.1-3.0.fc28.i686
(gdb) bt
 #0  0xf7dc661e in ?? () from /lib/libgcc_s.so.1
 #1  0xf7dc7c18 in _Unwind_Backtrace () from /lib/libgcc_s.so.1
 #2  0xf7f0d809 in __GI___backtrace (array=array@entry=0xffffc7d0,
    size=size@entry=20) at ../sysdeps/i386/backtrace.c:127
 #3  0x08049254 in compare (p1=p1@entry=0xffffcad0, p2=p2@entry=0xffffcad4)
    at backtrace-tst.c:12
 #4  0xf7e2a28c in msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0,
    n=n@entry=2) at msort.c:65
 #5  0xf7e29f64 in msort_with_tmp (n=2, b=0xffffcad0, p=0xffffca5c)
    at msort.c:53
 #6  msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=5)
    at msort.c:53
 #7  0xf7e29f64 in msort_with_tmp (n=5, b=0xffffcad0, p=0xffffca5c)
    at msort.c:53
 #8  msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=10)
    at msort.c:53
 #9  0xf7e29f64 in msort_with_tmp (n=10, b=0xffffcad0, p=0xffffca5c)
    at msort.c:53
 #10 msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=20)
    at msort.c:53
 #11 0xf7e2a5b6 in msort_with_tmp (n=20, b=0xffffcad0, p=0xffffca5c)
    at msort.c:297
 #12 __GI___qsort_r (b=b@entry=0xffffcad0, n=n@entry=20, s=s@entry=4,
    cmp=cmp@entry=0x8049230 <compare>, arg=arg@entry=0x0) at msort.c:297
 #13 0xf7e2a84d in __GI_qsort (b=b@entry=0xffffcad0, n=n@entry=20, s=s@entry=4,
    cmp=cmp@entry=0x8049230 <compare>) at msort.c:308
 #14 0x080490f6 in main (argc=2, argv=0xffffcbd4) at backtrace-tst.c:39

FAIL: debug/backtrace-tst

	[BZ #23606]
	* sysdeps/i386/start.S: Include <sysdep.h>
	(_start): Use ENTRY/END to insert ENDBR32 at entry when CET is
	enabled.  Add cfi_undefined (eip).

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2018-09-12 08:41:26 -07:00
Szabolcs Nagy
e70c176825 Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2.  There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.

The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.

The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case).  Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.

The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.

New double precision error handling code was added following the
style of the single precision error handling code.

Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]

For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.

Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).

	* NEWS: Mention exp and exp2 improvements.
	* math/Makefile (libm-support): Remove t_exp.
	(type-double-routines): Add math_err and e_exp_data.
	* sysdeps/aarch64/libm-test-ulps: Update.
	* sysdeps/arm/libm-test-ulps: Update.
	* sysdeps/i386/fpu/e_exp_data.c: New file.
	* sysdeps/i386/fpu/math_err.c: New file.
	* sysdeps/i386/fpu/t_exp.c: Remove.
	* sysdeps/ia64/fpu/e_exp_data.c: New file.
	* sysdeps/ia64/fpu/math_err.c: New file.
	* sysdeps/ia64/fpu/t_exp.c: Remove.
	* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
	* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
	* sysdeps/ieee754/dbl-64/math_config.h: New file.
	* sysdeps/ieee754/dbl-64/math_err.c: New file.
	* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
	* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
	* sysdeps/ieee754/dbl-64/uexp.h: Remove.
	* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
	* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
	* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
	* sysdeps/powerpc/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-09-05 16:22:00 +01:00
Paul Pluzhnikov
a6e8926f8d [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
Joseph Myers
ff6b24501f Split fenv_private.h out of math_private.h more consistently.
On some architectures, the parts of math_private.h relating to the
floating-point environment are in a separate file fenv_private.h
included from math_private.h.  As this is purely an
architecture-specific convention used by several architectures,
however, all such architectures still need their own math_private.h,
even if it has nothing to do beyond #include <fenv_private.h> and
peculiarity of including the i386 file directly instead of having a
shared file in sysdeps/x86.

This patch makes the fenv_private.h name an architecture-independent
convention in glibc.  The include of fenv_private.h from
math_private.h becomes architecture-independent (until callers are
updated to include fenv_private.h directly so the include from
math_private.h is no longer needed).  Some architecture math_private.h
headers are removed if no longer needed, or renamed to fenv_private.h
if all they define belongs in that header; architecture fenv_private.h
headers now do require #include_next <fenv_private.h>.  The i386
fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is
actually shared with x86_64.  The generic math_private.h gets a new
include of <stdbool.h>, as needed for bool in some prototypes in that
header (previously that was indirectly included via include/fenv.h,
which now only gets included too late in math_private.h, after those
prototypes).

Tested for x86_64 and x86, and tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.

	* sysdeps/aarch64/fpu/fenv_private.h: New file.  Based on ....
	* sysdeps/aarch64/fpu/math_private.h: ... this file.  All contents
	moved to fenv_private.h except for ...
	(TOINT_INTRINSICS): Kept in math_private.h.
	(roundtoint): Likewise.
	(converttoint): Likewise.
	* sysdeps/arm/fenv_private.h: Change multiple-include guard to
	[ARM_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/arm/math_private.h: Remove.
	* sysdeps/generic/fenv_private.h: New file.  Contents moved from
	....
	* sysdeps/generic/math_private.h: ... this file.  Include
	<stdbool.h>.  Do not include <fenv.h> or <get-rounding-mode.h>.
	Include <fenv_private.h>.  Remove functions and macros moved to
	fenv_private.h.
	* sysdeps/i386/fpu/math_private.h: Remove.
	* sysdeps/mips/math_private.h: Move to ....
	* sysdeps/mips/fpu/fenv_private.h: ... here.  Change
	multiple-include guard to [MIPS_FENV_PRIVATE_H].  Remove
	[__mips_hard_float] conditional.  Include next <fenv_private.h>.
	* sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include
	guard to [POWERPC_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/powerpc/fpu/math_private.h: Do not include
	<fenv_private.h>.
	* sysdeps/riscv/rvf/math_private.h: Move to ....
	* sysdeps/riscv/rvf/fenv_private.h: ... here.  Change
	multiple-include guard to [RISCV_FENV_PRIVATE_H].  Include next
	<fenv_private.h>.
	* sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard
	to [SPARC_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/sparc/fpu/math_private.h: Remove.
	* sysdeps/i386/fpu/fenv_private.h: Move to ....
	* sysdeps/x86/fpu/fenv_private.h: ... here.  Change
	multiple-include guard to [X86_FENV_PRIVATE_H].  Include next
	<fenv_private.h>.
	* sysdeps/x86_64/fpu/math_private.h: Do not include
	<sysdeps/i386/fpu/fenv_private.h>.
2018-08-28 20:48:49 +00:00
Wilco Dijkstra
ca3aac57ef Remove unused math files
Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c,
k_cos.c, k_sin.c.  After the tanf change s_rem_pio2f.c and k_rem_pio2f.c
(and the ia64, m68k and powerpc equivalents) are no longer used,
so remove them.  All e_rem_pio2.c files were already empty or commented
out, so remove them too.  Passes build-many-glibcs.

	* math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c.
	Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c.
	* sysdeps/i386/fpu/e_rem_pio2.c: Delete file.
	* sysdeps/ia64/fpu/e_rem_pio2.c: Likewise.
	* sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise.
	* sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise.
	* sysdeps/ieee754/dbl-64/k_cos.c: Likewise.
	* sysdeps/ieee754/dbl-64/k_sin.c: Likewise.
	* sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise.
	* sysdeps/ieee754/flt-32/k_cosf.c: Likewise.
	* sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise.
	* sysdeps/ieee754/flt-32/k_sinf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise
	* sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise
	* sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise
	* sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise.
	* sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.
2018-08-24 15:34:54 +01:00
Joseph Myers
2ce7ba7d15 Move SNAN_TESTS_* out of math-tests.h.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the SNAN_TESTS_* macros for individual types out to their own
sysdeps header (while the type-generic SNAN_TESTS wrapper for those
macros remains in math-tests.h).

Tested for x86_64 and x86, and with build-many-glibcs.py.

	* sysdeps/generic/math-tests-snan.h: New file.
	* sysdeps/generic/math-tests.h: Include <math-tests-snan.h>.
	(SNAN_TESTS_float): Do not define here.
	(SNAN_TESTS_double): Likewise.
	(SNAN_TESTS_long_double): Likewise.
	(SNAN_TESTS_float128): Likewise.
	* sysdeps/i386/fpu/math-tests-snan.h: New file.
	* sysdeps/i386/fpu/math-tests.h: Remove file.
	* sysdeps/ia64/math-tests-snan.h: New file.
	* sysdeps/ia64/math-tests.h: Remove file.
	* sysdeps/x86/math-tests.h: Likewise.
	* sysdeps/x86_64/fpu/math-tests-snan.h: New file.
2018-08-10 19:22:01 +00:00
Ilya Leoshkevich
8d997d2253 Move __fentry__ version definition to sysdeps/{i386,x86_64}
__fentry__ symbol is currently not defined for other architectures.
Attempts to introduce it cause abicheck to fail, because it will be
available since 2.29 earliest, and not 2.13, which is the case for
Intel.  With the new code, abicheck passes for i686-linux-gnu,
x86_64-linux-gnu and x86_64-linux-gnu32 triples.

ChangeLog:

	* stdlib/Versions: Remove __fentry__.
	* sysdeps/i386/Versions: Add __fentry__.
	* sysdeps/x86_64/Versions: Add __fentry__.
2018-08-10 09:07:44 +02:00
H.J. Lu
430388d5dc x86: Don't include <init-arch.h> in assembly codes
There is no need to include <init-arch.h> in assembly codes since all
x86 IFUNC selector functions are written in C.  Tested on i686 and
x86-64.  There is no code change in libc.so, ld.so and libmvec.so.

	* sysdeps/i386/i686/multiarch/bzero-ia32.S: Don't include
	<init-arch.h>.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise.
2018-08-03 08:05:00 -07:00
H.J. Lu
9aa3113a42 x86: Rename __glibc_reserved2 to ssp_base in tcbhead_t
This will be used to record the current shadow stack base for shadow
stack switching by getcontext, makecontext, setcontext and swapcontext.
If the target shadow stack base is the same as the current shadow stack
base, we unwind the shadow stack.  Otherwise it is a stack switch and
we look for a restore token to restore the target shadow stack.

	* sysdeps/i386/nptl/tcb-offsets.sym (SSP_BASE_OFFSET): New.
	* sysdeps/i386/nptl/tls.h (tcbhead_t): Replace __glibc_reserved2
	with ssp_base.
	* sysdeps/x86_64/nptl/tcb-offsets.sym (SSP_BASE_OFFSET): New.
	* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Replace __glibc_reserved2
	with ssp_base.
2018-07-25 04:39:39 -07:00
H.J. Lu
77a8ae0948 i386: Use _CET_NOTRACK in memset-sse2-rep.S
* sysdeps/i386/i686/multiarch/memset-sse2-rep.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 08:04:12 -07:00
H.J. Lu
90d15dc577 i386: Use _CET_NOTRACK in strcat-sse2.S
* sysdeps/i386/i686/multiarch/strcat-sse2.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 08:03:15 -07:00
H.J. Lu
f1574581c7 i386: Use _CET_NOTRACK in strcpy-sse2.S
* sysdeps/i386/i686/multiarch/strcpy-sse2.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 08:02:26 -07:00
H.J. Lu
7fb613361c i386: Use _CET_NOTRACK in memcpy-ssse3.S
* sysdeps/i386/i686/multiarch/memcpy-ssse3.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 08:01:06 -07:00
H.J. Lu
0a899af097 i386: Use _CET_NOTRACK in memcpy-ssse3-rep.S
* sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
	(BRANCH_TO_JMPTBL_ENTRY_TAIL): Likewise.
2018-07-18 08:00:09 -07:00
H.J. Lu
177824e232 i386: Use _CET_NOTRACK in memcmp-sse4.S
* sysdeps/i386/i686/multiarch/memcmp-sse4.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 07:59:04 -07:00
H.J. Lu
00e7b76a8f i386: Use _CET_NOTRACK in memset-sse2.S
* sysdeps/i386/i686/multiarch/memset-sse2.S
	(BRANCH_TO_JMPTBL_ENTRY): Add _CET_NOTRACK before indirect jump
	to jump table.
2018-07-18 07:57:59 -07:00
H.J. Lu
7e119cd582 i386: Use _CET_NOTRACK in i686/memcmp.S
* sysdeps/i386/i686/memcmp.S (memcmp): Add _CET_NOTRACK before
	indirect jump to jump table.
2018-07-18 07:56:58 -07:00
H.J. Lu
be9ccd27c0 i386: Add _CET_ENDBR to indirect jump targets in add_n.S/sub_n.S
i386 add_n.S and sub_n.S use a trick to implment jump tables with LEA.
We can't use conditional branches nor normal jump tables since jump
table entries use EFLAGS set by jump table index.  This patch adds
_CET_ENDBR to indirect jump targets and adjust destination for
_CET_ENDBR.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* sysdeps/i386/add_n.S: Include <sysdep.h>, instead of
	"sysdep.h".
	(__mpn_add_n): Save and restore %ebx if IBT is enabed.  Add
	_CET_ENDBR to indirect jump targets and adjust jump destination
	for _CET_ENDBR.
	* sysdeps/i386/i686/add_n.S: Include <sysdep.h>, instead of
	"sysdep.h".
	(__mpn_add_n): Save and restore %ebx if IBT is enabed.  Add
	_CET_ENDBR to indirect jump targets and adjust jump destination
	for _CET_ENDBR.
	* sysdeps/i386/sub_n.S: Include <sysdep.h>, instead of
	"sysdep.h".
	(__mpn_sub_n): Save and restore %ebx if IBT is enabed.  Add
	_CET_ENDBR to indirect jump targets and adjust jump destination
	for _CET_ENDBR.
2018-07-17 16:11:44 -07:00
H.J. Lu
562837c002 x86: Add _CET_ENDBR to functions in dl-tlsdesc.S
Add _CET_ENDBR to functions in dl-tlsdesc.S, which are called indirectly,
to support IBT.

Tested on i686 and x86-64.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_return): Add
	_CET_ENDBR.
	(_dl_tlsdesc_undefweak): Likewise.
	(_dl_tlsdesc_dynamic): Likewise.
	(_dl_tlsdesc_resolve_abs_plus_addend): Likewise.
	(_dl_tlsdesc_resolve_rel): Likewise.
	(_dl_tlsdesc_resolve_rela): Likewise.
	(_dl_tlsdesc_resolve_hold): Likewise.
	* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_return): Likewise.
	(_dl_tlsdesc_undefweak): Likewise.
	(_dl_tlsdesc_dynamic): Likewise.
	(_dl_tlsdesc_resolve_rela): Likewise.
	(_dl_tlsdesc_resolve_hold): Likewise.
2018-07-17 16:07:17 -07:00
H.J. Lu
124bcde683 x86: Add _CET_ENDBR to functions in crti.S
Add _CET_ENDBR to functions in crti.S, which are called indirectly, to
support IBT.

Tested on i686 and x86-64.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* sysdeps/i386/crti.S (_init): Add _CET_ENDBR.
	(_fini): Likewise.
	* sysdeps/x86_64/crti.S (_init): Likewise.
	(_fini): Likewise.
2018-07-17 16:05:18 -07:00
H.J. Lu
f753fa7dea x86: Support IBT and SHSTK in Intel CET [BZ #21598]
Intel Control-flow Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en
forcement-technology-preview.pdf

includes Indirect Branch Tracking (IBT) and Shadow Stack (SHSTK).

GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to
indicate that all executable sections are compatible with IBT when
ENDBR instruction starts each valid target where an indirect branch
instruction can land.  Linker sets GNU_PROPERTY_X86_FEATURE_1_IBT on
output only if it is set on all relocatable inputs.

On an IBT capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable IBT and
lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable.
2. When loading an executable with an interpreter, enable IBT if
GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable,
     disable IBT.
  b. Lock IBT.
3. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
  a. If legacy interwork is allowed, then mark all pages in executable
     PT_LOAD segments in legacy code page bitmap.  Failure of legacy code
     page bitmap allocation causes an error.
  b. If legacy interwork isn't allowed, it causes an error.

GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to
indicate that all executable sections are compatible with SHSTK where
return address popped from shadow stack always matches return address
popped from normal stack.  Linker sets GNU_PROPERTY_X86_FEATURE_1_SHSTK
on output only if it is set on all relocatable inputs.

On a SHSTK capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable.
2. When loading an executable with an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable
     or any shared objects loaded via the DT_NEEDED tag, disable SHSTK.
  b. Otherwise lock SHSTK.
3. After SHSTK is enabled, it is an error to load a shared object
without GNU_PROPERTY_X86_FEATURE_1_SHSTK.

To enable CET support in glibc, --enable-cet is required to configure
glibc.  When CET is enabled, both compiler and assembler must support
CET.  Otherwise, it is a configure-time error.

To support CET run-time control,

1. _dl_x86_feature_1 is added to the writable ld.so namespace to indicate
if IBT or SHSTK are enabled at run-time.  It should be initialized by
init_cpu_features.
2. For dynamic executables:
   a. A l_cet field is added to struct link_map to indicate if IBT or
      SHSTK is enabled in an ELF module.  _dl_process_pt_note or
      _rtld_process_pt_note is called to process PT_NOTE segment for
      GNU program property and set l_cet.
   b. _dl_open_check is added to check IBT and SHSTK compatibilty when
      dlopening a shared object.
3. Replace i386 _dl_runtime_resolve and _dl_runtime_profile with
_dl_runtime_resolve_shstk and _dl_runtime_profile_shstk, respectively if
SHSTK is enabled.

CET run-time control can be changed via GLIBC_TUNABLES with

$ export GLIBC_TUNABLES=glibc.tune.x86_shstk=[permissive|on|off]
$ export GLIBC_TUNABLES=glibc.tune.x86_ibt=[permissive|on|off]

1. permissive: SHSTK is disabled when dlopening a legacy ELF module.
2. on: IBT or SHSTK are always enabled, regardless if there are IBT or
SHSTK bits in GNU program property.
3. off: IBT or SHSTK are always disabled, regardless if there are IBT or
SHSTK bits in GNU program property.

<cet.h> from CET-enabled GCC is automatically included by assembly codes
to add GNU_PROPERTY_X86_FEATURE_1_IBT and GNU_PROPERTY_X86_FEATURE_1_SHSTK
to GNU program property.  _CET_ENDBR is added at the entrance of all
assembly functions whose address may be taken.  _CET_NOTRACK is used to
insert NOTRACK prefix with indirect jump table to support IBT.  It is
defined as notrack when _CET_NOTRACK is defined in <cet.h>.

	 [BZ #21598]
	* configure.ac: Add --enable-cet.
	* configure: Regenerated.
	* elf/Makefille (all-built-dso): Add a comment.
	* elf/dl-load.c (filebuf): Moved before "dynamic-link.h".
	Include <dl-prop.h>.
	(_dl_map_object_from_fd): Call _dl_process_pt_note on PT_NOTE
	segment.
	* elf/dl-open.c: Include <dl-prop.h>.
	(dl_open_worker): Call _dl_open_check.
	* elf/rtld.c: Include <dl-prop.h>.
	(dl_main): Call _rtld_process_pt_note on PT_NOTE segment.  Call
	_rtld_main_check.
	* sysdeps/generic/dl-prop.h: New file.
	* sysdeps/i386/dl-cet.c: Likewise.
	* sysdeps/unix/sysv/linux/x86/cpu-features.c: Likewise.
	* sysdeps/unix/sysv/linux/x86/dl-cet.h: Likewise.
	* sysdeps/x86/cet-tunables.h: Likewise.
	* sysdeps/x86/check-cet.awk: Likewise.
	* sysdeps/x86/configure: Likewise.
	* sysdeps/x86/configure.ac: Likewise.
	* sysdeps/x86/dl-cet.c: Likewise.
	* sysdeps/x86/dl-procruntime.c: Likewise.
	* sysdeps/x86/dl-prop.h: Likewise.
	* sysdeps/x86/libc-start.h: Likewise.
	* sysdeps/x86/link_map.h: Likewise.
	* sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Add
	_CET_ENDBR.
	(_dl_runtime_profile): Likewise.
	(_dl_runtime_resolve_shstk): New.
	(_dl_runtime_profile_shstk): Likewise.
	* sysdeps/linux/x86/Makefile (sysdep-dl-routines): Add dl-cet
	if CET is enabled.
	(CFLAGS-.o): Add -fcf-protection if CET is enabled.
	(CFLAGS-.os): Likewise.
	(CFLAGS-.op): Likewise.
	(CFLAGS-.oS): Likewise.
	(asm-CPPFLAGS): Add -fcf-protection -include cet.h if CET
	is enabled.
	(tests-special): Add $(objpfx)check-cet.out.
	(cet-built-dso): New.
	(+$(cet-built-dso:=.note)): Likewise.
	(common-generated): Add $(cet-built-dso:$(common-objpfx)%=%.note).
	($(objpfx)check-cet.out): New.
	(generated): Add check-cet.out.
	* sysdeps/x86/cpu-features.c: Include <dl-cet.h> and
	<cet-tunables.h>.
	(TUNABLE_CALLBACK (set_x86_ibt)): New prototype.
	(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
	(init_cpu_features): Call get_cet_status to check CET status
	and update dl_x86_feature_1 with CET status.  Call
	TUNABLE_CALLBACK (set_x86_ibt) and TUNABLE_CALLBACK
	(set_x86_shstk).  Disable and lock CET in libc.a.
	* sysdeps/x86/cpu-tunables.c: Include <cet-tunables.h>.
	(TUNABLE_CALLBACK (set_x86_ibt)): New function.
	(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
	* sysdeps/x86/sysdep.h (_CET_NOTRACK): New.
	(_CET_ENDBR): Define if not defined.
	(ENTRY): Add _CET_ENDBR.
	* sysdeps/x86/dl-tunables.list (glibc.tune): Add x86_ibt and
	x86_shstk.
	* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Add
	_CET_ENDBR.
	(_dl_runtime_profile): Likewise.
2018-07-16 14:08:27 -07:00
H.J. Lu
faaee1f07e x86: Support shadow stack pointer in setjmp/longjmp
Save and restore shadow stack pointer in setjmp and longjmp to support
shadow stack in Intel CET.  Use feature_1 in tcbhead_t to check if
shadow stack is enabled before saving and restoring shadow stack pointer.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* sysdeps/i386/__longjmp.S: Include <jmp_buf-ssp.h>.
	(__longjmp): Restore shadow stack pointer if shadow stack is
	enabled, SHADOW_STACK_POINTER_OFFSET is defined and __longjmp
	isn't defined for __longjmp_cancel.
	* sysdeps/i386/bsd-_setjmp.S: Include <jmp_buf-ssp.h>.
	(_setjmp): Save shadow stack pointer if shadow stack is enabled
	and SHADOW_STACK_POINTER_OFFSET is defined.
	* sysdeps/i386/bsd-setjmp.S: Include <jmp_buf-ssp.h>.
	(setjmp): Save shadow stack pointer if shadow stack is enabled
	and SHADOW_STACK_POINTER_OFFSET is defined.
	* sysdeps/i386/setjmp.S: Include <jmp_buf-ssp.h>.
	(__sigsetjmp): Save shadow stack pointer if shadow stack is
	enabled and SHADOW_STACK_POINTER_OFFSET is defined.
	* sysdeps/unix/sysv/linux/i386/____longjmp_chk.S: Include
	<jmp_buf-ssp.h>.
	(____longjmp_chk): Restore shadow stack pointer if shadow stack
	is enabled and SHADOW_STACK_POINTER_OFFSET is defined.
	* sysdeps/unix/sysv/linux/x86/Makefile (gen-as-const-headers):
	Remove jmp_buf-ssp.sym.
	* sysdeps/unix/sysv/linux/x86_64/____longjmp_chk.S: Include
	<jmp_buf-ssp.h>.
	(____longjmp_chk): Restore shadow stack pointer if shadow stack
	is enabled and SHADOW_STACK_POINTER_OFFSET is defined.
	* sysdeps/x86/Makefile (gen-as-const-headers): Add
	jmp_buf-ssp.sym.
	* sysdeps/x86/jmp_buf-ssp.sym: New dummy file.
	* sysdeps/x86_64/__longjmp.S: Include <jmp_buf-ssp.h>.
	(__longjmp): Restore shadow stack pointer if shadow stack is
	enabled, SHADOW_STACK_POINTER_OFFSET is defined and __longjmp
	isn't defined for __longjmp_cancel.
	* sysdeps/x86_64/setjmp.S: Include <jmp_buf-ssp.h>.
	(__sigsetjmp): Save shadow stack pointer if shadow stack is
	enabled and SHADOW_STACK_POINTER_OFFSET is defined.
2018-07-14 05:59:53 -07:00
H.J. Lu
ebff9c5cfa x86: Rename __glibc_reserved1 to feature_1 in tcbhead_t [BZ #22563]
feature_1 has X86_FEATURE_1_IBT and X86_FEATURE_1_SHSTK bits for CET
run-time control.

CET_ENABLED, IBT_ENABLED and SHSTK_ENABLED are defined to 1 or 0 to
indicate that if CET, IBT and SHSTK are enabled.

<tls-setup.h> is added to set up thread-local data.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	[BZ #22563]
	* nptl/pthread_create.c: Include <tls-setup.h>.
	(__pthread_create_2_1): Call tls_setup_tcbhead.
	* sysdeps/generic/tls-setup.h: New file.
	* sysdeps/x86/nptl/tls-setup.h: Likewise.
	* sysdeps/i386/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
	* sysdeps/x86_64/nptl/tcb-offsets.sym (FEATURE_1_OFFSET):
	Likewise.
	* sysdeps/i386/nptl/tls.h (tcbhead_t): Rename __glibc_reserved1
	to feature_1.
	* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Likewise.
	* sysdeps/x86/sysdep.h (X86_FEATURE_1_IBT): New.
	(X86_FEATURE_1_SHSTK): Likewise.
	(CET_ENABLED): Likewise.
	(IBT_ENABLED): Likewise.
	(SHSTK_ENABLED): Likewise.
2018-07-14 05:56:46 -07:00
Florian Weimer
f496b28e61 math: Set 387 and SSE2 rounding mode for tgamma on i386 [BZ #23253]
Previously, only the SSE2 rounding mode was set, so the assembler
implementations using 387 were not following the expecting rounding
mode.
2018-06-21 08:04:29 +02:00
H.J. Lu
0221ce2a90 i386: Change offset of __private_ss to 0x30 [BZ #23250]
sysdeps/i386/nptl/tls.h has

typedef struct
{
  void *tcb;            /* Pointer to the TCB.  Not necessarily the
                           thread descriptor used by libpthread.  */
  dtv_t *dtv;
  void *self;           /* Pointer to the thread descriptor.  */
  int multiple_threads;
  uintptr_t sysinfo;
  uintptr_t stack_guard;
  uintptr_t pointer_guard;
  int gscope_flag;
  int __glibc_reserved1;
  /* Reservation of some values for the TM ABI.  */
  void *__private_tm[4];
  /* GCC split stack support.  */
  void *__private_ss;
} tcbhead_t;

The offset of __private_ss is 0x34.  But GCC defines

/* We steal the last transactional memory word.  */
 #define TARGET_THREAD_SPLIT_STACK_OFFSET 0x30

and libgcc/config/i386/morestack.S has

	cmpl	%gs:0x30,%eax		# See if we have enough space.
	movl	%eax,%gs:0x30		# Save the new stack boundary.
	movl	%eax,%gs:0x30		# Save the new stack boundary.
	movl	%ecx,%gs:0x30		# Save new stack boundary.
	movl	%eax,%gs:0x30
	movl	%gs:0x30,%eax
	movl	%eax,%gs:0x30

Since update TARGET_THREAD_SPLIT_STACK_OFFSET changes split stack ABI,
this patch updates tcbhead_t to match GCC.

	[BZ #23250]
	[BZ #10686]
	* sysdeps/i386/nptl/tls.h (tcbhead_t): Change __private_tm[4]
	to _private_tm[3] and add __glibc_reserved2.
	Add _Static_assert of offset of __private_ss == 0x30.
	* sysdeps/x86_64/nptl/tls.h: Add _Static_assert of offset of
	__private_ss == 0x40 for ILP32 and == 0x70 for LP64.
2018-06-12 06:34:48 -07:00
Florian Weimer
e826574c98 x86: Make strncmp usable from rtld
Due to the way the conditions were written, the rtld build of strncmp
ended up with no definition of the strncmp symbol at all: The
implementations were renamed for use within an IFUNC resolver, but the
IFUNC resolver itself was missing (because rtld does not use IFUNCs).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-06-12 15:00:33 +02:00
H.J. Lu
67c0579669 Mark _init and _fini as hidden [BZ #23145]
_init and _fini are special functions provided by glibc for linker to
define DT_INIT and DT_FINI in executable and shared library.  They
should never be put in dynamic symbol table.  This patch marks them as
hidden to remove them from dynamic symbol table.

Tested with build-many-glibcs.py.

	[BZ #23145]
	* elf/Makefile (tests-special): Add $(objpfx)check-initfini.out.
	($(all-built-dso:=.dynsym): New target.
	(common-generated): Add $(all-built-dso:$(common-objpfx)%=%.dynsym).
	($(objpfx)check-initfini.out): New target.
	(generated): Add check-initfini.out.
	* scripts/check-initfini.awk: New file.
	* sysdeps/aarch64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/alpha/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/arm/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/hppa/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/i386/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/ia64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/m68k/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/microblaze/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips64/n32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips64/n64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/nios2/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/powerpc/powerpc32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/powerpc/powerpc64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/s390/s390-32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/s390/s390-64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/sh/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/sparc/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/x86_64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
2018-06-08 10:28:52 -07:00
Florian Weimer
e02c026f38 math: Update i686 ulps (--disable-multi-arch configuration)
The results are from configuring with --disable-multi-arch,  building
with “-march=x86-64 -mtune=generic -mfpmath=sse” and running the
testsuite on a Haswell-era CPU.
2018-06-01 22:37:55 +02:00
Florian Weimer
d8c1927561 math: Update i686 ulps
The results are from building with “-march=x86-64 -mtune=generic
-mfpmath=sse” and running the testsuite on a Haswell-era CPU.
2018-06-01 19:32:18 +02:00
Florian Weimer
ed0d698870 i386: Drop -mpreferred-stack-boundary=4
The flag was a left-over from when the -mpreferred-stack-boundary=2 flag
was removed in commit db290cf592.
2018-05-22 14:44:14 +02:00
H.J. Lu
0068c08588 nptl: Remove __ASSUME_PRIVATE_FUTEX
Since __ASSUME_PRIVATE_FUTEX is always defined, this patch removes the
!__ASSUME_PRIVATE_FUTEX paths.

Tested with build-many-glibcs.py.

	* nptl/allocatestack.c (allocate_stack): Remove the
	!__ASSUME_PRIVATE_FUTEX paths.
	* nptl/descr.h (header): Remove the !__ASSUME_PRIVATE_FUTEX path.
	* nptl/nptl-init.c (__pthread_initialize_minimal_internal):
	Likewise.
	* sysdeps/i386/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Removed.
	* sysdeps/powerpc/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise.
	* sysdeps/sh/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise.
	* sysdeps/x86_64/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise.
	* sysdeps/i386/nptl/tls.h: (tcbhead_t): Remve the
	!__ASSUME_PRIVATE_FUTEX path.
	* sysdeps/s390/nptl/tls.h (tcbhead_t): Likewise.
	* sysdeps/sparc/nptl/tls.h (tcbhead_t): Likewise.
	* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Likewise.
	* sysdeps/unix/sysv/linux/i386/lowlevellock.S: Remove the
	!__ASSUME_PRIVATE_FUTEX macros.
	* sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/cancellation.S: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
	* sysdeps/unix/sysv/linux/kernel-features.h
	(__ASSUME_PRIVATE_FUTEX): Removed.
2018-05-17 04:25:10 -07:00
Joseph Myers
632a6cbe44 Add narrowing divide functions.
This patch adds the narrowing divide functions from TS 18661-1 to
glibc's libm: fdiv, fdivl, ddivl, f32divf64, f32divf32x, f32xdivf64
for all configurations; f32divf64x, f32divf128, f64divf64x,
f64divf128, f32xdivf64x, f32xdivf128, f64xdivf128 for configurations
with _Float64x and _Float128; __nldbl_ddivl for ldbl-opt.

The changes are mostly essentially the same as for the other narrowing
functions, so the description of those generally applies to this patch
as well.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

	* math/Makefile (libm-narrow-fns): Add div.
	(libm-test-funcs-narrow): Likewise.
	* math/Versions (GLIBC_2.28): Add narrowing divide functions.
	* math/bits/mathcalls-narrow.h (div): Use __MATHCALL_NARROW.
	* math/gen-auto-libm-tests.c (test_functions): Add div.
	* math/math-narrow.h (CHECK_NARROW_DIV): New macro.
	(NARROW_DIV_ROUND_TO_ODD): Likewise.
	(NARROW_DIV_TRIVIAL): Likewise.
	* sysdeps/ieee754/float128/float128_private.h (__fdivl): New
	macro.
	(__ddivl): Likewise.
	* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fdiv and
	ddiv.
	(CFLAGS-nldbl-ddiv.c): New variable.
	(CFLAGS-nldbl-fdiv.c): Likewise.
	* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
	__nldbl_ddivl.
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_ddivl): New
	prototype.
	* manual/arith.texi (Misc FP Arithmetic): Document fdiv, fdivl,
	ddivl, fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx.
	* math/auto-libm-test-in: Add tests of div.
	* math/auto-libm-test-out-narrow-div: New generated file.
	* math/libm-test-narrow-div.inc: New file.
	* sysdeps/i386/fpu/s_f32xdivf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_f32xdivf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fdiv.c: Likewise.
	* sysdeps/ieee754/float128/s_f32divf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64divf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64xdivf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_ddivl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_f64xdivf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_fdivl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_ddivl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fdivl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_ddivl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fdivl.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-ddiv.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-fdiv.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fdiv.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Update.
	* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-05-17 00:40:52 +00:00
Joseph Myers
69a01461ee Add narrowing multiply functions.
This patch adds the narrowing multiply functions from TS 18661-1 to
glibc's libm: fmul, fmull, dmull, f32mulf64, f32mulf32x, f32xmulf64
for all configurations; f32mulf64x, f32mulf128, f64mulf64x,
f64mulf128, f32xmulf64x, f32xmulf128, f64xmulf128 for configurations
with _Float64x and _Float128; __nldbl_dmull for ldbl-opt.

The changes are mostly essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.  f32xmulf64 for i386 cannot use precision control as used for
add and subtract, because that would result in double rounding for
subnormal results, so that uses round-to-odd with long double
intermediate result instead.  The soft-fp support involves adding a
new FP_TRUNC_COOKED since soft-fp multiplication uses cooked inputs
and outputs.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

	* math/Makefile (libm-narrow-fns): Add mul.
	(libm-test-funcs-narrow): Likewise.
	* math/Versions (GLIBC_2.28): Add narrowing multiply functions.
	* math/bits/mathcalls-narrow.h (mul): Use __MATHCALL_NARROW.
	* math/gen-auto-libm-tests.c (test_functions): Add mul.
	* math/math-narrow.h (CHECK_NARROW_MUL): New macro.
	(NARROW_MUL_ROUND_TO_ODD): Likewise.
	(NARROW_MUL_TRIVIAL): Likewise.
	* soft-fp/op-common.h (FP_TRUNC_COOKED): Likewise.
	* sysdeps/ieee754/float128/float128_private.h (__fmull): New
	macro.
	(__dmull): Likewise.
	* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fmul and
	dmul.
	(CFLAGS-nldbl-dmul.c): New variable.
	(CFLAGS-nldbl-fmul.c): Likewise.
	* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
	__nldbl_dmull.
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dmull): New
	prototype.
	* manual/arith.texi (Misc FP Arithmetic): Document fmul, fmull,
	dmull, fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx.
	* math/auto-libm-test-in: Add tests of mul.
	* math/auto-libm-test-out-narrow-mul: New generated file.
	* math/libm-test-narrow-mul.inc: New file.
	* sysdeps/i386/fpu/s_f32xmulf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_f32xmulf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fmul.c: Likewise.
	* sysdeps/ieee754/float128/s_f32mulf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64mulf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64xmulf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_dmull.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_f64xmulf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_fmull.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_dmull.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fmull.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_dmull.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmull.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-dmul.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-fmul.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_dmull.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fmul.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fmull.c: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Update.
	* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-05-16 00:05:28 +00:00
H.J. Lu
a15529fda8 i386: Replace PREINIT_FUNCTION@PLT with *%eax in call
Since we have loaded address of PREINIT_FUNCTION into %eax, we can
avoid extra branch to PLT slot.

	* sysdeps/i386/crti.S (_init): Replace PREINIT_FUNCTION@PLT
	with *%eax in call.

Acked-by: Christian Brauner (Ubuntu) <christian@brauner.io>
2018-05-14 09:24:06 -07:00
H.J. Lu
98ee36c7a4 x86: Add sysdeps/x86/ldsodefs.h
Merge sysdeps/i386/ldsodefs.h and sysdeps/x86_64/ldsodefs.h into
sysdeps/x86/ldsodefs.h.

Tested on i686 and x86-64.

	* sysdeps/i386/ldsodefs.h: Removed.
	* sysdeps/x86_64/ldsodefs.h: Moved to ...
	* sysdeps/x86/ldsodefs.h: This.
	(La_i86_regs): New.
	(La_i86_retval): Likewise.
	(ARCH_PLTENTER_MEMBERS): Add i86_gnu_pltenter.
	(ARCH_PLTEXIT_MEMBERS): i86_gnu_pltexit.

Acked-by: Christian Brauner (Ubuntu) christian@brauner.io
2018-05-14 09:19:41 -07:00
Joseph Myers
b4d5b8b021 Do not include math-barriers.h in math_private.h.
This patch continues the math_private.h cleanup by stopping
math_private.h from including math-barriers.h and making the users of
the barrier macros include the latter header directly.  No attempt is
made to remove any math_private.h includes that are now unused, except
in strtod_l.c where that is done to avoid line number changes in
assertions, so that installed stripped shared libraries can be
compared before and after the patch.  (I think the floating-point
environment support in math_private.h should also move out - some
architectures already have fenv_private.h as an architecture-internal
header included from their math_private.h - and after moving that out
might be a better time to identify unused math_private.h includes.)

Tested for x86_64 and x86, and tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.

	* sysdeps/generic/math_private.h: Do not include
	<math-barriers.h>.
	* stdlib/strtod_l.c: Include <math-barriers.h> instead of
	<math_private.h>.
	* math/fromfp.h: Include <math-barriers.h>.
	* math/math-narrow.h: Likewise.
	* math/s_nextafter.c: Likewise.
	* math/s_nexttowardf.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrintf.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrintf.c: Likewise.
	* sysdeps/i386/fpu/s_nextafterl.c: Likewise.
	* sysdeps/i386/fpu/s_nexttoward.c: Likewise.
	* sysdeps/i386/fpu/s_nexttowardf.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_atan2.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_atanh.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp2.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_j0.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_expm1.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fma.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_log1p.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise.
	* sysdeps/ieee754/flt-32/e_atanhf.c: Likewise.
	* sysdeps/ieee754/flt-32/e_j0f.c: Likewise.
	* sysdeps/ieee754/flt-32/s_expm1f.c: Likewise.
	* sysdeps/ieee754/flt-32/s_log1pf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_nextafterf.c: Likewise.
	* sysdeps/ieee754/k_standardl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/e_expl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/e_powl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_nextafterl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_nexttoward.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_nexttowardf.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/e_j0l.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fma.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_nexttoward.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_nexttowardf.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_nextafterl.c: Likewise.
2018-05-11 15:11:38 +00:00
Joseph Myers
9ed2e15ff4 Move math_opt_barrier, math_force_eval to separate math-barriers.h.
This patch continues cleaning up math_private.h by moving the
math_opt_barrier and math_force_eval macros to a separate header
math-barriers.h.

At present, those macros are inside a "#ifndef math_opt_barrier" in
math_private.h to allow architectures to override them and then use
a separate math-barriers.h header, no such #ifndef or #include_next is
needed; architectures just have their own alternative version of
math-barriers.h when providing their own optimized versions that avoid
going through memory unnecessarily.  The generic math-barriers.h has a
comment added to document these two macros.

In this patch, math_private.h is made to #include <math-barriers.h>,
so files using these macros do not need updating yet.  That is because
of uses of math_force_eval in math_check_force_underflow and
math_check_force_underflow_nonneg, which are still defined in
math_private.h.  Once those are moved out to a separate header, that
separate header can be made to include <math-barriers.h>, as can the
other files directly using these barrier macros, and then the include
of <math-barriers.h> from math_private.h can be removed.

Tested for x86_64 and x86.  Also tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by this patch.

	* sysdeps/generic/math-barriers.h: New file.
	* sysdeps/generic/math_private.h [!math_opt_barrier]
	(math_opt_barrier): Move to math-barriers.h.
	[!math_opt_barrier] (math_force_eval): Likewise.
	* sysdeps/aarch64/fpu/math-barriers.h: New file.
	* sysdeps/aarch64/fpu/math_private.h (math_opt_barrier): Move to
	math-barriers.h.
	(math_force_eval): Likewise.
	* sysdeps/alpha/fpu/math-barriers.h: New file.
	* sysdeps/alpha/fpu/math_private.h (math_opt_barrier): Move to
	math-barriers.h.
	(math_force_eval): Likewise.
	* sysdeps/x86/fpu/math-barriers.h: New file.
	* sysdeps/i386/fpu/fenv_private.h (math_opt_barrier): Move to
	math-barriers.h.
	(math_force_eval): Likewise.
	* sysdeps/m68k/m680x0/fpu/math_private.h: Move to....
	* sysdeps/m68k/m680x0/fpu/math-barriers.h: ... here.  Adjust
	multiple-include guard for rename.
	* sysdeps/powerpc/fpu/math-barriers.h: New file.
	* sysdeps/powerpc/fpu/math_private.h (math_opt_barrier): Move to
	math-barriers.h.
	(math_force_eval): Likewise.
2018-05-09 19:45:47 +00:00
Joseph Myers
aaee3cd88e Move math_narrow_eval to separate math-narrow-eval.h.
This patch continues cleaning up the math_private.h header, which
contains lots of different definitions many of which are only needed
by a limited subset of files using that header (and some of which are
overridden by architectures that only want to override selected parts
of the header), by moving the math_narrow_eval macro out to a separate
math-narrow-eval.h header, only included by those files that need it.
That header is placed in include/ (since it's used in stdlib/, not
just files built in math/, but no sysdeps variants are needed at
present).

Tested for x86_64, and with build-many-glibcs.py.  (Installed stripped
shared libraries change because of line numbers in assertions in
strtod_l.c.)

	* include/math-narrow-eval.h: New file.  Contents moved from ....
	* sysdeps/generic/math_private.h: ... here.
	(math_narrow_eval): Remove macro.  Moved to math-narrow-eval.h.
	[FLT_EVAL_METHOD != 0] (excess_precision): Likewise.
	* math/s_fdim_template.c: Include <math-narrow-eval.h>.
	* stdlib/strtod_l.c: Likewise.
	* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
	* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
	* sysdeps/i386/fpu/s_fdim.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_cosh.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_j1.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_jn.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_sinh.c: Likewise.
	* sysdeps/ieee754/dbl-64/gamma_productf.c: Likewise.
	* sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_erf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_llrint.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_lrint.c: Likewise.
	* sysdeps/ieee754/flt-32/e_coshf.c: Likewise.
	* sysdeps/ieee754/flt-32/e_exp2f.c: Likewise.
	* sysdeps/ieee754/flt-32/e_expf.c: Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise.
	* sysdeps/ieee754/flt-32/e_j1f.c: Likewise.
	* sysdeps/ieee754/flt-32/e_jnf.c: Likewise.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c: Likewise.
	* sysdeps/ieee754/flt-32/e_sinhf.c: Likewise.
	* sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_erff.c: Likewise.
	* sysdeps/ieee754/flt-32/s_llrintf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_lrintf.c: Likewise.
	* sysdeps/ieee754/ldbl-96/gamma_product.c: Likewise.
2018-05-09 00:15:10 +00:00
Samuel Thibault
81b032c833 Drop fpregset unused symbol exposition
* sysdeps/arm/sys/ucontext.h: Remove fpregset struct name, unused and
	non-compliant.
	* sysdeps/i386/sys/ucontext.h: Likewise.
	* sysdeps/m68k/sys/ucontext.h: Likewise.
	* sysdeps/mips/sys/ucontext.h: Likewise.
	* sysdeps/unix/sysv/linux/hppa/sys/ucontext.h: Likewise.
2018-04-20 01:27:13 +02:00
Adhemerval Zanella
243f59e5aa Update i386 libm-test-ulps.
Updated ulps after recent commit "[PATCH 1/7] sin/cos slow paths:
avoid slow paths for small inputs"
(19a8b9a300).

	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update.
2018-04-06 17:24:15 -03:00
Maciej W. Rozycki
10a446ddcc elf: Unify symbol address run-time calculation [BZ #19818]
Wrap symbol address run-time calculation into a macro and use it
throughout, replacing inline calculations.

There are a couple of variants, most of them different in a functionally
insignificant way.  Most calculations are right following RESOLVE_MAP,
at which point either the map or the symbol returned can be checked for
validity as the macro sets either both or neither.  In some places both
the symbol and the map has to be checked however.

My initial implementation therefore always checked both, however that
resulted in code larger by as much as 0.3%, as many places know from
elsewhere that no check is needed.  I have decided the size growth was
unacceptable.

Having looked closer I realized that it's the map that is the culprit.
Therefore I have modified LOOKUP_VALUE_ADDRESS to accept an additional
boolean argument telling it to access the map without checking it for
validity.  This in turn has brought quite nice results, with new code
actually being smaller for i686, and MIPS o32, n32 and little-endian n64
targets, unchanged in size for x86-64 and, unusually, marginally larger
for big-endian MIPS n64, as follows:

i686:
   text    data     bss     dec     hex filename
 152255    4052     192  156499   26353 ld-2.27.9000-base.so
 152159    4052     192  156403   262f3 ld-2.27.9000-elf-symbol-value.so
MIPS/o32/el:
   text    data     bss     dec     hex filename
 142906    4396     260  147562   2406a ld-2.27.9000-base.so
 142890    4396     260  147546   2405a ld-2.27.9000-elf-symbol-value.so
MIPS/n32/el:
   text    data     bss     dec     hex filename
 142267    4404     260  146931   23df3 ld-2.27.9000-base.so
 142171    4404     260  146835   23d93 ld-2.27.9000-elf-symbol-value.so
MIPS/n64/el:
   text    data     bss     dec     hex filename
 149835    7376     408  157619   267b3 ld-2.27.9000-base.so
 149787    7376     408  157571   26783 ld-2.27.9000-elf-symbol-value.so
MIPS/o32/eb:
   text    data     bss     dec     hex filename
 142870    4396     260  147526   24046 ld-2.27.9000-base.so
 142854    4396     260  147510   24036 ld-2.27.9000-elf-symbol-value.so
MIPS/n32/eb:
   text    data     bss     dec     hex filename
 142019    4404     260  146683   23cfb ld-2.27.9000-base.so
 141923    4404     260  146587   23c9b ld-2.27.9000-elf-symbol-value.so
MIPS/n64/eb:
   text    data     bss     dec     hex filename
 149763    7376     408  157547   2676b ld-2.27.9000-base.so
 149779    7376     408  157563   2677b ld-2.27.9000-elf-symbol-value.so
x86-64:
   text    data     bss     dec     hex filename
 148462    6452     400  155314   25eb2 ld-2.27.9000-base.so
 148462    6452     400  155314   25eb2 ld-2.27.9000-elf-symbol-value.so

	[BZ #19818]
	* sysdeps/generic/ldsodefs.h (LOOKUP_VALUE_ADDRESS): Add `set'
	parameter.
	(SYMBOL_ADDRESS): New macro.
	[!ELF_FUNCTION_PTR_IS_SPECIAL] (DL_SYMBOL_ADDRESS): Use
	SYMBOL_ADDRESS for symbol address calculation.
	* elf/dl-runtime.c (_dl_fixup): Likewise.
	(_dl_profile_fixup): Likewise.
	* elf/dl-symaddr.c (_dl_symbol_address): Likewise.
	* elf/rtld.c (dl_main): Likewise.
	* sysdeps/aarch64/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/alpha/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/arm/dl-machine.h (elf_machine_rel): Likewise.
	(elf_machine_rela): Likewise.
	* sysdeps/hppa/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/hppa/dl-symaddr.c (_dl_symbol_address): Likewise.
	* sysdeps/i386/dl-machine.h (elf_machine_rel): Likewise.
	(elf_machine_rela): Likewise.
	* sysdeps/ia64/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/m68k/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/microblaze/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
	Likewise.
	(elf_machine_reloc): Likewise.
	(elf_machine_got_rel): Likewise.
	* sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise.
	* sysdeps/nios2/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/riscv/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/s390/s390-32/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/s390/s390-64/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/sh/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/tile/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2018-04-04 23:09:37 +01:00
Samuel Thibault
ad2b41bfd8 hurd: Bump remaining LGPL2+ htl licences to LGPL 2.1+
* htl/Makefile: Bump licence to LGPL 2.1+.
	* htl/alloca_cutoff.c: Likewise.
	* htl/cthreads-compat.c: Likewise.
	* htl/lockfile.c: Likewise.
	* htl/pt-alloc.c: Likewise.
	* htl/pt-cancel.c: Likewise.
	* htl/pt-cleanup.c: Likewise.
	* htl/pt-create.c: Likewise.
	* htl/pt-dealloc.c: Likewise.
	* htl/pt-detach.c: Likewise.
	* htl/pt-exit.c: Likewise.
	* htl/pt-getattr.c: Likewise.
	* htl/pt-initialize.c: Likewise.
	* htl/pt-internal.h: Likewise.
	* htl/pt-join.c: Likewise.
	* htl/pt-self.c: Likewise.
	* htl/pt-setcancelstate.c: Likewise.
	* htl/pt-setcanceltype.c: Likewise.
	* htl/pt-sigmask.c: Likewise.
	* htl/pt-spin-inlines.c: Likewise.
	* htl/pt-testcancel.c: Likewise.
	* htl/pt-yield.c: Likewise.
	* htl/tests/test-1.c: Likewise.
	* htl/tests/test-10.c: Likewise.
	* htl/tests/test-11.c: Likewise.
	* htl/tests/test-12.c: Likewise.
	* htl/tests/test-13.c: Likewise.
	* htl/tests/test-14.c: Likewise.
	* htl/tests/test-15.c: Likewise.
	* htl/tests/test-16.c: Likewise.
	* htl/tests/test-17.c: Likewise.
	* htl/tests/test-2.c: Likewise.
	* htl/tests/test-3.c: Likewise.
	* htl/tests/test-4.c: Likewise.
	* htl/tests/test-5.c: Likewise.
	* htl/tests/test-6.c: Likewise.
	* htl/tests/test-7.c: Likewise.
	* htl/tests/test-8.c: Likewise.
	* htl/tests/test-9.c: Likewise.
	* htl/tests/test-__pthread_destroy_specific-skip.c: Likewise.
	* sysdeps/htl/bits/cancelation.h: Likewise.
	* sysdeps/htl/bits/pthread-np.h: Likewise.
	* sysdeps/htl/bits/pthread.h: Likewise.
	* sysdeps/htl/bits/pthreadtypes.h: Likewise.
	* sysdeps/htl/bits/semaphore.h: Likewise.
	* sysdeps/htl/bits/types/__pthread_key.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_attr.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_barrier.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_barrierattr.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_cond.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_condattr.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_mutex.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_mutexattr.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_once.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_rwlock.h: Likewise.
	* sysdeps/htl/bits/types/struct___pthread_rwlockattr.h: Likewise.
	* sysdeps/htl/old_pt-atfork.c: Likewise.
	* sysdeps/htl/pt-atfork.c: Likewise.
	* sysdeps/htl/pt-attr-destroy.c: Likewise.
	* sysdeps/htl/pt-attr-getdetachstate.c: Likewise.
	* sysdeps/htl/pt-attr-getguardsize.c: Likewise.
	* sysdeps/htl/pt-attr-getinheritsched.c: Likewise.
	* sysdeps/htl/pt-attr-getschedparam.c: Likewise.
	* sysdeps/htl/pt-attr-getschedpolicy.c: Likewise.
	* sysdeps/htl/pt-attr-getscope.c: Likewise.
	* sysdeps/htl/pt-attr-getstack.c: Likewise.
	* sysdeps/htl/pt-attr-getstackaddr.c: Likewise.
	* sysdeps/htl/pt-attr-getstacksize.c: Likewise.
	* sysdeps/htl/pt-attr-init.c: Likewise.
	* sysdeps/htl/pt-attr-setdetachstate.c: Likewise.
	* sysdeps/htl/pt-attr-setguardsize.c: Likewise.
	* sysdeps/htl/pt-attr-setinheritsched.c: Likewise.
	* sysdeps/htl/pt-attr-setschedparam.c: Likewise.
	* sysdeps/htl/pt-attr-setschedpolicy.c: Likewise.
	* sysdeps/htl/pt-attr-setscope.c: Likewise.
	* sysdeps/htl/pt-attr-setstack.c: Likewise.
	* sysdeps/htl/pt-attr-setstackaddr.c: Likewise.
	* sysdeps/htl/pt-attr-setstacksize.c: Likewise.
	* sysdeps/htl/pt-attr.c: Likewise.
	* sysdeps/htl/pt-barrier-destroy.c: Likewise.
	* sysdeps/htl/pt-barrier-init.c: Likewise.
	* sysdeps/htl/pt-barrier-wait.c: Likewise.
	* sysdeps/htl/pt-barrier.c: Likewise.
	* sysdeps/htl/pt-barrierattr-destroy.c: Likewise.
	* sysdeps/htl/pt-barrierattr-getpshared.c: Likewise.
	* sysdeps/htl/pt-barrierattr-init.c: Likewise.
	* sysdeps/htl/pt-barrierattr-setpshared.c: Likewise.
	* sysdeps/htl/pt-cond-brdcast.c: Likewise.
	* sysdeps/htl/pt-cond-destroy.c: Likewise.
	* sysdeps/htl/pt-cond-init.c: Likewise.
	* sysdeps/htl/pt-cond-signal.c: Likewise.
	* sysdeps/htl/pt-cond-timedwait.c: Likewise.
	* sysdeps/htl/pt-cond-wait.c: Likewise.
	* sysdeps/htl/pt-cond.c: Likewise.
	* sysdeps/htl/pt-condattr-destroy.c: Likewise.
	* sysdeps/htl/pt-condattr-getclock.c: Likewise.
	* sysdeps/htl/pt-condattr-getpshared.c: Likewise.
	* sysdeps/htl/pt-condattr-init.c: Likewise.
	* sysdeps/htl/pt-condattr-setclock.c: Likewise.
	* sysdeps/htl/pt-condattr-setpshared.c: Likewise.
	* sysdeps/htl/pt-destroy-specific.c: Likewise.
	* sysdeps/htl/pt-equal.c: Likewise.
	* sysdeps/htl/pt-getconcurrency.c: Likewise.
	* sysdeps/htl/pt-getcpuclockid.c: Likewise.
	* sysdeps/htl/pt-getschedparam.c: Likewise.
	* sysdeps/htl/pt-getspecific.c: Likewise.
	* sysdeps/htl/pt-init-specific.c: Likewise.
	* sysdeps/htl/pt-key-create.c: Likewise.
	* sysdeps/htl/pt-key-delete.c: Likewise.
	* sysdeps/htl/pt-key.h: Likewise.
	* sysdeps/htl/pt-mutex-destroy.c: Likewise.
	* sysdeps/htl/pt-mutex-getprioceiling.c: Likewise.
	* sysdeps/htl/pt-mutex-init.c: Likewise.
	* sysdeps/htl/pt-mutex-lock.c: Likewise.
	* sysdeps/htl/pt-mutex-setprioceiling.c: Likewise.
	* sysdeps/htl/pt-mutex-timedlock.c: Likewise.
	* sysdeps/htl/pt-mutex-trylock.c: Likewise.
	* sysdeps/htl/pt-mutex-unlock.c: Likewise.
	* sysdeps/htl/pt-mutexattr-destroy.c: Likewise.
	* sysdeps/htl/pt-mutexattr-getprioceiling.c: Likewise.
	* sysdeps/htl/pt-mutexattr-getprotocol.c: Likewise.
	* sysdeps/htl/pt-mutexattr-getpshared.c: Likewise.
	* sysdeps/htl/pt-mutexattr-gettype.c: Likewise.
	* sysdeps/htl/pt-mutexattr-init.c: Likewise.
	* sysdeps/htl/pt-mutexattr-setprioceiling.c: Likewise.
	* sysdeps/htl/pt-mutexattr-setprotocol.c: Likewise.
	* sysdeps/htl/pt-mutexattr-setpshared.c: Likewise.
	* sysdeps/htl/pt-mutexattr-settype.c: Likewise.
	* sysdeps/htl/pt-mutexattr.c: Likewise.
	* sysdeps/htl/pt-once.c: Likewise.
	* sysdeps/htl/pt-rwlock-attr.c: Likewise.
	* sysdeps/htl/pt-rwlock-destroy.c: Likewise.
	* sysdeps/htl/pt-rwlock-init.c: Likewise.
	* sysdeps/htl/pt-rwlock-rdlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-timedrdlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-timedwrlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-tryrdlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-trywrlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-unlock.c: Likewise.
	* sysdeps/htl/pt-rwlock-wrlock.c: Likewise.
	* sysdeps/htl/pt-rwlockattr-destroy.c: Likewise.
	* sysdeps/htl/pt-rwlockattr-getpshared.c: Likewise.
	* sysdeps/htl/pt-rwlockattr-init.c: Likewise.
	* sysdeps/htl/pt-rwlockattr-setpshared.c: Likewise.
	* sysdeps/htl/pt-setconcurrency.c: Likewise.
	* sysdeps/htl/pt-setschedparam.c: Likewise.
	* sysdeps/htl/pt-setschedprio.c: Likewise.
	* sysdeps/htl/pt-setspecific.c: Likewise.
	* sysdeps/htl/pt-spin.c: Likewise.
	* sysdeps/htl/pt-startup.c: Likewise.
	* sysdeps/htl/pthread.h: Likewise.
	* sysdeps/htl/sem-close.c: Likewise.
	* sysdeps/htl/sem-destroy.c: Likewise.
	* sysdeps/htl/sem-getvalue.c: Likewise.
	* sysdeps/htl/sem-init.c: Likewise.
	* sysdeps/htl/sem-open.c: Likewise.
	* sysdeps/htl/sem-post.c: Likewise.
	* sysdeps/htl/sem-timedwait.c: Likewise.
	* sysdeps/htl/sem-trywait.c: Likewise.
	* sysdeps/htl/sem-unlink.c: Likewise.
	* sysdeps/htl/sem-wait.c: Likewise.
	* sysdeps/hurd/htl/pt-kill.c: Likewise.
	* sysdeps/i386/htl/pt-machdep.h: Likewise.
	* sysdeps/mach/htl/pt-block.c: Likewise.
	* sysdeps/mach/htl/pt-spin.c: Likewise.
	* sysdeps/mach/htl/pt-stack-alloc.c: Likewise.
	* sysdeps/mach/htl/pt-thread-alloc.c: Likewise.
	* sysdeps/mach/htl/pt-thread-start.c: Likewise.
	* sysdeps/mach/htl/pt-thread-terminate.c: Likewise.
	* sysdeps/mach/htl/pt-timedblock.c: Likewise.
	* sysdeps/mach/htl/pt-wakeup.c: Likewise.
	* sysdeps/mach/hurd/htl/bits/pthread-np.h: Likewise.
	* sysdeps/mach/hurd/htl/bits/types/struct___pthread_mutex.h: Likewise.
	* sysdeps/mach/hurd/htl/pt-attr-setstackaddr.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-attr-setstacksize.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-docancel.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-hurd-cond-timedwait.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-hurd-cond-wait.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-consistent.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-destroy.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-getprioceiling.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-init.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-lock.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-setprioceiling.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-timedlock.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-transfer-np.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-trylock.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex-unlock.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutex.h: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-destroy.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-getprioceiling.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-getprotocol.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-getpshared.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-getrobust.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-gettype.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-init.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-setprioceiling.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-setprotocol.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-setpshared.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-setrobust.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-mutexattr-settype.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-sigstate-destroy.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-sigstate-init.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-sigstate.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-sysdep.c: Likewise.
	* sysdeps/mach/hurd/htl/pt-sysdep.h: Likewise.
	* sysdeps/mach/hurd/i386/htl/pt-machdep.c: Likewise.
	* sysdeps/mach/hurd/i386/htl/pt-setup.c: Likewise.
2018-04-02 16:37:36 +02:00
Samuel Thibault
33574c17ee hurd: Add hurd thread library
Contributed by

Agustina Arzille <avarzille@riseup.net>
Amos Jeffries <squid3@treenet.co.nz>
David Michael <fedora.dm0@gmail.com>
Marco Gerards <marco@gnu.org>
Marcus Brinkmann <marcus@gnu.org>
Neal H. Walfield <neal@gnu.org>
Pino Toscano <toscano.pino@tiscali.it>
Richard Braun <rbraun@sceen.net>
Roland McGrath <roland@gnu.org>
Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas DiModica <ricinwich@yahoo.com>
Thomas Schwinge <tschwinge@gnu.org>

	* htl: New directory.
	* sysdeps/htl: New directory.
	* sysdeps/hurd/htl: New directory.
	* sysdeps/i386/htl: New directory.
	* sysdeps/mach/htl: New directory.
	* sysdeps/mach/hurd/htl: New directory.
	* sysdeps/mach/hurd/i386/htl: New directory.
	* nscd/Depend, resolv/Depend, rt/Depend: Add htl dependency.
	* sysdeps/mach/hurd/i386/Implies: Add mach/hurd/i386/htl imply.
	* sysdeps/mach/hurd/i386/libpthread.abilist: New file.
2018-04-02 01:44:14 +02:00
Andrew Senkevich
cd66c0e584 Fix i386 memmove issue (bug 22644).
[BZ #22644]
	* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Fixed
	branch conditions.
	* string/test-memmove.c (do_test2): New testcase.
2018-03-23 16:19:45 +01:00
Joseph Myers
8d3f9e85cf Add narrowing subtract functions.
This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.

The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

	* math/Makefile (libm-narrow-fns): Add sub.
	(libm-test-funcs-narrow): Likewise.
	* math/Versions (GLIBC_2.28): Add narrowing subtract functions.
	* math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW.
	* math/gen-auto-libm-tests.c (test_functions): Add sub.
	* math/math-narrow.h (CHECK_NARROW_SUB): New macro.
	(NARROW_SUB_ROUND_TO_ODD): Likewise.
	(NARROW_SUB_TRIVIAL): Likewise.
	* sysdeps/ieee754/float128/float128_private.h (__fsubl): New
	macro.
	(__dsubl): Likewise.
	* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and
	dsub.
	(CFLAGS-nldbl-dsub.c): New variable.
	(CFLAGS-nldbl-fsub.c): Likewise.
	* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
	__nldbl_dsubl.
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New
	prototype.
	* manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl,
	dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx.
	* math/auto-libm-test-in: Add tests of sub.
	* math/auto-libm-test-out-narrow-sub: New generated file.
	* math/libm-test-narrow-sub.inc: New file.
	* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fsub.c: Likewise.
	* sysdeps/ieee754/float128/s_f32subf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64subf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fsub.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Update.
	* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-03-20 00:34:52 +00:00
Joseph Myers
34ba96b89c Update i386 libm-test-ulps.
I found the i386 libm-test-ulps files needed updating (probably the
sqrt changes perturbed exactly when excess precision was used by the
compiler).

	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
2018-03-16 17:43:38 +00:00
Wilco Dijkstra
1294b1892e Add support for sqrt asm redirects
This patch series cleans up the many uses of  __ieee754_sqrt(f/l) in GLIBC.
The goal is to enable GCC to do the inlining, and if this fails call the
__ieee754_sqrt function.  This is done by internally declaring sqrt with asm
redirects.  The compat symbols and sqrt wrappers need to disable the redirect.
The redirect is also disabled if there are already redirects defined when
using -ffinite-math-only.

All math functions (but not math tests, non-library code and libnldbl) are
built with -fno-math-errno which means GCC will typically inline sqrt as a
single instruction.  This means targets are no longer forced to add a special
inline for sqrt.

	* include/math.h (sqrt): Declare with asm redirect.
	(sqrtf): Likewise.
	(sqrtl): Likewise.
	(sqrtf128): Likewise.
	* Makeconfig: Add -fno-math-errno for libc/libm, but build testsuite,
	nonlib and libnldbl with -fmath-errno.
	* math/w_sqrt_compat.c: Define NO_MATH_REDIRECT.
	* math/w_sqrt_template.c: Likewise.
	* math/w_sqrtf_compat.c: Likewise.
	* math/w_sqrtl_compat.c: Likewise.
	* sysdeps/i386/fpu/w_sqrt.c: Likewise.
	* sysdeps/i386/fpu/w_sqrt_compat.c: Likewise.
	* sysdeps/generic/math-type-macros-float128.h: Remove math.h and
	complex.h.
2018-03-15 19:21:35 +00:00
Samuel Thibault
a5df0318ef hurd: add gscope support
* elf/dl-support.c [!THREAD_GSCOPE_IN_TCB] (_dl_thread_gscope_count):
Define variable.
* sysdeps/generic/ldsodefs.h [!THREAD_GSCOPE_IN_TCB] (struct
rtld_global): Add _dl_thread_gscope_count member.
* sysdeps/mach/hurd/tls.h: Include <atomic.h>.
[!defined __ASSEMBLER__] (THREAD_GSCOPE_GLOBAL, THREAD_GSCOPE_SET_FLAG,
THREAD_GSCOPE_RESET_FLAG, THREAD_GSCOPE_WAIT): Define macros.
* sysdeps/generic/tls.h: Document THREAD_GSCOPE_IN_TCB.
* sysdeps/aarch64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/alpha/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/arm/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/hppa/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/i386/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/ia64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/m68k/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/microblaze/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/mips/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/nios2/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/powerpc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/riscv/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/s390/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/sh/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/sparc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/tile/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
* sysdeps/x86_64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
2018-03-11 13:06:33 +01:00
Joseph Myers
e2bcf6a855 Fix i386 fenv_private.h float128 for 32-bit --with-fpmath=sse (bug 22902).
As discussed in bug 22902, the i386 fenv_private.h implementation has
problems for float128 for the case of 32-bit glibc built with libgcc
from GCC configured using --with-fpmath=sse.

The optimized floating-point state handling in fenv_private.h needs to
know which floating-point state - x87 or SSE - is used for each
floating-point type, so that only one state needs updating / testing
for libm code using that state internally.  On 32-bit x86, the x87
rounding mode is always used for float128, but the x87 exception flags
are only used when libgcc is built using x87 floating-point
arithmetic; if libgcc is built for SSE arithmetic, the SSE exception
flags are used.

The choice of arithmetic with which libgcc is built is independent of
that with which glibc is built.  Thus, since glibc cannot tell the
choice used in libgcc, the default implementations of
libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128 (which
use the <fenv.h> functions, thus using both x87 and SSE state on
processors that have both) need to be used; this patch updates the
code accordingly.

Tested for 32-bit x86; HJ reports testing in the --with-fpmath=sse
case.

	[BZ #22902]
	* sysdeps/i386/fpu/fenv_private.h [!__x86_64__]
	(libc_feholdexcept_setroundf128): New macro.
	[!__x86_64__] (libc_feupdateenv_testf128): Likewise.
2018-02-28 21:55:51 +00:00
Wilco Dijkstra
610ee1fc93 Remove mplog and mpexp
Remove the now unused mplog and mpexp files.

	* math/Makefile: Remove mpexp.c and mplog.c
	* sysdeps/i386/fpu/mpexp.c: Delete file.
	* sysdeps/i386/fpu/mplog.c: Likewise.
	* sysdeps/ia64/fpu/mpexp.c: Likewise.
	* sysdeps/ia64/fpu/mplog.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog.
	* sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function.
	* sysdeps/ieee754/dbl-64/mpexp.c: Delete file.
	* sysdeps/ieee754/dbl-64/mplog.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/mplog.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*.
	* sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file.
	* sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.
2018-02-15 12:41:05 +00:00
Szabolcs Nagy
de800d8305 Remove slow paths from exp
Remove the __slowexp code, so exp is no longer correctly rounded.  The
result is computed to about 70 bits precision so the worst case ulp
error is about 0.500007 in nearest rounding mode.

	* manual/probes.texi: Remove slowexp probes.
	* math/Makefile: Remove slowexp.
	* sysdeps/generic/math_private.h (__slowexp): Remove.
	* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and
	document error bounds.
	* sysdeps/i386/fpu/slowexp.c: Remove.
	* sysdeps/ia64/fpu/slowexp.c: Remove.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove.
	* sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Remove.
	* sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.
2018-02-12 11:33:33 +00:00
Wilco Dijkstra
c3d466cba1 Remove slow paths from pow
Remove the slow paths from pow.  Like several other double precision math
functions, pow is exactly rounded.  This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP.  Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.

All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP.  A simple test over a few hundred million
values shows pow is 10% faster on average.  This fixes BZ #13932.

	[BZ #13932]
	* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
	* benchtests/pow-inputs: Update comment for slow path cases.
	* manual/probes.texi (slowpow_p10): Delete removed probe.
	(slowpow_p10): Likewise.
	* math/Makefile: Remove halfulp.c and slowpow.c.
	* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/generic/math_private.h (__exp1): Remove error argument.
	(__halfulp): Remove.
	(__slowpow): Remove.
	* sysdeps/i386/fpu/halfulp.c: Delete file.
	* sysdeps/i386/fpu/slowpow.c: Likewise.
	* sysdeps/ia64/fpu/halfulp.c: Likewise.
	* sysdeps/ia64/fpu/slowpow.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
	improve comments and add error analysis.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
	(power1): Remove function:
	(log1): Remove error argument, add error analysis.
	(my_log2): Remove function.
	* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
	* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
	* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
	* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
	slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
2018-02-12 10:47:09 +00:00
Joseph Myers
d8742dd82f Add narrowing add functions.
This patch adds the narrowing add functions from TS 18661-1 to glibc's
libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all
configurations; f32addf64x, f32addf128, f64addf64x, f64addf128,
f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with
_Float64x and _Float128; __nldbl_daddl for ldbl-opt.  As discussed for
the build infrastructure patch, tgmath.h support is deliberately
deferred, and FP_FAST_* macros are not applicable without optimized
function implementations.

Function implementations are added for all relevant pairs of formats
(including certain cases of a format and itself where more than one
type has that format).  The main implementations use round-to-odd, or
a trivial computation in the case where both formats are the same or
where the wider format is IBM long double (in which case we don't
attempt to be correctly rounding).  The sysdeps/ieee754/soft-fp
implementations use soft-fp, and are used automatically for
configurations without exceptions and rounding modes by virtue of
existing Implies files.  As previously discussed, optimized versions
for particular architectures are possible, but not included.

i386 gets a special version of f32xaddf64 to avoid problems with
double rounding (similar to the existing fdim version), since this
function must round just once without an intermediate rounding to long
double.  (No such special version is needed for any other function,
because the nontrivial functions use round-to-odd, which does the
intermediate computation with the rounding mode set to round-to-zero,
and double rounding is OK except in round-to-nearest mode, so is OK
for that intermediate round-to-zero computation.)  mul and div will
need slightly different special versions for i386 (using round-to-odd
on long double instead of precision control) because of the
possibility of inexact intermediate results in the subnormal range for
double.

To reduce duplication among the different function implementations,
math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD
and NARROW_ADD_TRIVIAL.

In the trivial cases and for any architecture-specific optimized
implementations, the overhead of the errno setting might be
significant, but I think that's best handled through compiler built-in
functions rather than providing separate no-errno versions in glibc
(and likewise there are no __*_finite entry points for these function
provided, __*_finite effectively being no-errno versions at present in
most cases).

Tested for x86_64 and x86, with both GCC 6 and GCC 7.  Tested for
mips64 (all three ABIs, both hard and soft float) and powerpc with GCC
7.  Tested with build-many-glibcs.py with both GCC 6 and GCC 7.

	* math/Makefile (libm-narrow-fns): Add add.
	(libm-test-funcs-narrow): Likewise.
	* math/Versions (GLIBC_2.28): Add narrowing add functions.
	* math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW .
	* math/gen-auto-libm-tests.c (test_functions): Add add.
	* math/math-narrow.h (CHECK_NARROW_ADD): New macro.
	(NARROW_ADD_ROUND_TO_ODD): Likewise.
	(NARROW_ADD_TRIVIAL): Likewise.
	* sysdeps/ieee754/float128/float128_private.h (__faddl): New
	macro.
	(__daddl): Likewise.
	* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and
	dadd.
	(CFLAGS-nldbl-dadd.c): New variable.
	(CFLAGS-nldbl-fadd.c): Likewise.
	* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
	__nldbl_daddl.
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New
	prototype.
	* manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl,
	daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx.
	* math/auto-libm-test-in: Add tests of add.
	* math/auto-libm-test-out-narrow-add: New generated file.
	* math/libm-test-narrow-add.inc: New file.
	* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fadd.c: Likewise.
	* sysdeps/ieee754/float128/s_f32addf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64addf128.c: Likewise.
	* sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_daddl.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_fadd.c: Likewise.
	* sysdeps/ieee754/soft-fp/s_faddl.c: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Update.
	* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-02-10 02:08:43 +00:00
Joseph Myers
63716ab270 Add build infrastructure for narrowing libm functions.
TS 18661-1 defines libm functions that carry out an operation (+ - * /
sqrt fma) on their arguments and return a result rounded to a
(usually) narrower type, as if the original result were computed to
infinite precision and then rounded directly to the result type
without any intermediate rounding to the argument type.  For example,
fadd, faddl and daddl for addition.  These are the last remaining TS
18661-1 functions left to be added to glibc.  TS 18661-3 extends this
to corresponding functions for _FloatN and _FloatNx types.

As functions parametrized by two rather than one varying
floating-point types, these functions require infrastructure in glibc
that was not required for previous libm functions.  This patch
provides such infrastructure - excluding test support, and actual
function implementations, which will be in subsequent patches.

Declaring the functions uses a header bits/mathcalls-narrow.h, which
is included many times, for each relevant pair of types.  This will
end up containing macro calls of the form

__MATHCALL_NARROW (__MATHCALL_NAME (add), __MATHCALL_REDIR_NAME (add), 2);

for each family of narrowing functions.  (The structure of this macro
call, with the calls to __MATHCALL_NAME and __MATHCALL_REDIR_NAME
there rather than in the definition of __MATHCALL_NARROW, arises from
the names such as "add" *not* themselves being reserved identifiers -
meaning it's necessary to avoid any indirection that would result in a
user-defined "add" macro being expanded.)  Whereas for existing
functions declaring long double functions is disabled if _LIBC in the
case where they alias double functions, to facilitate defining the
long double functions as aliases of the double ones, there is no such
logic for the narrowing functions in this patch.  Rather, the files
defining such functions are expected to use #define to hide the
original declarations of the alias names, to avoid errors about
defining aliases with incompatible types.

math/Makefile support is added for building the functions (listed in
libm-narrow-fns, currently empty) for all relevant pairs of types.  An
internal header math-narrow.h is added for macros shared between
multiple function implementations - currently a ROUND_TO_ODD macro to
facilitate writing functions using the round-to-odd implementation
approach, and alias macros to create all the required function
aliases.  libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128
are added for use when required (only for x86_64).  float128_private.h
support is added for ldbl-128 narrowing functions to be used for
_Float128.

Certain things are specifically omitted from this patch and the
immediate followups.  tgmath.h support is deferred; there remain
unresolved questions about how the type-generic macros for these
functions are supposed to work, especially in the case of arguments of
integer type.  The math.h / bits/mathcalls-narrow.h logic, and the
logic for determining what functions / aliases to define, will need
some adjustments to support the sqrt and fma functions, where
e.g. f32xsqrtf64 can just be an alias for sqrt rather than a separate
function.  TS 18661-1 defines FP_FAST_* macros but no support is
included for defining them (they won't in general be true without
architecture-specific optimized function versions).

For each of the function groups (add sub mul div sqrt fma) there are
always six functions present (e.g. fadd, faddl, daddl, f32addf64,
f32addf32x, f32xaddf64).  When _Float64x and _Float128 are supported,
there are seven more (e.g. f32addf64x, f32addf128, f64addf64x,
f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128).  In addition, in
the ldbl-opt case there are function names such as __nldbl_daddl (an
alias for f32xaddf64, which is not a reserved name in TS 18661-1, only
in TS 18661-3), for calls to daddl to be mapped to in the
-mlong-double-64 case.  (Calls to faddl just get mapped to fadd, and
for sqrt and fma there won't be __nldbl_* functions because dsqrtl and
dfmal can just be mapped to sqrt and fma with -mlong-double-64.)

While there are six or thirteen functions present in each group (plus
__nldbl_* names only as an ABI, not an API), not all are distinct;
they fall in various groups of aliases.  There are two distinct
versions built if long double has the same format as double; four if
they have distinct formats but there is no _Float64x or _Float128
support; five if long double has binary128 format; seven when
_Float128 is distinct from long double.

Architecture-specific optimized versions are possible, but not
included in my patches.  For example, IA64 generally supports
narrowing the result of most floating-point instructions; Power ISA
2.07 (POWER8) supports double values as arguments to float
instructions, with the results narrowed as expected; Power ISA 3
(POWER9) supports round-to-odd for float128 instructions, so meaning
that approach can be used without needing to set and restore the
rounding mode and test "inexact".  I intend to leave any such
optimized versions to the architecture maintainers.  Generally in such
cases it would also make sense for calls to these functions to be
expanded inline (given -fno-math-errno); I put a suggestion for TS
18661-1 built-in functions at <https://gcc.gnu.org/wiki/SummerOfCode>.

Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with further patches).

	* math/bits/mathcalls-narrow.h: New file.
	* include/bits/mathcalls-narrow.h: Likewise.
	* math/math-narrow.h: Likewise.
	* math/math.h (__MATHCALL_NARROW_ARGS_1): New macro.
	(__MATHCALL_NARROW_ARGS_2): Likewise.
	(__MATHCALL_NARROW_ARGS_3): Likewise.
	(__MATHCALL_NARROW_NORMAL): Likewise.
	(__MATHCALL_NARROW_REDIR): Likewise.
	(__MATHCALL_NARROW): Likewise.
	[__GLIBC_USE (IEC_60559_BFP_EXT)]: Repeatedly include
	<bits/mathcalls-narrow.h> with _Mret_, _Marg_ and __MATHCALL_NAME
	defined.
	[__GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
	* math/Makefile (headers): Add bits/mathcalls-narrow.h.
	(libm-narrow-fns): New variable.
	(libm-narrow-types-basic): Likewise.
	(libm-narrow-types-ldouble-yes): Likewise.
	(libm-narrow-types-float128-yes): Likewise.
	(libm-narrow-types-float128-alias-yes): Likewise.
	(libm-narrow-types): Likewise.
	(libm-routines): Add narrowing functions.
	* sysdeps/i386/fpu/fenv_private.h [__x86_64__]
	(libc_feholdexcept_setroundf128): New macro.
	[__x86_64__] (libc_feupdateenv_testf128): Likewise.
	* sysdeps/ieee754/float128/float128_private.h: Include
	<math/math-narrow.h>.
	[libc_feholdexcept_setroundf128] (libc_feholdexcept_setroundl):
	Undefine and redefine.
	[libc_feupdateenv_testf128] (libc_feupdateenv_testl): Likewise.
	(libm_alias_float_ldouble): Undefine and redefine.
	(libm_alias_double_ldouble): Likewise.
2018-02-09 21:18:52 +00:00
H.J. Lu
f886c16ca5 i386: Use __glibc_likely/__glibc_likely in dl-machine.h
The differences in elf/dl-reloc.os are

--- before    	2018-02-05 03:53:31.970492246 -0800
+++ after     	2018-02-05 03:53:49.719902340 -0800
@@ -1202,9 +1202,9 @@ _dl_relocate_object:
 	movl	-60(%ebp), %eax
 	testl	%eax, %eax
 	je	.L249
-	movl	8(%eax), %eax
-	movl	8(%ebx), %esi
-	cmpl	%esi, %eax
+	movl	8(%eax), %esi
+	movl	8(%ebx), %eax
+	cmpl	%eax, %esi
 	ja	.L284
 	jb	.L707
 .L285:
@@ -2255,7 +2255,7 @@ _dl_relocate_object:
 	cmpl	$6, %edi
 	movl	$4, %edx
 	je	.L132
-	cmpl	%ecx, %eax
+	cmpl	%eax, %ecx
 	je	.L350
 	cmpl	$7, %edi
 	je	.L419
@@ -2735,7 +2735,7 @@ _dl_relocate_object:
 	je	.L120
 .L121:
 	movl	-96(%ebp), %edx
-	movl	$640, 8(%esp)
+	movl	$639, 8(%esp)
 	leal	__PRETTY_FUNCTION__.9431@GOTOFF(%edx), %eax
 	movl	%eax, 12(%esp)
 	leal	.LC9@GOTOFF(%edx), %eax
@@ -3454,10 +3454,10 @@ _dl_relocate_object:
 	movl	-152(%ebp), %eax
 	movl	%eax, 4(%esp)
 	call	_dl_dprintf
-	movl	-60(%ebp), %eax
-	movl	8(%ebx), %esi
+	movl	8(%ebx), %eax
+	movl	-60(%ebp), %ebx
 	movl	-112(%ebp), %edx
-	movl	8(%eax), %eax
+	movl	8(%ebx), %esi
 	jmp	.L285
 .L713:
 	movl	%esi, (%esp)

	* sysdeps/i386/dl-machine.h (elf_machine_rel): Replace
	__builtin_expect with __glibc_likely and __glibc_unlikely.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
2018-02-05 06:22:40 -08:00
Carlos O'Donell
2ec0e7eade Revert Intel CET changes to __jmp_buf_tag (Bug 22743)
In commit cba595c350 and commit
f81ddabffd, ABI compatibility with
applications was broken by increasing the size of the on-stack
allocated __pthread_unwind_buf_t beyond the oringal size.
Applications only have the origianl space available for
__pthread_unwind_register, and __pthread_unwind_next to use,
any increase in the size of __pthread_unwind_buf_t causes these
functions to write beyond the original structure into other
on-stack variables leading to segmentation faults in common
applications like vlc. The only workaround is to version those
functions which operate on the old sized objects, but this must
happen in glibc 2.28.

Thank you to Andrew Senkevich, H.J. Lu, and Aurelien Jarno, for
submitting reports and tracking the issue down.

The commit reverts the above mentioned commits and testing on
x86_64 shows that the ABI compatibility is restored. A tst-cleanup1
regression test linked with an older glibc now passes when run
with the newly built glibc. Previously a tst-cleanup1 linked with
an older glibc would segfault when run with an affected glibc build.

Tested on x86_64 with no regressions.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
2018-01-25 23:43:46 -08:00
Samuel Thibault
107a35a575 i386: Regenerate libm-test-ulps for for gcc 7 on i686
* sysdeps/i386/fpu/libm-test-ulps: Regenerated for GCC 7 with
	"-O2 -march=i686".
2018-01-06 22:11:40 +01:00
Samuel Thibault
4a5ce6e908 hurd: Fix build without NO_HIDDEN
* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic) [NO_RTLD_HIDDEN]: Call
JUMPTARGET (___tls_get_addr) instead of HIDDEN_JUMPTARGET (___tls_get_addr).
* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Likewise.
2018-01-06 18:20:18 +01:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Joseph Myers
f1e005022e Revert exp reimplementation (causes test failures).
Revert:

	2017-12-19  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/x86_64/fpu/libm-test-ulps: Update.

	2017-12-19  Patrick McGehearty  <patrick.mcgehearty@oracle.com>

	* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
	<errno.h>.  Include "eexp.tbl".
	(half): New constant.
	(one): Likewise.
	(__ieee754_exp): Rewrite.
	(__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
	* sysdeps/i386/fpu/slowexp.c: Likewise.
	* sysdeps/ia64/fpu/slowexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
	* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
	comment.
	* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
	(CPPFLAGS-slowexp.c): Remove variable.
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
	(CFLAGS-slowexp-fma.c): Remove variable.
	(CFLAGS-slowexp-fma4.c): Likewise.
	(CFLAGS-slowexp-avx.c): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
	define as macro.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
	* math/Makefile (type-double-routines): Remove slowexp.
	* manual/probes.texi (slowexp_p6): Remove.
	(slowexp_p32): Likewise.
2017-12-19 18:11:37 +00:00
Patrick McGehearty
6fd0a3c6a8 Improve __ieee754_exp() performance by greater than 5x on sparc/x86.
These changes will be active for all platforms that don't provide
their own exp() routines. They will also be active for ieee754
versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and
erf.

Typical performance gains is typically around 5x when measured on
Sparc s7 for common values between exp(1) and exp(40).

Using the glibc perf tests on sparc,
      sparc (nsec)    x86 (nsec)
      old     new     old     new
max   17629   395    5173     144
min     399    54      15      13
mean   5317   200    1349      23

The extreme max times for the old (ieee754) exp are due to the
multiprecision computation in the old algorithm when the true value is
very near 0.5 ulp away from an value representable in double
precision. The new algorithm does not take special measures for those
cases. The current glibc exp perf tests overrepresent those values.
Informal testing suggests approximately one in 200 cases might
invoke the high cost computation. The performance advantage of the new
algorithm for other values is still large but not as large as indicated
by the chart above.

Glibc correctness tests for exp() and expf() were run. Within the
test suite 3 input values were found to cause 1 bit differences (ulp)
when "FE_TONEAREST" rounding mode is set. No differences in exp() were
seen for the tested values for the other rounding modes.
Typical example:
exp(-0x1.760cd2p+0)  (-1.46113312244415283203125)
 new code:    2.31973271630014299393707e-01   0x1.db14cd799387ap-3
 old code:    2.31973271630014271638132e-01   0x1.db14cd7993879p-3
    exp    =  2.31973271630014285508337 (high precision)
Old delta: off by 0.49 ulp
New delta: off by 0.51 ulp

In addition, because ieee754_exp() is used by other routines, cexp()
showed test results with very small imaginary input values where the
imaginary portion of the result was off by 3 ulp when in upward
rounding mode, but not in the other rounding modes.  For x86, tgamma
showed a few values where the ulp increased to 6 (max ulp for tgamma
is 5). Sparc tgamma did not show these failures.  I presume the tgamma
differences are due to compiler optimization differences within the
gamma function.The gamma function is known to be difficult to compute
accurately.

	* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
	<errno.h>.  Include "eexp.tbl".
	(half): New constant.
	(one): Likewise.
	(__ieee754_exp): Rewrite.
	(__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
	* sysdeps/i386/fpu/slowexp.c: Likewise.
	* sysdeps/ia64/fpu/slowexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
	* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
	comment.
	* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
	(CPPFLAGS-slowexp.c): Remove variable.
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
	(CFLAGS-slowexp-fma.c): Remove variable.
	(CFLAGS-slowexp-fma4.c): Likewise.
	(CFLAGS-slowexp-avx.c): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
	define as macro.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
	* math/Makefile (type-double-routines): Remove slowexp.
	* manual/probes.texi (slowexp_p6): Remove.
	(slowexp_p32): Likewise.
2017-12-19 17:27:31 +00:00