mirror of
https://sourceware.org/git/glibc.git
synced 2024-11-29 08:11:08 +00:00
42d3593505
1134 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Paul A. Clarke
|
102b5b0caf |
Remove duplicate inline implementation of issignalingf
Very recent commit
|
||
Alistair Francis
|
aa706e13f4 |
Split up endian.h to minimize exposure of BYTE_ORDER.
With only two exceptions (sys/types.h and sys/param.h, both of which historically might have defined BYTE_ORDER) the public headers that include <endian.h> only want to be able to test __BYTE_ORDER against __*_ENDIAN. This patch creates a new bits/endian.h that can be included by any header that wants to be able to test __BYTE_ORDER and/or __FLOAT_WORD_ORDER against the __*_ENDIAN constants, or needs __LONG_LONG_PAIR. It only defines macros in the implementation namespace. The existing bits/endian.h (which could not be included independently of endian.h, and only defines __BYTE_ORDER and maybe __FLOAT_WORD_ORDER) is renamed to bits/endianness.h. I also took the opportunity to canonicalize the form of this header, which we are stuck with having one copy of per architecture. Since they are so short, this means git doesn’t understand that they were renamed from existing headers, sigh. endian.h itself is a nonstandard header and its only remaining use from a standard header is guarded by __USE_MISC, so I dropped the __USE_MISC conditionals from around all of the public-namespace things it defines. (This means, an application that requests strict library conformance but includes endian.h will still see the definition of BYTE_ORDER.) A few changes to specific bits/endian(ness).h variants deserve mention: - sysdeps/unix/sysv/linux/ia64/bits/endian.h is moved to sysdeps/ia64/bits/endianness.h. If I remember correctly, ia64 did have selectable endianness, but we have assembly code in sysdeps/ia64 that assumes it’s little-endian, so there is no reason to treat the ia64 endianness.h as linux-specific. - The C-SKY port does not fully support big-endian mode, the compile will error out if __CSKYBE__ is defined. - The PowerPC port had extra logic in its bits/endian.h to detect a broken compiler, which strikes me as unnecessary, so I removed it. - The only files that defined __FLOAT_WORD_ORDER always defined it to the same value as __BYTE_ORDER, so I removed those definitions. The SH bits/endian(ness).h had comments inconsistent with the actual setting of __FLOAT_WORD_ORDER, which I also removed. - I *removed* copyright boilerplate from the few bits/endian(ness).h headers that had it; these files record a single fact in a fashion dictated by an external spec, so I do not think they are copyrightable. As long as I was changing every copy of ieee754.h in the tree, I noticed that only the MIPS variant includes float.h, because it uses LDBL_MANT_DIG to decide among three different versions of ieee854_long_double. This patch makes it not include float.h when GCC’s intrinsic __LDBL_MANT_DIG__ is available. * string/endian.h: Unconditionally define LITTLE_ENDIAN, BIG_ENDIAN, PDP_ENDIAN, and BYTE_ORDER. Condition byteswapping macros only on !__ASSEMBLER__. Move the definitions of __BIG_ENDIAN, __LITTLE_ENDIAN, __PDP_ENDIAN, __FLOAT_WORD_ORDER, and __LONG_LONG_PAIR to... * string/bits/endian.h: ...this new file, which includes the renamed header bits/endianness.h for the definition of __BYTE_ORDER and possibly __FLOAT_WORD_ORDER. * string/Makefile: Install bits/endianness.h. * include/bits/endian.h: New wrapper. * bits/endian.h: Rename to bits/endianness.h. Add multiple-include guard. Rewrite the comment explaining what the machine-specific variants of this file should do. * sysdeps/unix/sysv/linux/ia64/bits/endian.h: Move to sysdeps/ia64. * sysdeps/aarch64/bits/endian.h * sysdeps/alpha/bits/endian.h * sysdeps/arm/bits/endian.h * sysdeps/csky/bits/endian.h * sysdeps/hppa/bits/endian.h * sysdeps/ia64/bits/endian.h * sysdeps/m68k/bits/endian.h * sysdeps/microblaze/bits/endian.h * sysdeps/mips/bits/endian.h * sysdeps/nios2/bits/endian.h * sysdeps/powerpc/bits/endian.h * sysdeps/riscv/bits/endian.h * sysdeps/s390/bits/endian.h * sysdeps/sh/bits/endian.h * sysdeps/sparc/bits/endian.h * sysdeps/x86/bits/endian.h: Rename to endianness.h; canonicalize form of file; remove redundant definitions of __FLOAT_WORD_ORDER. * sysdeps/powerpc/bits/endianness.h: Remove logic to check for broken compilers. * ctype/ctype.h * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h * sysdeps/arm/nptl/bits/pthreadtypes-arch.h * sysdeps/csky/nptl/bits/pthreadtypes-arch.h * sysdeps/ia64/ieee754.h * sysdeps/ieee754/ieee754.h * sysdeps/ieee754/ldbl-128/ieee754.h * sysdeps/ieee754/ldbl-128ibm/ieee754.h * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h * sysdeps/mips/ieee754/ieee754.h * sysdeps/mips/nptl/bits/pthreadtypes-arch.h * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h * sysdeps/nptl/pthread.h * sysdeps/riscv/nptl/bits/pthreadtypes-arch.h * sysdeps/sh/nptl/bits/pthreadtypes-arch.h * sysdeps/sparc/sparc32/ieee754.h * sysdeps/unix/sysv/linux/generic/bits/stat.h * sysdeps/unix/sysv/linux/generic/bits/statfs.h * sysdeps/unix/sysv/linux/sys/acct.h * wctype/bits/wctype-wchar.h: Include bits/endian.h, not endian.h. * sysdeps/unix/sysv/linux/hppa/pthread.h: Don’t include endian.h. * sysdeps/mips/ieee754/ieee754.h: Use __LDBL_MANT_DIG__ in ifdefs, instead of LDBL_MANT_DIG. Only include float.h when __LDBL_MANT_DIG__ is not predefined, in which case define __LDBL_MANT_DIG__ to equal LDBL_MANT_DIG. |
||
Paul Eggert
|
5a82c74822 |
Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S |
||
Joseph Myers
|
42760d7646 |
Make totalorder and totalordermag functions take pointer arguments.
The resolution of C floating-point Clarification Request 25 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2397.htm#dr_25> is that the totalorder and totalordermag functions should take pointer arguments, and this has been adopted in C2X (with const added; note that the integration of this change into C2X is present in the C standard git repository but postdates the most recent public PDF draft). This patch updates glibc accordingly. As a defect resolution, the API is changed unconditionally rather than supporting any sort of TS 18661-1 mode for compilation with the old version of the API. There are compat symbols for existing binaries that pass floating-point arguments directly. As a consequence of changing to pointer arguments, there are no longer type-generic macros in tgmath.h for these functions. Because of the fairly complicated logic for creating libm function aliases and determining the set of aliases to create in a given glibc configuration, rather than duplicating all that in individual source files to create the versioned and compat symbols, the source files for the various versions of totalorder functions are set up to redefine weak_alias before using libm_alias_* macros to create the symbols required. In turn, this requires creating a separate alias for each symbol version pointing to the same implementation (see binutils bug <https://sourceware.org/bugzilla/show_bug.cgi?id=23840>), which is done automatically using __COUNTER__. (As I noted in <https://sourceware.org/ml/libc-alpha/2018-10/msg00631.html>, it might well make sense for glibc's symbol versioning macros to do that alias creation with __COUNTER__ themselves, which would somewhat simplify the logic in the totalorder source files.) It is of course desirable to test the compat symbols. I did this with the generic libm-test machinery, but didn't wish to duplicate the actual tables of test inputs and outputs, and thought it risky to attempt to have a single object file refer to both default and compat versions of the same function in order to test them together. Thus, I created libm-test-compat_totalorder.inc and libm-test-compat_totalordermag.inc which include the generated .c files (with the processed version of those tables of inputs) from the non-compat tests, and added appropriate dependencies. I think this provides sufficient test coverage for the compat symbols without also needing to make the special ldbl-96 and ldbl-128ibm tests (of peculiarities relating to the representations of those formats that can't be covered in the generic tests) run for the compat symbols. Tests of compat symbols need to be internal tests, meaning _ISOMAC is not defined. Making some libm-test tests into internal tests showed up two other issues. GCC diagnoses duplicate macro definitions of __STDC_* macros, including __STDC_WANT_IEC_60559_TYPES_EXT__; I added an appropriate conditional and filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91451> for this issue. On ia64, include/setjmp.h ends up getting included indirectly from libm-symbols.h, resulting in conflicting definitions of the STR macro (also defined in libm-test-driver.c); I renamed the macros in include/setjmp.h. (It's arguable that we should have common internal headers used everywhere for stringizing and concatenation macros.) Tested for x86_64 and x86, and with build-many-glibcs.py. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN] (totalorder): Take pointer arguments. [__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN] (totalordermag): Likewise. * manual/arith.texi (totalorder): Likewise. (totalorderf): Likewise. (totalorderl): Likewise. (totalorderfN): Likewise. (totalorderfNx): Likewise. (totalordermag): Likewise. (totalordermagf): Likewise. (totalordermagl): Likewise. (totalordermagfN): Likewise. (totalordermagfNx): Likewise. * math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): Remove macro. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): Likewise. * math/Versions (GLIBC_2.31): Add totalorder, totalorderf, totalorderl, totalordermag, totalordermagf, totalordermagl, totalorderf32, totalorderf64, totalorderf32x, totalordermagf32, totalordermagf64, totalordermagf32x, totalorderf64x, totalordermagf64x, totalorderf128 and totalordermagf128. * math/Makefile (libm-test-funcs-noauto): Add compat_totalorder and compat_totalordermag. (libm-test-funcs-compat): New variable. (libm-tests-compat): Likewise. (tests): Do not include compat tests. (tests-internal): Add compat tests. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalorder.o)): Depend on $(objpfx)libm-test-totalorder.c. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalordermag.o): Depend on $(objpfx)libm-test-totalordermag.c. (tgmath3-macros): Remove totalorder and totalordermag. * math/libm-test-compat_totalorder.inc: New file. * math/libm-test-compat_totalordermag.inc: Likewise. * math/libm-test-driver.c (struct test_ff_i_data): Update comment. (RUN_TEST_fpfp_b): New macro. (RUN_TEST_LOOP_fpfp_b): Likewise. * math/libm-test-totalorder.inc (totalorder_test_data): Use TEST_fpfp_b. (totalorder_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/libm-test-totalordermag.inc (totalordermag_test_data): Use TEST_fpfp_b. (totalordermag_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/gen-tgmath-tests.py (Tests.add_all_tests): Remove totalorder and totalordermag. * math/test-tgmath.c (NCALLS): Change to 132. (F(compile_test)): Do not call totalorder or totalordermag. (F(totalorder)): Remove. (F(totalordermag)): Likewise. * include/float.h (__STDC_WANT_IEC_60559_TYPES_EXT__): Do not define if [__STDC_WANT_IEC_60559_TYPES_EXT__]. * include/setjmp.h [!_ISOMAC] (STR_HELPER): Rename to SJSTR_HELPER. [!_ISOMAC] (STR): Rename to SJSTR. Update call to STR_HELPER. [!_ISOMAC] (TEST_SIZE): Update call to STR. [!_ISOMAC] (TEST_ALIGN): Likewise. [!_ISOMAC] (TEST_OFFSET): Likewise. * sysdeps/ieee754/dbl-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/float128/float128_private.h (__totalorder_compatl): New macro. (__totalordermag_compatl): Likewise. * sysdeps/ieee754/flt-32/s_totalorderf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/flt-32/s_totalordermagf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include <shlib-compat.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include <shlib-compat.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c (totalorderl): Take pointer arguments. * sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c (totalordermagl): Likewise. * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/csky/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Adhemerval Zanella
|
105f2ed368 |
math: Use wordsize-64 version for s_logb
- The resulting binary difference on 32 bits architecture is minimum. On i686-linux-gnu (with architecture optimization routine removed) there is no different using logb benchtests - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Move to ... * sysdeps/ieee754/dbl-64/s_logb.c: ... here. Add work around for powerpc32 integer 0 converting to -0. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> |
||
Adhemerval Zanella
|
a72186761b |
math: Use wordsize-64 version for finite
- math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra finite call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ... * sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> |
||
Adhemerval Zanella
|
a8c590f789 |
math: Use wordsize-64 version for isinf
- math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isinf call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ... * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> |
||
Adhemerval Zanella
|
197dbda1a1 |
math: Use wordsize-64 version for isnan
- math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isnan call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ... * sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> |
||
Maciej W. Rozycki
|
87c266d758 |
Fix -O1 compilation errors with __ddivl' and __fdivl' [BZ #19444]
Complementing commit |
||
Gabriel F. T. Gomes
|
f0eaf86276 |
ldbl-opt: Reuse test cases from misc/ that check long double
This patch adds test cases for the compatibility versions of the functions: err, errx, verr, verrx, warn, warnx, vwarn, vwarnx (from err.h), error, and error_at_line (from error.h), when long double has the same format as double (-mlong-double-64). Tested for powerpc, powerpc64 and powerpc64le. |
||
Gabriel F. T. Gomes
|
d11086a939 |
ldbl-opt: Add error and error_at_line (bug 23984)
On platforms where long double may have the same format as double (-mlong-double-64), error and error_at_line do not take that into account and might produce wrong output if a long double conversion is requested by the format string ('%Lf'). This patch adds compatibility functions for this situation and redirects calls via header magic. Tested for powerpc, powerpc64 and powerpc64le. |
||
Gabriel F. T. Gomes
|
90188e7d1a |
ldbl-opt: Add err, errx, verr, verrx, warn, warnx, vwarn, and vwarnx (bug 23984)
When support for long double format with 128-bits (-mlong-double-128) was added for platforms where long double had the same format as double, such as powerpc, compatibility versions for the functions listed in the commit title were missed. Since the older format of long double can still be used (with -mlong-double-64), using these functions with a format string that requests the printing of long double variables will produce wrong outputs. This patch adds the missing compatibility functions and header magic to redirect calls to them when -mlong-double-64 is in use. Tested for powerpc, powerpc64 and powerpc64le. |
||
Gabriel F. T. Gomes
|
ea2d89d01c |
ldbl-opt: Reuse argp tests that print long double
The test case tst-ldbl-argp checks that the conversion specifier '%Lf' correctly prints long double values with the default long double format for a platform. This patch reuses the test case for long double with the same format as double (-mlong-double-64). Tested for powerpc, powerpc64 and powerpc64le. |
||
Gabriel F. T. Gomes
|
6e1f6440b9 |
ldbl-opt: Add argp_error and argp_failure (bug 23983)
The functions argp_error and argp_failure are missing support for printing long double values when long double has the same format as double. This patch adds the new functions __nldbl_argp_error and __nldbl_argp_failure, as well as header magic to redirect calls to them when -mlong-double-64 is in use. Tested for powerpc, powerpc64 and powerpc64le. |
||
Joseph Myers
|
c2d8f0b704 |
Avoid "inline" after return type in function definitions.
One group of warnings seen with -Wextra is warnings for static or inline not at the start of a declaration (-Wold-style-declaration). This patch fixes various such cases for inline, ensuring it comes at the start of the declaration (after any static). A common case of the fix is "static inline <type> __always_inline"; the definition of __always_inline starts with __inline, so the natural change is to "static __always_inline <type>". Other cases of the warning may be harder to fix (one pattern is a function definition that gets rewritten to be static by an including file, "#define funcname static wrapped_funcname" or similar), but it seems worth fixing these cases with inline anyway. Tested for x86_64. * elf/dl-load.h (_dl_postprocess_loadcmd): Use __always_inline before return type, without separate inline. * elf/dl-tunables.c (maybe_enable_malloc_check): Likewise. * elf/dl-tunables.h (tunable_is_name): Likewise. * malloc/malloc.c (do_set_trim_threshold): Likewise. (do_set_top_pad): Likewise. (do_set_mmap_threshold): Likewise. (do_set_mmaps_max): Likewise. (do_set_mallopt_check): Likewise. (do_set_perturb_byte): Likewise. (do_set_arena_test): Likewise. (do_set_arena_max): Likewise. (do_set_tcache_max): Likewise. (do_set_tcache_count): Likewise. (do_set_tcache_unsorted_limit): Likewise. * nis/nis_subr.c (count_dots): Likewise. * nptl/allocatestack.c (advise_stack_range): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Likewise. (do_sin): Likewise. (reduce_sincos): Likewise. (do_sincos): Likewise. * sysdeps/unix/sysv/linux/x86/elision-conf.c (do_set_elision_enable): Likewise. (TUNABLE_CALLBACK_FNDECL): Likewise. |
||
Dmitry V. Levin
|
a1b02ae763 |
Fix a few typos in comments
Apply the following spelling fixes: $ git grep -F -l 'relevent' | xargs sed -i 's/relevent/relevant/g' $ git grep -F -l 'checked fot' | xargs sed -i 's/checked fot/checked for/g' $ git grep -F -l "could't" | xargs sed -i "s/could't/couldn't/g" $ git grep -F -l 'wheter' | grep -Fv ChangeLog.old | xargs sed -i 's/wheter/whether/g' $ git grep -F -l 'neccessary' | grep -Fv ChangeLog.old | xargs sed -i 's/neccessary/necessary/g' $ git grep -F -l 'ouput' | xargs sed -i 's/ouput/output/g' $ git grep -F -w -l 'iput' | xargs sed -i 's/iput/input/g' This is inspired by a gnulib bug report at https://lists.gnu.org/archive/html/bug-gnulib/2019-01/msg00081.html * argp/argp-help.c: Fix typo in comment. * misc/sys/cdefs.h: Likewise. * posix/regexec.c (sift_states_iter_mb): Likewise. * socket/sockatmark.c: Likewise. * socket/sys/socket.h: Likewise. * sysdeps/ia64/fpu/libm_sincos_large.S: Likewise. * sysdeps/ia64/fpu/libm_sincosl.S: Likewise. * sysdeps/ia64/fpu/s_cosl.S: Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise. * sysdeps/unix/sockatmark.c: Likewise. * time/strptime_l.c: Likewise. |
||
H.J. Lu
|
69da3c9e87 |
soft-fp: Properly check _FP_W_TYPE_SIZE [BZ #24066]
quad.h have #if _FP_W_TYPE_SIZE < 64 union _FP_UNION_Q { Use 4 _FP_W_TYPEs } #else union _FP_UNION_Q { Use 2 _FP_W_TYPEs } #endif Replace #if (2 * _FP_W_TYPE_SIZE) < _FP_FRACBITS_Q with #if _FP_W_TYPE_SIZE < 64 to check whether 4 or 2 _FP_W_TYPEs are used for IEEE quad precision. Tested with build-many-glibcs.py. [BZ #24066] * soft-fp/extenddftf2.c: Use "_FP_W_TYPE_SIZE < 64" to check if 4_FP_W_TYPEs are used for IEEE quad precision. * soft-fp/extendhftf2.c: Likewise. * soft-fp/extendsftf2.c: Likewise. * soft-fp/extendxftf2.c: Likewise. * soft-fp/trunctfdf2.c: Likewise. * soft-fp/trunctfhf2.c: Likewise. * soft-fp/trunctfsf2.c: Likewise. * soft-fp/trunctfxf2.c: Likewise. * sysdeps/alpha/ots_cvttx.c: Likewise. * sysdeps/alpha/ots_cvtxt.c: Likewise. * sysdeps/ieee754/soft-fp/s_daddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_dmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise. * sysdeps/ieee754/soft-fp/s_faddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise. * sysdeps/sparc/sparc32/q_dtoq.c: Likewise. * sysdeps/sparc/sparc32/q_qtod.c: Likewise. * sysdeps/sparc/sparc32/q_qtos.c: Likewise. * sysdeps/sparc/sparc32/q_stoq.c: Likewise. * sysdeps/sparc/sparc64/qp_dtoq.c: Likewise. * sysdeps/sparc/sparc64/qp_qtod.c: Likewise. * sysdeps/sparc/sparc64/qp_qtos.c: Likewise. * sysdeps/sparc/sparc64/qp_stoq.c: Likewise. |
||
Martin Jansa
|
27c5e756a2 |
sysdeps/ieee754: prevent maybe-uninitialized errors with -O [BZ #19444]
With -O included in CFLAGS it fails to build with: ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_jnl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:146:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_ynl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:375:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_jn': ../sysdeps/ieee754/dbl-64/e_jn.c:113:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_yn': ../sysdeps/ieee754/dbl-64/e_jn.c:320:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os: https://sourceware.org/ml/libc-alpha/2018-09/msg00539.html [BZ #19444] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Use __builtin_unreachable for default case in switch. (__ieee754_yn): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. |
||
Zack Weinberg
|
03992356e6
|
Use C99-compliant scanf under _GNU_SOURCE with modern compilers.
The only difference between noncompliant and C99-compliant scanf is that the former accepts the archaic GNU extension '%as' (also %aS and %a[...]) meaning to allocate space for the input string with malloc. This extension conflicts with C99's use of %a as a format _type_ meaning to read a floating-point number; POSIX.1-2008 standardized equivalent functionality using the modifier letter 'm' instead (%ms, %mS, %m[...]). The extension was already disabled in most conformance modes: specifically, any mode that doesn't involve _GNU_SOURCE and _does_ involve either strict conformance to C99 or loose conformance to both C99 and POSIX.1-2001 would get the C99-compliant scanf. With compilers new enough to use -std=gnu11 instead of -std=gnu89, or equivalent, that includes the default mode. With this patch, we now provide C99-compliant scanf in all configurations except when _GNU_SOURCE is defined *and* __STDC_VERSION__ or __cplusplus (whichever is relevant) indicates C89/C++98. This leaves the old scanf available under e.g. -std=c89 -D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it was already not present under -std=gnu11 without -D_GNU_SOURCE) and from -std=gnu89 without -D_GNU_SOURCE. There needs to be an internal override so we can compile the noncompliant scanf itself. This is the same problem we had when we removed 'gets' from _GNU_SOURCE and it's dealt with the same way: there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to off under the appropriate conditions for external code, but can be overridden by individual files within stdio. We also run into problems with PLT bypass for internal uses of sscanf, because libc_hidden_proto uses __REDIRECT and so does the logic in stdio.h for choosing which implementation of scanf to use; __REDIRECT isn't transitive, so include/stdio.h needs to bridge the gap with a macro. As far as I can tell, sscanf is the only function in this family that's internally called by unrelated code. Finally, there are several tests in stdio-common that use the extension. bug21.c is a regression test for a crash; it still exercises the relevant code when changed to use %ms instead of %as. scanf14.c through scanf17.c are more complicated since they are actually testing the subtleties of the extension - under what circumstances is 'a' treated as a modifier letter, etc. I changed all of them to use %ms instead of %as as well, but duplicated scanf14.c and scanf16.c as scanf14a.c and scanf16a.c. These still use %as and are compiled with -std=gnu89 to access the old extension. A bunch of diagnostic overrides and manual workarounds for the old stdio.h behavior become unnecessary. Yay! * include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE parameter. Only use deprecated scanf when __USE_GNU is defined and __STDC_VERSION__ is less than 199901L or __cplusplus is less than 201103L, whichever is relevant for the language being compiled. * libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their __isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF). * wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf. * libio/iovsscanf.c * libio/fwscanf.c * libio/iovswscanf.c * libio/swscanf.c * libio/vscanf.c * libio/vwscanf.c * libio/wscanf.c * stdio-common/fscanf.c * stdio-common/scanf.c * stdio-common/vfscanf.c * stdio-common/vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-compat.c * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c * sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c: Override __GLIBC_USE_DEPRECATED_SCANF to 1. * stdio-common/sscanf.c: Likewise. Remove ldbl_hidden_def for __sscanf. * stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf. * include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf, not sscanf. [!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf with a preprocessor macro. * stdio-common/bug21.c, stdio-common/scanf14.c: Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[]; remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16.c: Likewise. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf14a.c: New copy of scanf14.c which still uses %as, %aS, %a[]. Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16a.c: New copy of scanf16.c which still uses %as, %aS, %a[]. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf15.c, stdio-common/scanf17.c: No need to override feature selection macros or provide definitions of u_char etc. * stdio-common/Makefile (tests): Add scanf14a and scanf16a. (CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove. (CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New. Compile these files with -std=gnu89. |
||
Joseph Myers
|
04277e02d7 |
Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. |
||
H.J. Lu
|
8700a7851b |
x86-64: Vectorize sincosf_poly and update s_sincosf-fma.c
Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>. |
||
Szabolcs Nagy
|
505b5b2922 |
Fix powf overflow handling in non-nearest rounding mode [BZ #23961]
The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. The fix tries to avoid affecting the common code path or calling a function that may introduce a stack frame, so float arithmetics is used to check the rounding mode and the threshold is selected accordingly. [BZ #23961] * math/auto-libm-test-in: Add new test case. * math/auto-libm-test-out-pow: Regenerated. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Fix overflow check. |
||
Zack Weinberg
|
35caceb145 |
Use PRINTF_LDBL_IS_DBL instead of __ldbl_is_dbl.
After all that prep work, nldbl-compat.c can now use PRINTF_LDBL_IS_DBL instead of __no_long_double to control the behavior of printf-like functions; this is the last thing we needed __no_long_double for, so it can go away entirely. Tested for powerpc and powerpc64le. |
||
Zack Weinberg
|
4e2f43f842 |
Use PRINTF_FORTIFY instead of _IO_FLAGS2_FORTIFY (bug 11319)
The _chk variants of all of the printf functions become much simpler.
This is the last thing that we needed _IO_acquire_lock_clear_flags2
for, so it can go as well. I took the opportunity to make the headers
included and the names of all local variables consistent across all the
affected files.
Since we ultimately want to get rid of __no_long_double as well, it
must be possible to get all of the nontrivial effects of the _chk
functions by calling the _internal functions with appropriate flags.
For most of the __(v)xprintf_chk functions, this is covered by
PRINTF_FORTIFY plus some up-front argument checks that can be
duplicated. However, __(v)sprintf_chk installs a custom jump table so
that it can crash instead of overflowing the output buffer. This
functionality is moved to __vsprintf_internal, which now has a
'maxlen' argument like __vsnprintf_internal; to get the unsafe
behavior of ordinary (v)sprintf, pass -1 for that argument.
obstack_printf_chk and obstack_vprintf_chk are no longer in the same
file.
As a side-effect of the unification of both fortified and non-fortified
vdprintf initialization, this patch fixes bug 11319 for __dprintf_chk
and __vdprintf_chk, which was previously fixed only for dprintf and
vdprintf by the commit
commit
|
||
Zack Weinberg
|
124fc732c1 |
Add __vsyslog_internal, with same flags as __v*printf_internal.
__nldbl___vsyslog_chk will ultimately want to pass PRINTF_LDBL_IS_DBL down to __vfprintf_internal *as well as* possibly setting PRINTF_FORTIFY. To make that possible, we need a __vsyslog_internal that takes the same flags as printf. The code in misc/syslog.c does also get a little simpler. Tested for powerpc and powerpc64le. |
||
Zack Weinberg
|
698fb75b9f |
Add __v*printf_internal with flags arguments
There are a lot more printf variants than there are scanf variants, and the code for setting up and tearing down their custom FILE variants around the call to __vf(w)printf is more complicated and variable. Therefore, I have added _internal versions of all the v*printf variants, rather than introducing helper routines so that they can all directly call __vf(w)printf_internal, as was done with scanf. As with the scanf changes, in this patch the _internal functions still look at the environmental mode bits and all callers pass 0 for the flags parameter. Several of the affected public functions had _IO_ name aliases that were not exported (but, in one case, appeared in libio.h anyway); I was originally planning to leave them as aliases to avoid having to touch internal callers, but it turns out ldbl_*_alias only work for exported symbols, so they've all been removed instead. It also turns out there were hardly any internal callers. _IO_vsprintf and _IO_vfprintf *are* exported, so those two stick around. Summary for the changes to each of the affected symbols: _IO_vfprintf, _IO_vsprintf: All internal calls removed, thus the internal declarations, as well as uses of libc_hidden_proto and libc_hidden_def, were also removed. The external symbol is now exposed via uses of ldbl_strong_alias to __vfprintf_internal and __vsprintf_internal, respectively. _IO_vasprintf, _IO_vdprintf, _IO_vsnprintf, _IO_vfwprintf, _IO_vswprintf, _IO_obstack_vprintf, _IO_obstack_printf: All internal calls removed, thus declaration in internal headers were also removed. They were never exported, so there are no aliases tying them to the internal functions. I.e.: entirely gone. __vsnprintf: Internal calls were always preceded by macros such as #define __vsnprintf _IO_vsnprintf, and #define __vsnprintf vsnprintf The macros were removed and their uses replaced with calls to the new internal function __vsnprintf_internal. Since there were no internal calls, the internal declaration was also removed. The external symbol is preserved with ldbl_weak_alias to ___vsnprintf. __vfwprintf: All internal calls converted into calls to __vfwprintf_internal, thus the internal declaration was removed. The function is now a wrapper that calls __vfwprintf_internal. The external symbol is preserved. __vswprintf: Similarly, but no external symbol. __vasprintf, __vdprintf, __vfprintf, __vsprintf: New internal wrappers. Not exported. vasprintf, vdprintf, vfprintf, vsprintf, vsnprintf, vfwprintf, vswprintf, obstack_vprintf, obstack_printf: These functions used to be aliases to the respective _IO_* function, they are now aliases to their respective __* functions. Tested for powerpc and powerpc64le. |
||
Zack Weinberg
|
d91798b31a |
Use SCANF_LDBL_IS_DBL instead of __ldbl_is_dbl.
Change the callers of __vfscanf_internal and __vfwscanf_internal that want to treat 'long double' as another name for 'double' (all of which happen to be in sysdeps/ieee754/ldbl-opt/nldbl-compat.c) to communicate this via the new flags argument, instead of the per-thread variable __no_long_double and its __ldbl_is_dbl wrapper macro. Tested for powerpc and powerpc64le. |
||
Zack Weinberg
|
349718d4d7 |
Add __vfscanf_internal and __vfwscanf_internal with flags arguments.
There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used by __isoc99_ scanf variants. In this patch, the new functions honor these flag bits if they're set, but they still also look at the corresponding bits of environmental state, and callers all pass zero. The new functions do *not* have the "errp" argument possessed by _IO_vfscanf and _IO_vfwscanf. All internal callers passed NULL for that argument. External callers could theoretically exist, so I preserved wrappers, but they are flagged as compat symbols and they don't preserve the three-way distinction among types of errors that was formerly exposed. These functions probably should have been in the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just aliases for vfscanf and vfwscanf. (It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf. Please check that part of the patch very carefully, I am still not confident I understand all of the details of ldbl-opt.) This patch also introduces helper inlines in libio/strfile.h that encapsulate the process of initializing an _IO_strfile object for reading. This allows us to call __vfscanf_internal directly from sscanf, and __vfwscanf_internal directly from swscanf, without duplicating the initialization code. (Previously, they called their v-counterparts, but that won't work if we want to control *both* C99 mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.) It's still a little awkward, especially for wide strfiles, but it's much better than what we had. Tested for powerpc and powerpc64le. |
||
Szabolcs Nagy
|
a502c5294b |
Remove the error handling wrapper from pow
Introduce new pow symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_pow.c and enabled for targets with their own pow implementation or ifunc dispatch on __ieee754_pow by including math/w_pow.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously powl was an alias of pow, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __pow_finite symbol is now an alias of pow. Both __pow_finite and pow set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add pow. * math/w_pow_compat.c (__pow_compat): Change to versioned compat symbol. * math/w_pow.c: New file. * sysdeps/i386/fpu/w_pow.c: New file. * sysdeps/ia64/fpu/e_pow.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow and add necessary aliases. * sysdeps/ieee754/dbl-64/w_pow.c: New file. * sysdeps/m68k/m680x0/fpu/w_pow.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to __pow. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/w_pow.c: New file. |
||
Szabolcs Nagy
|
718d6542f2 |
Remove the error handling wrapper from log2
Introduce new log2 symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log2.c and enabled for targets with their own log2 implementation by including math/w_log2.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously log2l was an alias of log2, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log2_finite symbol is now an alias of log2. Both __log2_finite and log2 set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log2. * math/w_log2_compat.c (__log2_compat): Change to versioned compat symbol. * math/w_log2.c: New file. * sysdeps/i386/fpu/w_log2.c: New file. * sysdeps/ia64/fpu/e_log2.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log2.c: New file. * sysdeps/m68k/m680x0/fpu/w_log2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. |
||
Szabolcs Nagy
|
f29b7c492d |
Remove the error handling wrapper from log
Introduce new log symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log.c and enabled for targets with their own log implementation by including math/w_log.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously logl was an alias of log, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log_finite symbol is now an alias of log. Both __log_finite and log set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log. * math/w_log_compat.c (__log_compat): Change to versioned compat symbol. * math/w_log.c: New file. * sysdeps/i386/fpu/w_log.c: New file. * sysdeps/ia64/fpu/e_log.S: Update. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log.c: New file. * sysdeps/m68k/m680x0/fpu/w_log.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to __log. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/w_log.c: New file. |
||
Szabolcs Nagy
|
c20a10561a |
Remove the error handling wrapper from exp and exp2
Introduce new exp and exp2 symbol version that don't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The double precision wrappers are disabled for sysdeps/ieee754/dbl-64 by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and math/w_exp2.c files use the wrapper template and can be included by targets that have their own exp and exp2 implementations or use ifunc on the glibc internal __ieee754_exp symbol. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously expl and exp2l were aliases of exp and exp2, now they point to the compatibility symbols with the wrapper, because they still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The _finite symbols are now aliases of the standard symbols (they have no performance advantage anymore). Both the standard symbols and _finite symbols set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header (the new macro name is __exp instead of __ieee754_exp which breaks some math.h macros). Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add exp and exp2. * math/w_exp2_compat.c (__exp2_compat): Change to versioned compat symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly. * math/w_exp_compat.c (__exp_compat): Likewise. * math/w_exp.c: New file. * math/w_exp2.c: New file. * sysdeps/i386/fpu/w_exp.c: New file. * sysdeps/i386/fpu/w_exp2.c: New file. * sysdeps/ia64/fpu/e_exp.S: Add versioned symbols. * sysdeps/ia64/fpu/e_exp2.S: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp and add necessary aliases. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_exp.c: New file. * sysdeps/ieee754/dbl-64/w_exp2.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/w_exp.c: New file. |
||
Zack Weinberg
|
c75772e3f0 |
Use STRFMON_LDBL_IS_DBL instead of __ldbl_is_dbl.
On platforms where long double used to have the same format as double, but later switched to a different format (alpha, s390, sparc, and powerpc), accessing the older behavior is possible and it happens via __nldbl_* functions (not on the API, but accessible from header redirection and from compat symbols). These functions write to the global flag __ldbl_is_dbl, which tells other functions that long double variables should be handled as double. This patch takes the first step towards removing this global flag and creates __vstrfmon_l_internal, which takes an explicit flags parameter. This change arguably makes the generated code slightly worse on architectures where __ldbl_is_dbl is never true; right now, on those architectures, it's a compile-time constant; after this change, the compiler could theoretically prove that __vstrfmon_l_internal was never called with a nonzero flags argument, but it would probably need LTO to do it. This is not performance critical code and I tend to think that the maintainability benefits of removing action at a distance are worth it. However, we _could_ wrap the runtime flag check with a macro that was defined to ignore its argument and always return false on architectures where __ldbl_is_dbl is never true, if people think the codegen benefits are important. Tested for powerpc and powerpc64le. |
||
Joseph Myers
|
a19876214a |
Fix libnldbl_nonshared.a references to internal libm symbols (bug 23735).
The redirection of built-in functions such as sqrt in include/math.h applies when the wrappers for those functions in libnldbl_nonshared.a are built, resulting in references to internal names such as __ieee754_sqrt that aren't actually exported from the shared libm. (This applies for sqrt in 2.28, also for the round-to-integer functions in current master because of my changes there.) This patch arranges for NO_MATH_REDIRECT to be used for all the affected functions, and adds a test for those functions in libnldbl_nonshared.a. (We could of course choose to obsolete libnldbl_nonshared.a and require that people building with -mlong-double-64 either include the relevant headers and have a compiler supporting asm redirection, or have some other means of achieving that redirection at compile time if not including those headers. But while we have libnldbl_nonshared.a, it seems appropriate to fix such bugs in it.) Tested for powerpc, and with build-many-glibcs.py. [BZ #23735] * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (NO_MATH_REDIRECT): Define. * sysdeps/ieee754/ldbl-opt/test-nldbl-redirect.c: New file. * sysdeps/ieee754/ldbl-opt/Makefile [$(subdir) = math] (tests): Add test-nldbl-redirect. [$(subdir) = math] (CFLAGS-test-nldbl-redirect.c): New variable. [$(subdir) = math] ($(objpfx)test-nldbl-redirect): Depend on $(objpfx)libnldbl_nonshared.a. |
||
Martin Jansa
|
4a06ceea33 |
sysdeps/ieee754/soft-fp: ignore maybe-uninitialized with -O [BZ #19444]
* with -O, -O1, -Os it fails with: In file included from ../soft-fp/soft-fp.h:318, from ../sysdeps/ieee754/soft-fp/s_fdiv.c:28: ../sysdeps/ieee754/soft-fp/s_fdiv.c: In function '__fdiv': ../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized] X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) | X##_f0 >> (N) \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f1' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ ../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized] : (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f0' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os. [BZ #19444] * sysdeps/ieee754/soft-fp/s_fdiv.c: Include <libc-diag.h> and use DIAG_PUSH_NEEDS_COMMENT, DIAG_IGNORE_NEEDS_COMMENT and DIAG_POP_NEEDS_COMMENT to disable -Wmaybe-uninitialized. |
||
Joseph Myers
|
c52944e8cc |
Remove unnecessary math_private.h includes.
After my changes to move various macros, inlines and other content from math_private.h to more specific headers, many files including math_private.h no longer need to do so. Furthermore, since the optimized inlines of various functions have been moved to include/fenv.h or replaced by use of function names GCC inlines automatically, a missing math_private.h include where one is appropriate will reliably cause a build failure rather than possibly causing code to be less well optimized while still building successfully. Thus, this patch removes includes of math_private.h that are now unnecessary. In the case of two RISC-V files, the include is replaced by one of stdbool.h because the files in question were relying on math_private.h to get a definition of bool. Tested for x86_64 and x86, and with build-many-glibcs.py. * math/fromfp.h: Do not include <math_private.h>. * math/s_cacosh_template.c: Likewise. * math/s_casin_template.c: Likewise. * math/s_casinh_template.c: Likewise. * math/s_ccos_template.c: Likewise. * math/s_cproj_template.c: Likewise. * math/s_fdim_template.c: Likewise. * math/s_fmaxmag_template.c: Likewise. * math/s_fminmag_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/s_ldexp_template.c: Likewise. * math/s_nextdown_template.c: Likewise. * math/w_log1p_template.c: Likewise. * math/w_scalbln_template.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_atanl.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/i386/fpu/s_logbl.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/i386/fpu/s_significandl.c: Likewise. * sysdeps/ia64/fpu/s_matherrf.c: Likewise. * sysdeps/ia64/fpu/s_matherrl.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/s_signgam.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise. * sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of <math_private.h>. * sysdeps/riscv/rvf/s_rintf.c: Likewise. |
||
Joseph Myers
|
81dca813cc |
Use copysign functions not __copysign functions in glibc libm.
Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __copysign functions to call the corresponding copysign names instead, with asm redirection to __copysign when the calls are not inlined (all cases are inlined except for IBM long double for powerpc soft-float / e500v1). This eliminates the need for an inline function defining __copysign in terms of __builtin_copysign. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_BINARY_ARGS): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT. * sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/alpha/fpu/s_copysignf.c: Likewise. * sysdeps/ieee754/dbl-64/s_copysign.c: Likewise. * sysdeps/ieee754/float128/s_copysignf128.c: Likewise. * sysdeps/ieee754/flt-32/s_copysignf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/riscv/rvd/s_copysign.c: Likewise. * sysdeps/riscv/rvf/s_copysignf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/generic/math_private_calls.h [!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign): Do not declare and define as an inline function. * math/divtc3.c (__divtc3): Use copysign functions instead of __copysign variants. * math/multc3.c (__multc3): Likewise. * sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. (__ieee754_yn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise. * sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise. (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. (__ieee754_ynf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl) * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise. |
||
Joseph Myers
|
9755bc4686 |
Use round functions not __round functions in glibc libm.
Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl. |
||
Joseph Myers
|
7abf97bed9 |
Use trunc functions not __trunc functions in glibc libm.
Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. |
||
Szabolcs Nagy
|
d734727837 |
Fix the documentation comment of checkint in powf
checkint in powf is not supposed to be used with 0, inf or nan inputs. * sysdeps/ieee754/flt-32/e_powf.c (checkint): Fix documentation. |
||
Szabolcs Nagy
|
424c4f60ed |
Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about 1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). The __exp1 internal symbol is no longer necessary. There is separate code path when fma is not available but the worst case error is about 0.54 ULP in both cases. The lookup table and consts for log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1] pow latency: 1.84x in [0.01 11.1]x[0.01 11.1] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets. * NEWS: Mention pow improvements. * math/Makefile (type-double-routines): Add e_pow_log_data. * sysdeps/generic/math_private.h (__exp1): Remove. * sysdeps/i386/fpu/e_pow_log_data.c: New file. * sysdeps/ia64/fpu/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma contraction. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove. (exp_inline): Remove. (__ieee754_exp): Only single double input is handled. * sysdeps/ieee754/dbl-64/e_pow.c: Rewrite. * sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define. (__pow_log_data): Define. * sysdeps/ieee754/dbl-64/upow.h: Remove. * sysdeps/ieee754/dbl-64/upow.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma contraction. (CFLAGS-e_pow-fma4.c): Likewise. |
||
Joseph Myers
|
50bc59ca4d |
Fix ldbl-128ibm ceill, floorl inlining of ceil, floor.
The ldbl-128ibm implementations of ceill and floorl call the corresponding double functions. This patch fixes those implementations to call those functions as ceil and floor rather than as __ceil and __floor, so that the proper inlining takes place when possible, while including local asm redirections for when the functions are not inlined since NO_MATH_REDIRECT applies to the double functions as well as to the long double ones. Tested with build-many-glibcs.py for all its powerpc configurations. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c (ceil): Redirect to __ceil. (__ceill): Call ceil instead of __ceil. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c (floor): Redirect to __floor. (__floorl): Call floor instead of __floor. |
||
Joseph Myers
|
71223ef909 |
Use ceil functions not __ceil functions in glibc libm.
Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise. |
||
Joseph Myers
|
f29b6f17e4 |
Use rint functions not __rint functions in glibc libm.
Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise. |
||
Joseph Myers
|
e44acb2063 |
Use floor functions not __floor functions in glibc libm.
Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise. |
||
Szabolcs Nagy
|
3e08ff544b |
Add new log2 implementation
Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c) where the last term is approximated by a polynomial of x/c - 1, the first order coefficient is about 1/ln2 in this case. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, for which the table size is doubled. The worst case error is 0.547 ULP (0.55 without fma), the read only global data size is 1168 bytes (2192 without fma) on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log2 thruput: 2.00x in [0.01 11.1] log2 latency: 2.04x in [0.01 11.1] log2 thruput: 2.17x in [0.999 1.001] log2 latency: 2.88x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linxu-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log2 improvements. * math/Makefile (type-double-routines): Add e_log2_data. * sysdeps/i386/fpu/e_log2_data.c: New file. * sysdeps/ia64/fpu/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/e_log2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove. * sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file. |
||
Szabolcs Nagy
|
f41b0a43e4 |
Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c) values for small intervalls around 1. The log(c) is very near a double precision value, it has about 62 bits precision. The algorithm is log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is approximated by a polynomial of x/c - 1. Near 1 a single polynomial of x - 1 is used. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, in which case the table size is doubled. The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined as an instruction. With the default configuration settings the worst case error is 0.519 ULP (and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma). The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log thruput: 3.28x in [0.01 11.1] log latency: 2.23x in [0.01 11.1] log thruput: 1.56x in [0.999 1.001] log latency: 1.57x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linux-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log improvement. * math/Makefile (type-double-routines): Add e_log_data. * sysdeps/i386/fpu/e_log_data.c: New file. * sysdeps/ia64/fpu/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/e_log.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add. * sysdeps/ieee754/dbl-64/ulog.h: Remove. * sysdeps/ieee754/dbl-64/ulog.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_log_data.c: New file. |
||
Szabolcs Nagy
|
e70c176825 |
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update. |
||
Joseph Myers
|
418d99e622 |
Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h.
<fenv_private.h> has inline versions of various <fenv.h> functions, and their __fe* variants, for systems (generally soft-float) without support for floating-point exceptions, rounding modes or both. Having these inlines in a separate header introduces a risk of a source file including <fenv.h> and compiling OK on x86_64, but failing to compile (because the feraiseexcept inline is actually a macro that discards its argument, to avoid the need for #ifdef FE_INVALID conditionals), or not being properly optimized, on systems without the exceptions and rounding modes support (when these inlines were in math_private.h, we had a few cases where this broke the build because there was no obvious reason for a file to need math_private.h and it didn't need that header on x86_64). By moving those inlines to include/fenv.h, this risk can be avoided, and fenv_private.h becomes more clearly defined as specifically the header for the internal libc_fe* and SET_RESTORE_ROUND* interfaces. This patch makes that move, removing fenv_private.h includes that are no longer needed (or replacing them by fenv.h includes in a few cases that didn't already have such an include). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/fenv_private.h [FE_ALL_EXCEPT == 0]: Move this code .... [!FE_HAVE_ROUNDING_MODES]: And this code .... * include/fenv.h [!_ISOMAC]: ... to here. * math/fraiseexcpt.c (__feraiseexcept): Undefine as macro. (feraiseexcept): Likewise. * math/fromfp.h: Do not include <fenv_private.h>. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * math/w_ilogb_template.c: Include <fenv.h> instead of <fenv_private.h>. * math/w_llogb_template.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise. |
||
Joseph Myers
|
70e2ba332f |
Do not include fenv_private.h in math_private.h.
Continuing the clean-up related to the catch-all math_private.h header, this patch stops math_private.h from including fenv_private.h. Instead, fenv_private.h is included directly from those users of math_private.h that also used interfaces from fenv_private.h. No attempt is made to remove unused includes of math_private.h, but that is a natural followup. (However, since math_private.h sometimes defines optimized versions of math.h interfaces or __* variants thereof, as well as defining its own interfaces, I think it might make sense to get all those optimized versions included from include/math.h, not requiring a separate header at all, before eliminating unused math_private.h includes - that avoids a file quietly becoming less-optimized if someone adds a call to one of those interfaces without restoring a math_private.h include to that file.) There is still a pitfall that if code uses plain fe* and __fe* interfaces, but only includes fenv.h and not fenv_private.h or (before this patch) math_private.h, it will compile on platforms with exceptions and rounding modes but not get the optimized versions (and possibly not compile) on platforms without exception and rounding mode support, so making it easy to break the build for such platforms accidentally. I think it would be most natural to move the inlines / macros for fe* and __fe* in the case of no exceptions and rounding modes into include/fenv.h, so that all code including fenv.h with _ISOMAC not defined automatically gets them. Then fenv_private.h would be purely the header for the libc_fe*, SET_RESTORE_ROUND etc. internal interfaces and the risk of breaking the build on other platforms than the one you tested on because of a missing fenv_private.h include would be much reduced (and there would be some unused fenv_private.h includes to remove along with unused math_private.h includes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/math_private.h: Do not include <fenv_private.h>. * math/fromfp.h: Include <fenv_private.h>. * math/math-narrow.h: Likewise. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_ilogb_template.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_llogb_template.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_remainder.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_product.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/x2y2m1.c: Likewise. * sysdeps/ieee754/float128/float128_private.h: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1l.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. |
||
Wilco Dijkstra
|
ca3aac57ef |
Remove unused math files
Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c, k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c (and the ia64, m68k and powerpc equivalents) are no longer used, so remove them. All e_rem_pio2.c files were already empty or commented out, so remove them too. Passes build-many-glibcs. * math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c. Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c. * sysdeps/i386/fpu/e_rem_pio2.c: Delete file. * sysdeps/ia64/fpu/e_rem_pio2.c: Likewise. * sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise. * sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/k_cos.c: Likewise. * sysdeps/ieee754/dbl-64/k_sin.c: Likewise. * sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_cosf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise * sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise * sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise * sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise. * sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise. |
||
Wilco Dijkstra
|
900fb446eb |
Speedup tanf range reduction
Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * |x| < M_PI_4 : 1.1x * |x| < M_PI_2 : 1.2x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.6x * |x| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction. |
||
Wilco Dijkstra
|
126c4e3f80 |
Use generic sinf/cosf in lgammaf_r
The internal functions __kernel_sinf and __kernel_cosf are used only by lgammaf_r. Removing the internal functions and using the generic sinf and cosf is better overall. Benchmarking on Cortex-A72 shows the generic sinf and cosf are 1.4x and 2.3x faster in the range |x| < PI/4, and 0.66x and 1.1x for |x| < PI/2, so it should make lgammaf_r faster on average. GLIBC regression tests pass on AArch64. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf. * sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise. |
||
Wilco Dijkstra
|
599cf39766 |
Improve performance of sinf and cosf
The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * |x| < 0x1p-12 : 1.2x * |x| < M_PI_4 : 1.8x * |x| < 2 * M_PI: 1.7x * |x| < 120.0 : 2.3x * |x| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite. |
||
Wilco Dijkstra
|
ea5c662c62 |
Improve performance of sincosf
This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * |x| < 0x1p-12 : 1.6x * |x| < M_PI_4 : 1.7x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.8x * |x| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file. |
||
Szabolcs Nagy
|
43cfdf8f48 |
Clean up converttoint handling and document the semantics
This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path. |
||
Gabriel F. T. Gomes
|
b7b88cea41 |
ldbl-128ibm-compat: Add printf_size
Since the addition of the _Float128 API, strfromf128 and printf_size use __printf_fp to print _Float128 values. This is achieved by setting the 'is_binary128' member of the 'printf_info' structure to one. Now that the format of long double on powerpc64le is getting a third option, this mechanism is reused for long double values that have binary128 format (i.e.: when -mabi=ieeelongdouble). This patch adds __printf_sizeieee128 as an exported symbol, but doesn't provide redirections from printf_size, yet. All redirections will be installed in a future commit, once all other functions that print or read long double values with binary128 format are ready. In __printf_fp, when 'is_binary128' is one, the floating-point argument is treated as if it was of _Float128 type, regardless of the value of 'is_long_double', thus __printf_sizeieee128 sets 'is_binary128' to the same value of 'is_long_double'. Otherwise, double values would not be printed correctly. Tested for powerpc64le. |
||
Szabolcs Nagy
|
2b445206a1 |
Use uint32_t sign in single precision math error handling functions
Ideally sign should be bool, but sometimes (e.g. in powf) it's more efficient to pass a non-zero value than 1 to indicate that the sign should be set. The fixed size int is less ambigous than unsigned long. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Use uint32_t. (exp2f_inline): Likewise. * sysdeps/ieee754/flt-32/math_config.h (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise. * sysdeps/ieee754/flt-32/math_errf.c (xflowf): Likewise. (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise. |
||
Rajalakshmi Srinivasaraghavan
|
86a0f56158 |
ldbl-128ibm-compat: Introduce ieee128 symbols
This patch adds __*ieee128 symbols for strfrom, strtold, strtold_l, wcstold and wcstold_l functions. Redirection from *l to *ieee128 will be handled in separate patch once we start building these new files. 2018-06-28 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/ieee754/ldbl-128ibm-compat/Versions: Add __strfromieee128, __strtoieee128, __strtoieee128_l,__wcstoieee128 and __wcstoieee128_l. * sysdeps/ieee754/ldbl-128ibm-compat/strfromf128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/strtof128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/strtof128_l.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/wcstof128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/wcstof128_l.c: New file. |
||
Tulio Magno Quites Machado Filho
|
209ae17c60 |
ldbl-128ibm-compat: Create libm-alias-float128.h
Add a new libm-alias-float128.h in order to provide the __*ieee128 aliases for the existing *f128 that do not have a globally exported symbol. * sysdeps/ieee754/ldbl-128ibm-compat/Versions: New file. * sysdeps/ieee754/ldbl-128ibm-compat/libm-alias-float128.h: New file. |
||
Tulio Magno Quites Machado Filho
|
5e79e0292b |
Add a generic significand implementation
Create a template for significand. * math/Makefile (libm-calls): Move s_significandF to... (gen-libm-calls): ... here. * math/s_significand_template.c: New file. * math/s_significand.c: Removed. * math/s_significandf.c: Removed. * math/s_significandl.c: Removed. * sysdeps/ieee754/ldbl-opt/s_significand.c: Removed. * sysdeps/ieee754/ldbl-opt/s_significandl.c: Removed. Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> |
||
Joseph Myers
|
ca121b117f |
Fix ldbl-96 fma (Inf, Inf, finite) (bug 23272).
As reported in bug 23272, the ldbl-96 implementation of fma (fma for double, in terms of ldbl-96 as the internal arithmetic type, as used on 32-bit x86) is missing some of the special-case handling for non-finite arguments, resulting in incorrect NaN results when the first two arguments are infinities, the third is finite and so the infinities go through the logic for finite arguments. This patch fixes it by handling all cases of non-finite arguments up front, with additional fma tests for the problem cases being added to the testsuite. Tested for x86_64 and x86. [BZ #23272] * sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Start by handling all cases of non-finite arguments. * math/libm-test-fma.inc (fma_test_data): Add more tests. |
||
Joseph Myers
|
3d6302a546 |
Fix i686-linux-gnu build with GCC mainline.
Building with recent GCC mainline for i686-linux-gnu is failing with: ../sysdeps/ieee754/flt-32/k_rem_pio2f.c: In function '__kernel_rem_pio2f': ../sysdeps/ieee754/flt-32/k_rem_pio2f.c:186:28: error: 'fq[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized] fv = math_narrow_eval (fq[0]-fv); ^ and ../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function '__kernel_rem_pio2': ../sysdeps/ieee754/dbl-64/k_rem_pio2.c:333:32: error: 'fq[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized] fv = math_narrow_eval (fq[0] - fv); ^ These are similar to -Warray-bounds cases for which the DIAG_* macros are already used in those files: the array element is in fact always initialized, but the reasoning that it is depends on another array not having been all zero at an earlier point, which depends on the functions not being called with zero arguments. Thus, this patch uses DIAG_* to disable -Wmaybe-uninitialized for this code. (The warning may be i686-specific because of math_narrow_eval somehow perturbing what the compiler does with this code enough to cause the warning. I don't know why it doesn't appear for i686-gnu.) Tested with build-many-glibcs.py that this fixes the i686 build in this configuration. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Ignore -Wmaybe-uninitialized around access to fq[0]. * sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f): Likewise. |
||
Joseph Myers
|
632a6cbe44 |
Add narrowing divide functions.
This patch adds the narrowing divide functions from TS 18661-1 to glibc's libm: fdiv, fdivl, ddivl, f32divf64, f32divf32x, f32xdivf64 for all configurations; f32divf64x, f32divf128, f64divf64x, f64divf128, f32xdivf64x, f32xdivf128, f64xdivf128 for configurations with _Float64x and _Float128; __nldbl_ddivl for ldbl-opt. The changes are mostly essentially the same as for the other narrowing functions, so the description of those generally applies to this patch as well. Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft float) and powerpc, and with build-many-glibcs.py. * math/Makefile (libm-narrow-fns): Add div. (libm-test-funcs-narrow): Likewise. * math/Versions (GLIBC_2.28): Add narrowing divide functions. * math/bits/mathcalls-narrow.h (div): Use __MATHCALL_NARROW. * math/gen-auto-libm-tests.c (test_functions): Add div. * math/math-narrow.h (CHECK_NARROW_DIV): New macro. (NARROW_DIV_ROUND_TO_ODD): Likewise. (NARROW_DIV_TRIVIAL): Likewise. * sysdeps/ieee754/float128/float128_private.h (__fdivl): New macro. (__ddivl): Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fdiv and ddiv. (CFLAGS-nldbl-ddiv.c): New variable. (CFLAGS-nldbl-fdiv.c): Likewise. * sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add __nldbl_ddivl. * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_ddivl): New prototype. * manual/arith.texi (Misc FP Arithmetic): Document fdiv, fdivl, ddivl, fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx. * math/auto-libm-test-in: Add tests of div. * math/auto-libm-test-out-narrow-div: New generated file. * math/libm-test-narrow-div.inc: New file. * sysdeps/i386/fpu/s_f32xdivf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_f32xdivf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_fdiv.c: Likewise. * sysdeps/ieee754/float128/s_f32divf128.c: Likewise. * sysdeps/ieee754/float128/s_f64divf128.c: Likewise. * sysdeps/ieee754/float128/s_f64xdivf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ddivl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_f64xdivf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fdivl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ddivl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fdivl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_ddivl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fdivl.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-ddiv.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-fdiv.c: Likewise. * sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fdiv.c: Likewise. * sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/mach/hurd/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Florian Weimer
|
9761bf4dfa |
math: Merge strtod_nan_*.h into math-type-macros-*.h
This change will eventually make it possible to compile stdlib/strtod_nan_main.c as part of math/s_nan_template.c. |
||
Joseph Myers
|
69a01461ee |
Add narrowing multiply functions.
This patch adds the narrowing multiply functions from TS 18661-1 to glibc's libm: fmul, fmull, dmull, f32mulf64, f32mulf32x, f32xmulf64 for all configurations; f32mulf64x, f32mulf128, f64mulf64x, f64mulf128, f32xmulf64x, f32xmulf128, f64xmulf128 for configurations with _Float64x and _Float128; __nldbl_dmull for ldbl-opt. The changes are mostly essentially the same as for the narrowing add functions, so the description of those generally applies to this patch as well. f32xmulf64 for i386 cannot use precision control as used for add and subtract, because that would result in double rounding for subnormal results, so that uses round-to-odd with long double intermediate result instead. The soft-fp support involves adding a new FP_TRUNC_COOKED since soft-fp multiplication uses cooked inputs and outputs. Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft float) and powerpc, and with build-many-glibcs.py. * math/Makefile (libm-narrow-fns): Add mul. (libm-test-funcs-narrow): Likewise. * math/Versions (GLIBC_2.28): Add narrowing multiply functions. * math/bits/mathcalls-narrow.h (mul): Use __MATHCALL_NARROW. * math/gen-auto-libm-tests.c (test_functions): Add mul. * math/math-narrow.h (CHECK_NARROW_MUL): New macro. (NARROW_MUL_ROUND_TO_ODD): Likewise. (NARROW_MUL_TRIVIAL): Likewise. * soft-fp/op-common.h (FP_TRUNC_COOKED): Likewise. * sysdeps/ieee754/float128/float128_private.h (__fmull): New macro. (__dmull): Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fmul and dmul. (CFLAGS-nldbl-dmul.c): New variable. (CFLAGS-nldbl-fmul.c): Likewise. * sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add __nldbl_dmull. * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dmull): New prototype. * manual/arith.texi (Misc FP Arithmetic): Document fmul, fmull, dmull, fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx. * math/auto-libm-test-in: Add tests of mul. * math/auto-libm-test-out-narrow-mul: New generated file. * math/libm-test-narrow-mul.inc: New file. * sysdeps/i386/fpu/s_f32xmulf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_f32xmulf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmul.c: Likewise. * sysdeps/ieee754/float128/s_f32mulf128.c: Likewise. * sysdeps/ieee754/float128/s_f64mulf128.c: Likewise. * sysdeps/ieee754/float128/s_f64xmulf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_dmull.c: Likewise. * sysdeps/ieee754/ldbl-128/s_f64xmulf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmull.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_dmull.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmull.c: Likewise. * sysdeps/ieee754/ldbl-96/s_dmull.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmull.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-dmul.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-fmul.c: Likewise. * sysdeps/ieee754/soft-fp/s_dmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_fmul.c: Likewise. * sysdeps/ieee754/soft-fp/s_fmull.c: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/mach/hurd/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Joseph Myers
|
b4d5b8b021 |
Do not include math-barriers.h in math_private.h.
This patch continues the math_private.h cleanup by stopping math_private.h from including math-barriers.h and making the users of the barrier macros include the latter header directly. No attempt is made to remove any math_private.h includes that are now unused, except in strtod_l.c where that is done to avoid line number changes in assertions, so that installed stripped shared libraries can be compared before and after the patch. (I think the floating-point environment support in math_private.h should also move out - some architectures already have fenv_private.h as an architecture-internal header included from their math_private.h - and after moving that out might be a better time to identify unused math_private.h includes.) Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/math_private.h: Do not include <math-barriers.h>. * stdlib/strtod_l.c: Include <math-barriers.h> instead of <math_private.h>. * math/fromfp.h: Include <math-barriers.h>. * math/math-narrow.h: Likewise. * math/s_nextafter.c: Likewise. * math/s_nexttowardf.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_nextafterl.c: Likewise. * sysdeps/i386/fpu/s_nexttoward.c: Likewise. * sysdeps/i386/fpu/s_nexttowardf.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_j0.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c: Likewise. * sysdeps/ieee754/flt-32/e_j0f.c: Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c: Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/flt-32/s_nextafterf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_j0l.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nextafterl.c: Likewise. |
||
Joseph Myers
|
8f5b00d375 |
Move math_check_force_underflow macros to separate math-underflow.h.
This patch continues cleaning up math_private.h by moving the math_check_force_underflow set of macros to a separate header math-underflow.h. This header is included by the files that need it rather than from math_private.h. Moving these macros to a separate file removes the math_private.h uses of macros from float.h, so the inclusion of float.h in math_private.h is also removed; files that were depending on that inclusion are fixed to include float.h directly. The inclusion of math-barriers.h from math_private.h will be removed in a separate patch. Tested for x86_64 and x86. Also tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * math/math-underflow.h: New file. * sysdeps/generic/math_private.h: Do not include <float.h>. (fabs_tg): Remove macro. Moved to math-underflow.h. (min_of_type_f): Likewise. (min_of_type_): Likewise. (min_of_type_l): Likewise. (min_of_type_f128): Likewise. (min_of_type): Likewise. (math_check_force_underflow): Likewise. (math_check_force_underflow_nonneg): Likewise. (math_check_force_underflow_complex): Likewise. * math/e_exp2_template.c: Include <math-underflow.h>. * math/k_casinh_template.c: Likewise. * math/s_catan_template.c: Likewise. * math/s_catanh_template.c: Likewise. * math/s_ccosh_template.c: Likewise. * math/s_cexp_template.c: Likewise. * math/s_clog10_template.c: Likewise. * math/s_clog_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_csqrt_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * sysdeps/ieee754/dbl-64/e_asin.c: Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_hypot.c: Likewise. * sysdeps/ieee754/dbl-64/e_j1.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c: Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_erf.c: Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/s_tanh.c: Likewise. * sysdeps/ieee754/flt-32/e_asinf.c: Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/ieee754/flt-32/k_tanf.c: Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c: Likewise. * sysdeps/ieee754/flt-32/s_atanf.c: Likewise. * sysdeps/ieee754/flt-32/s_erff.c: Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c: Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c: Likewise. * sysdeps/ieee754/flt-32/s_tanhf.c: Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c: Likewise. * sysdeps/ieee754/ldbl-128/s_log1pl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-96/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-96/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_tanhl.c: Likewise. * sysdeps/powerpc/fpu/e_hypot.c: Likewise. * sysdeps/x86/fpu/powl_helper.c: Likewise. * sysdeps/ieee754/dbl-64/s_nextup.c: Include <float.h>. * sysdeps/ieee754/flt-32/s_nextupf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nextupl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextupl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nextupl.c: Likewise. |
||
Joseph Myers
|
aaee3cd88e |
Move math_narrow_eval to separate math-narrow-eval.h.
This patch continues cleaning up the math_private.h header, which contains lots of different definitions many of which are only needed by a limited subset of files using that header (and some of which are overridden by architectures that only want to override selected parts of the header), by moving the math_narrow_eval macro out to a separate math-narrow-eval.h header, only included by those files that need it. That header is placed in include/ (since it's used in stdlib/, not just files built in math/, but no sysdeps variants are needed at present). Tested for x86_64, and with build-many-glibcs.py. (Installed stripped shared libraries change because of line numbers in assertions in strtod_l.c.) * include/math-narrow-eval.h: New file. Contents moved from .... * sysdeps/generic/math_private.h: ... here. (math_narrow_eval): Remove macro. Moved to math-narrow-eval.h. [FLT_EVAL_METHOD != 0] (excess_precision): Likewise. * math/s_fdim_template.c: Include <math-narrow-eval.h>. * stdlib/strtod_l.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/ieee754/dbl-64/e_cosh.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_j1.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_productf.c: Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_erf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/flt-32/e_coshf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Likewise. * sysdeps/ieee754/flt-32/e_expf.c: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_erff.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_product.c: Likewise. |
||
Patrick McGehearty
|
a14d8acd32 |
Improves __ieee754_exp(x) performance by 18-37% when |x| < 1.0397
Adds a fast path to e_exp.c when |x| < 1.03972053527832. When values are tested in isolation, reduction in execution time is: aarch 30%, sparc 18%, x86 37%. When comparing benchtests/bench.out which includes values outside that range, the gains are: aarch 8%, sparc 5%, x86 9%. make check is clean (no increase in ulp for any math test). Testing 20M values for each rounding mode in that range shows approximately one in 200 values is off by 1 ulp. No value tested for exp(x) changed by 2 or more ulp. No observed change in performance or accuracy for x outside fast path range. These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. |
||
Wilco Dijkstra
|
e88ecbbfe8 |
[PATCH 7/7] sin/cos slow paths: refactor sincos implementation
Refactor the sincos implementation - rather than rely on odd partial inlining of preprocessed portions from sin and cos, explicitly write out the cases. This makes sincos much easier to maintain and provides an additional 16-20% speedup between 0 and 2^27. The overall speedup of sincos is 48% over this range. Between 0 and PI it is 66% faster. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same logic as sin and cos. |
||
Wilco Dijkstra
|
aef3e2558a |
[PATCH 6/7] sin/cos slow paths: refactor duplicated code into dosin
Refactor duplicated code into do_sin. Since all calls to do_sin use copysign to set the sign of the result, move it inside do_sin. Small inputs use a separate polynomial, so move this into do_sin as well (the check is based on the more conservative case when doing large range reduction, but could be relaxed). * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small inputs. Return correct sign. (do_sincos): Remove small input check before do_sin, let do_sin set the sign. (__sin): Likewise. (__cos): Likewise. |
||
Wilco Dijkstra
|
72f6e9a3e3 |
[PATCH 5/7] sin/cos slow paths: remove unused slowpath functions
Remove all unused slowpath functions. * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove. (do_cos_slow): Likewise. (do_sin_slow): Likewise. (reduce_and_compute): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bslow): Likewise. (bslow1): Likewise. (bslow2): Likewise. (cslow2): Likewise. |
||
Wilco Dijkstra
|
649095838b |
[PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction
For huge inputs use the improved do_sincos function as well. Now no cases use the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it. * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. (reduce_and_compute_sincos): Remove unused function. |
||
Wilco Dijkstra
|
d9469deb14 |
[PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction
This patch improves the accuracy of the range reduction. When the input is large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not enough. Improve range reduction accuracy to 136 bits. As a result the special checks for results close to zero can be removed. The ULP of the polynomials is at worst 0.55ULP, so there is no reason for the slow functions, and they can be removed. * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to reduce_sincos, improve accuracy to 136 bits. (do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions. (__sin): Use improved reduction and simplified do_sincos calculation. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. |
||
Wilco Dijkstra
|
7a5640f23a |
[PATCH 2/7] sin/cos slow paths: remove large range reduction
This patch removes the large range reduction code and defers to the huge range reduction code. The first level range reducer supports inputs up to 2^27, which is way too large given that inputs for sin/cos are typically small (< 10), and optimizing for a smaller range would give a significant speedup. Input values above 2^27 are practically never used, so there is no reason for supporting range reduction between 2^27 and 2^48. Removing it significantly simplifies code and enables further speedups. There is about a 2.3x slowdown in this range due to __branred being extremely slow (a better algorithm could easily more than double performance). * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case. |
||
Wilco Dijkstra
|
19a8b9a300 |
[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs
This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos. |
||
Joseph Myers
|
8d3f9e85cf |
Add narrowing subtract functions.
This patch adds the narrowing subtract functions from TS 18661-1 to glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64 for all configurations; f32subf64x, f32subf128, f64subf64x, f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt. The changes are essentially the same as for the narrowing add functions, so the description of those generally applies to this patch as well. Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft float) and powerpc, and with build-many-glibcs.py. * math/Makefile (libm-narrow-fns): Add sub. (libm-test-funcs-narrow): Likewise. * math/Versions (GLIBC_2.28): Add narrowing subtract functions. * math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW. * math/gen-auto-libm-tests.c (test_functions): Add sub. * math/math-narrow.h (CHECK_NARROW_SUB): New macro. (NARROW_SUB_ROUND_TO_ODD): Likewise. (NARROW_SUB_TRIVIAL): Likewise. * sysdeps/ieee754/float128/float128_private.h (__fsubl): New macro. (__dsubl): Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and dsub. (CFLAGS-nldbl-dsub.c): New variable. (CFLAGS-nldbl-fsub.c): Likewise. * sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add __nldbl_dsubl. * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New prototype. * manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl, dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx. * math/auto-libm-test-in: Add tests of sub. * math/auto-libm-test-out-narrow-sub: New generated file. * math/libm-test-narrow-sub.inc: New file. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_fsub.c: Likewise. * sysdeps/ieee754/float128/s_f32subf128.c: Likewise. * sysdeps/ieee754/float128/s_f64subf128.c: Likewise. * sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise. * sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fsub.c: Likewise. * sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/mach/hurd/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Wilco Dijkstra
|
f67a8147b0 |
Rename all __ieee754_sqrt(f/l) calls to sqrt(f/l)
Use sqrt(f/l) to enable inlining by GCC - if inlining doesn't happen, the asm redirect ensures we will still call __ieee754_sqrt(f/l). * sysdeps/ieee754/dbl-64/e_acosh.c (__ieee754_acosh): Use sqrt. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Likewise. * sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Likewise. * sysdeps/ieee754/dbl-64/e_j0.c (__ieee754_j0): Likewise. * sysdeps/ieee754/dbl-64/e_j1.c (__ieee754_j1): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/e_acosh.c (__ieee754_acosh): Likewise. * sysdeps/ieee754/flt-32/e_acosf.c (__ieee754_acosf): Likewise. * sysdeps/ieee754/flt-32/e_acoshf.c (__ieee754_acoshf): Likewise. * sysdeps/ieee754/flt-32/e_asinf.c (__ieee754_asinf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/flt-32/e_hypotf.c (__ieee754_hypotf): Likewise. * sysdeps/ieee754/flt-32/e_j0f.c (__ieee754_j0f): Likewise. * sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_j1f): Likewise. * sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/ldbl-128/e_acoshl.c (__ieee754_acoshl): Use sqrtl. * sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-128/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c (__ieee754_j1l): Likewise * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Use sqrtl. * sysdeps/ieee754/ldbl-96/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-96/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/m68k/m680x0/fpu/e_pow.c (__ieee754_pow): Likewise. * sysdeps/powerpc/fpu/e_hypot.c (__ieee754_hypot): Likewise. * sysdeps/powerpc/fpu/e_hypotf.c (__ieee754_hypotf): Likewise. |
||
Zack Weinberg
|
d3da750d01 |
nldbl-compat.c: Include math.h before nldbl-compat.h.
Jeff Law noticed that native PowerPC builds were broken by my having made math_ldbl_opt.h not include math.h. nldbl-compat.c formerly got math.h via libioP.h and math_ldbl_opt.h, *without* __NO_LONG_DOUBLE_MATH; after my change it got it via nldbl-compat.h *with* __NO_LONG_DOUBLE_MATH, but __NO_LONG_DOUBLE_MATH mode is forbidden on hosts that define __HAVE_DISTINCT_FLOAT128, so the build breaks. This is the quick fix. * sysdeps/ieee754/ldbl-opt/nldbl-compat.c: Include math.h before nldbl-compat.h. |
||
Zack Weinberg
|
0d13dfa17b |
Don't include math.h/math_private.h in math_ldbl_opt.h.
The sysdeps/ieee754/ldbl-opt version of math_ldbl_opt.h includes math.h and math_private.h, despite not having any need for those headers itself; the sysdeps/generic version doesn't. About 20 files are relying on math_ldbl_opt.h to include math.h and/or math_private.h for them, even though none of them necessarily used on a platform that needs ldbl-opt support. * sysdeps/ieee754/ldbl-opt/math_ldbl_opt.h: Don't include math.h or math_private.h. * sysdeps/alpha/fpu/s_isnan.c * sysdeps/ieee754/ldbl-128ibm/s_ceill.c * sysdeps/ieee754/ldbl-128ibm/s_floorl.c * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c * sysdeps/ieee754/ldbl-128ibm/s_rintl.c * sysdeps/ieee754/ldbl-128ibm/s_roundl.c * sysdeps/ieee754/ldbl-128ibm/s_truncl.c * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot.c * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf.c: * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot.c * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf.c: Include math_private.h. * sysdeps/ieee754/ldbl-64-128/s_finitel.c * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c * sysdeps/ieee754/ldbl-64-128/s_isinfl.c * sysdeps/ieee754/ldbl-64-128/s_isnanl.c * sysdeps/ieee754/ldbl-64-128/s_signbitl.c * sysdeps/powerpc/power7/fpu/s_logb.c: Include math.h and math_private.h. |
||
Zack Weinberg
|
9964a14579 |
Mechanically remove _IO_ name aliases for types and constants.
This patch mechanically removes all remaining uses, and the definitions, of the following libio name aliases: name replaced with ---- ------------- _IO_FILE FILE _IO_fpos_t __fpos_t _IO_fpos64_t __fpos64_t _IO_size_t size_t _IO_ssize_t ssize_t or __ssize_t _IO_off_t off_t _IO_off64_t off64_t _IO_pid_t pid_t _IO_uid_t uid_t _IO_wint_t wint_t _IO_va_list va_list or __gnuc_va_list _IO_BUFSIZ BUFSIZ _IO_cookie_io_functions_t cookie_io_functions_t __io_read_fn cookie_read_function_t __io_write_fn cookie_write_function_t __io_seek_fn cookie_seek_function_t __io_close_fn cookie_close_function_t I used __fpos_t and __fpos64_t instead of fpos_t and fpos64_t because the definitions of fpos_t and fpos64_t depend on the largefile mode. I used __ssize_t and __gnuc_va_list in a handful of headers where namespace cleanliness might be relevant even though they're internal-use-only. In all other cases, I used the public-namespace name. There are a tiny handful of places where I left a use of 'struct _IO_FILE' alone, because it was being used together with 'struct _IO_FILE_plus' or 'struct _IO_FILE_complete' in the same arithmetic expression. Because this patch was almost entirely done with search and replace, I may have introduced indentation botches. I did proofread the diff, but I may have missed something. The ChangeLog below calls out all of the places where this was not a pure search-and-replace change. Installed stripped libraries and executables are unchanged by this patch, except that some assertions in vfscanf.c change line numbers. * libio/libio.h (_IO_FILE): Delete; all uses changed to FILE. (_IO_fpos_t): Delete; all uses changed to __fpos_t. (_IO_fpos64_t): Delete; all uses changed to __fpos64_t. (_IO_size_t): Delete; all uses changed to size_t. (_IO_ssize_t): Delete; all uses changed to ssize_t or __ssize_t. (_IO_off_t): Delete; all uses changed to off_t. (_IO_off64_t): Delete; all uses changed to off64_t. (_IO_pid_t): Delete; all uses changed to pid_t. (_IO_uid_t): Delete; all uses changed to uid_t. (_IO_wint_t): Delete; all uses changed to wint_t. (_IO_va_list): Delete; all uses changed to va_list or __gnuc_va_list. (_IO_BUFSIZ): Delete; all uses changed to BUFSIZ. (_IO_cookie_io_functions_t): Delete; all uses changed to cookie_io_functions_t. (__io_read_fn): Delete; all uses changed to cookie_read_function_t. (__io_write_fn): Delete; all uses changed to cookie_write_function_t. (__io_seek_fn): Delete; all uses changed to cookie_seek_function_t. (__io_close_fn): Delete: all uses changed to cookie_close_function_t. * libio/iofopncook.c: Remove unnecessary forward declarations. * libio/iolibio.h: Correct outdated commentary. * malloc/malloc.c (__malloc_stats): Remove unnecessary casts. * stdio-common/fxprintf.c (__fxprintf_nocancel): Remove unnecessary casts. * stdio-common/getline.c: Use _IO_getdelim directly. Don't redefine ssize_t. * stdio-common/printf_fp.c, stdio_common/printf_fphex.c * stdio-common/printf_size.c: Don't redefine size_t or FILE. Remove outdated comments. * stdio-common/vfscanf.c: Don't redefine va_list. |
||
Wilco Dijkstra
|
610ee1fc93 |
Remove mplog and mpexp
Remove the now unused mplog and mpexp files. * math/Makefile: Remove mpexp.c and mplog.c * sysdeps/i386/fpu/mpexp.c: Delete file. * sysdeps/i386/fpu/mplog.c: Likewise. * sysdeps/ia64/fpu/mpexp.c: Likewise. * sysdeps/ia64/fpu/mplog.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog. * sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function. * sysdeps/ieee754/dbl-64/mpexp.c: Delete file. * sysdeps/ieee754/dbl-64/mplog.c: Likewise. * sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/mplog.c: Likewise. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise. |
||
Szabolcs Nagy
|
de800d8305 |
Remove slow paths from exp
Remove the __slowexp code, so exp is no longer correctly rounded. The result is computed to about 70 bits precision so the worst case ulp error is about 0.500007 in nearest rounding mode. * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove. |
||
Wilco Dijkstra
|
c3d466cba1 |
Remove slow paths from pow
Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise. |
||
Joseph Myers
|
d8742dd82f |
Add narrowing add functions.
This patch adds the narrowing add functions from TS 18661-1 to glibc's libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all configurations; f32addf64x, f32addf128, f64addf64x, f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with _Float64x and _Float128; __nldbl_daddl for ldbl-opt. As discussed for the build infrastructure patch, tgmath.h support is deliberately deferred, and FP_FAST_* macros are not applicable without optimized function implementations. Function implementations are added for all relevant pairs of formats (including certain cases of a format and itself where more than one type has that format). The main implementations use round-to-odd, or a trivial computation in the case where both formats are the same or where the wider format is IBM long double (in which case we don't attempt to be correctly rounding). The sysdeps/ieee754/soft-fp implementations use soft-fp, and are used automatically for configurations without exceptions and rounding modes by virtue of existing Implies files. As previously discussed, optimized versions for particular architectures are possible, but not included. i386 gets a special version of f32xaddf64 to avoid problems with double rounding (similar to the existing fdim version), since this function must round just once without an intermediate rounding to long double. (No such special version is needed for any other function, because the nontrivial functions use round-to-odd, which does the intermediate computation with the rounding mode set to round-to-zero, and double rounding is OK except in round-to-nearest mode, so is OK for that intermediate round-to-zero computation.) mul and div will need slightly different special versions for i386 (using round-to-odd on long double instead of precision control) because of the possibility of inexact intermediate results in the subnormal range for double. To reduce duplication among the different function implementations, math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD and NARROW_ADD_TRIVIAL. In the trivial cases and for any architecture-specific optimized implementations, the overhead of the errno setting might be significant, but I think that's best handled through compiler built-in functions rather than providing separate no-errno versions in glibc (and likewise there are no __*_finite entry points for these function provided, __*_finite effectively being no-errno versions at present in most cases). Tested for x86_64 and x86, with both GCC 6 and GCC 7. Tested for mips64 (all three ABIs, both hard and soft float) and powerpc with GCC 7. Tested with build-many-glibcs.py with both GCC 6 and GCC 7. * math/Makefile (libm-narrow-fns): Add add. (libm-test-funcs-narrow): Likewise. * math/Versions (GLIBC_2.28): Add narrowing add functions. * math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW . * math/gen-auto-libm-tests.c (test_functions): Add add. * math/math-narrow.h (CHECK_NARROW_ADD): New macro. (NARROW_ADD_ROUND_TO_ODD): Likewise. (NARROW_ADD_TRIVIAL): Likewise. * sysdeps/ieee754/float128/float128_private.h (__faddl): New macro. (__daddl): Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and dadd. (CFLAGS-nldbl-dadd.c): New variable. (CFLAGS-nldbl-fadd.c): Likewise. * sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add __nldbl_daddl. * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New prototype. * manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl, daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx. * math/auto-libm-test-in: Add tests of add. * math/auto-libm-test-out-narrow-add: New generated file. * math/libm-test-narrow-add.inc: New file. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise. * sysdeps/ieee754/dbl-64/s_fadd.c: Likewise. * sysdeps/ieee754/float128/s_f32addf128.c: Likewise. * sysdeps/ieee754/float128/s_f64addf128.c: Likewise. * sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise. * sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise. * sysdeps/ieee754/soft-fp/s_daddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fadd.c: Likewise. * sysdeps/ieee754/soft-fp/s_faddl.c: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/mach/hurd/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Joseph Myers
|
8e554659ad |
Add test infrastructure for narrowing libm functions.
This patch continues preparations for adding TS 18661-1 narrowing libm functions by adding the required testsuite infrastructure to test such functions through the libm-test infrastructure. That infrastructure is based around testing for a single type, FLOAT. For the narrowing functions, FLOAT, the "main" type for testing, is the function return type; the argument type is ARG_FLOAT. This is consistent with how the code built once for each type, libm-test-support.c, depends on FLOAT for such things as calculating ulps errors in results but can already handle different argument types (pointers, integers, long double for nexttoward). Makefile machinery is added to handle building tests for all pairs of types for which there are narrowing functions (as with non-narrowing functions, aliases are tested just the same as the functions they alias). gen-auto-libm-tests gains a --narrow option for building outputs for narrowing functions (so narrowing sqrt and fma will share the same inputs as non-narrowing, but gen-auto-libm-tests will be run with and without that option to generate different output files). In the narrowing case, the auto-libm-test-out-narrow-* files include annotations for each test about what properties ARG_FLOAT must have to be able to represent all the inputs for that test; those annotations result in calls to the TEST_COND_arg_fmt macro. gen-libm-test.pl has some minor updates to handle narrowing tests (for example, arguments in such tests must be surrounded by ARG_LIT calls instead of LIT calls). Various new macros are added to the C test support code (for example, sNaN initializers need to be properly typed, so arg_snan_value is added; other such arg_* macros are added as it seems cleanest to do so, though some are not strictly required). Special-casing of the ibm128 format to allow for its limitations is adjusted to handle it as the argument format as well as as the result format; thus, the tests of the new functions allow nonzero ulps only in the case where ibm128 is the argument format, as otherwise the functions correspond to fully-defined IEEE operations. The ulps in question appear as e.g. 'Function: "add_ldouble"' in libm-test-ulps (with 1ulp errors then listed for double and float for that function in powerpc); no support is added to generate corresponding faddl / daddl ulps listings in the ulps table in the manual. For the previous patch, I noted the need to avoid spurious macro expansions of identifiers such as "add". A test test-narrow-macros.c is added to verify such macro expansions are successfully avoided, and there is also a -mlong-double-64 version of that test for ldbl-opt. This test is set up to cover the full set of relevant identifiers from the start rather than adding functions one at a time as each function group is added. Tested for x86_64 (this patch in isolation, as well as testing for various configurations in conjunction with the actual addition of "add" functions). * math/Makefile (test-type-pairs): New variable. (test-type-pairs-f64xf128-yes): Likewise. (tests): Add test-narrow-macros. (libm-test-funcs-narrow): New variable. (libm-test-c-narrow): Likewise. (generated): Add $(libm-test-c-narrow). (libm-tests-base-narrow): New variable. (libm-tests-narrow): Likewise. (libm-tests): Add $(libm-tests-narrow). (libm-tests-for-type): Handle $(libm-tests-narrow). (libm-test-c-narrow-obj): New variable. ($(libm-test-c-narrow-obj)): New rule. ($(foreach t,$(libm-tests-narrow),$(objpfx)$(t).c)): Likewise. ($(foreach f,$(libm-test-funcs-narrow),$(objpfx)$(o)-$(f).o)): Use $(o-iterator) to set dependencies and CFLAGS. * math/gen-auto-libm-tests.c: Document use for narrowing functions. (output_for_one_input_case): Take argument NARROW. (generate_output): Likewise. Update call to output_for_one_input_case. (main): Take --narrow option. Update call to generate_output. * math/gen-libm-test.pl (_apply_lit): Take macro name as argument. (apply_lit): Update call to _apply_lit. (apply_arglit): New function. (parse_args): Handle "a" arguments. (parse_auto_input): Handle format names using ":". * math/README.libm-test: Document "a" parameter type. * math/libm-test-support.h (ARG_TYPE_MIN): New macro. (ARG_TYPE_TRUE_MIN): Likewise. (ARG_TYPE_MAX): Likwise. (ARG_MIN_EXP): Likewise. (ARG_MAX_EXP): Likewise. (ARG_MANT_DIG): Likewise. (TEST_COND_arg_ibm128): Likewise. (TEST_COND_ibm128_libgcc): Define conditional on [ARG_FLOAT]. (TEST_COND_arg_fmt): New macro. (init_max_error): Update prototype. * math/libm-test-support.c (test_ibm128): New variable. (init_max_error): Take argument testing_ibm128 and set test_ibm128 instead of using [TEST_COND_ibm128] conditional. (test_exceptions): Use test_ibm128 instead of TEST_COND_ibm128. * math/libm-test-driver.c (STR_ARG_FLOAT): New macro. [TEST_NARROW] (TEST_MSG): New definition. (arg_plus_zero): New macro. (arg_minus_zero): Likewise. (arg_plus_infty): Likewise. (arg_minus_infty): Likewise. (arg_qnan_value_pl): Likewise. (arg_qnan_value): Likewise. (arg_snan_value_pl): Likewise. (arg_snan_value): Likewise. (arg_max_value): Likewise. (arg_min_value): Likewise. (arg_min_subnorm_value): Likewise. [ARG_FLOAT] (struct test_aa_f_data): New struct type. (RUN_TEST_LOOP_aa_f): New macro. (TEST_SUFF): New macro. (TEST_SUFF_STR): Likewise. [!TEST_MATHVEC] (VEC_SUFF): Don't define. (TEST_COND_any_ibm128): New macro. (START): Use TEST_SUFF and TEST_SUFF_STR in initializer for this_func. Update call to init_max_error. * math/test-double.h (FUNC_NARROW_PREFIX): New macro. * math/test-float.h (FUNC_NARROW_PREFIX): Likewise. * math/test-float128.h (FUNC_NARROW_PREFIX): Likewise. * math/test-float32.h (FUNC_NARROW_PREFIX): Likewise. * math/test-float32x.h (FUNC_NARROW_PREFIX): Likewise. * math/test-float64.h (FUNC_NARROW_PREFIX): Likewise. * math/test-float64x.h (FUNC_NARROW_PREFIX): Likewise. * math/test-math-scalar.h (TEST_NARROW): Likewise. * math/test-math-vector.h (TEST_NARROW): Likewise. * math/test-arg-double.h: New file. * math/test-arg-float128.h: Likewise. * math/test-arg-float32x.h: Likewise. * math/test-arg-float64.h: Likewise. * math/test-arg-float64x.h: Likewise. * math/test-arg-ldouble.h: Likewise. * math/test-math-narrow.h: Likewise. * math/test-narrow-macros.c: Likewise. * sysdeps/ieee754/ldbl-opt/test-narrow-macros-ldbl-64.c: Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (tests): Add test-narrow-macros-ldbl-64. (CFLAGS-test-narrow-macros-ldbl-64.c): New variable. |
||
Joseph Myers
|
63716ab270 |
Add build infrastructure for narrowing libm functions.
TS 18661-1 defines libm functions that carry out an operation (+ - * / sqrt fma) on their arguments and return a result rounded to a (usually) narrower type, as if the original result were computed to infinite precision and then rounded directly to the result type without any intermediate rounding to the argument type. For example, fadd, faddl and daddl for addition. These are the last remaining TS 18661-1 functions left to be added to glibc. TS 18661-3 extends this to corresponding functions for _FloatN and _FloatNx types. As functions parametrized by two rather than one varying floating-point types, these functions require infrastructure in glibc that was not required for previous libm functions. This patch provides such infrastructure - excluding test support, and actual function implementations, which will be in subsequent patches. Declaring the functions uses a header bits/mathcalls-narrow.h, which is included many times, for each relevant pair of types. This will end up containing macro calls of the form __MATHCALL_NARROW (__MATHCALL_NAME (add), __MATHCALL_REDIR_NAME (add), 2); for each family of narrowing functions. (The structure of this macro call, with the calls to __MATHCALL_NAME and __MATHCALL_REDIR_NAME there rather than in the definition of __MATHCALL_NARROW, arises from the names such as "add" *not* themselves being reserved identifiers - meaning it's necessary to avoid any indirection that would result in a user-defined "add" macro being expanded.) Whereas for existing functions declaring long double functions is disabled if _LIBC in the case where they alias double functions, to facilitate defining the long double functions as aliases of the double ones, there is no such logic for the narrowing functions in this patch. Rather, the files defining such functions are expected to use #define to hide the original declarations of the alias names, to avoid errors about defining aliases with incompatible types. math/Makefile support is added for building the functions (listed in libm-narrow-fns, currently empty) for all relevant pairs of types. An internal header math-narrow.h is added for macros shared between multiple function implementations - currently a ROUND_TO_ODD macro to facilitate writing functions using the round-to-odd implementation approach, and alias macros to create all the required function aliases. libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128 are added for use when required (only for x86_64). float128_private.h support is added for ldbl-128 narrowing functions to be used for _Float128. Certain things are specifically omitted from this patch and the immediate followups. tgmath.h support is deferred; there remain unresolved questions about how the type-generic macros for these functions are supposed to work, especially in the case of arguments of integer type. The math.h / bits/mathcalls-narrow.h logic, and the logic for determining what functions / aliases to define, will need some adjustments to support the sqrt and fma functions, where e.g. f32xsqrtf64 can just be an alias for sqrt rather than a separate function. TS 18661-1 defines FP_FAST_* macros but no support is included for defining them (they won't in general be true without architecture-specific optimized function versions). For each of the function groups (add sub mul div sqrt fma) there are always six functions present (e.g. fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64). When _Float64x and _Float128 are supported, there are seven more (e.g. f32addf64x, f32addf128, f64addf64x, f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128). In addition, in the ldbl-opt case there are function names such as __nldbl_daddl (an alias for f32xaddf64, which is not a reserved name in TS 18661-1, only in TS 18661-3), for calls to daddl to be mapped to in the -mlong-double-64 case. (Calls to faddl just get mapped to fadd, and for sqrt and fma there won't be __nldbl_* functions because dsqrtl and dfmal can just be mapped to sqrt and fma with -mlong-double-64.) While there are six or thirteen functions present in each group (plus __nldbl_* names only as an ABI, not an API), not all are distinct; they fall in various groups of aliases. There are two distinct versions built if long double has the same format as double; four if they have distinct formats but there is no _Float64x or _Float128 support; five if long double has binary128 format; seven when _Float128 is distinct from long double. Architecture-specific optimized versions are possible, but not included in my patches. For example, IA64 generally supports narrowing the result of most floating-point instructions; Power ISA 2.07 (POWER8) supports double values as arguments to float instructions, with the results narrowed as expected; Power ISA 3 (POWER9) supports round-to-odd for float128 instructions, so meaning that approach can be used without needing to set and restore the rounding mode and test "inexact". I intend to leave any such optimized versions to the architecture maintainers. Generally in such cases it would also make sense for calls to these functions to be expanded inline (given -fno-math-errno); I put a suggestion for TS 18661-1 built-in functions at <https://gcc.gnu.org/wiki/SummerOfCode>. Tested for x86_64 (this patch in isolation, as well as testing for various configurations in conjunction with further patches). * math/bits/mathcalls-narrow.h: New file. * include/bits/mathcalls-narrow.h: Likewise. * math/math-narrow.h: Likewise. * math/math.h (__MATHCALL_NARROW_ARGS_1): New macro. (__MATHCALL_NARROW_ARGS_2): Likewise. (__MATHCALL_NARROW_ARGS_3): Likewise. (__MATHCALL_NARROW_NORMAL): Likewise. (__MATHCALL_NARROW_REDIR): Likewise. (__MATHCALL_NARROW): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)]: Repeatedly include <bits/mathcalls-narrow.h> with _Mret_, _Marg_ and __MATHCALL_NAME defined. [__GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise. * math/Makefile (headers): Add bits/mathcalls-narrow.h. (libm-narrow-fns): New variable. (libm-narrow-types-basic): Likewise. (libm-narrow-types-ldouble-yes): Likewise. (libm-narrow-types-float128-yes): Likewise. (libm-narrow-types-float128-alias-yes): Likewise. (libm-narrow-types): Likewise. (libm-routines): Add narrowing functions. * sysdeps/i386/fpu/fenv_private.h [__x86_64__] (libc_feholdexcept_setroundf128): New macro. [__x86_64__] (libc_feupdateenv_testf128): Likewise. * sysdeps/ieee754/float128/float128_private.h: Include <math/math-narrow.h>. [libc_feholdexcept_setroundf128] (libc_feholdexcept_setroundl): Undefine and redefine. [libc_feupdateenv_testf128] (libc_feupdateenv_testl): Likewise. (libm_alias_float_ldouble): Undefine and redefine. (libm_alias_double_ldouble): Likewise. |
||
Zack Weinberg
|
63fb8f9aa9 |
Post-cleanup 2: minimize _G_config.h.
Nearly everything in _G_config.h is either junk or more appropriately defined elsewhere: * _G_fpos_t, _G_fpos64_t, and _G_BUFSIZ are already completely unused. * All remaining uses of _G_va_list have been changed to __gnuc_va_list. * The definition of _G_HAVE_ST_BLKSIZE/_IO_HAVE_ST_BLKSIZE has been inlined into its sole use. * The complete definition of _G_iconv_t has been moved to libio.h and renamed _IO_iconv_t (all actual users used that name). * _G_IO_IO_FILE_VERSION is vestigial; some code cares whether _IO_stdin_used exists, but nothing looks at its value. I've preserved the value as a hardwired constant in csu/init.c. This means csu/init.c no longer needs to include anything. * Many of the headers included by _G_config.h were already being included directly by either either libio.h or stdio.h; the remaining ones were moved to libio.h. * _G_HAVE_MREMAP is still relevant, because mremap genuinely is a Linux extension; it's not in POSIX and as far as I can tell it's not available on the Hurd either. I also preserved _G_HAVE_MMAP, since it's conceivable someone would want to port glibc to a MMU-less, mmap-less environment in the future. Both are now always defined to 1/0 as is the current convention, instead of the older 1/undef convention. These are the only symbols still defined in _G_config.h. * The actual inclusion of _G_config.h moves from libio.h to libioP.h, as this is where a potential override of _G_HAVE_MMAP happens. * The #ifdef logic in libioP.h controlling _IO_JUMPS_OFFSET has been simplified. After this patch, the only surviving _G_ symbols are the struct tag names _G_fpos_t and _G_fpos64_t, which are preserved for the sake of C++ mangled names in applications, and _G_HAVE_MMAP and _G_HAVE_MREMAP, which do not seem worth renaming. Installed stripped libraries are unchanged by this patch. * bits/_G_config.h: Move back to sysdeps/generic/_G_config.h. Delete all contents except for definitions of _G_HAVE_MMAP and _G_HAVE_MREMAP. Add commentary explaining those two symbols. * sysdeps/unix/sysv/linux/bits/_G_config.h: Move back to sysdeps/unix/sysv/linux/_G_config.h. Make same content change as above. * libio/libio.h: Don't include bits/_G_config.h here. Include stddef.h with __need_wchar_t defined. Include bits/types/__mbstate_t.h, bits/types/wint_t.h, and gconv.h. Define _IO_iconv_t here, directly. Don't define _IO_HAVE_ST_BLKSIZE. * libio/libioP.h: Include _G_config.h here. Move include of shlib-compat.h up with rest of includes. Simplify conditionals controlling definition of _IO_JUMPS_OFFSET. * csu/init.c: Remove always-true #if around entire file. Don't include stdio.h. Set _IO_stdin_used to hardwired constant 0x20001, and update commentary. * include/stdio.h, sysdeps/ieee754/ldbl-opt/nldbl-compat.h: Replace all uses of _G_va_list with __gnuc_va_list. * libio/filedoalloc.c: Use #if defined _STATBUF_ST_BLKSIZE instead of #if _IO_HAVE_ST_BLKSIZE. * libio/fileops.c: Test _G_HAVE_MREMAP with #if, not #ifdef. * libio/iofdopen.c, libio/iofopen.c: Test _G_HAVE_MMAP with #if, not #ifdef. |
||
Wilco Dijkstra
|
b7c83ca30e |
Remove slow paths from log
Remove the slow paths from log. Like several other double precision math functions, log is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. Interestingly removing the slow paths makes hardly any difference in practice: the worst case error is still ~0.502ULP, and exp(log(x)) shows identical results before/after on many millions of random cases. All GLIBC math tests pass on AArch64 and x64 with no change in ULP error. A simple test over a few hundred million values shows log is now 18% faster on average. * manual/probes.texi (slowlog): Delete documentation of removed probe. (slowlog_inexact): Likewise * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Remove slow paths. * sysdeps/ieee754/dbl-64/ulog.h: Remove unused declarations. |
||
Joseph Myers
|
6f9a3dd8b8 |
Move LDBL_CLASSIFY_COMPAT to its own header.
The general rule in glibc is that it's better for a macro to be always defined, and tested with #if, than for it to be tested with #ifdef, because the latter is prone to typos in the macro name as well as to the header with the macro accidentally not being included in a file testing it. (Testing with an "if" statement is even better, in those cases where it's possible to do things that way, as it then means both cases in the code get checked for syntax in glibc builds with either value of the condition.) math_private.h has several different groups of macros, meaning that architectures wanting to override some of them need to define those then include the generic version, which then defines macros if not already defined. It's hard to avoid that arrangement completely, but various cases can be improved by splitting out macros or groups of macros into separate files. This patch splits out the LDBL_CLASSIFY_COMPAT macro into a separate ldbl-classify-compat.h header. This macro is tested with #ifdef; this patch changes it to testing with #if, with a default definition to 0 in the generic header and then architecture-specific headers defining it to 1. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/ldbl-classify-compat.h: New file. * sysdeps/arm/ldbl-classify-compat.h: Likewise. * sysdeps/m68k/coldfire/ldbl-classify-compat.h: Likewise. * sysdeps/microblaze/ldbl-classify-compat.h: Likewise. * sysdeps/mips/ldbl-classify-compat.h: Likewise. * sysdeps/nios2/ldbl-classify-compat.h: Likewise. * sysdeps/sh/ldbl-classify-compat.h: Likewise. * sysdeps/ieee754/dbl-64/s_finite.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/s_isinf.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/s_isnan.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/arm/math_private.h (LDBL_CLASSIFY_COMPAT): Remove macro. * sysdeps/mips/math_private.h (LDBL_CLASSIFY_COMPAT): Likewise. * sysdeps/m68k/coldfire/math_private.h: Remove file. * sysdeps/microblaze/math_private.h: Likewise. * sysdeps/nios2/math_private.h: Likewise. * sysdeps/sh/math_private.h: Likewise. |
||
Carlos O'Donell
|
f1d7368196 |
Fix -Os log1p, log1pf build (bug 21314).
As reported in bug 21314, building log1p and log1pf fails with -Os because of a spurious -Wmaybe-uninitialized warning (reported there for GCC 5 for MIPS, I see it also with GCC 7 for x86_64). This patch, based on the patches in the bug, fixes this using the DIAG_* macros. Tested for x86_64 with -Os that this eliminates those warnings and so allows the build to progress further. 2018-02-01 Carlos O'Donell <carlos@redhat.com> Ramin Seyed-Moussavi <lordrasmus@gmail.com> Joseph Myers <joseph@codesourcery.com> [BZ #21314] * sysdeps/ieee754/dbl-64/s_log1p.c: Include <libc-diag.h>. (__log1p): Disable -Wmaybe-uninitialized for -Os around computation using c. * sysdeps/ieee754/flt-32/s_log1pf.c: Include <libc-diag.h>. (__log1pf): Disable -Wmaybe-uninitialized for -Os around computation using c. |
||
Joseph Myers
|
b303185df9 |
Fix ldbl-128ibm log1pl (-qNaN) spurious "invalid" exception (bug 22693).
The ldbl-128ibm implementation of log1pl does ordered comparisons on a negative qNaN argument, so resulting in spurious "invalid" exceptions (for soft-float powerpc; hard-float only avoids this because of GCC bug 58684 meaning ordered comparison instructions never get generated). This patch fixes this by arranging for the test for NaN or infinity arguments to handle negative arguments as well. Tested for powerpc (soft float). [BZ #22693] * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Handle negative arguments in test for NaN or infinity argument. |
||
Joseph Myers
|
1272748886 |
Fix ldbl-128ibm lrintl, lroundl missing "invalid" exceptions (bug 22690).
The ldbl-128ibm implementations of lrintl and lroundl are missing "invalid" exceptions for certain overflow cases when compiled with GCC 8. The cause of this is after-the-fact integer overflow checks that fail when the compiler optimizes on the basis of integer overflow being undefined; GCC 8 must be able to detect new cases of undefinedness here. Failure: lrint (-0x80000001p0): Exception "Invalid operation" not set Failure: lrint_downward (-0x80000001p0): Exception "Invalid operation" not set Failure: lrint_towardzero (-0x80000001p0): Exception "Invalid operation" not set Failure: lrint_upward (-0x80000001p0): Exception "Invalid operation" not set Failure: lround (-0x80000001p0): Exception "Invalid operation" not set Failure: lround_downward (-0x80000001p0): Exception "Invalid operation" not set Failure: lround_towardzero (-0x80000001p0): Exception "Invalid operation" not set Failure: lround_upward (-0x80000001p0): Exception "Invalid operation" not set (Tested that these failures occur before the patch for powerpc soft-float, but the issue applies in principle for hard-float as well, whether or not the particular optimizations in fact occur there at present.) This patch fixes the bug by ensuring the additions / subtractions in question cast arguments to unsigned long int, or use 1UL as a constant argument, so that the arithmetic occurs in an unsigned type with the result then converted back to a signed type. Tested for powerpc (soft-float). [BZ #22690] * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c (__lrintl): Use unsigned long int for arguments of possibly overflowing addition or subtraction. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c (__lroundl): Likewise. |
||
Joseph Myers
|
688903eb3e |
Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. |
||
Bernd Edlinger
|
648615e13f |
Avoid signed shift overflow in pow (bug 21309).
As noted in bug 21309, dbl-64/e_pow.c contains signed int shifts that, although the shift count is in the range [0, 31], shift bits into and beyond the sign bit and so are undefined in ISO C. Although this is defined in GNU C, this patch from the bug cleans up the code to avoid those shifts. Tested for x86_64. [BZ #21309] * sysdeps/ieee754/dbl-64/e_pow.c (checkint): Make m and n unsigned. |
||
Joseph Myers
|
f1e005022e |
Revert exp reimplementation (causes test failures).
Revert: 2017-12-19 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86_64/fpu/libm-test-ulps: Update. 2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com> * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise. |
||
Patrick McGehearty
|
6fd0a3c6a8 |
Improve __ieee754_exp() performance by greater than 5x on sparc/x86.
These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise. |
||
Rajalakshmi Srinivasaraghavan
|
984ae9967b |
New generic sincosf
This implementation is based on generic s_sinf.c and s_cosf.c. Tested on s390x, powerpc64le and powerpc32. |
||
Joseph Myers
|
6f7c009282 |
Add sysdeps/ieee754/soft-fp.
The default sysdeps/ieee754 fma implementations rely on exceptions and rounding modes to achieve correct results through internal use of round-to-odd. Thus, glibc configurations without support for exceptions and rounding modes instead need to use implementations of fma based on soft-fp. At present, this is achieved via having implementation files in soft-fp/ that are #included by sysdeps files for each glibc configuration that needs them. In general this means such a configuration has its own s_fma.c and s_fmaf.c. TS 18661-1 adds functions that do an operation (+ - * / sqrt fma) on arguments wider than the return type, with a single rounding of the infinite-precision result to that return type. These are also naturally implemented using round-to-odd on platforms with hardware support for rounding modes and exceptions but lacking hardware support for these narrowing operations themselves. (Platforms that have direct hardware support for such narrowing operations include at least ia64, and Power ISA 2.07 or later, which I think means POWER8 or later.) So adding the remaining TS 18661-1 functions would mean at least six narrowing function implementations (fadd fsub fmul fdiv ffma fsqrt), with aliases for other types and further implementations in some configurations, that need to be overridden for configurations lacking hardware exceptions and rounding modes. Requiring all such configurations (currently seven of them) to have their own source files for all those functions seems undesirable. Thus, this patch adds a directory sysdeps/ieee754/soft-fp to contain libm function implementations based on soft-fp. This directory is then used via Implies from all the configurations that need it, so no more files need adding to every such configuration when adding more functions with soft-fp implementations. A configuration can still selectively #include a particular file from this directory if desired; thus, the MIPS #include of the fmal implementation is retained, since that's appropriate even for hard float (because long double is always implementated in software for MIPS64, so the soft-fp implementation of fmal is better than the ldbl-128 one). This also provides additional motivation for my recent patch removing --with-fp / --without-fp: previously there was no need for correct use of --without-fp for no-FPU ARM or SH3, and now we have autodetection nofpu/ sysdeps directories can be used by this patch for those configurations without imposing any new requirements on how glibc is configured. (The mips64/*/fpu/s_fma.c files added by this patch are needed to keep the dbl-64 version of fma for double, rather than the ldbl-128 one, used in that case.) Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * soft-fp/fmadf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fma.c: ... here. * soft-fp/fmasf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fmaf.c: ... here. * soft-fp/fmatf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fmal.c: ... here. * sysdeps/ieee754/soft-fp/Makefile: New file. * sysdeps/arm/preconfigure.ac: Define with_fp_cond. * sysdeps/arm/preconfigure: Regenerated. * sysdeps/arm/nofpu/Implies: New file. * sysdeps/arm/s_fma.c: Remove file. * sysdeps/arm/s_fmaf.c: Likewise. * sysdeps/m68k/coldfire/nofpu/Implies: New file. * sysdeps/m68k/coldfire/nofpu/s_fma.c: Remove file. * sysdeps/m68k/coldfire/nofpu/s_fmaf.c: Likewise. * sysdeps/microblaze/Implies: Add ieee754/soft-fp. * sysdeps/microblaze/s_fma.c: Remove file. * sysdeps/microblaze/s_fmaf.c: Likewise. * sysdeps/mips/mips32/nofpu/Implies: New file. * sysdeps/mips/mips64/n32/fpu/s_fma.c: Likewise. * sysdeps/mips/mips64/n32/nofpu/Implies: Likewise. * sysdeps/mips/mips64/n64/fpu/s_fma.c: Likewise. * sysdeps/mips/mips64/n64/nofpu/Implies: Likewise. * sysdeps/mips/ieee754/s_fma.c: Remove file. * sysdeps/mips/ieee754/s_fmaf.c: Likewise. * sysdeps/mips/ieee754/s_fmal.c: Update include for move of fmal implementation. * sysdeps/nios2/Implies: Add ieee754/soft-fp. * sysdeps/nios2/s_fma.c: Remove file. * sysdeps/nios2/s_fmaf.c: Likewise. * sysdeps/sh/nofpu/Implies: New file. * sysdeps/sh/s_fma.c: Remove file. * sysdeps/sh/s_fmaf.c: Likewise. * sysdeps/tile/Implies: Add ieee754/soft-fp. * sysdeps/tile/s_fma.c: Remove file. * sysdeps/tile/s_fmaf.c: Likewise. |
||
Paul Clarke
|
f4b2aea6e1 |
New generic cosf
The same logic used in s_cosf.S version for x86 and powerpc is used to create a generic s_cosf.c, so there is no performance improvement in x86_64 and powerpc64. * sysdeps/ieee754/flt-32/s_cosf.c: New implementation. |
||
Joseph Myers
|
1dbe6f64ab |
Don't make local variables static in ldbl-96 j1l.
The ldbl-96 implementation of j1l has some function-local variables that are declared static for no apparent reason (this dates back to the first addition of that file). Any vaguely recent compiler, probably including any that are supported for building glibc, optimizes away the "static" here, as the values of the variables on entry to the function are dead. So there is not actually a user-visible bug here at present (but with any compilers that didn't optimize away the static at all, possibly building with less or no optimization, so that the function stored intermediate values to and then loaded them from the variables, there would have been a thread-safety issue). But the "static" clearly doesn't belong there and might potentially make things unsafe were compilation without optimization to be supported in future, so this patch removes it. Tested for x86_64. * sysdeps/ieee754/ldbl-96/e_j1l.c (qone): Don't make local variables static. |
||
Joseph Myers
|
53994f1263 |
Make some ldbl-128, ldbl-128ibm arrays const.
I noticed that an x86_64 build of libm unexpectedly contained more non-constant data than an older version (before _Float128 support) did. The problem is non-const arrays in the ldbl-128 j0l and j1l implementations; this patch makes those arrays, and the corresponding ldbl-128ibm ones, const. Tested for x86_64, and tested compilation for powerpc with build-many-glibcs.py. * sysdeps/ieee754/ldbl-128/e_j0l.c (Y0_2N): Make const. (Y0_2D): Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c (Y0_2N): Likewise. (Y0_2D): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j0l.c (Y0_2N): Likewise. (Y0_2D): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c (Y0_2N): Likewise. (Y0_2D): Likewise. |
||
Adhemerval Zanella
|
94d80dfc73 |
math: Use sign as double for reduced case in sinf
This patch avoid an extra floating point to integer conversion in reduced internal function for generic sinf by defining the sign as double instead of integers. There is no much difference on Haswell with GCC 7.2.1: Before After min 9.11 9.108 mean 21.982 21.9224 However H.J. Lu reported gains on Skylake: Before: "sinf": { "": { "duration": 3.4044e+10, "iterations": 1.9942e+09, "max": 141.106, "min": 7.704, "mean": 17.0715 } } After: "sinf": { "": { "duration": 3.40665e+10, "iterations": 2.03199e+09, "max": 95.994, "min": 7.704, "mean": 16.765 } } Checked on x86_64-linux-gnu. * sysdeps/ieee754/flt-32/s_sinf.c (ones): Define as double. (reduced): Use ones as double instead of integer. |
||
Szabolcs Nagy
|
00d54af7c8 |
[PATCH] fix sinf(NAN)
sinf(NAN) should not signal invalid fp exception so use isless instead of < where NAN is compared. this makes the sinf tests pass on aarch64. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Use isless. |
||
Joseph Myers
|
f2d64d621e |
Support _Float64, _Float32x in libm_alias_double.
This patch makes the libm_alias_double macros support creating _Float64 and _Float32x aliases, in preparation for enabling glibc support for those types. Tested for x86_64; also tested with build-many-glibcs.py in conjunction with other _Float64 / _Float32x changes. * sysdeps/generic/libm-alias-double.h: Include <bits/floatn.h>. (libm_alias_double_other_r_f64): New macro. (libm_alias_double_other_r_f32x): Likewise. (libm_alias_double_other_r): Use libm_alias_double_other_r_f64 and libm_alias_double_other_r_f32x. (libm_alias_double_r): Use semicolon before call to libm_alias_double_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h: Include <bits/floatn.h>. (libm_alias_double_other_r_f64): New macro. (libm_alias_double_other_r_f32x): Likewise. (libm_alias_double_other_r): Use libm_alias_double_other_r_f64 and libm_alias_double_other_r_f32x. |
||
H.J. Lu
|
91c318e7b9 |
s_sinf.c: Replace floor with simple casts
Since s_sinf.c either assigns the return value of floor to integer or passes double converted from integer to floor, this patch replaces floor with simple casts. Also since long == int for 32-bit targets, we can use long instead of int to avoid 64-bit integer for 64-bit targets. On Skylake, bench-sinf reports performance improvement: Before After Improvement max 130.566 129.564 30% min 7.704 7.706 0% mean 21.8188 19.1363 30% * sysdeps/ieee754/flt-32/s_sinf.c (reduced): Replace long with int. (SINF_FUNC): Likewise. Replace floor with simple casts. |
||
Joseph Myers
|
73895b499b |
Use __floor not floor in sinf.
The new sinf implementation introduced localplt failures for all platforms where the compiler did not inline the calls to floor (converted to trunc by machine-independent optimizations). This patch changes the calls to use __floor as normal in libm. We can't use the public function names floor / floorf / floorl / floorf128 in libm code in the absence of appropriate asms to redirect floor/trunc calls, if not inlined, to use the internal names instead (while avoiding breaking code building the floor functions themselves) - while having such asms and then calling the public functions unconditionally would be desirable for optimization (few architectures have __floor inlines in math_private.h, and once the built-in function is used you don't need them), using __floor is the minimum safe fix for the present test regressions. Tested with build-many-glibcs.py that this fixes the localplt test failure for arm-linux-gnueabi. * sysdeps/ieee754/flt-32/s_sinf.c (SINF_FUNC): Use __floor instead of floor. |
||
Rajalakshmi Srinivasaraghavan
|
7863a71181 |
New generic sinf
This implementation is based on optimized sinf assembly versions of x86_64 and powerpc. |
||
Joseph Myers
|
d812486444 |
Support ldbl-opt libm_alias_double use from .S files.
This patch makes the ldbl-opt libm_alias_double implementation support use from .S sources, by adding a semicolon after its use of weak_alias. Tested (compilation only) with build-many-glibcs.py for alpha-linux-gnu, in conjunction with a patch introducing uses of libm_alias_double in alpha .S files. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h (libm_alias_double_r): Add semicolon after weak_alias call. |
||
Joseph Myers
|
a23aa5b727 |
Add _Float64x function aliases.
This patch continues filling out TS 18661-3 support by adding *f64x function aliases on platforms with _Float64x support. (It so happens the set of such platforms is exactly the same as the set of platforms with _Float128 support, although on x86_64, x86 and ia32 the _Float64x format is Intel extended rather than binary128.) The API provided corresponds exactly to that provided for _Float128, mostly coming from TS 18661-3. As these functions always alias those for another type (long double, _Float128 or both), __* function names are not provided, as in other cases of alias types. Given the preparation done in previous patches, this one just enables the feature via Makeconfig and bits/floatn.h, adds symbol versions, and updates documentation and ABI baselines. The symbol versions are present unconditionally as GLIBC_2.27 in the relevant Versions files, as it's OK for those to specify versions for functions that may not be present in some configurations; no additional complexity is needed unless in future some configuration gains support for this type that didn't have such support in 2.27. The Makeconfig additions for ia64 and x86 aren't strictly needed, as those configurations also get float64x-alias-fcts definitions from sysdeps/ieee754/float128/Makeconfig, but still seem appropriate given that _Float64x is not _Float128 for those configurations. A libm-test-ulps update for x86 is included. This is because bits/mathinline.h does not have _Float64x support added and for two functions the use of out-of-line functions results in increased ulps (ifloat64x shares ulps with ildouble / ifloat128 as appropriate). Given that we'd like generally to eliminate bits/mathinline.h optimizations, preferring to have such optimizations in GCC instead, it seems reasonable not to add such support there for new types. GCC support for _FloatN / _FloatNx built-in functions is limited, but has been improved in GCC 8, and at some point I hope the full set of libm built-in functions in GCC, and other optimizations with per-floating-type aspects, will be enabled for all _FloatN / _FloatNx types. Tested for x86_64 and x86, and with build-many-glibcs.py, with both GCC 6 and GCC 7. * sysdeps/ia64/Makeconfig (float64x-alias-fcts): New variable. * sysdeps/ieee754/float128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/ieee754/ldbl-128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/x86/Makeconfig: New file. * bits/floatn-common.h (__HAVE_FLOAT64X): Remove macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * bits/floatn.h (__HAVE_FLOAT64X): New macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ia64/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/mips/ieee754/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/powerpc/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/x86/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * manual/math.texi (Mathematics): Document support for _Float64x. * math/Versions (GLIBC_2.27): Add _Float64x functions. * stdlib/Versions (GLIBC_2.27): Likewise. * wcsmbs/Versions (GLIBC_2.27): Likewise. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. |
||
Joseph Myers
|
de61465c04 |
Use libm_alias_float128 more in sysdeps/ieee754/float128.
This patch uses libm_alias_float128 in place of weak_alias more in sysdeps/ieee754/float128, in preparation for defining _Float64x aliases when appropriate. Tested for x86_64, and for powerpc64le (compilation only) with build-many-glibcs.py in conjunction with _Float64x support patches. * sysdeps/ieee754/float128/s_fromfpf128.c (fromfpf128): Define using libm_alias_float128. * sysdeps/ieee754/float128/s_fromfpxf128.c (fromfpxf128): Likewise. * sysdeps/ieee754/float128/s_setpayloadf128.c (setpayloadf128): Likewise. * sysdeps/ieee754/float128/s_setpayloadsigf128.c (setpayloadsigf128): Likewise. * sysdeps/ieee754/float128/s_ufromfpf128.c (ufromfpf128): Likewise. * sysdeps/ieee754/float128/s_ufromfpxf128.c (ufromfpxf128): Likewise. |
||
Joseph Myers
|
6e70d156c7 |
Support _Float64x in libm_alias macros.
This patch adds support for libm_alias_ldouble and libm_alias_float128 to create *f64x function aliases when appropriate. Making such aliases work for functions defined in assembly sources requires adding some semicolons after weak_alias calls in alias macro definitions. For C, semicolons are already present in the macros called when required, but a GNU C extension allows excess semicolons at file scope in a source file (and glibc already uses this), so it is OK to have extra semicolons present in the macro definitions. For assembly sources, making multiple alias macro calls from a single macro expansion means there are no newlines between the calls, so an explicit separator is needed. If hppa were to have .S sources in libm, a more complicated approach would be needed that used ASM_LINE_SEP when building assembly sources but not for C, but right now there are no such sources so just using a semicolon (as already present unconditionally in some such macro expansions) suffices. Tested for x86_64, including in conjunction with _Float64x support patches. * sysdeps/generic/libm-alias-float128.h: Include <bits/floatn.h>. (libm_alias_float128_other_r): If [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE], define f64x alias. (libm_alias_float128_r): Add semicolon after weak_alias call. * sysdeps/generic/libm-alias-ldouble.h (libm_alias_ldouble_other_r_f128): New macro. (libm_alias_ldouble_other_r_f64x): Likewise. (libm_alias_ldouble_other_r): Use libm_alias_ldouble_other_r_f128 and libm_alias_ldouble_other_r_f64x. (libm_alias_ldouble_r): Add semicolon after weak_alias call. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h (libm_alias_ldouble_other_r_f128): New macro. (libm_alias_ldouble_other_r_f64x): Likewise. (libm_alias_ldouble_other_r): Use libm_alias_ldouble_other_r_f128 and libm_alias_ldouble_other_r_f64x. |
||
Joseph Myers
|
df2806cdb5 |
Support strfromf64x alias.
This patch adds support for defining strfromf64x as a function alias (of strfroml or strfromf128, as appropriate) when _Float64x is supported. Tested for x86_64, including in conjunction with _Float64x support patches, and also tested build for other configurations (in conjunction with _Float64x support patches) with build-many-glibcs.py to cover the various different files needing updating to define these aliases. * stdlib/strfroml.c: Always include <stdlib.h>. [__HAVE_FLOAT64X_LONG_DOUBLE] (strfromf64x): Define and later undefine as macro and define as weak alias. * sysdeps/ieee754/float128/strfromf128.c: Include <bits/floatn.h>. [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE]: Include <stdlib.h>. [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE] (strfromf64x): Define and later undefine as macro and define as weak alias. |
||
Joseph Myers
|
0df4fe3557 |
Support strtof64x, wcstof64x aliases.
This patch adds support for defining strtof64x, strtof64x_l, wcstof64 and wcstof64x_l function aliases when _Float64x is supported. Tested for x86_64, including in conjunction with _Float64x support patches, and also tested build for other configurations (in conjunction with _Float64x support patches) with build-many-glibcs.py to cover the various different files needing updating to define these aliases. * stdlib/strtold.c [__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/float128/strtof128.c: Include <bits/floatn.h>. [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/float128/strtof128_l.c [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-128/strtold_l.c [__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-64-128/strtold_l.c [__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-96/strtold_l.c [__HAVE_FLOAT64X_LONG_DOUBLE] (strtof64x_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT64X_LONG_DOUBLE] (wcstof64x_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. |
||
Joseph Myers
|
015c6dc288 |
Support bits/floatn.h inclusion from .S files.
Further _FloatN / _FloatNx type alias support will involve making architecture-specific .S files use the common macros for libm function aliases. Making them use those macros will also serve to simplify existing code for aliases / symbol versions in various cases, similar to such simplifications for ldbl-opt code. The libm-alias-*.h files sometimes need to include <bits/floatn.h> to determine which aliases they should define. At present, this does not work for inclusion from .S files because <bits/floatn.h> can define typedefs for old compilers. This patch changes all the <bits/floatn.h> and <bits/floatn-common.h> headers to include __ASSEMBLER__ conditionals. Those conditionals disable everything related to C syntax in the __ASSEMBLER__ case, not just the problem typedefs, as that seemed cleanest. The __HAVE_* definitions remain in the __ASSEMBLER__ case, as those provide information that is required to define the correct set of aliases. Tested with build-many-glibcs.py for a representative set of configurations (x86_64-linux-gnu i686-linux-gnu ia64-linux-gnu powerpc64le-linux-gnu mips64-linux-gnu-n64 sparc64-linux-gnu) with GCC 6. Also tested with GCC 6 for i686-linux-gnu in conjunction with changes to use alias macros in .S files. * bits/floatn-common.h [!__ASSEMBLER]: Disable everything related to C syntax instead of availability and properties of types. * bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ia64/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/mips/ieee754/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/powerpc/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/x86/bits/floatn.h [!__ASSEMBLER]: Likewise. |
||
Joseph Myers
|
797ba44ba2 |
Add bits/floatn.h defines for more _FloatN / _FloatNx types.
The bits/floatn.h header currently only has defines relating to _Float128. This patch adds defines relating to other _FloatN / _FloatNx types. The approach taken is to add defines for all _FloatN / _FloatNx types known to GCC, and to put them in a common bits/floatn-common.h header included at the end of all the individual bits/floatn.h headers. If in future some defines become different for different glibc configurations, they will move out into the separate bits/floatn.h headers. Some defines are expected always to be the same across glibc ports. Corresponding defines are nevertheless put in this header. The intent is that where there are conditionals (in headers or in non-installed files) that can just repeat the same or nearly the same logic for each floating-point type, they should do so, even if in fact the cases for some types could be unconditionally present or absent because the same conditionals are true or false for all glibc configurations. This should make the glibc code with such conditionals easier to read, because the reader can just see that the same conditionals are repeated for each type, rather than seeing different conditionals for different types and needing to reason, at each location with such differences, why those differences are indeed correct there. (Cases involving per-format rather than per-type logic are more likely still to need differences in how they handle different types.) Having such defines and conditionals also helps in incremental preparation for adding _Float32 / _Float64 / _Float32x / _Float64x function aliases. I intend subsequent patches to add such conditionals corresponding to those already present for _Float128, as well as making more architecture-specific function implementations use common macros to define aliases in preparation for adding such _FloatN / _FloatNx aliases. Tested for x86_64. * bits/floatn-common.h: New file. * math/Makefile (headers): Add bits/floatn-common.h. * bits/floatn.h: Include <bits/floatn-common.h>. * sysdeps/ia64/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/mips/ieee754/bits/floatn.h: Likewise. * sysdeps/powerpc/bits/floatn.h: Likewise. * sysdeps/x86/bits/floatn.h: Likewise. |
||
Joseph Myers
|
81325b12b1 |
Add _Float128 function aliases.
This patch adds support for *f128 function aliases on platforms where long double has the binary128 format (and thus GCC 7 provides the _Float128 type with the same ABI as long double but as a distinct type in terms of C type compatibility). This is the same API as provided in glibc 2.26 for powerpc64le / x86_64 / x86 / ia64 where _Float128 has a different format from long double, with the bulk of the API coming from TS 18661-3. All the functions alias the corresponding long double functions, and __* function names are not provided since those are only needed once for each floating-point format, not more than once for different types with the same format (so for example, -ffinite-math-only maps foof128 to __fool_finite, while type-generic macros end up calling e.g. __issignalingl for _Float128 arguments on such platforms). The preparation for this feature was done in previous patches, so this one just needs to add the relevant makefile and header definitions, and update macro definitions of libm_alias_ldouble_other_r, to turn on the feature, and update documentation and ABI baselines. Tested (a) for x86_64, (b) for aarch64, (c) with build-many-glibcs.py with both GCC 6 and GCC 7. * sysdeps/ieee754/ldbl-128/Makeconfig: New file. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/float128-abi.h: Likewise. * sysdeps/generic/libm-alias-ldouble.h: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (libm_alias_ldouble_other_r): Also create _Float128 alias. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (libm_alias_ldouble_other_r): Also create _Float128 alias. * manual/math.texi (Mathematics): Document additional architecture support for _Float128. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. |
||
Joseph Myers
|
c38a4bfd59 |
Move some float128 symbol version definitions.
With support for _Float128 functions on platforms where that type has the same ABI as long double, as well as on platforms where it is ABI-distinct, those functions will need to be exported from glibc's shared libraries at appropriate symbol versions in each case. This patch avoids duplication of lists of symbols to export by moving the symbols other than __* to math/Versions and stdlib/Versions. There, they are conditional on <float128-abi.h> defining FLOAT128_VERSION and a default version of that header is added that does not define that macro. Enabling the float128 function aliases will then include adding a sysdeps/ieee754/ldbl-128/float128-abi.h that defines FLOAT128_VERSION to GLIBC_2.27. Symbols __* remain in sysdeps/ieee754/float128/Versions; those symbols should be present only once per floating-point format, not once per type. Note that if any platforms currently lacking support for a type with binary128 format get glibc support for such a type in future (whether only as _Float128, or also as a new long double format), and new libm functions (present for all types) have been added by then, additional macros will be needed to allow such functions to get a version of the form "GLIBC_2.28 if the platform had _Float128 support by then, or the later version at which that platform had _Float128 support added". This is not however a preexisting condition, but would have applied equally to the existing support for _Float128 as an ABI-distinct type. New all-type libm functions should just be added to the appropriate symbol version (currently GLIBC_2.27) for all types, with such special-case handling for _Float128 versions (and _Float64x as well in future) waiting until someone actually wants to add support for _Float128 to an existing platform after a release in which that platform and a post-2.26 libm function had support but that platform lacked _Float128 support. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested in conjunction with the remaining changes to enable float128 aliases. * sysdeps/generic/float128-abi.h: New file. * sysdeps/ieee754/float128/Versions (FLOAT128_VERSION): Move non-__prefixed symbols to .... * math/Versions: ... here. Include <float128-abi.h>. * stdlib/Versions ... and here. Include <float128-abi.h> |
||
Joseph Myers
|
02010e79ce |
Support strtof128 etc. aliases.
This patch adds support for building strtof128, wcstof128, strtof128_l and wcstof128_l as aliases, in the case of __HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested together with changes to enable float128 aliases. * stdlib/strtold.c: Include <bits/floatn.h> [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-128/strtold_l.c [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-64-128/strtold_l.c: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. |
||
Joseph Myers
|
f8718a9e16 |
Use libm_alias_ldouble_other in ldbl-64-128/s_nextafterl.c.
This patch makes ldbl-64-128/s_nextafterl.c restore the default weak_alias definition and use libm_alias_ldouble_other (having undefined and redefined weak_alias for the include of ldbl-128/s_nextafterl.c, so the libm_alias_ldouble use in the latter file is ineffective). Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested together with changes to enable float128 aliases. * sysdeps/ieee754/ldbl-64-128/s_nextafterl.c (weak_alias): Undefine and restore default definition. Use libm_alias_ldouble_other. |
||
Joseph Myers
|
1def91b304 |
Fix ldbl-opt/w_lgamma_compatl.c libm_alias_ldouble_other usage.
Testing with changes to enable _Float128 function aliases shows that the libm_alias_ldouble_other usage in ldbl-opt/w_lgamma_compatl.c does not in fact work. Furthermore, it is unnecessary; the relevant aliases get created through w_lgammal_compat2.c. This patch removes the problem code. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. Also tested in conjunction with patches to enable _Float128 function aliases. * sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA]: Remove conditional code. |
||
Joseph Myers
|
7d25d410c2 |
Fix ldbl-opt/s_clog10l.c libm_alias_ldouble_other usage.
Testing with changes to enable _Float128 function aliases shows that the libm_alias_ldouble_other usage in ldbl-opt/s_clog10l.c does not in fact work, because __clog10l is defined with long_double_symbol rather than as a normal C alias. This patch fixes this by renaming the __clog10l__internal alias (not strictly necessary, but avoids a hack with "__clog10l_interna" / "__clog10l__interna" as first argument to libm_alias_ldouble_other) and using the renamed alias when calling libm_alias_ldouble_other. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanges by the patch. Also tested in conjunction with patches to enable _Float128 function aliases. * sysdeps/ieee754/ldbl-opt/s_clog10l.c (__clog10l__internal): Rename to __clog10_internal_l. (__clog10_internal_l): Define aliases using libm_alias_ldouble_other instead of using libm_alias_ldouble_other with __clog10. |
||
Joseph Myers
|
0ff64d3a18 |
Use generic alias macros in ldbl-opt.
This patch fixes ldbl-opt code to use generic libm alias macros in preparation for getting _FloatN / _FloatNx aliases where appropriate. Four functions are affected, that undefine and redefine alias macros before including the implementations they wrap in such a way that _FloatN / _FloatNx aliases would not appear. s_clog10l.c undefines and redefined declare_mgen_alias, so just needs a libm_alias_ldouble_other call added. w_exp10l_compat.c undefines and redefines weak_alias, but in fact does not need to do so, since math/w_exp10l_compat.c uses libm_alias_ldouble and does not use weak_alias other than through that, so the undefines and redefines of weak_alias are removed. w_lgamma_compatl.c and w_remainderl_compat.c are made to use libm_alias_ldouble_other in conjunction with restoring the original definition of weak_alias so this is effective. Tested with build-many-glibcs.py. Installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-opt/s_clog10l.c: Use libm_alias_ldouble_other. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (weak_alias): Do not undefine and redefine. [LIBM_SVID_COMPAT && !LONG_DOUBLE_COMPAT (libm, GLIBC_2_1)] (exp10l): Do not define here. * sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA] (weak_alias): Undefine and redefine. [BUILD_LGAMMA]: Use libm_alias_ldouble_other. * sysdeps/ieee754/ldbl-opt/w_remainderl_compat.c [LIBM_SVID_COMPAT] (weak_alias): Undefine and redefine here. [LIBM_SVID_COMPAT]: Use libm_alias_ldouble_other. |
||
Joseph Myers
|
24b6515d87 |
Add libm_alias_*_other_r macros.
Some libm functions are unable to use the generic alias macros such as libm_alias_double because they have special symbol versioning requirements for the main float, double or long double public names. To facilitate adding _FloatN / _FloatNx function aliases in future, it's still desirable to have generic macros those functions can use as far as possible. This patch adds macros such as libm_alias_double_other, which only define names for _FloatN / _FloatNx aliases, not for float / double / long double. As present, all these new macros do nothing, but they are called in the appropriate places in macros such as libm_alias_double. This patch also arranges for lgamma implementations, and the recently added optimized float function implementations, to use the new macros to make them ready for addition of _FloatN / _FloatNx aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/generic/libm-alias-float.h (libm_alias_float_other_r): New macro. (libm_alias_float_other): Likewise. (libm_alias_float_r): Use libm_alias_float_other_r. * sysdeps/generic/libm-alias-float128.h (libm_alias_float128_other_r): New macro. (libm_alias_float128_other): Likewise. (libm_alias_float128_r): Use libm_alias_float128_other_r. * sysdeps/generic/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * math/w_lgamma_main.c: Include <libm-alias-double.h>. [!USE_AS_COMPAT]: Use libm_alias_double_other. * math/w_lgammaf_main.c: Include <libm-alias-float.h>. [!USE_AS_COMPAT]: Use libm_alias_float_other. * math/w_lgammal_main.c: Include <libm-alias-ldouble.h>. [!USE_AS_COMPAT]: Use libm_alias_ldouble_other. * math/w_exp2f.c: Use libm_alias_float_other. * math/w_expf.c: Likewise. * math/w_log2f.c: Likewise. * math/w_logf.c: Likewise. * math/w_powf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Include <libm-alias-float.h>. [!__exp2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_expf.c: Include <libm-alias-float.h>. [!__expf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_log2f.c: Include <libm-alias-float.h>. [!__log2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_logf.c: Include <libm-alias-float.h>. [!__logf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_powf.c: Include <libm-alias-float.h>. [!__powf]: Use libm_alias_float_other. |
||
Joseph Myers
|
a8dce6197a |
Use generic macros for lgamma_r function aliases.
Continuing the use of generic macros for defining libm function aliases, in preparation for adding more _FloatN / _FloatNx function names, this patch makes the lgamma_r functions use such macros. declare_mgen_alias_r becomes a standard macro in math-type-macros.h instead of being locally defined in w_lgamma_r_templace.c. This in turn must be defined by each math-type-macros-<type>.h. Rather than providing an unused default in math-type-macros.h, that header is made to give an error if math-type-macros-<type>.h failed to define declare_mgen_alias or declare_mgen_alias_r. The compat lgamma_r wrappers are updated similarly. The ldbl-opt versions are removed as no longer needed. Tested for x86_64, and with build-many-glibcs.py. Installed stripped shared libraries are unchanged except for powerpc64le (where the usual issue applies that an ldbl-opt long double function previously used long_double_symbol unconditionally and now the symbol versions on powerpc64le mean weak_alias is used instead, resulting in the same symbol versions in the final shared library but still enough difference in the input objects for that library not to be byte-identical). * sysdeps/generic/math-type-macros.h [!declare_mgen_alias]: Give error. Remove default definition of declare_mgen_alias. [!declare_mgen_alias_r]: Likewise. * sysdeps/generic/math-type-macros-double.h [!declare_mgen_alias_r] (declare_mgen_alias_r): New macro. * sysdeps/generic/math-type-macros-float.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * sysdeps/generic/math-type-macros-float128.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * sysdeps/generic/math-type-macros-ldouble.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * math/w_lgamma_r_template.c (declare_mgen_alias_r_x): Remove macro. (declare_mgen_alias_r_s): Likewise. (declare_mgen_alias_r): Likewise. * math/w_lgamma_r_compat.c: Include <libm-alias-double.h>. (lgamma_r): Define using libm_alias_double_r. * math/w_lgammaf_r_compat.c: Include <libm-alias-float.h>. (lgammaf_r): Define using libm_alias_float_r. * math/w_lgammal_r_compat.c: Include <libm-alias-ldouble.h>. (lgammal_r): Define using libm_alias_ldouble_r. * sysdeps/ieee754/ldbl-opt/w_lgamma_r_compat.c: Remove file. * sysdeps/ieee754/ldbl-opt/w_lgammal_r_compat.c: Likewise. |
||
Joseph Myers
|
c7509db215 |
Remove ldbl-opt w_scalbln.c.
The ldbl-opt version of w_scalbln.c is not in fact needed; it handles compat symbol versions for libc, but this file isn't built for libc, only for libm. This patch removes this file. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-opt/w_scalbln.c: Remove file. |
||
Joseph Myers
|
f85a176f3f |
Use libm_alias_double in ldbl-128, ldbl-96 fma.
This patch makes the ldbl-128 and ldbl-96 implementations of fma use libm_alias_double. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/ldbl-128/s_fma.c: Include <libm-alias-double.h>. [!__fma] (fma): Define using libm_alias_double. * sysdeps/ieee754/ldbl-96/s_fma.c: Include <libm-alias-double.h>. [!__fma] (fma): Define using libm_alias_double. |
||
Joseph Myers
|
fd3b4e7c8a |
Use libm_alias_ldouble for ldbl-128 functions.
This patch makes ldbl-128 functions use libm_alias_ldouble to define function aliases. float128_private.h is updated accordingly. Most of the ldbl-64-128 wrappers are removed as no longer needed with this change (leaving those that involve versioning for functions in libc or that shouldn't be exported from libm for _Float128 / _Float64x types with the same format as long double). Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/float128/float128_private.h: Include <libm-alias-ldouble.h> and <libm-alias-float128.h>. (libm_alias_ldouble_r): Undefine and redefine. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <libm-alias-ldouble.h>. (asinhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_atanl.c: Include <libm-alias-ldouble.h>. (atanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_cbrtl.c: Include <libm-alias-ldouble.h>. (cbrtl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ceill.c: Include <libm-alias-ldouble.h>. (ceill): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_cosl.c: Include <libm-alias-ldouble.h>. (cosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_erfl.c: Include <libm-alias-ldouble.h>. (erfl): Define using libm_alias_ldouble. (erfcl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c: Include <libm-alias-ldouble.h>. (expm1l): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fabsl.c: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_floorl.c: Include <libm-alias-ldouble.h>. (floorl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fmal.c: Include <libm-alias-ldouble.h>. (fmal): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_frexpl.c: Include <libm-alias-ldouble.h>. (frexpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fromfpl.c (fromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fromfpl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-128/s_fromfpxl.c (fromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_getpayloadl.c: Include <libm-alias-ldouble.h>. (getpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Include <libm-alias-ldouble.h>. (llroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_logbl.c: Include <libm-alias-ldouble.h>. (logbl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Include <libm-alias-ldouble.h>. (lroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_modfl.c: Include <libm-alias-ldouble.h>. (modfl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Include <libm-alias-ldouble.h>. (nearbyintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nextafterl.c: Include <libm-alias-ldouble.h>. (nextafterl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nextupl.c: Include <libm-alias-ldouble.h>. (nextupl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_remquol.c: Include <libm-alias-ldouble.h>. (remquol): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_rintl.c: Include <libm-alias-ldouble.h>. (rintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_roundevenl.c: Include <libm-alias-ldouble.h>. (roundevenl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_roundl.c: Include <libm-alias-ldouble.h>. (roundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_setpayloadl.c (setpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_setpayloadl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-128/s_setpayloadsigl.c (setpayloadsigl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <libm-alias-ldouble.h>. (sincosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_sinl.c: Include <libm-alias-ldouble.h>. (sinl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <libm-alias-ldouble.h>. (tanhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_tanl.c: Include <libm-alias-ldouble.h>. (tanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <libm-alias-ldouble.h>. (totalorderl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <libm-alias-ldouble.h>. (totalordermagl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_truncl.c: Include <libm-alias-ldouble.h>. (truncl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ufromfpl.c (ufromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ufromfpxl.c (ufromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (copysignl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (frexpl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_modfl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (modfl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_asinhl.c: Remove file. * sysdeps/ieee754/ldbl-64-128/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_cosl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_expm1l.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fabsl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_logbl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_remquol.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_sinl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_tanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_truncl.c: Likewise. |
||
Joseph Myers
|
6dff198369 |
Remove redundant ldbl-64-128 files.
Various source files in ldbl-64-128 are redundant, because they wrap files that no longer provide public symbols that need special versioning (those symbols having moved to separate errno-setting wrappers), or, in the case of w_scalblnl.c, because the type-generic template now does everything required (it deals with symbol versioning for use in libm, and this file is never built for libc anyway - the compat scalbln* symbols in libc, as opposed to scalbn*, are only for i386 and m68k and are aliases to the corresponding scalbn* symbols). This patch removes those redundant files. Tested with build-many-glibcs.py (for all ldbl-64-128 configurations) that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-64-128/e_ilogbl.c: Remove file. * sysdeps/ieee754/ldbl-64-128/s_log1pl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_scalblnl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise. |
||
Joseph Myers
|
86f9568af6 |
Use libm_alias_ldouble for ldbl-96 functions.
This patch makes ldbl-96 functions use libm_alias_ldouble to define function aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <libm-alias-ldouble.h>. (asinhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Include <libm-alias-ldouble.h>. (cbrtl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_cosl.c: Include <libm-alias-ldouble.h>. (cosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_erfl.c: Include <libm-alias-ldouble.h>. (erfl): Define using libm_alias_ldouble. (erfcl): Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Include <libm-alias-ldouble.h>. (fmal): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_frexpl.c: Include <libm-alias-ldouble.h>. (frexpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_fromfpl.c (fromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_fromfpl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-96/s_fromfpxl.c (fromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_getpayloadl.c: Include <libm-alias-ldouble.h>. (getpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Include <libm-alias-ldouble.h>. (llroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Include <libm-alias-ldouble.h>. (lroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_modfl.c: Include <libm-alias-ldouble.h>. (modfl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_nextupl.c: Include <libm-alias-ldouble.h>. (nextupl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_remquol.c: Include <libm-alias-ldouble.h>. (remquol): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_roundevenl.c: Include <libm-alias-ldouble.h>. (roundevenl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_roundl.c: Include <libm-alias-ldouble.h>. (roundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_setpayloadl.c (setpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_setpayloadl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-96/s_setpayloadsigl.c: Include <libm-alias-ldouble.h>. (setpayloadsigl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <libm-alias-ldouble.h>. (sincosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_sinl.c: Include <libm-alias-ldouble.h>. (sinl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <libm-alias-ldouble.h>. (tanhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_tanl.c: Include <libm-alias-ldouble.h>. (tanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <libm-alias-ldouble.h>. (totalorderl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <libm-alias-ldouble.h>. (totalordermagl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_ufromfpl.c (ufromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_ufromfpxl.c (ufromfpxl): Define using libm_alias_ldouble. |
||
Joseph Myers
|
7e16a5d1d1 |
Use libm_alias_double for dbl-64 fma.
This patch makes dbl-64 fma use libm_alias_double. The ldbl-opt version is removed. The sparc32 version no longer needs to handle compat symbols, while alpha needs a new wrapper to avoid getting the ldbl-128 version (where ldbl-opt is earlier in the list of sysdeps directories, so previously fma came from there). Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/dbl-64/s_fma.c: Include <libm-alias-double.h>. (fma): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_fma.c: Remove file. * sysdeps/sparc/sparc32/fpu/s_fma.c: Do not include <math_ldbl_opt.h>. (fmal): Do not define as compat symbol here. * sysdeps/alpha/fpu/s_fma.c: New file. |
||
Szabolcs Nagy
|
86c27ade1e |
[BZ #22244] Fix yn(n,0) without SVID wrapper
Without SVID compat wrapper yn(n,0) and ynf(n,0) does not raise the divide-by-zero excpetion and it may return inf with the wrong sign for n < 0. [BZ #22244] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_yn): Fix x == 0 case. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_ynf): Likewise. |
||
Szabolcs Nagy
|
8f8f8ef7ab |
[BZ #22243] fix log2(0) and log(10) in downward rounding
On 64bit targets if the SVID compat wrapper is suppressed (e.g. static linking) then log2(0) and log10(0) returned inf instead of -inf. [BZ #22243] * sysdeps/ieee754/dbl-64/wordsize-64/e_log10.c (__ieee754_log10): Use fabs. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c (__ieee754_log2): Likewise. |
||
Joseph Myers
|
d8f619b393 |
Use libm_alias_double for dbl-64 modf.
This patch makes dbl-64 modf use libm_alias_double. Both the dbl-64 and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version is changed to define the libc compat symbol only. Because of multiarch wrappers, the changed implementations are made not to define aliases at all if __modf is defined as a macro, as with other functions, so avoiding duplicate compat symbols while allowing those wrappers to be simplified. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/dbl-64/s_modf.c: Include <libm-alias-double.h>. (modf): Define using libm_alias_double, only if [!__modf]. * sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c: Include <libm-alias-double.h>. (modf): Define using libm_alias_double, only if [!__modf]. * sysdeps/ieee754/ldbl-opt/s_modf.c (modfl): Only define libc compat symbol here. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-ppc32.c (weak_alias): Do not undefine and redefine. (strong_alias): Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c (weak_alias): Likewise. (strong_alias): Likewise. |
||
Joseph Myers
|
4699cb8b5f |
Use libm_alias_double for dbl-64 logb.
This patch makes dbl-64 logb use libm_alias_double. Both the dbl-64 and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version is removed. Because of multiarch wrappers, the changed implementations are made not to define aliases at all if __logb is defined as a macro, as with other functions, so avoiding duplicate compat symbols while allowing those wrappers to be simplified. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged (except on alpha where changes from using the wordsize-64 version are expected). * sysdeps/ieee754/dbl-64/s_logb.c: Include <libm-alias-double.h>. (logb): Define using libm_alias_double, only if [!__logb]. * sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Include <libm-alias-double.h>. (logb): Define using libm_alias_double, only if [!__logb]. * sysdeps/ieee754/ldbl-opt/s_logb.c: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-ppc32.c (weak_alias): Do not undefine and redefine. (strong_alias): Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c (weak_alias): Likewise. (strong_alias): Likewise. |
||
Joseph Myers
|
7f1cbdf8ed |
Use libm_alias_float for dbl-64 fmaf.
This patch makes the implementation of fmaf in the dbl-64 directory use libm_alias float. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/dbl-64/s_fmaf.c: Include <libm-alias-float.h>. [!__fmaf] (fmaf): Define using libm_alias_float. |
||
Joseph Myers
|
39793865ec |
Use libm_alias_double for dbl-64 frexp.
This patch makes dbl-64 frexp use libm_alias_double. Both the dbl-64 and dbl-64/wordsize-64 versions are changed; the ldbl-opt version is made to define only the libc frexpl compat symbol, now the generic code handles the libm compat symbol automatically. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/dbl-64/s_frexp.c: Include <libm-alias-double.h>. (frexp): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c: Include <libm-alias-double.h>. (frexp): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_frexp.c (frexpl): Only define libc compat symbol here. |
||
Gabriel F. T. Gomes
|
aa0235dfde |
Add C++ versions of iscanonical for ldbl-96 and ldbl-128ibm (bug 22235)
All representations of floating-point numbers in types with IEC 60559 binary exchange format are canonical. On the other hand, types with IEC 60559 extended formats, such as those implemented under ldbl-96 and ldbl-128ibm, contain representations that are not canonical. TS 18661-1 introduced the type-generic macro iscanonical, which returns whether a floating-point value is canonical or not. In Glibc, this type-generic macro is implemented using the macro __MATH_TG, which, when support for float128 is enabled, relies on __builtin_types_compatible_p to select between floating-point types. However, this use of iscanonical breaks C++ applications, because the builtin is only available in C mode. This patch provides a C++ implementation of iscanonical that relies on function overloading, rather than builtins, to select between floating-point types. Unlike the C++ implementations for iszero and issignaling, this implementation ignores __NO_LONG_DOUBLE_MATH. The double type always matches IEC 60559 double format, which is always canonical. Thus, when double and long double are the same (__NO_LONG_DOUBLE_MATH), iscanonical always returns 1 and is not implemented with __MATH_TG. Tested for powerpc64, powerpc64le and x86_64. [BZ #22235] * math/math.h: Trivial fix for unbalanced parentheses in comment. * math/Makefile [CXX] (tests): Add test-math-iscanonical.cc. (CFLAGS-test-math-iscanonical.cc): New variable. * math/test-math-iscanonical.cc: New file. * sysdeps/ieee754/ldbl-96/bits/iscanonical.h (iscanonical): Provide a C++ implementation based on function overloading, rather than using __MATH_TG, which uses C-only builtins. * sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h (iscanonical): Likewise. * sysdeps/powerpc/powerpc64le/Makefile (CFLAGS-test-math-iscanonical.cc): New variable. |
||
Joseph Myers
|
a1132b5e56 |
Use libm_alias_double for more dbl-64 functions.
This patch makes more dbl-64 functions use libm_alias_double to define function aliases. Specifically, it makes the change for functions with dbl-64/wordsize-64 versions, changing both the dbl-64 and dbl-64/wordsize-64 versions and removing the ldbl-opt wrappers. Functions are excluded from this patch if there are complications because of versions of those functions also present in libc, or architecture-specific wrappers round these files. Tested for x86_64, and with build-many-glibcs.py. Installed stripped shared libraries are unchanged except for alpha (where increased use of dbl-64/wordsize-64 files, where previously ldbl-opt files that wrapped dbl-64 files were used, was expected to result in different, better code). * sysdeps/ieee754/dbl-64/s_ceil.c: Include <libm-alias-double.h>. (ceil): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_floor.c: Include <libm-alias-double.h>. (floor): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_llround.c: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_lround.c: Include <libm-alias-double.h>. (lround): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Include <libm-alias-double.h>. (nearbyint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_remquo.c: Include <libm-alias-double.h>. (remquo): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_rint.c: Include <libm-alias-double.h>. (rint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_round.c: Include <libm-alias-double.h>. (round): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_trunc.c: Include <libm-alias-double.h>. (trunc): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Include <libm-alias-double.h>. (ceil): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Include <libm-alias-double.h>. (floor): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. [_LP64] (lround): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Include <libm-alias-double.h>. [!_LP64] (lround): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Include <libm-alias-double.h>. (nearbyint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c: Include <libm-alias-double.h>. (remquo): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Include <libm-alias-double.h>. (rint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Include <libm-alias-double.h>. (round): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Include <libm-alias-double.h>. (trunc): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_ceil.c: Remove file. * sysdeps/ieee754/ldbl-opt/s_floor.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_llround.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_lround.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_nearbyint.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_remquo.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_rint.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_round.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_trunc.c: Likewise. |
||
Joseph Myers
|
38722448c6 |
Use libm_alias_double for dbl-64 atan, tan.
This patch makes the dbl-64 atan and tan implementations use libm_alias_double, removing the corresponding ldbl-opt wrappers. Tested for x86_64, and with build-many-glibcs.py. Installed stripped shared libraries are unchanged on non-ldbl-opt platforms. For ldbl-opt configurations, the patch has the effect of causing compat_symbol to define atanl and tanl in terms of __atan and __tan instead of in terms of atan and tan, which is enough to change the installed stripped libm.so. * sysdeps/ieee754/dbl-64/s_atan.c: Include <libm-alias-double.h>. (atan): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_tan.c: Include <libm-alias-double.h>. (tan): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_atan.c: Remove file. * sysdeps/ieee754/ldbl-opt/s_tan.c: Likewise. |
||
Joseph Myers
|
527cd19c3d |
Make dbl-64 atan and tan into weak aliases.
This patch converts the dbl-64 implementations of atan and tan into weak aliases of __atan and __tan, in preparation for making them use libm_alias_double. Consequent changes are made to the x86_64 multiarch versions wrapping round them (with the dbl-64 functions, like other such functions, being made not to define their aliases at all if __atan or __tan are defined as macros by an including file). Tested for x86_64, and with build-many-glibcs.py. * sysdeps/ieee754/dbl-64/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. Do not define any aliases if [__atan]. [NO_LONG_DOUBLE] (__atanl): Define as strong alias of __atan. [NO_LONG_DOUBLE] (atanl): Define as weak alias of __atanl. * sysdeps/ieee754/dbl-64/s_tan.c (tan): Rename to __tan and define as weak alias of __tan. Do not define any aliases if [__tan]. [NO_LONG_DOUBLE] (__tanl): Define as strong alias of __tan. [NO_LONG_DOUBLE] (tanl): Define as weak alias of __tanl. * sysdeps/x86_64/fpu/multiarch/s_atan-avx.c (atan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma4.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-avx.c (tan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma4.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c (tan): Rename to __tan and define as weak alias of __tan. |
||
Szabolcs Nagy
|
bd4430c2a6 |
Do not wrap logf, log2f and powf
The new generic logf, log2f and powf code don't need wrappers any more, they set errno inline so only use the wrappers on targets that need it. * sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise * sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise * sysdeps/ieee754/flt-32/w_log2f.c: New file. * sysdeps/ieee754/flt-32/w_logf.c: New file. * sysdeps/ieee754/flt-32/w_powf.c: New file. * sysdeps/i386/fpu/w_log2f.c: New file. * sysdeps/i386/fpu/w_logf.c: New file. * sysdeps/i386/fpu/w_powf.c: New file. * sysdeps/m68k/m680x0/fpu/w_log2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_logf.c: New file. * sysdeps/m68k/m680x0/fpu/w_powf.c: New file. |
||
Szabolcs Nagy
|
f7a0b063e7 |
Do not wrap expf and exp2f
The new generic expf and exp2f code don't need wrappers any more, they set errno inline, so only use the wrappers on targets that need it. (If the wrapper is needed, then the top level wrapper code is included, otherwise empty w_exp*f.c is used to suppress the wrapper.) A powerpc64 expf implementation includes the expf c code directly which needed some changes. * sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise * sysdeps/ieee754/flt-32/w_exp2f.c: New file. * sysdeps/ieee754/flt-32/w_expf.c: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for the new expf code. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_expf.c: New file. * sysdeps/i386/fpu/w_exp2f.c: New file. * sysdeps/i386/fpu/w_expf.c: New file. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file. * sysdeps/x86_64/fpu/w_expf.c: New file. |
||
H.J. Lu
|
c26dd7c600 |
Mark ____wcsto*_l_internal functions with attribute_hidden [BZ #18822]
Mark ____wcsto*_l_internal functions with attribute_hidden to allow direct access to them within libc.so and libc.a without using GOT nor PLT. [BZ #18822] * include/wchar.h (____wcstof_l_internal): New prototype. (____wcstod_l_internal): Likewise. (____wcstold_l_internal): Likewise. (____wcstol_l_internal): Likewise. (____wcstoul_l_internal): Likewise. (____wcstoll_l_internal): Likewise. (____wcstoull_l_internal): Likewise. (____wcstof128_l_internal): Likewise. * sysdeps/ieee754/float128/wcstof128.c (____wcstof128_l_internal): Removed. * sysdeps/ieee754/float128/wcstof128_l.c (____wcstof128_l_internal): Likewise. * wcsmbs/wcstod.c (____wcstod_l_internal): Likewise. * wcsmbs/wcstod_l.c (____wcstod_l_internal): Likewise. * wcsmbs/wcstof.c (____wcstof_l_internal): Likewise. * wcsmbs/wcstof_l.c (____wcstof_l_internal): Likewise. * wcsmbs/wcstol_l.c (____wcstol_l_internal): Likewise. * wcsmbs/wcstold.c (____wcstold_l_internal): Likewise. * wcsmbs/wcstold_l.c (____wcstold_l_internal): Likewise. * wcsmbs/wcstoll_l.c (____wcstoll_l_internal): Likewise. * wcsmbs/wcstoul_l.c (____wcstoul_l_internal): Likewise. * wcsmbs/wcstoull_l.c (____wcstoull_l_internal): Likewise. |
||
Joseph Myers
|
1e2bffd05c |
Use libm_alias_double for some dbl-64 functions.
Continuing the move of libm aliases to common macros that can create _FloatN / _FloatNx aliases in future, this patch converts some dbl-64 functions to using libm_alias_double, thereby eliminating the need for some ldbl-opt wrappers. This patch deliberately limits what functions are converted so that it can be verified by comparison of stipped binaries. Specifically, atan and tan are excluded because they first need converting to being weak aliases; fma is omitted as it has additional complications with versions in other directories (removing the ldbl-opt version can e.g. cause the ldbl-128 version to be used instead of dbl-64); and functions that have both dbl-64/wordsize-64 and ldbl-opt versions are excluded because ldbl-opt currently always wraps dbl-64 function versions, so changing those will result in platforms using both ldbl-opt and dbl-64/wordsize-64 (i.e. alpha) starting to use the dbl-64/wordsize-64 versions of those functions (which is good, as an optimization, but still best separated from the present patch to get better validation). Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/dbl-64/s_asinh.c: Include <libm-alias-double.h>. (asinh): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_cbrt.c: Include <libm-alias-double.h>. (cbrt): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_copysign.c: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_erf.c: Include <libm-alias-double.h>. (erf): Define using libm_alias_double. (erfc): Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Include <libm-alias-double.h>. (expm1): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_fabs.c: Include <libm-alias-double.h>. (fabs): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_fromfp.c (fromfp): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_fromfp_main.c: Include <libm-alias-double.h>. * sysdeps/ieee754/dbl-64/s_fromfpx.c (fromfpx): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_getpayload.c: Include <libm-alias-double.h>. (getpayload): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_llrint.c: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_lrint.c: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_nextup.c: Include <libm-alias-double.h>. (nextup): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_roundeven.c: Include <libm-alias-double.h>. (roundeven): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_setpayload.c (setpayload): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_setpayload_main.c: Include <libm-alias-double.h>. * sysdeps/ieee754/dbl-64/s_setpayloadsig.c (setpayloadsig): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_sin.c: Include <libm-alias-double.h>. (cos): Define using libm_alias_double. (sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Include <libm-alias-double.h>. (sincos): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_tanh.c: Include <libm-alias-double.h>. (tanh): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_totalorder.c: Include <libm-alias-double.h>. (totalorder): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_totalordermag.c: Include <libm-alias-double.h>. (totalordermag): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_ufromfp.c (ufromfp): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/s_ufromfpx.c (ufromfpx): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_getpayload.c: Include <libm-alias-double.h>. (getpayload): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c: Include <libm-alias-double.h>. (roundeven): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_setpayload_main.c: Include <libm-alias-double.h>. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include <libm-alias-double.h>. (totalorder): Define using libm_alias_double. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include <libm-alias-double.h>. (totalordermag): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_copysign.c (copysignl): Only define libc compat symbol here. * sysdeps/ieee754/ldbl-opt/s_asinh.c: Remove file. * sysdeps/ieee754/ldbl-opt/s_cbrt.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_erf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_expm1.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_fabs.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_llrint.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_lrint.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_sin.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_sincos.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_tanh.c: Likewise. |
||
Wilco Dijkstra
|
bd8d53bb33 |
Use fabs(f/l) rather than __fabs
A few math functions still use __fabs(f/l) rather than fabs, which means they won't be inlined. Rename them so they are inlined. Also add -fno-builtin-fabsl to nofpu powerpc makefile to work around BZ #29253. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Use fabs rather than __fabs. * sysdeps/ieee754/dbl-64/e_log10.c (__ieee754_log10): Likewise. * sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use fabsf rather than __fabsf. * sysdeps/ieee754/flt-32/e_log10f.c (__ieee754_log10f): Likewise. * sysdeps/ieee754/flt-32/e_log2f.c (__ieee754_log2f): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Use fabsl rather than __fabsl. * sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Likewise. * sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Use fabsl rather than __fabsl. * sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/powerpc/nofpu/Makefile: Add -fno-builtin-fabsl for BZ #29253. |
||
Szabolcs Nagy
|
4ea49f4c08 |
New generic powf
without wrapper on aarch64: powf reciprocal-throughput: 4.2x faster powf latency: 2.6x faster old worst-case error: 1.11 ulp new worst-case error: 0.82 ulp aarch64 .text size: -780 bytes aarch64 .rodata size: +144 bytes powf(x,y) is implemented as exp2(y*log2(x)) with the same algorithms that are used in exp2f and log2f, except that the log2f polynomial is larger for extra precision and its output (and exp2f input) may be scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction step of exp2 (possible when efficient round and convert toint operation is available). The special case handling tries to minimize the checks in the hot path. When the input of exp2_inline is checked, int arithmetics is used as that was faster on the tested aarch64 cores. * math/Makefile (type-float-routines): Add e_powf_log2_data. * sysdeps/ieee754/flt-32/e_powf.c: New implementation. * sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__powf_log2_data): Define. (issignalingf_inline): Likewise. (POWF_LOG2_TABLE_BITS): Likewise. (POWF_LOG2_POLY_ORDER): Likewise. (POWF_SCALE_BITS): Likewise. (POWF_SCALE): Likewise. * sysdeps/i386/fpu/e_powf_log2_data.c: New file. * sysdeps/ia64/fpu/e_powf_log2_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file. |
||
Szabolcs Nagy
|
875c76c704 |
New generic log2f
Similar to the new logf: double precision arithmetics and a small lookup table is used. The argument reduction step is the same as in the new logf. without wrapper on aarch64: log2f reciprocal-throughput: 2.3x faster log2f latency: 2.1x faster old worst case error: 1.72 ulp new worst case error: 0.75 ulp aarch64 .text size: -252 bytes aarch64 .rodata size: +244 bytes * math/Makefile (type-float-routines): Add e_log2f_data. * sysdeps/ieee754/flt-32/e_log2f.c: New implementation. * sysdeps/ieee754/flt-32/e_log2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define. (LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define. * sysdeps/i386/fpu/e_log2f_data.c: New file. * sysdeps/ia64/fpu/e_log2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file. |
||
Szabolcs Nagy
|
bf27d3973d |
New generic logf
without wrapper on aarch64: logf reciprocal-throughput: 2.2x faster logf latency: 1.9x faster old worst case error: 0.89 ulp new worst case error: 0.82 ulp aarch64 .text size: -356 bytes aarch64 .rodata size: +240 bytes Uses double precision arithmetics and a lookup table to allow smaller polynomial and avoid the use of division. Data is in a separate translation unit with fixed layout to prevent the compiler generating suboptimal literal access. Errors are handled inline according to POSIX rules, but this patch keeps the wrapper with SVID compatible error handling. Needs libm-test-ulps adjustment for clogf in non-nearest rounding mode. * math/Makefile (type-float-routines): Add e_logf_data. * sysdeps/ieee754/flt-32/e_logf.c: New implementation. * sysdeps/ieee754/flt-32/e_logf_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__logf_data): Define. (LOGF_TABLE_BITS, LOGF_POLY_ORDER): Define. * sysdeps/i386/fpu/e_logf_data.c: New file. * sysdeps/ia64/fpu/e_logf_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_logf_data.c: New file. |