glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-30 08:40:07 +00:00

Author	SHA1	Message	Date
Wilco Dijkstra	220622dde5	Add libm_alias_finite for _finite symbols This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:04 -03:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Gabriel F. T. Gomes	9ae967bf45	ldbl-128ibm-compat: Do not mix -mabi=longdouble and -mlong-double-128 Some compiler versions, e.g. GCC 7, complain when -mlong-double-128 is used together with -mabi=ibmlongdouble or -mabi=ieeelongdouble, producing the following error message: cc1: error: ‘-mabi=ibmlongdouble’ requires ‘-mlong-double-128’ This patch removes -mlong-double-128 from the compilation lines that explicitly request -mabi=longdouble. Tested for powerpc64le. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho	5d73c96f64	ldbl-128ibm-compat: Compiler flags for stdio functions Some of the files that provide stdio.h and wchar.h functions have a filename prefixed with 'io', such as 'iovsprintf.c'. On platforms that imply ldbl-128ibm-compat, these files must be compiled with the flag -mabi=ibmlongdouble. This patch adds this flag to their compilation. Notice that this is not required for the other files that provide similar functions, because filenames that are not prefixed with 'io' have ldbl-128ibm-compat counterparts in the Makefile, which already adds -mabi=ibmlongdouble to them. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho	1ef9b6e0bf	Do not redirect calls to __GI_* symbols, when redirecting to ieee128 On platforms where long double has IEEE binary128 format as a third option (initially, only powerpc64le), many exported functions are redirected to their __ieee128 equivalents. This redirection is provided by installed headers such as stdio-ldbl.h, and is supposed to work correctly with user code. However, during the build of glibc, similar redirections are employed, in internal headers, such as include/stdio.h, in order to avoid extra PLT entries. These redirections conflict with the redirections to __*ieee128, and must be avoided during the build. This patch protects the second redirections with a test for __LONG_DOUBLE_USES_FLOAT128, a new macro that is defined to 1 when functions that deal with long double typed values reuses the _Float128 implementation (this is currently only true for powerpc64le). Tested for powerpc64le, x86_64, and with build-many-glibcs.py. Co-authored-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2019-12-27 15:02:10 -03:00
Gabriel F. T. Gomes	f8cd102081	Avoid compat symbols for totalorder in powerpc64le IEEE long double On powerpc64le, the libm_alias_float128_other_r_ldbl macro is used to create an alias between totalorderf128 and __totalorderlieee128, as well as between the totalordermagf128 and __totalordermaglieee128. However, the totalorder* and totalordermag* functions changed their parameter type since commit ID `42760d7646` and got compat symbols for their old versions. With this change, the aforementioned macro would create two conflicting aliases for __totalorderlieee128 and __totalordermaglieee128. This patch avoids the creation of the alias between the IEEE long double symbols (__totalorderl*ieee128) and the compat symbols, because the IEEE long double functions have never been exported thus don't need such compat symbol. Tested for powerpc64le. Reviewed-by: Joseph Myers <joseph@codesourcery.com>	2019-12-23 16:32:20 -03:00
Gabriel F. T. Gomes	3021e78178	ldbl-128ibm-compat: Add cvt functions This patch adds IEEE long double versions of qcvt* functions for powerpc64le. Unlike all other long double to/from string conversion functions, these do not rely on internal functions that can take floating-point numbers with different formats and act on them accordingly, instead, the related files are rebuilt with the -mabi=ieeelongdouble compiler flag set. Having -mabi=ieeelongdouble passed to the compiler causes the object files to be marked with a .gnu_attribute that is incompatible with the .gnu_attribute in files built with -mabi=ibmlongdouble (the default). The difference causes error messages similar to the following: ld: libc_pic.a(s_isinfl.os) uses IBM long double, libc_pic.a(ieee128-qefgcvt_r.os) uses IEEE long double. collect2: error: ld returned 1 exit status make[2]: *** [../Makerules:649: libc_pic.os] Error 1 Although this warning is useful in other situations, the library actually needs to have functions with different long double formats, so .gnu_attribute generation is explicitly disabled for these files with the use of -mno-gnu-attribute. Tested for powerpc64le on the branch that actually enables the sysdeps/ieee754/ldbl-128ibm-compat for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-23 16:32:20 -03:00
Gabriel F. T. Gomes	f1a0eb5b67	ldbl-128ibm-compat: Add ISO C99 versions of scanf functions In the format string for *scanf functions, the '%as', '%aS', and '%a[]' modifiers behave differently depending on ISO C99 compatibility. When _GNU_SOURCE is defined and -std=c89 is passed to the compiler, these functions behave like ascanf, and the modifiers allocate memory for the output. Otherwise, the ISO C99 compliant version of these functions is used, and the modifiers consume a floating-point argument. This patch adds the IEEE binary128 variant of ISO C99 compliant functions for the third long double format on powerpc64le. Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-13 18:02:09 -03:00
Gabriel F. T. Gomes	348787f069	ldbl-128ibm-compat: Fix selection of GNU and ISO C99 scanf Since commit commit `03992356e6` Author: Zack Weinberg <zackw@panix.com> Date: Sat Feb 10 11:58:35 2018 -0500 Use C99-compliant scanf under _GNU_SOURCE with modern compilers. the selection of the GNU versions of scanf functions requires both _GNU_SOURCE and -std=c89. This patch changes the tests in ldbl-128ibm-compat so that they actually test the GNU versions (without this change, the redirection to the ISO C99 version always happens, so GNU versions of the new implementation (e.g. __scanfieee128) were left untested). Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-13 18:01:25 -03:00
Stefan Liebler	1902d5d5ff	Adjust s_copysignl.c regarding code style. This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:21 +01:00
Stefan Liebler	171d23d7cb	Adjust s_ceilf.c and s_ceill.c regarding code style. This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:21 +01:00
Stefan Liebler	d3a0409ab6	Adjust s_floorf.c and s_floorl.c regarding code style. This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:20 +01:00
Stefan Liebler	99b39a83e7	Adjust s_rintf.c and s_rintl.c regarding code style. This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:20 +01:00
Stefan Liebler	6a3866dae9	Adjust s_nearbyintf.c and s_nearbyintl.c regarding code style. This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:19 +01:00
Stefan Liebler	f818afdd3b	Use GCC builtins for copysign functions if desired. This patch is always using the corresponding GCC builtin for copysignf, copysign, and is using the builtin for copysignl, copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. Altough the long double version is enabled by default we still need the macro and the alternative implementation as the _Float128 version of the builtin is not available with all supported GCC versions. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:19 +01:00
Stefan Liebler	f82996f815	Use GCC builtins for round functions if desired. This patch is using the corresponding GCC builtin for roundf, round, roundl and roundf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:18 +01:00
Stefan Liebler	1ac9c1cf87	Use GCC builtins for trunc functions if desired. This patch is using the corresponding GCC builtin for truncf, trunc, truncl and truncf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:17 +01:00
Stefan Liebler	62560ee840	Use GCC builtins for ceil functions if desired. This patch is using the corresponding GCC builtin for ceilf, ceil, ceill and ceilf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:17 +01:00
Stefan Liebler	6c1b6a5e8c	Use GCC builtins for floor functions if desired. This patch is using the corresponding GCC builtin for floorf, floor, floorl and floorf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:16 +01:00
Stefan Liebler	a2a9b00429	Use GCC builtins for rint functions if desired. This patch is using the corresponding GCC builtin for rintf, rint, rintl and rintf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:16 +01:00
Stefan Liebler	ae3577f607	Use GCC builtins for nearbyint functions if desired. This patch is using the corresponding GCC builtin for nearbyintf, nearbyint, nearbintl and nearbyintf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:15 +01:00
Stefan Liebler	36e9acbd5c	Always use wordsize-64 version of s_round.c. This patch replaces s_round.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:14 +01:00
Stefan Liebler	1c94bf0f0a	Always use wordsize-64 version of s_trunc.c. This patch replaces s_trunc.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:14 +01:00
Stefan Liebler	9f234eafe8	Always use wordsize-64 version of s_ceil.c. This patch replaces s_ceil.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:13 +01:00
Stefan Liebler	95b0c2c431	Always use wordsize-64 version of s_floor.c. This patch replaces s_floor.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	ab48bdd098	Always use wordsize-64 version of s_rint.c. This patch replaces s_rint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	af123aa950	Always use wordsize-64 version of s_nearbyint.c. This patch replaces s_nearbyint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:11 +01:00
Gabriel F. T. Gomes	39c977b23e	ldbl-128ibm-compat: Add tests for strfroml, strtold, and wcstold Since the commit commit `86a0f56158` Author: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> Date: Thu Jun 28 13:57:50 2018 +0530 ldbl-128ibm-compat: Introduce ieee128 symbols IEEE long double versions of strfroml, strtold, and wcstold have been prepared, but not exposed (which will only happen when the full support for IEEE long double is complete). This patch adds tests for these functions in both IBM and IEEE long double mode. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-03 13:30:42 -03:00
Gabriel F. T. Gomes	80a19b003e	ldbl-128ibm-compat: Add tests for strfmon and strfmon_l This patch adds elementary tests to check that strfmon and strfmon_l correctly evaluate long double values with IBM Extended Precision and IEEE binary128 format. Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-03 13:30:42 -03:00
Rajalakshmi Srinivasaraghavan	66fa30828a	ldbl-128ibm-compat: Add strfmon_l with IEEE long double format Similarly to what has been done for printf-like functions, more specifically to the internal implementation in __vfprintf_internal, this patch extends __vstrfmon_l_internal to deal with long double values with binary128 format (as a third format option and reusing the float128 implementation). Tested for powerpc64le, powerpc64, x86_64, and with build-many-glibcs. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-03 13:30:37 -03:00
Gabriel F. T. Gomes	5d39f37b26	ldbl-128ibm-compat: Replace http with https in new files Several commits to the ldbl-128ibm-compat directory added new files where the URL in the copyright notice pointed to an http, rather than to an https, address. This happened because I copied the notices before commit ID `5a82c74822`. This trivial patch fixes this issue.	2019-12-03 13:00:57 -03:00
Gabriel F. T. Gomes	381b76d7a3	ldbl-128ibm-compat: Add syslog functions Similarly to __vfprintf_internal and __vfscanf_internal, the internal implementation of syslog functions (__vsyslog_internal) takes a 'mode_flags' parameter used to select the format of long double parameters. This patch adds variants of the syslog functions that set 'mode_flags' to PRINTF_LDBL_USES_FLOAT128, thus enabling the correct printing of long double values on powerpc64le, when long double has IEEE binary128 format (-mabi=ieeelongdouble). Tested for powerpc64le. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Joseph Myers <joseph@codesourcery.com> Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	590ef889bc	ldbl-128ibm-compat: Add obstack printing functions Similarly to the functions from the printf family, this patch adds implementations for __obstack_printf* functions that set the 'mode_flags' parameter to PRINTF_LDBL_USES_FLOAT128, before making calls to __vfprintf_internal (indirectly through __obstack_vprintf_internal). Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	ff3cb5accb	ldbl-128ibm-compat: Reuse tests for err.h and error.h functions Commit IDs `9771e6cb51` and `7597b0c7f7` added tests for the functions from err.h and error.h that can take long double parameters. Afterwards, commit ID `f0eaf86276` reused them on architectures that changed the long double format from the same as double to something else (i.e.: architectures that imply ldbl-opt). This patch reuses it again for IEEE long double on powerpc64le. Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	9f25935dda	ldbl-128ibm-compat: Add error.h functions Use the recently added, internal functions, __error_at_line_internal and __error_internal, to provide error.h functions that can take long double arguments with IEEE binary128 format on platforms where long double can also take double format or some non-IEEE format (currently, this means powerpc64le). Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	a23ed31463	ldbl-128ibm-compat: Add err.h functions Use the recently added, internal functions, __vwarnx_internal and __vwarn_internal, to provide err.h functions that can take long double arguments with IEEE binary128 format on platforms where long double can also take double format or some non-IEEE format (currently, this means powerpc64le). Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	77607e7d44	ldbl-128ibm-compat: Add argp_error and argp_failure Use the recently added, internal functions, __argp_error_internal and __argp_failure_internal, to provide argp_error and argp_failure that can take long double arguments with IEEE binary128 format on platforms where long double can also take double format or some non-IEEE format (currently, this means powerpc64le). Tested for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-27 15:52:41 -03:00
Gabriel F. T. Gomes	b370c5f014	ldbl-128ibm-compat: Add wide character scanning functions Similarly to what was done for regular character scanning functions, this patch uses the new mode mask, SCANF_LDBL_USES_FLOAT128, in the 'mode' argument of the wide characters scanning function, __vfwscanf_internal (which is also extended to support scanning floating-point values with IEEE binary128, by redirecting calls to __wcstold_internal to __wcstof128_internal). Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:13:20 -03:00
Gabriel F. T. Gomes	a5b15bdec8	ldbl-128ibm-compat: Add regular character scanning functions The 'mode' argument to __vfscanf_internal allows the selection of the long double format for all long double arguments requested by the format string. Currently, there are two possibilities: long double with the same format as double or long double as something else. The 'something else' format varies between architectures, and on powerpc64le, it means IBM Extended Precision format. In preparation for the third option of long double format on powerpc64le, this patch uses the new mode mask, SCANF_LDBL_USES_FLOAT128, which tells __vfscanf_internal to call __strtof128_internal, instead of __strtold_internal, and save the output into a _Float128 variable. Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:13:01 -03:00
Gabriel F. T. Gomes	c2f959ed5f	ldbl-128ibm-compat: Test positional arguments The format string can request positional parameters, instead of relying on the order in which they appear as arguments. Since this has an effect on how the type of each argument is determined, this patch extends the test cases to use positional parameters with mixed double and long double types, to verify that the IEEE long double implementations of *printf work correctly in this scenario. Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:12:54 -03:00
Gabriel F. T. Gomes	5bbbd5ae05	ldbl-128ibm-compat: Test double values A single format string can take double and long double parameters at the same time. Internally, these parameters are routed to the same function, which correctly reads them and calls the underlying functions responsible for the actual conversion to string. This patch adds a new case to test this scenario. Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:12:37 -03:00
Gabriel F. T. Gomes	329037cead	ldbl-128ibm-compat: Add wide character, fortified printing functions Similarly to what was done for the regular character, fortified printing functions, this patch combines the mode masks PRINTF_LDBL_USES_FLOAT128 and PRINTF_FORTIFY to provide wide character versions of fortified printf functions. It also adds two flavors of test cases: one that explicitly calls the fortified functions, and another that reuses the non-fortified test, but defining _FORTIFY_SOURCE as 2. The first guarantees that the implementations are actually being tested (independently of what's in bits/wchar2.h), whereas the second guarantees that the redirections calls the correct function in the IBM and IEEE long double cases. Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:12:27 -03:00
Gabriel F. T. Gomes	5aa64dbc29	ldbl-128ibm-compat: Add regular character, fortified printing functions Since the introduction of internal functions with explicit flags for the printf family of functions, the 'mode' parameter can be used to select which format long double parameters have (with the mode flags: PRINTF_LDBL_IS_DBL and PRINTF_LDBL_USES_FLOAT128), as well as to select whether to check for overflows (mode flag: PRINTF_FORTIFY). This patch combines PRINTF_LDBL_USES_FLOAT128 and PRINTF_FORTIFY to provide the IEEE binary128 version of printf-like function for platforms where long double can take this format, in addition to the double format and to some non-ieee format (currently, this means powerpc64le). There are two flavors of test cases provided with this patch: one that explicitly calls the fortified functions, for instance __asprintf_chk, and another that reuses the non-fortified test, but defining _FORTIFY_SOURCE as 2. The first guarantees that the implementations are actually being tested (in bits/stdio2.h, vprintf gets redirected to __vfprintf_chk, which would leave __vprintf_chk untested), whereas the second guarantees that the redirections calls the correct function in the IBM and IEEE long double cases. Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:11:49 -03:00
Gabriel F. T. Gomes	1771a5cf0e	ldbl-128ibm-compat: Add wide character printing functions Similarly to what was done for regular character printing functions, this patch uses the new mode mask, PRINTF_LDBL_USES_FLOAT128, in the 'mode' argument of the wide characters printing function, __vfwprintf_internal (which is also extended to support printing floating-point values with IEEE binary128, by saving floating-point values into variables of type __float128 and adjusting the parameters to __printf_fp and __printf_fphex as if it was a call from a wide-character version of strfromf128 (even though such version does not exist)). Tested for powerpc64le. Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:11:38 -03:00
Gabriel F. T. Gomes	421a1d34bf	ldbl-128ibm-compat: Add regular character printing functions The 'mode' argument to __vfprintf_internal allows the selection of the long double format for all long double arguments requested by the format string. Currently, there are two possibilities: long double with the same format as double or long double as something else. The 'something else' format varies between architectures, and on powerpc64le, it means IBM Extended Precision format. In preparation for the third option of long double format on powerpc64le, this patch uses the new mode mask, PRINTF_LDBL_USES_FLOAT128, which tells __vfprintf_internal to save the floating-point values into variables of type __float128 and adjusts the parameters to __printf_fp and __printf_fphex as if it was a call from strfromf128. Many files from the stdio-common, wcsmbs, argp, misc, and libio directories will have IEEE binary128 counterparts. Setting the correct compiler options to these files (original and counterparts) would produce a large amount of repetitive Makefile rules. To avoid this repetition, this patch adds a Makefile routine that iterates over the files adding or removing the appropriate flags. Tested for powerpc64le. Reviewed-By: Florian Weimer <fweimer@redhat.com> Reviewed-By: Joseph Myers <joseph@codesourcery.com> Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>	2019-11-22 18:10:52 -03:00
Paul A. Clarke	102b5b0caf	Remove duplicate inline implementation of issignalingf Very recent commit `854e91bf6b` enabled inline of issignalingf() in general (__issignalingf in include/math.h). There is another implementation for an inline use of issignalingf (issignalingf_inline in sysdeps/ieee754/flt-32/math_config.h) which could instead make use of the new enablement. Replace the use of issignalingf_inline with __issignaling. Using issignaling (instead of __issignalingf) will allow future enhancements to the type-generic implementation, issignaling, to be automatically adopted. The implementations are slightly different, and compile to slightly different code, but I measured no significant performance difference. The second implementation was brought to my attention by: Suggested-by: Joseph Myers <joseph@codesourcery.com> Reviewed-by: Joseph Myers <joseph@codesourcery.com>	2019-11-22 11:37:40 -06:00
Alistair Francis	aa706e13f4	Split up endian.h to minimize exposure of BYTE_ORDER. With only two exceptions (sys/types.h and sys/param.h, both of which historically might have defined BYTE_ORDER) the public headers that include <endian.h> only want to be able to test __BYTE_ORDER against ___ENDIAN. This patch creates a new bits/endian.h that can be included by any header that wants to be able to test __BYTE_ORDER and/or __FLOAT_WORD_ORDER against the ___ENDIAN constants, or needs __LONG_LONG_PAIR. It only defines macros in the implementation namespace. The existing bits/endian.h (which could not be included independently of endian.h, and only defines __BYTE_ORDER and maybe __FLOAT_WORD_ORDER) is renamed to bits/endianness.h. I also took the opportunity to canonicalize the form of this header, which we are stuck with having one copy of per architecture. Since they are so short, this means git doesn’t understand that they were renamed from existing headers, sigh. endian.h itself is a nonstandard header and its only remaining use from a standard header is guarded by __USE_MISC, so I dropped the __USE_MISC conditionals from around all of the public-namespace things it defines. (This means, an application that requests strict library conformance but includes endian.h will still see the definition of BYTE_ORDER.) A few changes to specific bits/endian(ness).h variants deserve mention: - sysdeps/unix/sysv/linux/ia64/bits/endian.h is moved to sysdeps/ia64/bits/endianness.h. If I remember correctly, ia64 did have selectable endianness, but we have assembly code in sysdeps/ia64 that assumes it’s little-endian, so there is no reason to treat the ia64 endianness.h as linux-specific. - The C-SKY port does not fully support big-endian mode, the compile will error out if __CSKYBE__ is defined. - The PowerPC port had extra logic in its bits/endian.h to detect a broken compiler, which strikes me as unnecessary, so I removed it. - The only files that defined __FLOAT_WORD_ORDER always defined it to the same value as __BYTE_ORDER, so I removed those definitions. The SH bits/endian(ness).h had comments inconsistent with the actual setting of __FLOAT_WORD_ORDER, which I also removed. - I removed copyright boilerplate from the few bits/endian(ness).h headers that had it; these files record a single fact in a fashion dictated by an external spec, so I do not think they are copyrightable. As long as I was changing every copy of ieee754.h in the tree, I noticed that only the MIPS variant includes float.h, because it uses LDBL_MANT_DIG to decide among three different versions of ieee854_long_double. This patch makes it not include float.h when GCC’s intrinsic __LDBL_MANT_DIG__ is available. * string/endian.h: Unconditionally define LITTLE_ENDIAN, BIG_ENDIAN, PDP_ENDIAN, and BYTE_ORDER. Condition byteswapping macros only on !__ASSEMBLER__. Move the definitions of __BIG_ENDIAN, __LITTLE_ENDIAN, __PDP_ENDIAN, __FLOAT_WORD_ORDER, and __LONG_LONG_PAIR to... * string/bits/endian.h: ...this new file, which includes the renamed header bits/endianness.h for the definition of __BYTE_ORDER and possibly __FLOAT_WORD_ORDER. * string/Makefile: Install bits/endianness.h. * include/bits/endian.h: New wrapper. * bits/endian.h: Rename to bits/endianness.h. Add multiple-include guard. Rewrite the comment explaining what the machine-specific variants of this file should do. * sysdeps/unix/sysv/linux/ia64/bits/endian.h: Move to sysdeps/ia64. * sysdeps/aarch64/bits/endian.h * sysdeps/alpha/bits/endian.h * sysdeps/arm/bits/endian.h * sysdeps/csky/bits/endian.h * sysdeps/hppa/bits/endian.h * sysdeps/ia64/bits/endian.h * sysdeps/m68k/bits/endian.h * sysdeps/microblaze/bits/endian.h * sysdeps/mips/bits/endian.h * sysdeps/nios2/bits/endian.h * sysdeps/powerpc/bits/endian.h * sysdeps/riscv/bits/endian.h * sysdeps/s390/bits/endian.h * sysdeps/sh/bits/endian.h * sysdeps/sparc/bits/endian.h * sysdeps/x86/bits/endian.h: Rename to endianness.h; canonicalize form of file; remove redundant definitions of __FLOAT_WORD_ORDER. * sysdeps/powerpc/bits/endianness.h: Remove logic to check for broken compilers. * ctype/ctype.h * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h * sysdeps/arm/nptl/bits/pthreadtypes-arch.h * sysdeps/csky/nptl/bits/pthreadtypes-arch.h * sysdeps/ia64/ieee754.h * sysdeps/ieee754/ieee754.h * sysdeps/ieee754/ldbl-128/ieee754.h * sysdeps/ieee754/ldbl-128ibm/ieee754.h * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h * sysdeps/mips/ieee754/ieee754.h * sysdeps/mips/nptl/bits/pthreadtypes-arch.h * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h * sysdeps/nptl/pthread.h * sysdeps/riscv/nptl/bits/pthreadtypes-arch.h * sysdeps/sh/nptl/bits/pthreadtypes-arch.h * sysdeps/sparc/sparc32/ieee754.h * sysdeps/unix/sysv/linux/generic/bits/stat.h * sysdeps/unix/sysv/linux/generic/bits/statfs.h * sysdeps/unix/sysv/linux/sys/acct.h * wctype/bits/wctype-wchar.h: Include bits/endian.h, not endian.h. * sysdeps/unix/sysv/linux/hppa/pthread.h: Don’t include endian.h. * sysdeps/mips/ieee754/ieee754.h: Use __LDBL_MANT_DIG__ in ifdefs, instead of LDBL_MANT_DIG. Only include float.h when __LDBL_MANT_DIG__ is not predefined, in which case define __LDBL_MANT_DIG__ to equal LDBL_MANT_DIG.	2019-10-01 14:54:46 -07:00
Paul Eggert	5a82c74822	Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http\|ftp)(://(.\.)?(gnu\|fsf\|sourceware)\.org($\|[^.]\|\.[^a-z])),https\2,g s,(http\|ftp)(://(.\.)?)sources\.redhat\.com($\|[^.]\|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '.po' \ ! -name 'ChangeLog' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: * error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: * error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S	2019-09-07 02:43:31 -07:00
Joseph Myers	42760d7646	Make totalorder and totalordermag functions take pointer arguments. The resolution of C floating-point Clarification Request 25 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2397.htm#dr_25> is that the totalorder and totalordermag functions should take pointer arguments, and this has been adopted in C2X (with const added; note that the integration of this change into C2X is present in the C standard git repository but postdates the most recent public PDF draft). This patch updates glibc accordingly. As a defect resolution, the API is changed unconditionally rather than supporting any sort of TS 18661-1 mode for compilation with the old version of the API. There are compat symbols for existing binaries that pass floating-point arguments directly. As a consequence of changing to pointer arguments, there are no longer type-generic macros in tgmath.h for these functions. Because of the fairly complicated logic for creating libm function aliases and determining the set of aliases to create in a given glibc configuration, rather than duplicating all that in individual source files to create the versioned and compat symbols, the source files for the various versions of totalorder functions are set up to redefine weak_alias before using libm_alias_* macros to create the symbols required. In turn, this requires creating a separate alias for each symbol version pointing to the same implementation (see binutils bug <https://sourceware.org/bugzilla/show_bug.cgi?id=23840>), which is done automatically using __COUNTER__. (As I noted in <https://sourceware.org/ml/libc-alpha/2018-10/msg00631.html>, it might well make sense for glibc's symbol versioning macros to do that alias creation with __COUNTER__ themselves, which would somewhat simplify the logic in the totalorder source files.) It is of course desirable to test the compat symbols. I did this with the generic libm-test machinery, but didn't wish to duplicate the actual tables of test inputs and outputs, and thought it risky to attempt to have a single object file refer to both default and compat versions of the same function in order to test them together. Thus, I created libm-test-compat_totalorder.inc and libm-test-compat_totalordermag.inc which include the generated .c files (with the processed version of those tables of inputs) from the non-compat tests, and added appropriate dependencies. I think this provides sufficient test coverage for the compat symbols without also needing to make the special ldbl-96 and ldbl-128ibm tests (of peculiarities relating to the representations of those formats that can't be covered in the generic tests) run for the compat symbols. Tests of compat symbols need to be internal tests, meaning _ISOMAC is not defined. Making some libm-test tests into internal tests showed up two other issues. GCC diagnoses duplicate macro definitions of __STDC_* macros, including __STDC_WANT_IEC_60559_TYPES_EXT__; I added an appropriate conditional and filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91451> for this issue. On ia64, include/setjmp.h ends up getting included indirectly from libm-symbols.h, resulting in conflicting definitions of the STR macro (also defined in libm-test-driver.c); I renamed the macros in include/setjmp.h. (It's arguable that we should have common internal headers used everywhere for stringizing and concatenation macros.) Tested for x86_64 and x86, and with build-many-glibcs.py. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT) \|\| __MATH_DECLARING_FLOATN] (totalorder): Take pointer arguments. [__GLIBC_USE (IEC_60559_BFP_EXT) \|\| __MATH_DECLARING_FLOATN] (totalordermag): Likewise. * manual/arith.texi (totalorder): Likewise. (totalorderf): Likewise. (totalorderl): Likewise. (totalorderfN): Likewise. (totalorderfNx): Likewise. (totalordermag): Likewise. (totalordermagf): Likewise. (totalordermagl): Likewise. (totalordermagfN): Likewise. (totalordermagfNx): Likewise. * math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): Remove macro. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): Likewise. * math/Versions (GLIBC_2.31): Add totalorder, totalorderf, totalorderl, totalordermag, totalordermagf, totalordermagl, totalorderf32, totalorderf64, totalorderf32x, totalordermagf32, totalordermagf64, totalordermagf32x, totalorderf64x, totalordermagf64x, totalorderf128 and totalordermagf128. * math/Makefile (libm-test-funcs-noauto): Add compat_totalorder and compat_totalordermag. (libm-test-funcs-compat): New variable. (libm-tests-compat): Likewise. (tests): Do not include compat tests. (tests-internal): Add compat tests. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalorder.o)): Depend on $(objpfx)libm-test-totalorder.c. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalordermag.o): Depend on $(objpfx)libm-test-totalordermag.c. (tgmath3-macros): Remove totalorder and totalordermag. * math/libm-test-compat_totalorder.inc: New file. * math/libm-test-compat_totalordermag.inc: Likewise. * math/libm-test-driver.c (struct test_ff_i_data): Update comment. (RUN_TEST_fpfp_b): New macro. (RUN_TEST_LOOP_fpfp_b): Likewise. * math/libm-test-totalorder.inc (totalorder_test_data): Use TEST_fpfp_b. (totalorder_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/libm-test-totalordermag.inc (totalordermag_test_data): Use TEST_fpfp_b. (totalordermag_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/gen-tgmath-tests.py (Tests.add_all_tests): Remove totalorder and totalordermag. * math/test-tgmath.c (NCALLS): Change to 132. (F(compile_test)): Do not call totalorder or totalordermag. (F(totalorder)): Remove. (F(totalordermag)): Likewise. * include/float.h (__STDC_WANT_IEC_60559_TYPES_EXT__): Do not define if [__STDC_WANT_IEC_60559_TYPES_EXT__]. * include/setjmp.h [!_ISOMAC] (STR_HELPER): Rename to SJSTR_HELPER. [!_ISOMAC] (STR): Rename to SJSTR. Update call to STR_HELPER. [!_ISOMAC] (TEST_SIZE): Update call to STR. [!_ISOMAC] (TEST_ALIGN): Likewise. [!_ISOMAC] (TEST_OFFSET): Likewise. * sysdeps/ieee754/dbl-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/float128/float128_private.h (__totalorder_compatl): New macro. (__totalordermag_compatl): Likewise. * sysdeps/ieee754/flt-32/s_totalorderf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/flt-32/s_totalordermagf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include <shlib-compat.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include <shlib-compat.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c (totalorderl): Take pointer arguments. * sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c (totalordermagl): Likewise. * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/csky/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.	2019-08-15 15:18:34 +00:00
Adhemerval Zanella	105f2ed368	math: Use wordsize-64 version for s_logb - The resulting binary difference on 32 bits architecture is minimum. On i686-linux-gnu (with architecture optimization routine removed) there is no different using logb benchtests - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Move to ... * sysdeps/ieee754/dbl-64/s_logb.c: ... here. Add work around for powerpc32 integer 0 converting to -0. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-07-08 17:22:22 -03:00
Adhemerval Zanella	a72186761b	math: Use wordsize-64 version for finite - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra finite call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ... * sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	a8c590f789	math: Use wordsize-64 version for isinf - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isinf call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ... * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	197dbda1a1	math: Use wordsize-64 version for isnan - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isnan call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ... * sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:18 -03:00
Maciej W. Rozycki	87c266d758	Fix -O1 compilation errors with `__ddivl' and` __fdivl' [BZ #19444 ] Complementing commit `4a06ceea33` ("sysdeps/ieee754/soft-fp: ignore maybe-uninitialized with -O [BZ #19444]") and commit `27c5e756a2` ("sysdeps/ieee754: prevent maybe-uninitialized errors with -O [BZ #19444]") also fix compilation errors observed at -O1 in `__ddivl' and `__fdivl' with GCC 9 and RISC-V targets: In file included from ../soft-fp/soft-fp.h:318, from ../sysdeps/ieee754/soft-fp/s_fdivl.c:27: ../sysdeps/ieee754/soft-fp/s_fdivl.c: In function '__fdivl': ../soft-fp/op-2.h:108:9: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized] 108 \| : (X##_f1 << (2_FP_W_TYPE_SIZE - (N)))) \ \| ^ ../sysdeps/ieee754/soft-fp/s_fdivl.c:37:14: note: 'R_f1' was declared here 37 \| FP_DECL_Q (R); \| ^ ../soft-fp/op-common.h:39:3: note: in expansion of macro '_FP_FRAC_DECL_2' 39 \| _FP_FRAC_DECL_##wc (X) \| ^~~~~~~~~~~~~~ ../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL' 226 \| # define FP_DECL_Q(X) _FP_DECL (2, X) \| ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdivl.c:37:3: note: in expansion of macro 'FP_DECL_Q' 37 \| FP_DECL_Q (R); \| ^~~~~~~~~ ../soft-fp/op-2.h:109:8: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized] 109 \| \| X##_f0) != 0)); \ \| ^ ../sysdeps/ieee754/soft-fp/s_fdivl.c:37:14: note: 'R_f0' was declared here 37 \| FP_DECL_Q (R); \| ^ ../soft-fp/op-common.h:39:3: note: in expansion of macro '_FP_FRAC_DECL_2' 39 \| _FP_FRAC_DECL_##wc (X) \| ^~~~~~~~~~~~~~ ../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL' 226 \| # define FP_DECL_Q(X) _FP_DECL (2, X) \| ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdivl.c:37:3: note: in expansion of macro 'FP_DECL_Q' 37 \| FP_DECL_Q (R); \| ^~~~~~~~~ In file included from ../soft-fp/soft-fp.h:318, from ../sysdeps/ieee754/soft-fp/s_ddivl.c:31: ../sysdeps/ieee754/soft-fp/s_ddivl.c: In function '__ddivl': ../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized] 98 \| X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) \| X##_f0 >> (N) \ \| ^~ ../sysdeps/ieee754/soft-fp/s_ddivl.c:41:14: note: 'R_f1' was declared here 41 \| FP_DECL_Q (R); \| ^ ../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2' 37 \| _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT \| ^ ../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL' 226 \| # define FP_DECL_Q(X) _FP_DECL (2, X) \| ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_ddivl.c:41:3: note: in expansion of macro 'FP_DECL_Q' 41 \| FP_DECL_Q (R); \| ^~~~~~~~~ ../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized] 101 \| : (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \ \| ^~ ../sysdeps/ieee754/soft-fp/s_ddivl.c:41:14: note: 'R_f0' was declared here 41 \| FP_DECL_Q (R); \| ^ ../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2' 37 \| _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT \| ^ ../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL' 226 \| # define FP_DECL_Q(X) _FP_DECL (2, X) \| ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_ddivl.c:41:3: note: in expansion of macro 'FP_DECL_Q' 41 \| FP_DECL_Q (R); \| ^~~~~~~~~ cc1: all warnings being treated as errors make[2]: [.../sysd-rules:587: .../math/s_fdivl.o] Error 1 make[2]: * Waiting for unfinished jobs.... cc1: all warnings being treated as errors make[2]: *** [.../sysd-rules:587: .../math/s_ddivl.o] Error 1 This comes from cases in _FP_DIV that return a result described as FP_CLS_ZERO or FP_CLS_INF and do not initialize the fractional part, which is then operated on unconditionally in FP_TRUNC_COOKED before being ignored by _FP_PACK_CANONICAL. Clearly at this optimization level GCC cannot guarantee to be able to determine that the fractional part is ultimately unused, so ignore the error as with the earlier commits referred, letting compilation proceed. [BZ #19444] * sysdeps/ieee754/soft-fp/s_ddivl.c (__ddivl): Ignore errors from `-Wmaybe-uninitialized'. * sysdeps/ieee754/soft-fp/s_fdivl.c (__fdivl): Likewise.	2019-04-30 02:24:49 +01:00
Gabriel F. T. Gomes	f0eaf86276	ldbl-opt: Reuse test cases from misc/ that check long double This patch adds test cases for the compatibility versions of the functions: err, errx, verr, verrx, warn, warnx, vwarn, vwarnx (from err.h), error, and error_at_line (from error.h), when long double has the same format as double (-mlong-double-64). Tested for powerpc, powerpc64 and powerpc64le.	2019-03-01 15:32:49 -03:00
Gabriel F. T. Gomes	d11086a939	ldbl-opt: Add error and error_at_line (bug 23984) On platforms where long double may have the same format as double (-mlong-double-64), error and error_at_line do not take that into account and might produce wrong output if a long double conversion is requested by the format string ('%Lf'). This patch adds compatibility functions for this situation and redirects calls via header magic. Tested for powerpc, powerpc64 and powerpc64le.	2019-03-01 15:26:36 -03:00
Gabriel F. T. Gomes	90188e7d1a	ldbl-opt: Add err, errx, verr, verrx, warn, warnx, vwarn, and vwarnx (bug 23984) When support for long double format with 128-bits (-mlong-double-128) was added for platforms where long double had the same format as double, such as powerpc, compatibility versions for the functions listed in the commit title were missed. Since the older format of long double can still be used (with -mlong-double-64), using these functions with a format string that requests the printing of long double variables will produce wrong outputs. This patch adds the missing compatibility functions and header magic to redirect calls to them when -mlong-double-64 is in use. Tested for powerpc, powerpc64 and powerpc64le.	2019-03-01 15:24:51 -03:00
Gabriel F. T. Gomes	ea2d89d01c	ldbl-opt: Reuse argp tests that print long double The test case tst-ldbl-argp checks that the conversion specifier '%Lf' correctly prints long double values with the default long double format for a platform. This patch reuses the test case for long double with the same format as double (-mlong-double-64). Tested for powerpc, powerpc64 and powerpc64le.	2019-03-01 15:23:16 -03:00
Gabriel F. T. Gomes	6e1f6440b9	ldbl-opt: Add argp_error and argp_failure (bug 23983) The functions argp_error and argp_failure are missing support for printing long double values when long double has the same format as double. This patch adds the new functions __nldbl_argp_error and __nldbl_argp_failure, as well as header magic to redirect calls to them when -mlong-double-64 is in use. Tested for powerpc, powerpc64 and powerpc64le.	2019-03-01 15:21:32 -03:00
Joseph Myers	c2d8f0b704	Avoid "inline" after return type in function definitions. One group of warnings seen with -Wextra is warnings for static or inline not at the start of a declaration (-Wold-style-declaration). This patch fixes various such cases for inline, ensuring it comes at the start of the declaration (after any static). A common case of the fix is "static inline <type> __always_inline"; the definition of __always_inline starts with __inline, so the natural change is to "static __always_inline <type>". Other cases of the warning may be harder to fix (one pattern is a function definition that gets rewritten to be static by an including file, "#define funcname static wrapped_funcname" or similar), but it seems worth fixing these cases with inline anyway. Tested for x86_64. * elf/dl-load.h (_dl_postprocess_loadcmd): Use __always_inline before return type, without separate inline. * elf/dl-tunables.c (maybe_enable_malloc_check): Likewise. * elf/dl-tunables.h (tunable_is_name): Likewise. * malloc/malloc.c (do_set_trim_threshold): Likewise. (do_set_top_pad): Likewise. (do_set_mmap_threshold): Likewise. (do_set_mmaps_max): Likewise. (do_set_mallopt_check): Likewise. (do_set_perturb_byte): Likewise. (do_set_arena_test): Likewise. (do_set_arena_max): Likewise. (do_set_tcache_max): Likewise. (do_set_tcache_count): Likewise. (do_set_tcache_unsorted_limit): Likewise. * nis/nis_subr.c (count_dots): Likewise. * nptl/allocatestack.c (advise_stack_range): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Likewise. (do_sin): Likewise. (reduce_sincos): Likewise. (do_sincos): Likewise. * sysdeps/unix/sysv/linux/x86/elision-conf.c (do_set_elision_enable): Likewise. (TUNABLE_CALLBACK_FNDECL): Likewise.	2019-02-06 17:16:43 +00:00
Dmitry V. Levin	a1b02ae763	Fix a few typos in comments Apply the following spelling fixes: $ git grep -F -l 'relevent' \| xargs sed -i 's/relevent/relevant/g' $ git grep -F -l 'checked fot' \| xargs sed -i 's/checked fot/checked for/g' $ git grep -F -l "could't" \| xargs sed -i "s/could't/couldn't/g" $ git grep -F -l 'wheter' \| grep -Fv ChangeLog.old \| xargs sed -i 's/wheter/whether/g' $ git grep -F -l 'neccessary' \| grep -Fv ChangeLog.old \| xargs sed -i 's/neccessary/necessary/g' $ git grep -F -l 'ouput' \| xargs sed -i 's/ouput/output/g' $ git grep -F -w -l 'iput' \| xargs sed -i 's/iput/input/g' This is inspired by a gnulib bug report at https://lists.gnu.org/archive/html/bug-gnulib/2019-01/msg00081.html * argp/argp-help.c: Fix typo in comment. * misc/sys/cdefs.h: Likewise. * posix/regexec.c (sift_states_iter_mb): Likewise. * socket/sockatmark.c: Likewise. * socket/sys/socket.h: Likewise. * sysdeps/ia64/fpu/libm_sincos_large.S: Likewise. * sysdeps/ia64/fpu/libm_sincosl.S: Likewise. * sysdeps/ia64/fpu/s_cosl.S: Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise. * sysdeps/unix/sockatmark.c: Likewise. * time/strptime_l.c: Likewise.	2019-01-12 13:44:51 +00:00
H.J. Lu	69da3c9e87	soft-fp: Properly check _FP_W_TYPE_SIZE [BZ #24066 ] quad.h have #if _FP_W_TYPE_SIZE < 64 union _FP_UNION_Q { Use 4 _FP_W_TYPEs } #else union _FP_UNION_Q { Use 2 _FP_W_TYPEs } #endif Replace #if (2 * _FP_W_TYPE_SIZE) < _FP_FRACBITS_Q with #if _FP_W_TYPE_SIZE < 64 to check whether 4 or 2 _FP_W_TYPEs are used for IEEE quad precision. Tested with build-many-glibcs.py. [BZ #24066] * soft-fp/extenddftf2.c: Use "_FP_W_TYPE_SIZE < 64" to check if 4_FP_W_TYPEs are used for IEEE quad precision. * soft-fp/extendhftf2.c: Likewise. * soft-fp/extendsftf2.c: Likewise. * soft-fp/extendxftf2.c: Likewise. * soft-fp/trunctfdf2.c: Likewise. * soft-fp/trunctfhf2.c: Likewise. * soft-fp/trunctfsf2.c: Likewise. * soft-fp/trunctfxf2.c: Likewise. * sysdeps/alpha/ots_cvttx.c: Likewise. * sysdeps/alpha/ots_cvtxt.c: Likewise. * sysdeps/ieee754/soft-fp/s_daddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_dmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise. * sysdeps/ieee754/soft-fp/s_faddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise. * sysdeps/sparc/sparc32/q_dtoq.c: Likewise. * sysdeps/sparc/sparc32/q_qtod.c: Likewise. * sysdeps/sparc/sparc32/q_qtos.c: Likewise. * sysdeps/sparc/sparc32/q_stoq.c: Likewise. * sysdeps/sparc/sparc64/qp_dtoq.c: Likewise. * sysdeps/sparc/sparc64/qp_qtod.c: Likewise. * sysdeps/sparc/sparc64/qp_qtos.c: Likewise. * sysdeps/sparc/sparc64/qp_stoq.c: Likewise.	2019-01-07 09:04:39 -08:00
Martin Jansa	27c5e756a2	sysdeps/ieee754: prevent maybe-uninitialized errors with -O [BZ #19444 ] With -O included in CFLAGS it fails to build with: ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_jnl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:146:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_ynl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:375:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_jn': ../sysdeps/ieee754/dbl-64/e_jn.c:113:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_yn': ../sysdeps/ieee754/dbl-64/e_jn.c:320:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os: https://sourceware.org/ml/libc-alpha/2018-09/msg00539.html [BZ #19444] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Use __builtin_unreachable for default case in switch. (__ieee754_yn): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise.	2019-01-04 16:17:48 +00:00
Zack Weinberg	03992356e6	Use C99-compliant scanf under _GNU_SOURCE with modern compilers. The only difference between noncompliant and C99-compliant scanf is that the former accepts the archaic GNU extension '%as' (also %aS and %a[...]) meaning to allocate space for the input string with malloc. This extension conflicts with C99's use of %a as a format _type_ meaning to read a floating-point number; POSIX.1-2008 standardized equivalent functionality using the modifier letter 'm' instead (%ms, %mS, %m[...]). The extension was already disabled in most conformance modes: specifically, any mode that doesn't involve _GNU_SOURCE and _does_ involve either strict conformance to C99 or loose conformance to both C99 and POSIX.1-2001 would get the C99-compliant scanf. With compilers new enough to use -std=gnu11 instead of -std=gnu89, or equivalent, that includes the default mode. With this patch, we now provide C99-compliant scanf in all configurations except when _GNU_SOURCE is defined and __STDC_VERSION__ or __cplusplus (whichever is relevant) indicates C89/C++98. This leaves the old scanf available under e.g. -std=c89 -D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it was already not present under -std=gnu11 without -D_GNU_SOURCE) and from -std=gnu89 without -D_GNU_SOURCE. There needs to be an internal override so we can compile the noncompliant scanf itself. This is the same problem we had when we removed 'gets' from _GNU_SOURCE and it's dealt with the same way: there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to off under the appropriate conditions for external code, but can be overridden by individual files within stdio. We also run into problems with PLT bypass for internal uses of sscanf, because libc_hidden_proto uses __REDIRECT and so does the logic in stdio.h for choosing which implementation of scanf to use; __REDIRECT isn't transitive, so include/stdio.h needs to bridge the gap with a macro. As far as I can tell, sscanf is the only function in this family that's internally called by unrelated code. Finally, there are several tests in stdio-common that use the extension. bug21.c is a regression test for a crash; it still exercises the relevant code when changed to use %ms instead of %as. scanf14.c through scanf17.c are more complicated since they are actually testing the subtleties of the extension - under what circumstances is 'a' treated as a modifier letter, etc. I changed all of them to use %ms instead of %as as well, but duplicated scanf14.c and scanf16.c as scanf14a.c and scanf16a.c. These still use %as and are compiled with -std=gnu89 to access the old extension. A bunch of diagnostic overrides and manual workarounds for the old stdio.h behavior become unnecessary. Yay! * include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE parameter. Only use deprecated scanf when __USE_GNU is defined and __STDC_VERSION__ is less than 199901L or __cplusplus is less than 201103L, whichever is relevant for the language being compiled. * libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their __isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF). * wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf. * libio/iovsscanf.c * libio/fwscanf.c * libio/iovswscanf.c * libio/swscanf.c * libio/vscanf.c * libio/vwscanf.c * libio/wscanf.c * stdio-common/fscanf.c * stdio-common/scanf.c * stdio-common/vfscanf.c * stdio-common/vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-compat.c * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c * sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c: Override __GLIBC_USE_DEPRECATED_SCANF to 1. * stdio-common/sscanf.c: Likewise. Remove ldbl_hidden_def for __sscanf. * stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf. * include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf, not sscanf. [!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf with a preprocessor macro. * stdio-common/bug21.c, stdio-common/scanf14.c: Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[]; remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16.c: Likewise. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf14a.c: New copy of scanf14.c which still uses %as, %aS, %a[]. Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16a.c: New copy of scanf16.c which still uses %as, %aS, %a[]. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf15.c, stdio-common/scanf17.c: No need to override feature selection macros or provide definitions of u_char etc. * stdio-common/Makefile (tests): Add scanf14a and scanf16a. (CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove. (CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New. Compile these files with -std=gnu89.	2019-01-03 11:12:39 -05:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
H.J. Lu	8700a7851b	x86-64: Vectorize sincosf_poly and update s_sincosf-fma.c Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.	2018-12-26 06:56:13 -08:00
Szabolcs Nagy	505b5b2922	Fix powf overflow handling in non-nearest rounding mode [BZ #23961 ] The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. The fix tries to avoid affecting the common code path or calling a function that may introduce a stack frame, so float arithmetics is used to check the rounding mode and the threshold is selected accordingly. [BZ #23961] * math/auto-libm-test-in: Add new test case. * math/auto-libm-test-out-pow: Regenerated. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Fix overflow check.	2018-12-11 10:01:43 +00:00
Zack Weinberg	35caceb145	Use PRINTF_LDBL_IS_DBL instead of __ldbl_is_dbl. After all that prep work, nldbl-compat.c can now use PRINTF_LDBL_IS_DBL instead of __no_long_double to control the behavior of printf-like functions; this is the last thing we needed __no_long_double for, so it can go away entirely. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	4e2f43f842	Use PRINTF_FORTIFY instead of _IO_FLAGS2_FORTIFY (bug 11319) The _chk variants of all of the printf functions become much simpler. This is the last thing that we needed _IO_acquire_lock_clear_flags2 for, so it can go as well. I took the opportunity to make the headers included and the names of all local variables consistent across all the affected files. Since we ultimately want to get rid of __no_long_double as well, it must be possible to get all of the nontrivial effects of the _chk functions by calling the _internal functions with appropriate flags. For most of the __(v)xprintf_chk functions, this is covered by PRINTF_FORTIFY plus some up-front argument checks that can be duplicated. However, __(v)sprintf_chk installs a custom jump table so that it can crash instead of overflowing the output buffer. This functionality is moved to __vsprintf_internal, which now has a 'maxlen' argument like __vsnprintf_internal; to get the unsafe behavior of ordinary (v)sprintf, pass -1 for that argument. obstack_printf_chk and obstack_vprintf_chk are no longer in the same file. As a side-effect of the unification of both fortified and non-fortified vdprintf initialization, this patch fixes bug 11319 for __dprintf_chk and __vdprintf_chk, which was previously fixed only for dprintf and vdprintf by the commit commit `7ca890b88e` Author: Ulrich Drepper <drepper@redhat.com> Date: Wed Feb 24 16:07:57 2010 -0800 Fix reporting of I/O errors in *dprintf functions. This patch adds a test case to avoid regressions. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	124fc732c1	Add __vsyslog_internal, with same flags as __vprintf_internal. __nldbl___vsyslog_chk will ultimately want to pass PRINTF_LDBL_IS_DBL down to __vfprintf_internal as well as* possibly setting PRINTF_FORTIFY. To make that possible, we need a __vsyslog_internal that takes the same flags as printf. The code in misc/syslog.c does also get a little simpler. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	698fb75b9f	Add __vprintf_internal with flags arguments There are a lot more printf variants than there are scanf variants, and the code for setting up and tearing down their custom FILE variants around the call to __vf(w)printf is more complicated and variable. Therefore, I have added _internal versions of all the vprintf variants, rather than introducing helper routines so that they can all directly call __vf(w)printf_internal, as was done with scanf. As with the scanf changes, in this patch the _internal functions still look at the environmental mode bits and all callers pass 0 for the flags parameter. Several of the affected public functions had _IO_ name aliases that were not exported (but, in one case, appeared in libio.h anyway); I was originally planning to leave them as aliases to avoid having to touch internal callers, but it turns out ldbl__alias only work for exported symbols, so they've all been removed instead. It also turns out there were hardly any internal callers. _IO_vsprintf and _IO_vfprintf are* exported, so those two stick around. Summary for the changes to each of the affected symbols: _IO_vfprintf, _IO_vsprintf: All internal calls removed, thus the internal declarations, as well as uses of libc_hidden_proto and libc_hidden_def, were also removed. The external symbol is now exposed via uses of ldbl_strong_alias to __vfprintf_internal and __vsprintf_internal, respectively. _IO_vasprintf, _IO_vdprintf, _IO_vsnprintf, _IO_vfwprintf, _IO_vswprintf, _IO_obstack_vprintf, _IO_obstack_printf: All internal calls removed, thus declaration in internal headers were also removed. They were never exported, so there are no aliases tying them to the internal functions. I.e.: entirely gone. __vsnprintf: Internal calls were always preceded by macros such as #define __vsnprintf _IO_vsnprintf, and #define __vsnprintf vsnprintf The macros were removed and their uses replaced with calls to the new internal function __vsnprintf_internal. Since there were no internal calls, the internal declaration was also removed. The external symbol is preserved with ldbl_weak_alias to ___vsnprintf. __vfwprintf: All internal calls converted into calls to __vfwprintf_internal, thus the internal declaration was removed. The function is now a wrapper that calls __vfwprintf_internal. The external symbol is preserved. __vswprintf: Similarly, but no external symbol. __vasprintf, __vdprintf, __vfprintf, __vsprintf: New internal wrappers. Not exported. vasprintf, vdprintf, vfprintf, vsprintf, vsnprintf, vfwprintf, vswprintf, obstack_vprintf, obstack_printf: These functions used to be aliases to the respective _IO_* function, they are now aliases to their respective __* functions. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Zack Weinberg	d91798b31a	Use SCANF_LDBL_IS_DBL instead of __ldbl_is_dbl. Change the callers of __vfscanf_internal and __vfwscanf_internal that want to treat 'long double' as another name for 'double' (all of which happen to be in sysdeps/ieee754/ldbl-opt/nldbl-compat.c) to communicate this via the new flags argument, instead of the per-thread variable __no_long_double and its __ldbl_is_dbl wrapper macro. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Zack Weinberg	349718d4d7	Add __vfscanf_internal and __vfwscanf_internal with flags arguments. There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used by __isoc99_ scanf variants. In this patch, the new functions honor these flag bits if they're set, but they still also look at the corresponding bits of environmental state, and callers all pass zero. The new functions do not have the "errp" argument possessed by _IO_vfscanf and _IO_vfwscanf. All internal callers passed NULL for that argument. External callers could theoretically exist, so I preserved wrappers, but they are flagged as compat symbols and they don't preserve the three-way distinction among types of errors that was formerly exposed. These functions probably should have been in the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just aliases for vfscanf and vfwscanf. (It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf. Please check that part of the patch very carefully, I am still not confident I understand all of the details of ldbl-opt.) This patch also introduces helper inlines in libio/strfile.h that encapsulate the process of initializing an _IO_strfile object for reading. This allows us to call __vfscanf_internal directly from sscanf, and __vfwscanf_internal directly from swscanf, without duplicating the initialization code. (Previously, they called their v-counterparts, but that won't work if we want to control both C99 mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.) It's still a little awkward, especially for wide strfiles, but it's much better than what we had. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Szabolcs Nagy	a502c5294b	Remove the error handling wrapper from pow Introduce new pow symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_pow.c and enabled for targets with their own pow implementation or ifunc dispatch on __ieee754_pow by including math/w_pow.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously powl was an alias of pow, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __pow_finite symbol is now an alias of pow. Both __pow_finite and pow set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add pow. * math/w_pow_compat.c (__pow_compat): Change to versioned compat symbol. * math/w_pow.c: New file. * sysdeps/i386/fpu/w_pow.c: New file. * sysdeps/ia64/fpu/e_pow.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow and add necessary aliases. * sysdeps/ieee754/dbl-64/w_pow.c: New file. * sysdeps/m68k/m680x0/fpu/w_pow.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to __pow. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.	2018-11-21 09:58:36 +00:00
Szabolcs Nagy	718d6542f2	Remove the error handling wrapper from log2 Introduce new log2 symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log2.c and enabled for targets with their own log2 implementation by including math/w_log2.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously log2l was an alias of log2, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log2_finite symbol is now an alias of log2. Both __log2_finite and log2 set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log2. * math/w_log2_compat.c (__log2_compat): Change to versioned compat symbol. * math/w_log2.c: New file. * sysdeps/i386/fpu/w_log2.c: New file. * sysdeps/ia64/fpu/e_log2.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log2.c: New file. * sysdeps/m68k/m680x0/fpu/w_log2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.	2018-11-21 09:57:21 +00:00
Szabolcs Nagy	f29b7c492d	Remove the error handling wrapper from log Introduce new log symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log.c and enabled for targets with their own log implementation by including math/w_log.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously logl was an alias of log, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log_finite symbol is now an alias of log. Both __log_finite and log set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log. * math/w_log_compat.c (__log_compat): Change to versioned compat symbol. * math/w_log.c: New file. * sysdeps/i386/fpu/w_log.c: New file. * sysdeps/ia64/fpu/e_log.S: Update. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log.c: New file. * sysdeps/m68k/m680x0/fpu/w_log.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to __log. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/w_log.c: New file.	2018-11-21 09:56:27 +00:00
Szabolcs Nagy	c20a10561a	Remove the error handling wrapper from exp and exp2 Introduce new exp and exp2 symbol version that don't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The double precision wrappers are disabled for sysdeps/ieee754/dbl-64 by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and math/w_exp2.c files use the wrapper template and can be included by targets that have their own exp and exp2 implementations or use ifunc on the glibc internal __ieee754_exp symbol. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously expl and exp2l were aliases of exp and exp2, now they point to the compatibility symbols with the wrapper, because they still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The _finite symbols are now aliases of the standard symbols (they have no performance advantage anymore). Both the standard symbols and _finite symbols set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header (the new macro name is __exp instead of __ieee754_exp which breaks some math.h macros). Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add exp and exp2. * math/w_exp2_compat.c (__exp2_compat): Change to versioned compat symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly. * math/w_exp_compat.c (__exp_compat): Likewise. * math/w_exp.c: New file. * math/w_exp2.c: New file. * sysdeps/i386/fpu/w_exp.c: New file. * sysdeps/i386/fpu/w_exp2.c: New file. * sysdeps/ia64/fpu/e_exp.S: Add versioned symbols. * sysdeps/ia64/fpu/e_exp2.S: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp and add necessary aliases. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_exp.c: New file. * sysdeps/ieee754/dbl-64/w_exp2.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.	2018-11-21 09:55:02 +00:00
Zack Weinberg	c75772e3f0	Use STRFMON_LDBL_IS_DBL instead of __ldbl_is_dbl. On platforms where long double used to have the same format as double, but later switched to a different format (alpha, s390, sparc, and powerpc), accessing the older behavior is possible and it happens via __nldbl_* functions (not on the API, but accessible from header redirection and from compat symbols). These functions write to the global flag __ldbl_is_dbl, which tells other functions that long double variables should be handled as double. This patch takes the first step towards removing this global flag and creates __vstrfmon_l_internal, which takes an explicit flags parameter. This change arguably makes the generated code slightly worse on architectures where __ldbl_is_dbl is never true; right now, on those architectures, it's a compile-time constant; after this change, the compiler could theoretically prove that __vstrfmon_l_internal was never called with a nonzero flags argument, but it would probably need LTO to do it. This is not performance critical code and I tend to think that the maintainability benefits of removing action at a distance are worth it. However, we _could_ wrap the runtime flag check with a macro that was defined to ignore its argument and always return false on architectures where __ldbl_is_dbl is never true, if people think the codegen benefits are important. Tested for powerpc and powerpc64le.	2018-11-16 09:21:14 -02:00
Joseph Myers	a19876214a	Fix libnldbl_nonshared.a references to internal libm symbols (bug 23735). The redirection of built-in functions such as sqrt in include/math.h applies when the wrappers for those functions in libnldbl_nonshared.a are built, resulting in references to internal names such as __ieee754_sqrt that aren't actually exported from the shared libm. (This applies for sqrt in 2.28, also for the round-to-integer functions in current master because of my changes there.) This patch arranges for NO_MATH_REDIRECT to be used for all the affected functions, and adds a test for those functions in libnldbl_nonshared.a. (We could of course choose to obsolete libnldbl_nonshared.a and require that people building with -mlong-double-64 either include the relevant headers and have a compiler supporting asm redirection, or have some other means of achieving that redirection at compile time if not including those headers. But while we have libnldbl_nonshared.a, it seems appropriate to fix such bugs in it.) Tested for powerpc, and with build-many-glibcs.py. [BZ #23735] * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (NO_MATH_REDIRECT): Define. * sysdeps/ieee754/ldbl-opt/test-nldbl-redirect.c: New file. * sysdeps/ieee754/ldbl-opt/Makefile [$(subdir) = math] (tests): Add test-nldbl-redirect. [$(subdir) = math] (CFLAGS-test-nldbl-redirect.c): New variable. [$(subdir) = math] ($(objpfx)test-nldbl-redirect): Depend on $(objpfx)libnldbl_nonshared.a.	2018-10-04 12:16:05 +00:00
Martin Jansa	4a06ceea33	sysdeps/ieee754/soft-fp: ignore maybe-uninitialized with -O [BZ #19444 ] * with -O, -O1, -Os it fails with: In file included from ../soft-fp/soft-fp.h:318, from ../sysdeps/ieee754/soft-fp/s_fdiv.c:28: ../sysdeps/ieee754/soft-fp/s_fdiv.c: In function '__fdiv': ../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized] X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) \| X##_f0 >> (N) \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f1' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ ../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized] : (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f0' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os. [BZ #19444] * sysdeps/ieee754/soft-fp/s_fdiv.c: Include <libc-diag.h> and use DIAG_PUSH_NEEDS_COMMENT, DIAG_IGNORE_NEEDS_COMMENT and DIAG_POP_NEEDS_COMMENT to disable -Wmaybe-uninitialized.	2018-10-02 15:40:57 +00:00
Joseph Myers	c52944e8cc	Remove unnecessary math_private.h includes. After my changes to move various macros, inlines and other content from math_private.h to more specific headers, many files including math_private.h no longer need to do so. Furthermore, since the optimized inlines of various functions have been moved to include/fenv.h or replaced by use of function names GCC inlines automatically, a missing math_private.h include where one is appropriate will reliably cause a build failure rather than possibly causing code to be less well optimized while still building successfully. Thus, this patch removes includes of math_private.h that are now unnecessary. In the case of two RISC-V files, the include is replaced by one of stdbool.h because the files in question were relying on math_private.h to get a definition of bool. Tested for x86_64 and x86, and with build-many-glibcs.py. * math/fromfp.h: Do not include <math_private.h>. * math/s_cacosh_template.c: Likewise. * math/s_casin_template.c: Likewise. * math/s_casinh_template.c: Likewise. * math/s_ccos_template.c: Likewise. * math/s_cproj_template.c: Likewise. * math/s_fdim_template.c: Likewise. * math/s_fmaxmag_template.c: Likewise. * math/s_fminmag_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/s_ldexp_template.c: Likewise. * math/s_nextdown_template.c: Likewise. * math/w_log1p_template.c: Likewise. * math/w_scalbln_template.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_atanl.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/i386/fpu/s_logbl.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/i386/fpu/s_significandl.c: Likewise. * sysdeps/ia64/fpu/s_matherrf.c: Likewise. * sysdeps/ia64/fpu/s_matherrl.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/s_signgam.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise. * sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of <math_private.h>. * sysdeps/riscv/rvf/s_rintf.c: Likewise.	2018-09-28 21:53:33 +00:00
Joseph Myers	81dca813cc	Use copysign functions not __copysign functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __copysign functions to call the corresponding copysign names instead, with asm redirection to __copysign when the calls are not inlined (all cases are inlined except for IBM long double for powerpc soft-float / e500v1). This eliminates the need for an inline function defining __copysign in terms of __builtin_copysign. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_BINARY_ARGS): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT. * sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/alpha/fpu/s_copysignf.c: Likewise. * sysdeps/ieee754/dbl-64/s_copysign.c: Likewise. * sysdeps/ieee754/float128/s_copysignf128.c: Likewise. * sysdeps/ieee754/flt-32/s_copysignf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/riscv/rvd/s_copysign.c: Likewise. * sysdeps/riscv/rvf/s_copysignf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/generic/math_private_calls.h [!__MATH_DECLARING_LONG_DOUBLE \|\| !NO_LONG_DOUBLE] (__copysign): Do not declare and define as an inline function. * math/divtc3.c (__divtc3): Use copysign functions instead of __copysign variants. * math/multc3.c (__multc3): Likewise. * sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. (__ieee754_yn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise. * sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise. (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. (__ieee754_ynf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl) * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-27 20:04:48 +00:00
Joseph Myers	9755bc4686	Use round functions not __round functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl.	2018-09-27 12:35:23 +00:00
Joseph Myers	7abf97bed9	Use trunc functions not __trunc functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.	2018-09-20 21:11:10 +00:00
Szabolcs Nagy	d734727837	Fix the documentation comment of checkint in powf checkint in powf is not supposed to be used with 0, inf or nan inputs. * sysdeps/ieee754/flt-32/e_powf.c (checkint): Fix documentation.	2018-09-19 10:13:20 +01:00
Szabolcs Nagy	424c4f60ed	Add new pow implementation The algorithm is exp(y * log(x)), where log(x) is computed with about 1.32^-68 relative error (1.52^-68 without fma), returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). The __exp1 internal symbol is no longer necessary. There is separate code path when fma is not available but the worst case error is about 0.54 ULP in both cases. The lookup table and consts for log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1] pow latency: 1.84x in [0.01 11.1]x[0.01 11.1] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets. * NEWS: Mention pow improvements. * math/Makefile (type-double-routines): Add e_pow_log_data. * sysdeps/generic/math_private.h (__exp1): Remove. * sysdeps/i386/fpu/e_pow_log_data.c: New file. * sysdeps/ia64/fpu/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma contraction. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove. (exp_inline): Remove. (__ieee754_exp): Only single double input is handled. * sysdeps/ieee754/dbl-64/e_pow.c: Rewrite. * sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define. (__pow_log_data): Define. * sysdeps/ieee754/dbl-64/upow.h: Remove. * sysdeps/ieee754/dbl-64/upow.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma contraction. (CFLAGS-e_pow-fma4.c): Likewise.	2018-09-19 10:04:51 +01:00
Joseph Myers	50bc59ca4d	Fix ldbl-128ibm ceill, floorl inlining of ceil, floor. The ldbl-128ibm implementations of ceill and floorl call the corresponding double functions. This patch fixes those implementations to call those functions as ceil and floor rather than as __ceil and __floor, so that the proper inlining takes place when possible, while including local asm redirections for when the functions are not inlined since NO_MATH_REDIRECT applies to the double functions as well as to the long double ones. Tested with build-many-glibcs.py for all its powerpc configurations. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c (ceil): Redirect to __ceil. (__ceill): Call ceil instead of __ceil. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c (floor): Redirect to __floor. (__floorl): Call floor instead of __floor.	2018-09-18 13:24:14 +00:00
Joseph Myers	71223ef909	Use ceil functions not __ceil functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-17 20:42:06 +00:00
Joseph Myers	f29b6f17e4	Use rint functions not __rint functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.	2018-09-14 13:10:39 +00:00
Joseph Myers	e44acb2063	Use floor functions not __floor functions in glibc libm. Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-14 13:09:01 +00:00
Szabolcs Nagy	3e08ff544b	Add new log2 implementation Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c) where the last term is approximated by a polynomial of x/c - 1, the first order coefficient is about 1/ln2 in this case. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, for which the table size is doubled. The worst case error is 0.547 ULP (0.55 without fma), the read only global data size is 1168 bytes (2192 without fma) on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log2 thruput: 2.00x in [0.01 11.1] log2 latency: 2.04x in [0.01 11.1] log2 thruput: 2.17x in [0.999 1.001] log2 latency: 2.88x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linxu-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log2 improvements. * math/Makefile (type-double-routines): Add e_log2_data. * sysdeps/i386/fpu/e_log2_data.c: New file. * sysdeps/ia64/fpu/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/e_log2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove. * sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.	2018-09-12 17:36:33 +01:00
Szabolcs Nagy	f41b0a43e4	Add new log implementation Optimized log using carefully generated lookup table with 1/c and log(c) values for small intervalls around 1. The log(c) is very near a double precision value, it has about 62 bits precision. The algorithm is log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is approximated by a polynomial of x/c - 1. Near 1 a single polynomial of x - 1 is used. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, in which case the table size is doubled. The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined as an instruction. With the default configuration settings the worst case error is 0.519 ULP (and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma). The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log thruput: 3.28x in [0.01 11.1] log latency: 2.23x in [0.01 11.1] log thruput: 1.56x in [0.999 1.001] log latency: 1.57x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linux-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log improvement. * math/Makefile (type-double-routines): Add e_log_data. * sysdeps/i386/fpu/e_log_data.c: New file. * sysdeps/ia64/fpu/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/e_log.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add. * sysdeps/ieee754/dbl-64/ulog.h: Remove. * sysdeps/ieee754/dbl-64/ulog.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.	2018-09-12 17:33:30 +01:00
Szabolcs Nagy	e70c176825	Add new exp and exp2 implementations Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2018-09-05 16:22:00 +01:00
Joseph Myers	418d99e622	Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. <fenv_private.h> has inline versions of various <fenv.h> functions, and their __fe* variants, for systems (generally soft-float) without support for floating-point exceptions, rounding modes or both. Having these inlines in a separate header introduces a risk of a source file including <fenv.h> and compiling OK on x86_64, but failing to compile (because the feraiseexcept inline is actually a macro that discards its argument, to avoid the need for #ifdef FE_INVALID conditionals), or not being properly optimized, on systems without the exceptions and rounding modes support (when these inlines were in math_private.h, we had a few cases where this broke the build because there was no obvious reason for a file to need math_private.h and it didn't need that header on x86_64). By moving those inlines to include/fenv.h, this risk can be avoided, and fenv_private.h becomes more clearly defined as specifically the header for the internal libc_fe* and SET_RESTORE_ROUND* interfaces. This patch makes that move, removing fenv_private.h includes that are no longer needed (or replacing them by fenv.h includes in a few cases that didn't already have such an include). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/fenv_private.h [FE_ALL_EXCEPT == 0]: Move this code .... [!FE_HAVE_ROUNDING_MODES]: And this code .... * include/fenv.h [!_ISOMAC]: ... to here. * math/fraiseexcpt.c (__feraiseexcept): Undefine as macro. (feraiseexcept): Likewise. * math/fromfp.h: Do not include <fenv_private.h>. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * math/w_ilogb_template.c: Include <fenv.h> instead of <fenv_private.h>. * math/w_llogb_template.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise.	2018-09-04 19:52:06 +00:00
Joseph Myers	70e2ba332f	Do not include fenv_private.h in math_private.h. Continuing the clean-up related to the catch-all math_private.h header, this patch stops math_private.h from including fenv_private.h. Instead, fenv_private.h is included directly from those users of math_private.h that also used interfaces from fenv_private.h. No attempt is made to remove unused includes of math_private.h, but that is a natural followup. (However, since math_private.h sometimes defines optimized versions of math.h interfaces or __* variants thereof, as well as defining its own interfaces, I think it might make sense to get all those optimized versions included from include/math.h, not requiring a separate header at all, before eliminating unused math_private.h includes - that avoids a file quietly becoming less-optimized if someone adds a call to one of those interfaces without restoring a math_private.h include to that file.) There is still a pitfall that if code uses plain fe* and __fe* interfaces, but only includes fenv.h and not fenv_private.h or (before this patch) math_private.h, it will compile on platforms with exceptions and rounding modes but not get the optimized versions (and possibly not compile) on platforms without exception and rounding mode support, so making it easy to break the build for such platforms accidentally. I think it would be most natural to move the inlines / macros for fe* and __fe* in the case of no exceptions and rounding modes into include/fenv.h, so that all code including fenv.h with _ISOMAC not defined automatically gets them. Then fenv_private.h would be purely the header for the libc_fe, SET_RESTORE_ROUND etc. internal interfaces and the risk of breaking the build on other platforms than the one you tested on because of a missing fenv_private.h include would be much reduced (and there would be some unused fenv_private.h includes to remove along with unused math_private.h includes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. sysdeps/generic/math_private.h: Do not include <fenv_private.h>. * math/fromfp.h: Include <fenv_private.h>. * math/math-narrow.h: Likewise. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_ilogb_template.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_llogb_template.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_remainder.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_product.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/x2y2m1.c: Likewise. * sysdeps/ieee754/float128/float128_private.h: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1l.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise.	2018-09-03 21:09:04 +00:00
Wilco Dijkstra	ca3aac57ef	Remove unused math files Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c, k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c (and the ia64, m68k and powerpc equivalents) are no longer used, so remove them. All e_rem_pio2.c files were already empty or commented out, so remove them too. Passes build-many-glibcs. * math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c. Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c. * sysdeps/i386/fpu/e_rem_pio2.c: Delete file. * sysdeps/ia64/fpu/e_rem_pio2.c: Likewise. * sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise. * sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/k_cos.c: Likewise. * sysdeps/ieee754/dbl-64/k_sin.c: Likewise. * sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_cosf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise * sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise * sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise * sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise. * sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.	2018-08-24 15:34:54 +01:00
Wilco Dijkstra	900fb446eb	Speedup tanf range reduction Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * \|x\| < M_PI_4 : 1.1x * \|x\| < M_PI_2 : 1.2x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.6x * \|x\| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.	2018-08-23 12:38:16 +01:00
Wilco Dijkstra	126c4e3f80	Use generic sinf/cosf in lgammaf_r The internal functions __kernel_sinf and __kernel_cosf are used only by lgammaf_r. Removing the internal functions and using the generic sinf and cosf is better overall. Benchmarking on Cortex-A72 shows the generic sinf and cosf are 1.4x and 2.3x faster in the range \|x\| < PI/4, and 0.66x and 1.1x for \|x\| < PI/2, so it should make lgammaf_r faster on average. GLIBC regression tests pass on AArch64. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf. * sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.	2018-08-15 16:01:21 +01:00
Wilco Dijkstra	599cf39766	Improve performance of sinf and cosf The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.2x * \|x\| < M_PI_4 : 1.8x * \|x\| < 2 * M_PI: 1.7x * \|x\| < 120.0 : 2.3x * \|x\| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.	2018-08-14 10:45:59 +01:00
Wilco Dijkstra	ea5c662c62	Improve performance of sincosf This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.6x * \|x\| < M_PI_4 : 1.7x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.8x * \|x\| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.	2018-08-10 17:34:39 +01:00

1 2 3 4 5 ...

1129 Commits