glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-29 08:11:08 +00:00

Author	SHA1	Message	Date
Joseph Myers	7ec903e028	Implement C23 exp2m1, exp10m1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and exp10(x)-1, like expm1). As with other such functions, these use type-generic templates that could be replaced with faster and more accurate type-specific implementations in future. Test inputs are copied from those for expm1, plus some additions close to the overflow threshold (copied from exp2 and exp10) and also some near the underflow threshold. exp2m1 has the unusual property of having an input (M_MAX_EXP) where whether the function overflows (under IEEE semantics) depends on the rounding mode. Although these could reasonably be XFAILed in the testsuite (as we do in some cases for arguments very close to a function's overflow threshold when an error of a few ulps in the implementation can result in the implementation not agreeing with an ideal one on whether overflow takes place - the testsuite isn't smart enough to handle this automatically), since these functions aren't required to be correctly rounding, I made the implementation check for and handle this case specially. The Makefile ordering expected by lint-makefiles for the new functions is a bit peculiar, but I implemented it in this patch so that the test passes; I don't know why log2 also needed moving in one Makefile variable setting when it didn't in my previous patches, but the failure showed a different place was expected for that function as well. The powerpc64le IFUNC setup seems not to be as self-contained as one might hope; it shouldn't be necessary to add IFUNCs for new functions such as these simply to get them building, but without setting up IFUNCs for the new functions, there were undefined references to __GI___expm1f128 (that IFUNC machinery results in no such function being defined, but doesn't stop include/math.h from doing the redirection resulting in the exp2m1f128 and exp10m1f128 implementations expecting to call it). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 16:31:49 +00:00
Joseph Myers	55eb99e9a9	Implement C23 log10p1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for base-10 logarithms). This is directly analogous to the log2p1 implementation (except that whereas log2p1 has a smaller underflow range than log1p, log10p1 has a larger underflow range). The test inputs are copied from those for log1p and log2p1, plus a few more inputs in that wider underflow range. Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:48:13 +00:00
Joseph Myers	bb014f50c4	Implement C23 logp1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch does mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are not updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:47:09 +00:00
Joseph Myers	79c52daf47	Implement C23 log2p1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for base-2 logarithms). This illustrates the intended structure of implementations of all these function families: define them initially with a type-generic template implementation. If someone wishes to add type-specific implementations, it is likely such implementations can be both faster and more accurate than the type-generic one and can then override it for types for which they are implemented (adding benchmarks would be desirable in such cases to demonstrate that a new implementation is indeed faster). The test inputs are copied from those for log1p. Note that these changes make gen-auto-libm-tests depend on MPFR 4.2 (or later). The bulk of the changes are fairly generic for any such new function. (sysdeps/powerpc/nofpu/Makefile only needs changing for those type-generic templates that use fabs.) Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-05-20 13:41:39 +00:00
H.J. Lu	9e1f4aef86	x86-64: Exclude FMA4 IFUNC functions for -mapxf When -mapxf is used to build glibc, the resulting glibc will never run on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used. This requires GCC which defines __APX_F__ for -mapxf with commit: 1df56719bd8 x86: Define __APX_F__ for -mapxf Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-04-06 05:03:55 -07:00
Adhemerval Zanella	44ccc2465c	math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603) The implementations of trunc functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	932544efa4	math: x86 floor traps when FE_INEXACT is enabled (BZ 31601) The implementations of floor functions using x87 floating point (i386 and 86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	637bfc392f	math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600) The implementations of ceil functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Sunil K Pandey	9f78a7c1d0	x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch When glibc is built with ISA level 3 or higher by default, the resulting glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by default. When glibc is built with ISA level 2 enabled by default, only keep SSE4.1 variant. Fixes BZ 31335. NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind doesn't support AVX512 instructions: https://bugs.kde.org/show_bug.cgi?id=383010 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-25 13:20:51 -08:00
H.J. Lu	ef7f4b1fef	Apply the Makefile sorting fix Apply the Makefile sorting fix generated by sort-makefile-lines.py.	2024-02-15 11:19:56 -08:00
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
Bruno Haible	787282dede	x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. When we need to clear a flag, we need to do so in both units, due to the way fetestexcept is implemented. When we need to set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
H.J. Lu	a8ecb126d4	x86_64: Add log1p with FMA On Skylake, it changes log1p bench performance by: Before After Improvement max 63.349 58.347 8% min 4.448 5.651 -30% mean 12.0674 10.336 14% The minimum code path is if (hx < 0x3FDA827A) /* x < 0.41422 / { if (__glibc_unlikely (ax >= 0x3ff00000)) / x <= -1.0 / { ... } if (__glibc_unlikely (ax < 0x3e200000)) / \|x\| < 2*-29 / { math_force_eval (two54 + x); /* raise inexact / if (ax < 0x3c900000) / \|x\| < 2*-54 / { ... } else return x - x * x * 0.5; FMA and non-FMA code sequences look similar. Non-FMA version is slightly faster. Since log1p is called by asinh and atanh, it improves asinh performance by: Before After Improvement max 75.645 63.135 16% min 10.074 10.071 0% mean 15.9483 14.9089 6% and improves atanh performance by: Before After Improvement max 91.768 75.081 18% min 15.548 13.883 10% mean 18.3713 16.8011 8%	2023-08-21 10:44:26 -07:00
H.J. Lu	1b214630ce	x86_64: Add expm1 with FMA On Skylake, it improves expm1 bench performance by: Before After Improvement max 70.204 68.054 3% min 20.709 16.2 22% mean 22.1221 16.7367 24% NB: Add extern long double __expm1l (long double); extern long double __expm1f128 (long double); for __typeof (__expm1l) and __typeof (__expm1f128) when __expm1 is defined since __expm1 may be expanded in their declarations which causes the build failure.	2023-08-14 08:14:19 -07:00
H.J. Lu	f6b10ed8e9	x86_64: Add log2 with FMA On Skylake, it improves log2 bench performance by: Before After Improvement max 208.779 63.827 69% min 9.977 6.55 34% mean 10.366 6.8191 34%	2023-08-11 07:49:45 -07:00
H.J. Lu	881546979d	x86_64: Sort fpu/multiarch/Makefile Sort Makefile variables using scripts/sort-makefile-lines.py. No code generation changes observed in libm. No regressions on x86_64.	2023-08-10 11:23:25 -07:00
Andreas K. Hüttel	6d457ff36a	Update x86_64 libm-test-ulps (x32 ABI) Based on feedback by Mike Gilbert <floppym@gentoo.org> Linux-6.1.38-dist x86_64 AMD Phenom-tm- II X6 1055T Processor -march=amdfam10 failures occur for x32 ABI Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2023-07-19 16:56:54 +02:00
Paul Pluzhnikov	1e9d5987fd	Fix misspellings in sysdeps/x86_64 -- BZ 25337. Applying this commit results in bit-identical rebuild of libc.so.6 math/libm.so.6 elf/ld-linux-x86-64.so.2 mathvec/libmvec.so.1 Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-05-23 10:25:11 +00:00
Paul Pluzhnikov	1d2971b525	Fix misspellings in sysdeps/x86_64/fpu/multiarch -- BZ 25337. Applying this commit results in a bit-identical rebuild of mathvec/libmvec.so.1 (which is the only binary that gets rebuilt). Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-05-23 03:28:58 +00:00
Joe Ramsay	cd94326a13	Enable libmvec support for AArch64 This patch enables libmvec on AArch64. The proposed change is mainly implementing build infrastructure to add the new routines to ABI, tests and benchmarks. I have demonstrated how this all fits together by adding implementations for vector cos, in both single and double precision, targeting both Advanced SIMD and SVE. The implementations of the routines themselves are just loops over the scalar routine from libm for now, as we are more concerned with getting the plumbing right at this point. We plan to contribute vector routines from the Arm Optimized Routines repo that are compliant with requirements described in the libmvec wiki. Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising the minimum GCC by such a big jump, we allow users to disable libmvec if their compiler is too old. Note that at this point users have to manually call the vector math functions. This seems to be acceptable to some downstream users. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-05-03 12:09:49 +01:00
Florian Weimer	5d1ccdda7b	x86_64: Fix asm constraints in feraiseexcept (bug 30305) The divss instruction clobbers its first argument, and the constraints need to reflect that. Fortunately, with GCC 12, generated code does not actually change, so there is no externally visible bug. Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-04-03 18:40:52 +02:00
Joe Ramsay	e4d336f1ac	benchtests: Move libmvec benchtest inputs to benchtests directory This allows other targets to use the same inputs for their own libmvec microbenchmarks without having to duplicate them in their own subdirectory. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-03-27 17:04:03 +01:00
H.J. Lu	04a558e669	x86_64: Update libm test ulps Update libm test ulps for commit `3efbf11fdf` Author: Paul Zimmermann <Paul.Zimmermann@inria.fr> Date: Tue Feb 14 11:24:59 2023 +0100 update auto-libm-test-out-hypot Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-02-27 08:39:32 -08:00
Joseph Myers	6d7e8eda9b	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
Florian Weimer	e88b9f0e5c	stdio-common: Convert vfprintf and related functions to buffers vfprintf is entangled with vfwprintf (of course), __printf_fp, __printf_fphex, __vstrfmon_l_internal, and the strfrom family of functions. The latter use the internal snprintf functionality, so vsnprintf is converted as well. The simples conversion is __printf_fphex, followed by __vstrfmon_l_internal and __printf_fp, and finally __vfprintf_internal and __vfwprintf_internal. __vsnprintf_internal and strfrom* are mostly consuming the new interfaces, so they are comparatively simple. __printf_fp is a public symbol, so the FILE -based interface had to preserved. The __printf_fp rewrite does not change the actual binary-to-decimal conversion algorithm, and digits are still not emitted directly to the target buffer. However, the staging buffer now uses bytes instead of wide characters, and one buffer copy is eliminated. The changes are at least performance-neutral in my testing. Floating point printing and snprintf improved measurably, so that this Lua script for i=1,5000000 do print(i, i math.pi) end runs about 5% faster for me. To preserve fprintf performance for a simple "%d" format, this commit has some logic changes under LABEL (unsigned_number) to avoid additional function calls. There are certainly some very easy performance improvements here: binary, octal and hexadecimal formatting can easily avoid the temporary work buffer (the number of digits can be computed ahead-of-time using one of the __builtin_clz* built-ins). Decimal formatting can use a specialized version of _itoa_word for base 10. The existing (inconsistent) width handling between strfmon and printf is preserved here. __print_fp_buffer_1 would have to use __translated_number_width to achieve ISO conformance for printf. Test expectations in libio/tst-vtables-common.c are adjusted because the internal staging buffer merges all virtual function calls into one. In general, stack buffer usage is greatly reduced, particularly for unbuffered input streams. __printf_fp can still use a large buffer in binary128 mode for %g, though. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-12-19 18:56:54 +01:00
Noah Goldstein	f704192911	x86/fpu: Factor out shared avx2/avx512 code in svml_{s\|d}_wrapper_impl.h Code is exactly the same for the two so better to only maintain one version. All math and mathvec tests pass on x86.	2022-11-27 20:22:49 -08:00
Noah Goldstein	72f6a5a0ed	x86/fpu: Cleanup code in svml_{s\|d}_wrapper_impl.h 1. Remove unnecessary spills. 2. Fix some small nit missed optimizations. All math and mathvec tests pass on x86.	2022-11-27 20:22:49 -08:00
Noah Goldstein	d371be4b11	x86/fpu: Reformat svml_{s\|d}_wrapper_impl.h Just reformat with the style convention used in other x86 assembler files. This doesn't change libm.so or libmvec.so.	2022-11-27 20:22:49 -08:00
Noah Goldstein	95177b78ff	x86/fpu: Fix misspelled evex512 section in variety of svml files ``` .section .text.evex512, "ax", @progbits ``` With misspelled as: ``` .section .text.exex512, "ax", @progbits ```	2022-11-27 20:22:49 -08:00
Noah Goldstein	e1d082d9de	x86/fpu: Add missing ISA sections to variety of svml files Many sse4/avx2/avx512 files where just in .text.	2022-11-27 20:22:49 -08:00
Adhemerval Zanella	114e299ca6	x86: Remove .tfloat usage Some compiler does not support it (such as clang integrated assembler) neither gcc emits it.	2022-10-03 14:03:21 -03:00
Noah Goldstein	3079f652d7	x86: Replace all sse instructions with vex equivilent in avx+ files Most of these don't really matter as there was no dirty upper state but we should generally avoid stray sse when its not needed. The one case that really matters is in svml_d_tanh4_core_avx2.S: blendvps %xmm0, %xmm8, %xmm7 When there was a dirty upper state. Tested on x86_64-linux	2022-06-22 19:42:17 -07:00
Noah Goldstein	cffb9414c5	x86: Optimize svml_s_tanhf4_core_sse4.S Optimizations are: 1. Reduce code size (-112 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata size (-4k+ rodata is shared with avx2). Result is roughly a 15-16% speedup: Function, New Time, Old Time, New / Old _ZGVbN4v_tanhf, 3.158, 3.749, 0.842	2022-06-09 12:51:25 -07:00
Noah Goldstein	bcc41f66a4	x86: Optimize svml_s_tanhf8_core_avx2.S Optimizations are: 1. Reduce code size (-81 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata size (-32 bytes). Result is roughly a 17-18% speedup: Function, New Time, Old Time, New / Old _ZGVdN8v_tanhf, 1.977, 2.402, 0.823	2022-06-09 12:51:22 -07:00
Noah Goldstein	3a49ce8799	x86: Add data file that can be shared by tanhf-avx2 and tanhf-sse4 tanhf-avx2 and tanhf-sse4 use the same data tables so we can save over 4kb using a shared datatable. This does increase the memory footprint of the sse4 version (as now all the targets are 32 bytes instead of 16), generally it seems worth the code size save. NB: This patch doesn't do anything itself, it is setup for future patches.	2022-06-09 12:51:15 -07:00
Noah Goldstein	e560b3c2d2	x86: Optimize svml_s_tanhf16_core_avx512.S Optimizations are: 1. Reduce code size (-67 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Reduce rodata usage (-448 bytes). Result is roughly a 14% speedup: Function, New Time, Old Time, New / Old _ZGVeN16v_tanhf, 0.649, 0.752, 0.863	2022-06-09 12:51:12 -07:00
Noah Goldstein	fe1915d4f6	x86: Improve svml_s_atanhf4_core_sse4.S Improvements are: 1. Reduce code size (-62 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Reduce rodata usage (-16 bytes). The throughput improvement is not significant as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVbN4v_atanhf, 8.821, 8.903, 0.991	2022-06-09 12:51:09 -07:00
Noah Goldstein	65897e9916	x86: Improve svml_s_atanhf8_core_avx2.S Improvements are: 1. Reduce code size (-60 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Prefer registers which get short instruction encoding. 5. Shrink rodata usage (-32 bytes). The throughput improvement is not that significant (3-5%) as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVdN8v_atanhf, 2.799, 2.923, 0.958	2022-06-09 12:51:04 -07:00
Noah Goldstein	73bae395cf	x86: Improve svml_s_atanhf16_core_avx512.S Improvements are: 1. Reduce code size (-64 bytes). 2. Remove redundant move instructions. 3. Slightly improve instruction selection/scheduling where possible. 4. Reduce rodata size ([-128, -188] bytes). The throughput improvement is not significant as the port 0 bottleneck is unavoidable. Function, New Time, Old Time, New / Old _ZGVeN16v_atanhf, 1.39, 1.408, 0.987	2022-06-09 12:50:58 -07:00
Andreas Schwab	dc1e5eeb25	x86_64: Optimize sincos where sin/cos is optimized (bug 29193) The compiler may substitute calls to sin or cos with calls to sincos, thus we should have the same optimized implementations for sincos. The optimized implementations may produce results that differ, that also makes sure that the sincos call aggrees with the sin and cos calls.	2022-06-01 10:29:52 +02:00
Adhemerval Zanella	efeb2bd1ab	math: Add math-use-builtins-fabs (BZ#29027) Both float, double, and _Float128 are assumed to be supported (float and double already only uses builtins). Only long double is parametrized due GCC bug 29253 which prevents its usage on powerpc. It allows to remove i686, ia64, x86_64, powerpc, and sparc arch specific implementation. On ia64 it also fixes the sNAN handling: math/test-float64x-fabs math/test-ldouble-fabs Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.	2022-05-23 17:49:18 -03:00
Siddhesh Poyarekar	5b5b1012d5	benchtests: Better libmvec integration Improve libmvec benchmark integration so that in future other architectures may be able to run their libmvec benchmarks as well. This now allows libmvec benchmarks to be run with `make BENCHSET=bench-math`. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-04-29 11:48:18 +05:30
Siddhesh Poyarekar	944afe6d95	benchtests: Add UNSUPPORTED benchmark status The libmvec benchmarks print a message indicating that a certain CPU feature is unsupported and exit prematurelyi, which breaks the JSON in bench.out. Handle this more elegantly in the bench makefile target by adding support for an UNSUPPORTED exit status (77) so that bench.out continues to have output for valid tests. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-04-29 11:48:16 +05:30
Adhemerval Zanella	13d45cf9a7	x86: Remove fcopysign{f} implementation The builtin used by generic code generates similar code. Checked on x86_64-linux-gnu and i686-linux-gnu.	2022-04-07 12:17:15 -03:00
Adhemerval Zanella	cbc2c56bab	benchtests: Only build libmvec benchmarks iff $(build-mathvec) is set Checked on x86_64-linux-gnu.	2022-04-05 12:01:10 -03:00
Adhemerval Zanella	7eed708edf	x86: Remove fabs{f} implementation For x86_64 is the same as the generic implementation, while for i686 the builtin generates the same code.	2022-04-04 16:23:11 -03:00
Sunil K Pandey	6de743a4e3	x86_64: Fix svml_d_tanh8_core_avx512.S code formatting This commit contains following formatting changes 1. Instructions proceeded by a tab. 2. Instruction less than 8 characters in length have a tab between it and the first operand. 3. Instruction greater than 7 characters in length have a space between it and the first operand. 4. Tabs after `#define`d names and their value. 5. 8 space at the beginning of line replaced by tab. 6. Indent comments with code. 7. Remove redundent .text section. 8. 1 space between line content and line comment. 9. Space after all commas. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-03-07 21:44:09 -08:00
Sunil K Pandey	28ba5ee77f	x86_64: Fix svml_d_tanh4_core_avx2.S code formatting This commit contains following formatting changes 1. Instructions proceeded by a tab. 2. Instruction less than 8 characters in length have a tab between it and the first operand. 3. Instruction greater than 7 characters in length have a space between it and the first operand. 4. Tabs after `#define`d names and their value. 5. 8 space at the beginning of line replaced by tab. 6. Indent comments with code. 7. Remove redundent .text section. 8. 1 space between line content and line comment. 9. Space after all commas. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-03-07 21:44:09 -08:00
Sunil K Pandey	06c7208f27	x86_64: Fix svml_d_tanh2_core_sse4.S code formatting This commit contains following formatting changes 1. Instructions proceeded by a tab. 2. Instruction less than 8 characters in length have a tab between it and the first operand. 3. Instruction greater than 7 characters in length have a space between it and the first operand. 4. Tabs after `#define`d names and their value. 5. 8 space at the beginning of line replaced by tab. 6. Indent comments with code. 7. Remove redundent .text section. 8. 1 space between line content and line comment. 9. Space after all commas. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-03-07 21:44:09 -08:00
Sunil K Pandey	2c632117bf	x86_64: Fix svml_s_tanhf8_core_avx2.S code formatting This commit contains following formatting changes 1. Instructions proceeded by a tab. 2. Instruction less than 8 characters in length have a tab between it and the first operand. 3. Instruction greater than 7 characters in length have a space between it and the first operand. 4. Tabs after `#define`d names and their value. 5. 8 space at the beginning of line replaced by tab. 6. Indent comments with code. 7. Remove redundent .text section. 8. 1 space between line content and line comment. 9. Space after all commas. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-03-07 21:44:09 -08:00

1 2 3 4 5 ...

739 Commits