glibc/sysdeps/ieee754
Adhemerval Zanella 3323476641 i686: Use generic sinf implementation for SSE2 version
Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X),
the generic algorithm shows slight better performance for
the 'workload-huge.wrf' input set.

* s_sinf-sse2.S:
  "sinf": {
   "": {
    "duration": 3.72405e+09,
    "iterations": 2.38374e+08,
    "max": 63.973,
    "min": 11.211,
    "mean": 15.6227
   },
   "workload-random.wrf": {
    "duration": 3.76923e+09,
    "iterations": 8.4e+07,
    "reciprocal-throughput": 17.6355,
    "latency": 72.108,
    "max-throughput": 5.67037e+07,
    "min-throughput": 1.38681e+07
   },
   "workload-huge.wrf": {
    "duration": 3.76943e+09,
    "iterations": 6e+07,
    "reciprocal-throughput": 29.3493,
    "latency": 96.2985,
    "max-throughput": 3.40724e+07,
    "min-throughput": 1.03844e+07
   }
  }

* generic s_sinf.c:
  "sinf": {
   "": {
    "duration": 3.70989e+09,
    "iterations": 2.18025e+08,
    "max": 69.782,
    "min": 11.1,
    "mean": 17.0159
   },
   "workload-random.wrf": {
    "duration": 3.77213e+09,
    "iterations": 9.6e+07,
    "reciprocal-throughput": 17.5402,
    "latency": 61.0459,
    "max-throughput": 5.70119e+07,
    "min-throughput": 1.63811e+07
   },
   "workload-huge.wrf": {
    "duration": 3.81576e+09,
    "iterations": 5.6e+07,
    "reciprocal-throughput": 38.2111,
    "latency": 98.0659,
    "max-throughput": 2.61704e+07,
    "min-throughput": 1.01972e+07
   }
  }

Checked on i686-linux-gnu.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
..
dbl-64 x86_64: Optimize sincos where sin/cos is optimized (bug 29193) 2022-06-01 10:29:52 +02:00
float128 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
flt-32 i686: Use generic sinf implementation for SSE2 version 2022-06-01 10:47:44 -03:00
ldbl-64-128 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ldbl-96 math: Add math-use-builtins-fabs (BZ#29027) 2022-05-23 17:49:18 -03:00
ldbl-128 math: Add math-use-builtins-fabs (BZ#29027) 2022-05-23 17:49:18 -03:00
ldbl-128ibm math: Add math-use-builtins-fabs (BZ#29027) 2022-05-23 17:49:18 -03:00
ldbl-128ibm-compat Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ldbl-opt Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
soft-fp Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ieee754.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
k_standard.c Use copysign functions not __copysign functions in glibc libm. 2018-09-27 20:04:48 +00:00
k_standardf.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
k_standardl.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
libm-alias-finite.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Makefile Avoid -Wno-write-strings for k_standard.c. 2015-02-26 22:50:54 +00:00
s_lib_version.c Simplify math-svid-compat code. 2017-08-28 15:19:52 +00:00
s_matherr.c Obsolete matherr, _LIB_VERSION, libieee.a. 2017-08-21 17:45:10 +00:00
s_signgam.c Remove unnecessary math_private.h includes. 2018-09-28 21:53:33 +00:00