Commit Graph

5 Commits

Author SHA1 Message Date
Joe Ramsay
e302e10213 aarch64/fpu: Sync libmvec routines from 2.39 and before with AOR
This includes a fix for big-endian in AdvSIMD log, some cosmetic
changes, and numerous small optimisations mainly around inlining and
using indexed variants of MLA intrinsics.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-02-26 09:45:50 -03:00
Paul Eggert
dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Joe Ramsay
cc0d77ba94 aarch64: Add half-width versions of AdvSIMD f32 libmvec routines
Compilers may emit calls to 'half-width' routines (two-lane
single-precision variants). These have been added in the form of
wrappers around the full-width versions, where the low half of the
vector is simply duplicated. This will perform poorly when one lane
triggers the special-case handler, as there will be a redundant call
to the scalar version, however this is expected to be rare at Ofast.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-20 08:41:25 +00:00
Wilco Dijkstra
6b695e5c62 AArch64: Remove -0.0 check from vector sin
Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf,
improving performance.  Passes regress.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-09-26 13:40:07 +01:00
Joe Ramsay
3bb1af2051 aarch64: Add vector implementations of sin routines
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.

As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30 09:04:16 +01:00