glibc/sysdeps/x86_64/wcscat.S
Noah Goldstein 52cf11004e x86: Add avx2 optimized functions for the wchar_t strcpy family
Implemented:
    wcscat-avx2  (+ 744 bytes
    wcscpy-avx2  (+ 539 bytes)
    wcpcpy-avx2  (+ 577 bytes)
    wcsncpy-avx2 (+1108 bytes)
    wcpncpy-avx2 (+1214 bytes)
    wcsncat-avx2 (+1085 bytes)

Performance Changes:
    Times are from N = 10 runs of the benchmark suite and are reported
    as geometric mean of all ratios of New Implementation / Best Old
    Implementation. Best Old Implementation was determined with the
    highest ISA implementation.

    wcscat-avx2     -> 0.975
    wcscpy-avx2     -> 0.591
    wcpcpy-avx2     -> 0.698
    wcsncpy-avx2    -> 0.730
    wcpncpy-avx2    -> 0.711
    wcsncat-avx2    -> 0.954

Code Size Changes:
    This change  increase the size of libc.so by ~5.5kb bytes. For
    reference the patch optimizing the normal strcpy family functions
    decreases libc.so by ~5.2kb.

Full check passes on x86-64 and build succeeds for all ISA levels w/
and w/o multiarch.
2022-11-08 19:22:33 -08:00

42 lines
1.5 KiB
ArmAsm

/* ISA level static dispatch for wcscat .S files.
Copyright (C) 2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
/* wcscat non-multiarch build is split into two files,
wcscat-generic.c and wcscat.S. The wcscat-generic.c build is
for ISA level <= 1 and just uses multiarch/wcscat-generic.c.
This must be split into two files because we cannot include C
code from assembly or vice versa. */
#include <isa-level.h>
#if MINIMUM_X86_ISA_LEVEL >= 3
# define WCSCAT __wcscat
# define DEFAULT_IMPL_V4 "multiarch/wcscat-evex.S"
# define DEFAULT_IMPL_V3 "multiarch/wcscat-avx2.S"
/* isa-default-impl.h expects DEFAULT_IMPL_V1 to be defined but it
should never be used from here. */
# define DEFAULT_IMPL_V1 "ERROR -- Invalid ISA IMPL"
# include "isa-default-impl.h"
weak_alias (__wcscat, wcscat)
libc_hidden_def (__wcscat)
#endif