Commit Graph

4 Commits

Author SHA1 Message Date
Noah Goldstein
4fc321dc58 x86: Fix backwards Prefer_No_VZEROUPPER check in ifunc-evex.h
Add third argument to X86_ISA_CPU_FEATURES_ARCH_P macro so the runtime
CPU_FEATURES_ARCH_P check can be inverted if the
MINIMUM_X86_ISA_LEVEL is not high enough to constantly evaluate
the check.

Use this new macro to correct the backwards check in ifunc-evex.h
2022-06-27 08:35:51 -07:00
Noah Goldstein
3edda6a0f0 x86: Add support for compiling {raw|w}memchr with high ISA level
1. Refactor files so that all implementations for in the multiarch
   directory.
    - Essentially moved sse2 {raw|w}memchr.S implementation to
      multiarch/{raw|w}memchr-sse2.S

    - The non-multiarch {raw|w}memchr.S file now only includes one of
      the implementations in the multiarch directory based on the
      compiled ISA level (only used for non-multiarch builds.
      Otherwise we go through the ifunc selector).

2. Add ISA level build guards to different implementations.
    - I.e memchr-avx2.S which is ISA level 3 will only build if
      compiled ISA level <= 3. Otherwise there is no reason to include
      it as we will always use one of the ISA level 4
      implementations (memchr-evex{-rtm}.S).

3. Add new multiarch/rtld-{raw}memchr.S that just include the
   non-multiarch {raw}memchr.S which will in turn select the best
   implementation based on the compiled ISA level.

4. Refactor the ifunc selector and ifunc implementation list to use
   the ISA level aware wrapper macros that allow functions below the
   compiled ISA level (with a guranteed replacement) to be skipped.
    - Guranteed replacement essentially means that for any ISA level
      build there must be a function that the baseline of the ISA
      supports. So for {raw|w}memchr.S since there is not ISA level 2
      function, the ISA level 2 build still includes the ISA level
      1 (sse2) function. Once we reach the ISA level 3 build, however,
      {raw|w}memchr-avx2{-rtm}.S will always be sufficient so the ISA
      level 1 implementation ({raw|w}memchr-sse2.S) will not be built.

Tested with and without multiarch on x86_64 for ISA levels:
{generic, x86-64-v2, x86-64-v3, x86-64-v4}

And m32 with and without multiarch.
2022-06-22 19:41:35 -07:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Noah Goldstein
104c7b1967 x86: Add EVEX optimized memchr family not safe for RTM
No bug.

This commit adds a new implementation for EVEX memchr that is not safe
for RTM because it uses vzeroupper. The benefit is that by using
ymm0-ymm15 it can use vpcmpeq and vpternlogd in the 4x loop which is
faster than the RTM safe version which cannot use vpcmpeq because
there is no EVEX encoding for the instruction. All parts of the
implementation aside from the 4x loop are the same for the two
versions and the optimization is only relevant for large sizes.

Tigerlake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 9.2   , 9.04  , no-RTM  , 0.16
512   , 7     , 224   , 9.19  , 8.98  , no-RTM  , 0.21
2048  , 0     , 256   , 10.74 , 10.54 , no-RTM  , 0.2
2048  , 0     , 512   , 14.81 , 14.87 , RTM     , 0.06
2048  , 0     , 1024  , 22.97 , 22.57 , no-RTM  , 0.4
2048  , 0     , 2048  , 37.49 , 34.51 , no-RTM  , 2.98   <--

Icelake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 7.6   , 7.3   , no-RTM  , 0.3
512   , 7     , 224   , 7.63  , 7.27  , no-RTM  , 0.36
2048  , 0     , 256   , 8.48  , 8.38  , no-RTM  , 0.1
2048  , 0     , 512   , 11.57 , 11.42 , no-RTM  , 0.15
2048  , 0     , 1024  , 17.92 , 17.38 , no-RTM  , 0.54
2048  , 0     , 2048  , 30.37 , 27.34 , no-RTM  , 3.03   <--

test-memchr, test-wmemchr, and test-rawmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-08 16:26:30 -04:00