Commit Graph

40556 Commits

Author SHA1 Message Date
DJ Delorie
d846c28389 build-many-glibcs: Check for required system tools
Notes for future devs:

* Add tools as you find they're needed, with version 0,0
* Bump version when you find an old tool that doesn't work
* Don't add a version just because you know it works

Co-authored-by: Lukasz Majewski <lukma@denx.de>
Co-authored-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
2023-10-09 17:42:25 -04:00
Noah Goldstein
a3c50bf46a x86: Prepare strrchr-evex and strrchr-evex512 for AVX10
This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.

The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.

Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.

The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.

Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html

EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87

Full check passes on x86.
2023-10-06 00:18:55 -05:00
Joe Ramsay
5a4b6f8e4b aarch64: Optimise vecmath logs
* Transpose table layout for improved memory access
* Use half-vector special comparisons for AdvSIMD
* Improve register use near special-case branches
  - Due to the presence of a function call, return value would get
    mov-d out of x0 in order to facilitate PCS. By moving the final
    computation after the branch this can be avoided

Also change SVE routines to use overloaded intrinsics for readability.
2023-10-05 16:54:16 +01:00
Joe Ramsay
480a0dfe1a aarch64: Cosmetic change in SVE exp routines
Use overloaded intrinsics for readability. Codegen does not
change, however while we're bringing the routines up-to-date with
recent improvements to other routines in AOR it is worth copying
this change over as well.
2023-10-05 16:54:00 +01:00
Joe Ramsay
9180160e08 aarch64: Optimize SVE cos & cosf
Saves a mov by ensuring return value does not need to be moved out of
the way before special-case branch. Also change to use overloaded
intrinsics.
2023-10-05 16:53:38 +01:00
Joe Ramsay
8014d1e832 aarch64: Improve vecmath sin routines
* Update ULP comment reflecting a new observed max in [-pi/2, pi/2]
* Use the same polynomial in AdvSIMD and SVE, rather than FTRIG instructions
* Improve register use near special-case branch

Also use overloaded intrinsics for SVE.
2023-10-05 16:53:06 +01:00
Joe Simmons-Talbott
820948edd9 nss: Get rid of alloca usage in makedb's write_output.
Replace alloca usage with a scratch_buffer.

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2023-10-04 18:18:02 +00:00
Adhemerval Zanella
be7a5468d4 debug: Add regression tests for BZ 30932
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-10-04 08:07:43 -03:00
Volker Weißmann
7bb8045ec0 Fix FORTIFY_SOURCE false positive
When -D_FORTIFY_SOURCE=2 was given during compilation,
sprintf and similar functions will check if their
first argument is in read-only memory and exit with
*** %n in writable segment detected ***
otherwise. To check if the memory is read-only, glibc
reads frpm the file "/proc/self/maps". If opening this
file fails due to too many open files (EMFILE), glibc
will now ignore this error.

Fixes [BZ #30932]

Signed-off-by: Volker Weißmann <volker.weissmann@gmx.de>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-10-04 08:07:43 -03:00
Arjun Shankar
751850cf5a nss: Rearrange and sort Makefile variables
Rearrange lists of routines, tests, etc. into one-per-line in
nss/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-10-04 12:09:18 +02:00
Arjun Shankar
b6b8a88cf5 inet: Rearrange and sort Makefile variables
Rearrange lists of routines, tests, etc. into one-per-line in
inet/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-10-04 12:08:54 +02:00
Szabolcs Nagy
0a520f28ff Fix off-by-one OOB write in iconv/tst-iconv-mt
The iconv buffer sizes must not include the \0 string terminator.
And the output termination with *outbufpos = '\0' was OOB.

Consistently use non-null-terminated buffer sizes.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-04 08:33:20 +01:00
Siddhesh Poyarekar
1056e5b4c3 tunables: Terminate if end of input is reached (CVE-2023-4911)
The string parsing routine may end up writing beyond bounds of tunestr
if the input tunable string is malformed, of the form name=name=val.
This gets processed twice, first as name=name=val and next as name=val,
resulting in tunestr being name=name=val:name=val, thus overflowing
tunestr.

Terminate the parsing loop at the first instance itself so that tunestr
does not overflow.

This also fixes up tst-env-setuid-tunables to actually handle failures
correct and add new tests to validate the fix for this CVE.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-10-02 15:35:29 -04:00
Siddhesh Poyarekar
0d5f9ea97f Propagate GLIBC_TUNABLES in setxid binaries
GLIBC_TUNABLES scrubbing happens earlier than envvar scrubbing and some
tunables are required to propagate past setxid boundary, like their
env_alias.  Rely on tunable scrubbing to clean out GLIBC_TUNABLES like
before, restoring behaviour in glibc 2.37 and earlier.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-10-02 15:35:05 -04:00
Kir Kolyshkin
9e4e896f0f Linux: add ST_NOSYMFOLLOW
Linux v5.10 added a mount option MS_NOSYMFOLLOW, which was added to
glibc in commit 0ca21427d9.

Add the corresponding statfs/statvfs flag bit, ST_NOSYMFOLLOW.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-02 10:54:27 -03:00
Adhemerval Zanella
40c0add7d4 resolve: Remove __res_context_query alloca usage
The bufsize on current Linux build is:

   size_t bufsize = (type == 439963904 ? 2 : 1) * (12 + 4 + 255 + 1);

So with upper bound as 544 (2 * (12 + 4 + 255 + 1)).  However, it might
increase to 2 * PACKETSIZE later with malloc.  The default scratch_buffer
should fullfill the most usual allocation requirement.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Joe Simmons-Talbott <josimmon@redhat.com>
2023-10-02 10:54:27 -03:00
Joe Simmons-Talbott
08e9a60a1a mips: dl-machine-reject-phdr: Get rid of alloca.
Read directly into the mips_abiflags struct rather than reading the
entire segment and using alloca when the passed buffer is not big enough.

Checked with build-many-glibcs.py on mips-linux-gnu

Tested-by: Ying Huang <ying.huang@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-02 12:55:27 +00:00
Noah Goldstein
d90b43a4ed x86: Add support for AVX10 preset and vec size in cpu-features
This commit add support for the new AVX10 cpu features:
https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf

We add checks for:
    - `AVX10`: Check if AVX10 is present.
    - `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support.

`make check` passes and cpuid output was checked against GNR/DMR on an
emulator.
2023-09-29 14:18:42 -05:00
Arjun Shankar
5f913506f4 resolv: Fix a comment typo in __resolv_conf_load
The file being referred to is host.conf, not hosts.conf.
2023-09-29 11:39:27 +02:00
Arjun Shankar
99b4327a55 Remove unused -DRESOLVER getaddrinfo build flag
getaddrinfo doesn't look for any RESOLVER defines for conditional
compilation.  Therefore, remove the unnecessary -DRESOLVER build flag in
getaddrinfo's CFLAGS.

Checked on x86_64 for code generation changes; none found.
2023-09-29 11:21:04 +02:00
Joseph Myers
cdbf8229bb C2x scanf %wN, %wfN support
ISO C2x defines scanf length modifiers wN (for intN_t / int_leastN_t /
uintN_t / uint_leastN_t) and wfN (for int_fastN_t / uint_fastN_t).
Add support for those length modifiers, similar to the printf support
previously added.

Tested for x86_64 and x86.
2023-09-28 17:28:15 +00:00
Adhemerval Zanella
aea4ddb871 test-container: Use nftw instead of rm -rf
If the binary to run is 'env', test-containers skips it and adds
any required environment variable on the process envs variables.
This simplifies the required code to spawn new process (no need
to build an env-like program).

However, this is an issue for recursive_remove if there is any
LD_PRELOAD, since test-container will not prepend the loader command
along with required paths.  If the required preloaded library can
not be loaded by the system glibc, the 'post-clean rsync' will
eventually fail.

One example is if system glibc does not support DT_RELR and the
built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
with:

../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)

Instead trying to figure out the required loader arguments on how
to spawn the 'rm -rf', replace the command with a nftw call.

Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
2023-09-28 09:41:05 -03:00
Samuel Thibault
29d4591b07 hurd: Drop REG_GSFS and REG_ESDS from x86_64's ucontext
These are useless on x86_64, and __NGREG was actually wrong with them.
2023-09-28 00:10:13 +02:00
Qingqing Li
964d15a007 elf: Fix compile error with -DNDEBUG [BZ #18755]
Compilation fails when building with -DNDEBUG after commit a3189f66a5.
Here is the error:

dl-close.c: In function ‘_dl_close_worker’:
dl-close.c:140:22: error: unused variable ‘nloaded’ [-Werror=unused-variable]
  140 |   const unsigned int nloaded = ns->_ns_nloaded;

Add __attribute_maybe_unused__ for‘nloaded’to fix it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 17:03:34 -03:00
Ying Huang
a6e8ceb3bb MIPS: Add relocation types
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 13:57:11 -03:00
Ying Huang
f34dc13ad6 MIPS: Add new section type SHT_MIPS_ABIFLAGS
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 13:57:08 -03:00
Ying Huang
c07ae7cca4 MIPS: Add ELF file header flags
Now binutils use some E_MIPS_* macros and EF_MIPS_* macros, it is
difficult to decide which style macro we should use when we want
to add new ELF file header flags.
IRIX used to use EF_MIPS_* macros and in elf/elf.h there also has
comments "The following are unofficial names and should not be used".
So we should use EF_MIPS_* to keep same style with the beginning.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 13:57:06 -03:00
Manjunath Matti
4eac1825ed fegetenv_and_set_rn now uses the builtins provided by GCC.
On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the
prologue get/save/set rounding mode operations for POWER9 and
later by using 'mffscrn' where possible, this was introduced by
commit f1c56cdff0.

GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn
which now returns the FPSCR fields in a double. This feature is
available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro
is defined.
GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds
this feature.

Changes are done to use __builtin_set_fpscr_rn instead of mffscrn
or mffscrni in __fe_mffscrn(rn).

Suggested-by: Carl Love <cel@us.ibm.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 13:55:36 -03:00
Adhemerval Zanella
551101e824 io: Do not implement fstat with fstatat
AT_EMPTY_PATH is a requirement to implement fstat over fstatat,
however it does not prevent the kernel to read the path argument.
It is not an issue, but on x86-64 with SMAP-capable CPUs the kernel is
forced to perform expensive user memory access.  After that regular
lookup is performed which adds even more overhead.

Instead, issue the fstat syscall directly on LFS fstat implementation
(32 bit architectures will still continue to use statx, which is
required to have 64 bit time_t support).  it should be even a
small performance gain on non x86_64, since there is no need
to handle the path argument.

Checked on x86_64-linux-gnu.
2023-09-27 09:30:24 -03:00
Xi Ruoyao
64b1a44183 libio: Add nonnull attribute for most FILE * arguments in stdio.h
During the review of a GCC analyzer test case, we found most stdio
functions accepting a FILE * argument expect it to be nonnull and just
segfault when the argument is NULL.  Add nonnull attribute for them.

fflush and fflush_unlocked are well defined when __stream is NULL so
they are not touched.

For fputs, fgets, fread, fwrite, fprintf, vfprintf, and their unlocked
version, if __stream is empty but there is nothing to read or write,
they did not segfault.  But the standard disallow __stream to be empty
here, so nonnull attribute is also added for them.  Note that this may
blow up some old code already subtly broken.

Also add __nonnull for _chk variants and __fortify_function versions for
them.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-26 20:11:38 -04:00
Wilco Dijkstra
6b695e5c62 AArch64: Remove -0.0 check from vector sin
Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf,
improving performance.  Passes regress.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-09-26 13:40:07 +01:00
Siddhesh Poyarekar
fd134feba3 Document CVE-2023-4806 and CVE-2023-5156 in NEWS
These are tracked in BZ #30884 and BZ #30843.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-26 07:38:07 -04:00
Florian Weimer
f563971b5b elf: Add dummy declaration of _dl_audit_objclose for !SHARED
This allows us to avoid some #ifdef SHARED conditionals.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-26 11:40:12 +02:00
Romain Geissler
ec6b95c330 Fix leak in getaddrinfo introduced by the fix for CVE-2023-4806 [BZ #30843]
This patch fixes a very recently added leak in getaddrinfo.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-25 01:21:51 +01:00
Joe Simmons-Talbott
5d00c201b9 elf: dl-lookup: Remove unused alloca.h include 2023-09-21 14:08:20 +00:00
Mike FABIAN
d2d797a49b Remove unused localedata/th_TH.in 2023-09-21 10:34:35 +02:00
Mike FABIAN
aceda10bd5 Adapt collation in th_TH locale to use the iso14651_t1_common file and sync the collation with CLDR
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).

It seems to be impossible to follow the CLDR rules

  &[before 1]๚<ฯ # should be "variable"

and

  &๛<ๆ # should be "variable"

exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.

There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.

Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
2023-09-21 10:34:35 +02:00
caiyinyu
672b91ba10 Revert "LoongArch: Add glibc.cpu.hwcap support."
This reverts commit a53451559d.
2023-09-21 09:10:11 +08:00
Joseph Myers
457bb77255 Update kernel version to 6.5 in header constant tests
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-pidfd-consts.py to 6.5.  (There are no new constants covered
by these tests in 6.5 that need any other header changes;
tst-mount-consts.py was updated separately along with a header
constant addition.)

Tested with build-many-glibcs.py.
2023-09-20 13:36:46 +00:00
caiyinyu
a53451559d LoongArch: Add glibc.cpu.hwcap support.
Key Points:
1. On lasx & lsx platforms, We must use _dl_runtime_{profile, resolve}_{lsx, lasx}
   to save vector registers.
2. Via "tunables", users can choose str/mem_{lasx,lsx,unaligned} functions with
   `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,...`.
   Note: glibc.cpu.hwcaps doesn't affect _dl_runtime_{profile, resolve}_{lsx, lasx}
   selection.

Usage Notes:
1. Only valid inputs: LASX, LSX, UAL. Case-sensitive, comma-separated, no spaces.
2. Example: `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL` turns on LASX & UAL.
   Unmentioned features turn off. With default ifunc: lasx > lsx > unaligned >
   aligned > generic, effect is: lasx > unaligned > aligned > generic; lsx off.
3. Incorrect GLIBC_TUNABLES settings will show error messages.
   For example: On lsx platforms, you cannot enable lasx features. If you do
   that, you will get error messages.
4. Valid input examples:
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX: lasx > aligned > generic.
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LSX,UAL: lsx > unaligned > aligned > generic.
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL,LASX,UAL,LSX,LASX,UAL: Repetitions
     allowed but not recommended. Results in: lasx > lsx > unaligned > aligned >
     generic.
2023-09-19 09:11:49 +08:00
Wilco Dijkstra
5bc9b3a1f6 math: Add a no-mathvec flag for sin (-0.0)
Add support for a no-mathvec flag to gen-auto-libm-tests.c.
Update input test sin (-0.0) to be skipped in vector math libraries and
regenerate testcases.

Reviewed-By: Paul Zimmermann  <Paul.Zimmermann@inria.fr>
2023-09-18 11:50:23 +01:00
Mike FABIAN
bb5bbc2070 Update to Unicode 15.1.0 [BZ #30854]
Unicode 15.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 15.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

    Total removed characters in newly generated CHARMAP: 0
    Total changed characters in newly generated CHARMAP: 0
    Total added characters in newly generated CHARMAP: 627
    Total removed characters in newly generated WIDTH: 0
    Total changed characters in newly generated WIDTH: 0
    Total added characters in newly generated WIDTH: 627

    alpha: Added 622 characters in new ctype which were not in old ctype
    graph: Added 627 characters in new ctype which were not in old ctype
    print: Added 627 characters in new ctype which were not in old ctype
    punct: Added 5 characters in new ctype which were not in old ctype
        The five characters added to punct are:
        2FFC;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM RIGHT;So;0;ON;;;;;N;;;;;
        2FFD;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM LOWER RIGHT;So;0;ON;;;;;N;;;;;
        2FFE;IDEOGRAPHIC DESCRIPTION CHARACTER HORIZONTAL REFLECTION;So;0;ON;;;;;N;;;;;
        2FFF;IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION;So;0;ON;;;;;N;;;;;
        31EF;IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION;So;0;ON;;;;;N;;;;;

    The Unicode announcement blog entry says "[...] adds 627
    characters, [...] additions include 622 CJK unified ideographs in
    a new block, [...]", so that looks OK. The Unicode
    blog mentions "six completely new emoji" but they don't appear here as
    they are all sequences and not single code points.

Resolves: BZ #30854

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:03 +02:00
Mike FABIAN
71de3aead9 localedata/unicode-gen/utf8_gen.py: adapt regexp to get relevant lines from EastAsianWidth.txt
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:02 +02:00
Mike FABIAN
ba017b4f9d Fix regexp syntax warnings in localedata/unicode-gen/ctype_compatibility.py
Fix these:

$ python -m py_compile ./ctype_compatibility.py
./ctype_compatibility.py:146: SyntaxWarning: invalid escape sequence '\)'

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:02 +02:00
Siddhesh Poyarekar
973fe93a56 getaddrinfo: Fix use after free in getcanonname (CVE-2023-4806)
When an NSS plugin only implements the _gethostbyname2_r and
_getcanonname_r callbacks, getaddrinfo could use memory that was freed
during tmpbuf resizing, through h_name in a previous query response.

The backing store for res->at->name when doing a query with
gethostbyname3_r or gethostbyname2_r is tmpbuf, which is reallocated in
gethosts during the query.  For AF_INET6 lookup with AI_ALL |
AI_V4MAPPED, gethosts gets called twice, once for a v6 lookup and second
for a v4 lookup.  In this case, if the first call reallocates tmpbuf
enough number of times, resulting in a malloc, th->h_name (that
res->at->name refers to) ends up on a heap allocated storage in tmpbuf.
Now if the second call to gethosts also causes the plugin callback to
return NSS_STATUS_TRYAGAIN, tmpbuf will get freed, resulting in a UAF
reference in res->at->name.  This then gets dereferenced in the
getcanonname_r plugin call, resulting in the use after free.

Fix this by copying h_name over and freeing it at the end.  This
resolves BZ #30843, which is assigned CVE-2023-4806.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-15 14:38:28 -04:00
dengjianbo
780adf7aea LoongArch: Change to put magic number to .rodata section
Change to put magic number to .rodata section in memmove-lsx, and use
pcalau12i and %pc_lo12 with vld to get the data.
2023-09-15 09:07:47 +08:00
dengjianbo
24279aecf3 LoongArch: Add ifunc support for strrchr{aligned, lsx, lasx}
According to glibc strrchr microbenchmark test results, this implementation
could reduce the runtime time as following:

Name                Percent of rutime reduced
strrchr-lasx        10%-50%
strrchr-lsx         0%-50%
strrchr-aligned     5%-50%

Generic strrchr is implemented by function strlen + memrchr, the lasx version
will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx,
the lsx version will compare with generic strrchr implemented by strlen-lsx +
memrchr-lsx, the aligned version will compare with generic strrchr implemented
by strlen-aligned + memrchr-generic.
2023-09-15 09:07:47 +08:00
dengjianbo
06251002d4 LoongArch: Add ifunc support for strcpy, stpcpy{aligned, unaligned, lsx, lasx}
According to glibc strcpy and stpcpy microbenchmark test results(changed
to use generic_strcpy and generic_stpcpy instead of strlen + memcpy),
comparing with the generic version, this implementation could reduce the
runtime as following:

Name              Percent of rutime reduced
strcpy-aligned    8%-45%
strcpy-unaligned  8%-48%, comparing with the aligned version, unaligned
                  version takes less instructions to copy the tail of data
		  which length is less than 8. it also has better performance
		  in case src and dest cannot be both aligned with 8bytes
strcpy-lsx        20%-80%
strcpy-lasx       15%-86%
stpcpy-aligned    6%-43%
stpcpy-unaligned  8%-48%
stpcpy-lsx        10%-80%
stpcpy-lasx       10%-87%
2023-09-15 09:07:47 +08:00
caiyinyu
c6c73e136a LoongArch: Replace deprecated $v0 with $a0 to eliminate 'as' Warnings. 2023-09-15 09:07:47 +08:00
caiyinyu
f5242db159 LoongArch: Add lasx/lsx support for _dl_runtime_profile. 2023-09-15 09:07:42 +08:00