Commit Graph

18 Commits

Author SHA1 Message Date
dengjianbo
780adf7aea LoongArch: Change to put magic number to .rodata section
Change to put magic number to .rodata section in memmove-lsx, and use
pcalau12i and %pc_lo12 with vld to get the data.
2023-09-15 09:07:47 +08:00
dengjianbo
24279aecf3 LoongArch: Add ifunc support for strrchr{aligned, lsx, lasx}
According to glibc strrchr microbenchmark test results, this implementation
could reduce the runtime time as following:

Name                Percent of rutime reduced
strrchr-lasx        10%-50%
strrchr-lsx         0%-50%
strrchr-aligned     5%-50%

Generic strrchr is implemented by function strlen + memrchr, the lasx version
will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx,
the lsx version will compare with generic strrchr implemented by strlen-lsx +
memrchr-lsx, the aligned version will compare with generic strrchr implemented
by strlen-aligned + memrchr-generic.
2023-09-15 09:07:47 +08:00
dengjianbo
06251002d4 LoongArch: Add ifunc support for strcpy, stpcpy{aligned, unaligned, lsx, lasx}
According to glibc strcpy and stpcpy microbenchmark test results(changed
to use generic_strcpy and generic_stpcpy instead of strlen + memcpy),
comparing with the generic version, this implementation could reduce the
runtime as following:

Name              Percent of rutime reduced
strcpy-aligned    8%-45%
strcpy-unaligned  8%-48%, comparing with the aligned version, unaligned
                  version takes less instructions to copy the tail of data
		  which length is less than 8. it also has better performance
		  in case src and dest cannot be both aligned with 8bytes
strcpy-lsx        20%-80%
strcpy-lasx       15%-86%
stpcpy-aligned    6%-43%
stpcpy-unaligned  8%-48%
stpcpy-lsx        10%-80%
stpcpy-lasx       10%-87%
2023-09-15 09:07:47 +08:00
dengjianbo
693918b6dd LoongArch: Change loongarch to LoongArch in comments 2023-08-29 10:35:38 +08:00
dengjianbo
ea7698a616 LoongArch: Add ifunc support for memcmp{aligned, lsx, lasx}
According to glibc memcmp microbenchmark test results(Add generic
memcmp), this implementation have performance improvement
except the length is less than 3, details as below:

Name             Percent of time reduced
memcmp-lasx      16%-74%
memcmp-lsx       20%-50%
memcmp-aligned   5%-20%
2023-08-29 10:35:38 +08:00
dengjianbo
1b1e9b7c10 LoongArch: Add ifunc support for memset{aligned, unaligned, lsx, lasx}
According to glibc memset microbenchmark test results, for LSX and LASX
versions, A few cases with length less than 8 experience performace
degradation, overall, the LASX version could reduce the runtime about
15% - 75%, LSX version could reduce the runtime about 15%-50%.

The unaligned version uses unaligned memmory access to set data which
length is less than 64 and make address aligned with 8. For this part,
the performace is better than aligned version. Comparing with the generic
version, the performance is close when the length is larger than 128. When
the length is 8-128, the unaligned version could reduce the runtime about
30%-70%, the aligned version could reduce the runtime about 20%-50%.
2023-08-29 10:35:38 +08:00
dengjianbo
55e84dc6ed LoongArch: Add ifunc support for memrchr{lsx, lasx}
According to glibc memrchr microbenchmark, this implementation could reduce
the runtime as following:

Name            Percent of rutime reduced
memrchr-lasx    20%-83%
memrchr-lsx     20%-64%
2023-08-29 10:35:38 +08:00
dengjianbo
60bcb9acbf LoongArch: Add ifunc support for memchr{aligned, lsx, lasx}
According to glibc memchr microbenchmark, this implementation could reduce
the runtime as following:

Name               Percent of runtime reduced
memchr-lasx        37%-83%
memchr-lsx         30%-66%
memchr-aligned     0%-15%
2023-08-29 10:35:38 +08:00
dengjianbo
f8664fe215 LoongArch: Add ifunc support for rawmemchr{aligned, lsx, lasx}
According to glibc rawmemchr microbenchmark, A few cases tested with
char '\0' experience performance degradation due to the lasx and lsx
versions don't handle the '\0' separately. Overall, rawmemchr-lasx
implementation could reduce the runtime about 40%-80%, rawmemchr-lsx
implementation could reduce the runtime about 40%-66%, rawmemchr-aligned
implementation could reduce the runtime about 20%-40%.
2023-08-29 10:35:38 +08:00
dengjianbo
ddbb74f5c2 LoongArch: Add ifunc support for strncmp{aligned, lsx}
Based on the glibc microbenchmark, only a few short inputs with this
strncmp-aligned and strncmp-lsx implementation experience performance
degradation, overall, strncmp-aligned could reduce the runtime 0%-10%
for aligned comparision, 10%-25% for unaligend comparision, strncmp-lsx
could reduce the runtime about 0%-60%.
2023-08-24 17:19:47 +08:00
dengjianbo
82d9426e4a LoongArch: Add ifunc support for strcmp{aligned, lsx}
Based on the glibc microbenchmark, strcmp-aligned implementation could
reduce the runtime 0%-10% for aligned comparison, 10%-20% for unaligned
comparison, strcmp-lsx implemenation could reduce the runtime 0%-50%.
2023-08-24 17:19:47 +08:00
dengjianbo
e74d959862 LoongArch: Add ifunc support for strnlen{aligned, lsx, lasx}
Based on the glibc microbenchmark, strnlen-aligned implementation could
reduce the runtime more than 10%, strnlen-lsx implementation could reduce
the runtime about 50%-78%, strnlen-lasx implementation could reduce the
runtime about 50%-88%.
2023-08-24 17:19:47 +08:00
dengjianbo
8944ba483f Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and memmove{aligned, unaligned, lsx, lasx}
These implementations improve the time to copy data in the glibc
microbenchmark as below:
memcpy-lasx       reduces the runtime about 8%-76%
memcpy-lsx        reduces the runtime about 8%-72%
memcpy-unaligned  reduces the runtime of unaligned data copying up to 40%
memcpy-aligned    reduece the runtime of unaligned data copying up to 25%
memmove-lasx      reduces the runtime about 20%-73%
memmove-lsx       reduces the runtime about 50%
memmove-unaligned reduces the runtime of unaligned data moving up to 40%
memmove-aligned   reduces the runtime of unaligned data moving up to 25%
2023-08-17 10:12:18 +08:00
dengjianbo
ba67bc8e0a Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and strchrnul{aligned, lsx, lasx}
These implementations improve the time to run strchr{nul}
microbenchmark in glibc as below:
strchr-lasx       reduces the runtime about 50%-83%
strchr-lsx        reduces the runtime about 30%-67%
strchr-aligned    reduces the runtime about 10%-20%
strchrnul-lasx    reduces the runtime about 50%-83%
strchrnul-lsx     reduces the runtime about 36%-65%
strchrnul-aligned reduces the runtime about 6%-10%
2023-08-17 10:12:18 +08:00
dengjianbo
135407f431 Loongarch: Add ifunc support and add different versions of strlen
strlen-lasx is implemeted by LASX simd instructions(256bit)
strlen-lsx is implemeted by LSX simd instructions(128bit)
strlen-align is implemented by LA basic instructions and never use unaligned memory acess
2023-08-14 09:47:09 +08:00
caiyinyu
db9c100749 LoongArch: Update libm-test-ulps. 2023-03-02 11:17:15 +08:00
caiyinyu
68d61026d5 LoongArch: Hard Float Support 2022-07-26 12:35:12 -03:00
caiyinyu
3d87c89815 LoongArch: Build Infrastructure 2022-07-26 12:35:12 -03:00