Notes for future devs:
* Add tools as you find they're needed, with version 0,0
* Bump version when you find an old tool that doesn't work
* Don't add a version just because you know it works
Co-authored-by: Lukasz Majewski <lukma@denx.de>
Co-authored-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.
The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.
Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.
The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.
Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
EVEX geometric_mean(N=5) of all benchmarks New / Original : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
Full check passes on x86.
* Transpose table layout for improved memory access
* Use half-vector special comparisons for AdvSIMD
* Improve register use near special-case branches
- Due to the presence of a function call, return value would get
mov-d out of x0 in order to facilitate PCS. By moving the final
computation after the branch this can be avoided
Also change SVE routines to use overloaded intrinsics for readability.
Use overloaded intrinsics for readability. Codegen does not
change, however while we're bringing the routines up-to-date with
recent improvements to other routines in AOR it is worth copying
this change over as well.
* Update ULP comment reflecting a new observed max in [-pi/2, pi/2]
* Use the same polynomial in AdvSIMD and SVE, rather than FTRIG instructions
* Improve register use near special-case branch
Also use overloaded intrinsics for SVE.
When -D_FORTIFY_SOURCE=2 was given during compilation,
sprintf and similar functions will check if their
first argument is in read-only memory and exit with
*** %n in writable segment detected ***
otherwise. To check if the memory is read-only, glibc
reads frpm the file "/proc/self/maps". If opening this
file fails due to too many open files (EMFILE), glibc
will now ignore this error.
Fixes [BZ #30932]
Signed-off-by: Volker Weißmann <volker.weissmann@gmx.de>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Rearrange lists of routines, tests, etc. into one-per-line in
nss/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Rearrange lists of routines, tests, etc. into one-per-line in
inet/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The iconv buffer sizes must not include the \0 string terminator.
And the output termination with *outbufpos = '\0' was OOB.
Consistently use non-null-terminated buffer sizes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The string parsing routine may end up writing beyond bounds of tunestr
if the input tunable string is malformed, of the form name=name=val.
This gets processed twice, first as name=name=val and next as name=val,
resulting in tunestr being name=name=val:name=val, thus overflowing
tunestr.
Terminate the parsing loop at the first instance itself so that tunestr
does not overflow.
This also fixes up tst-env-setuid-tunables to actually handle failures
correct and add new tests to validate the fix for this CVE.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
GLIBC_TUNABLES scrubbing happens earlier than envvar scrubbing and some
tunables are required to propagate past setxid boundary, like their
env_alias. Rely on tunable scrubbing to clean out GLIBC_TUNABLES like
before, restoring behaviour in glibc 2.37 and earlier.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux v5.10 added a mount option MS_NOSYMFOLLOW, which was added to
glibc in commit 0ca21427d9.
Add the corresponding statfs/statvfs flag bit, ST_NOSYMFOLLOW.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The bufsize on current Linux build is:
size_t bufsize = (type == 439963904 ? 2 : 1) * (12 + 4 + 255 + 1);
So with upper bound as 544 (2 * (12 + 4 + 255 + 1)). However, it might
increase to 2 * PACKETSIZE later with malloc. The default scratch_buffer
should fullfill the most usual allocation requirement.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Joe Simmons-Talbott <josimmon@redhat.com>
Read directly into the mips_abiflags struct rather than reading the
entire segment and using alloca when the passed buffer is not big enough.
Checked with build-many-glibcs.py on mips-linux-gnu
Tested-by: Ying Huang <ying.huang@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This commit add support for the new AVX10 cpu features:
https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf
We add checks for:
- `AVX10`: Check if AVX10 is present.
- `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support.
`make check` passes and cpuid output was checked against GNR/DMR on an
emulator.
getaddrinfo doesn't look for any RESOLVER defines for conditional
compilation. Therefore, remove the unnecessary -DRESOLVER build flag in
getaddrinfo's CFLAGS.
Checked on x86_64 for code generation changes; none found.
ISO C2x defines scanf length modifiers wN (for intN_t / int_leastN_t /
uintN_t / uint_leastN_t) and wfN (for int_fastN_t / uint_fastN_t).
Add support for those length modifiers, similar to the printf support
previously added.
Tested for x86_64 and x86.
If the binary to run is 'env', test-containers skips it and adds
any required environment variable on the process envs variables.
This simplifies the required code to spawn new process (no need
to build an env-like program).
However, this is an issue for recursive_remove if there is any
LD_PRELOAD, since test-container will not prepend the loader command
along with required paths. If the required preloaded library can
not be loaded by the system glibc, the 'post-clean rsync' will
eventually fail.
One example is if system glibc does not support DT_RELR and the
built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
with:
../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
Instead trying to figure out the required loader arguments on how
to spawn the 'rm -rf', replace the command with a nftw call.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
Compilation fails when building with -DNDEBUG after commit a3189f66a5.
Here is the error:
dl-close.c: In function ‘_dl_close_worker’:
dl-close.c:140:22: error: unused variable ‘nloaded’ [-Werror=unused-variable]
140 | const unsigned int nloaded = ns->_ns_nloaded;
Add __attribute_maybe_unused__ for‘nloaded’to fix it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now binutils use some E_MIPS_* macros and EF_MIPS_* macros, it is
difficult to decide which style macro we should use when we want
to add new ELF file header flags.
IRIX used to use EF_MIPS_* macros and in elf/elf.h there also has
comments "The following are unofficial names and should not be used".
So we should use EF_MIPS_* to keep same style with the beginning.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the
prologue get/save/set rounding mode operations for POWER9 and
later by using 'mffscrn' where possible, this was introduced by
commit f1c56cdff0.
GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn
which now returns the FPSCR fields in a double. This feature is
available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro
is defined.
GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds
this feature.
Changes are done to use __builtin_set_fpscr_rn instead of mffscrn
or mffscrni in __fe_mffscrn(rn).
Suggested-by: Carl Love <cel@us.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
AT_EMPTY_PATH is a requirement to implement fstat over fstatat,
however it does not prevent the kernel to read the path argument.
It is not an issue, but on x86-64 with SMAP-capable CPUs the kernel is
forced to perform expensive user memory access. After that regular
lookup is performed which adds even more overhead.
Instead, issue the fstat syscall directly on LFS fstat implementation
(32 bit architectures will still continue to use statx, which is
required to have 64 bit time_t support). it should be even a
small performance gain on non x86_64, since there is no need
to handle the path argument.
Checked on x86_64-linux-gnu.
During the review of a GCC analyzer test case, we found most stdio
functions accepting a FILE * argument expect it to be nonnull and just
segfault when the argument is NULL. Add nonnull attribute for them.
fflush and fflush_unlocked are well defined when __stream is NULL so
they are not touched.
For fputs, fgets, fread, fwrite, fprintf, vfprintf, and their unlocked
version, if __stream is empty but there is nothing to read or write,
they did not segfault. But the standard disallow __stream to be empty
here, so nonnull attribute is also added for them. Note that this may
blow up some old code already subtly broken.
Also add __nonnull for _chk variants and __fortify_function versions for
them.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf,
improving performance. Passes regress.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).
It seems to be impossible to follow the CLDR rules
&[before 1]๚<ฯ # should be "variable"
and
&๛<ๆ # should be "variable"
exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.
There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.
Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-pidfd-consts.py to 6.5. (There are no new constants covered
by these tests in 6.5 that need any other header changes;
tst-mount-consts.py was updated separately along with a header
constant addition.)
Tested with build-many-glibcs.py.
Add support for a no-mathvec flag to gen-auto-libm-tests.c.
Update input test sin (-0.0) to be skipped in vector math libraries and
regenerate testcases.
Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Unicode 15.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 15.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Total removed characters in newly generated CHARMAP: 0
Total changed characters in newly generated CHARMAP: 0
Total added characters in newly generated CHARMAP: 627
Total removed characters in newly generated WIDTH: 0
Total changed characters in newly generated WIDTH: 0
Total added characters in newly generated WIDTH: 627
alpha: Added 622 characters in new ctype which were not in old ctype
graph: Added 627 characters in new ctype which were not in old ctype
print: Added 627 characters in new ctype which were not in old ctype
punct: Added 5 characters in new ctype which were not in old ctype
The five characters added to punct are:
2FFC;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM RIGHT;So;0;ON;;;;;N;;;;;
2FFD;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM LOWER RIGHT;So;0;ON;;;;;N;;;;;
2FFE;IDEOGRAPHIC DESCRIPTION CHARACTER HORIZONTAL REFLECTION;So;0;ON;;;;;N;;;;;
2FFF;IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION;So;0;ON;;;;;N;;;;;
31EF;IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION;So;0;ON;;;;;N;;;;;
The Unicode announcement blog entry says "[...] adds 627
characters, [...] additions include 622 CJK unified ideographs in
a new block, [...]", so that looks OK. The Unicode
blog mentions "six completely new emoji" but they don't appear here as
they are all sequences and not single code points.
Resolves: BZ #30854
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
When an NSS plugin only implements the _gethostbyname2_r and
_getcanonname_r callbacks, getaddrinfo could use memory that was freed
during tmpbuf resizing, through h_name in a previous query response.
The backing store for res->at->name when doing a query with
gethostbyname3_r or gethostbyname2_r is tmpbuf, which is reallocated in
gethosts during the query. For AF_INET6 lookup with AI_ALL |
AI_V4MAPPED, gethosts gets called twice, once for a v6 lookup and second
for a v4 lookup. In this case, if the first call reallocates tmpbuf
enough number of times, resulting in a malloc, th->h_name (that
res->at->name refers to) ends up on a heap allocated storage in tmpbuf.
Now if the second call to gethosts also causes the plugin callback to
return NSS_STATUS_TRYAGAIN, tmpbuf will get freed, resulting in a UAF
reference in res->at->name. This then gets dereferenced in the
getcanonname_r plugin call, resulting in the use after free.
Fix this by copying h_name over and freeing it at the end. This
resolves BZ #30843, which is assigned CVE-2023-4806.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
According to glibc strrchr microbenchmark test results, this implementation
could reduce the runtime time as following:
Name Percent of rutime reduced
strrchr-lasx 10%-50%
strrchr-lsx 0%-50%
strrchr-aligned 5%-50%
Generic strrchr is implemented by function strlen + memrchr, the lasx version
will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx,
the lsx version will compare with generic strrchr implemented by strlen-lsx +
memrchr-lsx, the aligned version will compare with generic strrchr implemented
by strlen-aligned + memrchr-generic.
According to glibc strcpy and stpcpy microbenchmark test results(changed
to use generic_strcpy and generic_stpcpy instead of strlen + memcpy),
comparing with the generic version, this implementation could reduce the
runtime as following:
Name Percent of rutime reduced
strcpy-aligned 8%-45%
strcpy-unaligned 8%-48%, comparing with the aligned version, unaligned
version takes less instructions to copy the tail of data
which length is less than 8. it also has better performance
in case src and dest cannot be both aligned with 8bytes
strcpy-lsx 20%-80%
strcpy-lasx 15%-86%
stpcpy-aligned 6%-43%
stpcpy-unaligned 8%-48%
stpcpy-lsx 10%-80%
stpcpy-lasx 10%-87%