Commit Graph

40333 Commits

Author SHA1 Message Date
Samuel Thibault
6333a6014f __call_tls_dtors: Use call_function_static_weak 2023-09-04 20:03:37 +02:00
Bruno Haible
2897b231a6 intl: Treat C.UTF-8 locale like C locale (BZ# 16621)
The wiki page https://sourceware.org/glibc/wiki/Proposals/C.UTF-8
says that "Setting LC_ALL=C.UTF-8 will ignore LANGUAGE just like it
does with LC_ALL=C." This patch implements it.

* intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale
like the C locale.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-09-04 15:31:36 +02:00
Samuel Thibault
8076906109 htl: Fix stack information for main thread
We can easily directly ask the kernel with vm_region rather than
assuming a one-page stack.
2023-09-03 21:11:29 +02:00
Samuel Thibault
89ade8d8cb htl: thread_local destructors support 2023-09-03 15:23:56 +02:00
Szabolcs Nagy
d2123d6827 elf: Fix slow tls access after dlopen [BZ #19924]
In short: __tls_get_addr checks the global generation counter and if
the current dtv is older then _dl_update_slotinfo updates dtv up to the
generation of the accessed module. So if the global generation is newer
than generation of the module then __tls_get_addr keeps hitting the
slow dtv update path. The dtv update path includes a number of checks
to see if any update is needed and this already causes measurable tls
access slow down after dlopen.

It may be possible to detect up-to-date dtv faster.  But if there are
many modules loaded (> TLS_SLOTINFO_SURPLUS) then this requires at
least walking the slotinfo list.

This patch tries to update the dtv to the global generation instead, so
after a dlopen the tls access slow path is only hit once.  The modules
with larger generation than the accessed one were not necessarily
synchronized before, so additional synchronization is needed.

This patch uses acquire/release synchronization when accessing the
generation counter.

Note: in the x86_64 version of dl-tls.c the generation is only loaded
once, since relaxed mo is not faster than acquire mo load.

I have not benchmarked this. Tested by Adhemerval Zanella on aarch64,
powerpc, sparc, x86 who reported that it fixes the performance issue
of bug 19924.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-01 08:21:37 +01:00
H.J. Lu
1493622f4f x86: Check the lower byte of EAX of CPUID leaf 2 [BZ #30643]
The old Intel software developer manual specified that the low byte of
EAX of CPUID leaf 2 returned 1 which indicated the number of rounds of
CPUDID leaf 2 was needed to retrieve the complete cache information. The
newer Intel manual has been changed to that it should always return 1
and be ignored.  If the lower byte isn't 1, CPUID leaf 2 can't be used.
In this case, we ignore CPUID leaf 2 and use CPUID leaf 4 instead.  If
CPUID leaf 4 doesn't contain the cache information, cache information
isn't available at all.  This addresses BZ #30643.
2023-08-29 12:57:41 -07:00
lijianglin
e1d3312015 add GB18030-2022 charmap and test the entire GB18030 charmap [BZ #30243]
support GB18030-2022 after add and change some transcoding relationship
of GB18030-2022.Details are as follows:
add 25 transcoding relationship
  UE81E 0x82359037
  UE826 0x82359038
  UE82B 0x82359039
  UE82C 0x82359130
  UE832 0x82359131
  UE843 0x82359132
  UE854 0x82359133
  UE864 0x82359134
  UE78D 0x84318236
  UE78F 0x84318237
  UE78E 0x84318238
  UE790 0x84318239
  UE791 0x84318330
  UE792 0x84318331
  UE793 0x84318332
  UE794 0x84318333
  UE795 0x84318334
  UE796 0x84318335
  UE816 0xfe51
  UE817 0xfe52
  UE818 0xfe53
  UE831 0xfe6c
  UE83B 0xfe76
  UE855 0xfe91
change 6 transcoding relationship
  U20087 0x95329031
  U20089 0x95329033
  U200CC 0x95329730
  U215D7 0x9536b937
  U2298F 0x9630ba35
  U241FE 0x9635b630
Test the entire GB18030 charmap, not only the Unicode BMP part.

Co-authored-by: yangyanchao <yangyanchao6@huawei.com>
Co-authored-by: liqingqing <liqingqing3@huawei.com>
Co-authored-by: Bruno Haible <bruno@clisp.org>
Reviewed-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Mike FABIAN <mfabian@redhat.com>
2023-08-29 19:02:30 +02:00
Joseph Myers
d3c34a2dd9 Use GMP 6.3.0, MPFR 4.2.1 in build-many-glibcs.py
This patch makes build-many-glibcs.py use the new GMP 6.3.0 and MPFR
4.2.1 releases.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
2023-08-29 14:11:35 +00:00
Colin Leroy-Mira
dfe8c44588 localedata: Translit common emojis to smileys [BZ #30649]
Add common emojis to the translit-able characters (mostly
faces and hearts), and translit them to old-fashioned
smileys.

Signed-off-by: Colin Leroy-Mira <colin@colino.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-08-29 09:31:23 +02:00
Florian Weimer
c00b984fcd nscd: Skip unusable entries in first pass in prune_cache (bug 30800)
Previously, if an entry was marked unusable for any reason, but had
not timed out yet, the assert would trigger.

One way to get into such state is if a data change is detected during
re-validation of an entry.  This causes the entry to be marked as not
usable.  If exits nscd soon after that, then the clock jumps
backwards, and nscd restarted, the cache re-validation run after
startup triggers the removed assert.

The change is more complicated than just the removal of the assert
because entries marked as not usable should be garbage-collected in
the second pass.  To make this happen, it is necessary to update some
book-keeping data.

Reviewed-by: DJ Delorie <dj@redhat.com>
2023-08-29 08:28:38 +02:00
dengjianbo
693918b6dd LoongArch: Change loongarch to LoongArch in comments 2023-08-29 10:35:38 +08:00
dengjianbo
ea7698a616 LoongArch: Add ifunc support for memcmp{aligned, lsx, lasx}
According to glibc memcmp microbenchmark test results(Add generic
memcmp), this implementation have performance improvement
except the length is less than 3, details as below:

Name             Percent of time reduced
memcmp-lasx      16%-74%
memcmp-lsx       20%-50%
memcmp-aligned   5%-20%
2023-08-29 10:35:38 +08:00
dengjianbo
1b1e9b7c10 LoongArch: Add ifunc support for memset{aligned, unaligned, lsx, lasx}
According to glibc memset microbenchmark test results, for LSX and LASX
versions, A few cases with length less than 8 experience performace
degradation, overall, the LASX version could reduce the runtime about
15% - 75%, LSX version could reduce the runtime about 15%-50%.

The unaligned version uses unaligned memmory access to set data which
length is less than 64 and make address aligned with 8. For this part,
the performace is better than aligned version. Comparing with the generic
version, the performance is close when the length is larger than 128. When
the length is 8-128, the unaligned version could reduce the runtime about
30%-70%, the aligned version could reduce the runtime about 20%-50%.
2023-08-29 10:35:38 +08:00
dengjianbo
55e84dc6ed LoongArch: Add ifunc support for memrchr{lsx, lasx}
According to glibc memrchr microbenchmark, this implementation could reduce
the runtime as following:

Name            Percent of rutime reduced
memrchr-lasx    20%-83%
memrchr-lsx     20%-64%
2023-08-29 10:35:38 +08:00
dengjianbo
60bcb9acbf LoongArch: Add ifunc support for memchr{aligned, lsx, lasx}
According to glibc memchr microbenchmark, this implementation could reduce
the runtime as following:

Name               Percent of runtime reduced
memchr-lasx        37%-83%
memchr-lsx         30%-66%
memchr-aligned     0%-15%
2023-08-29 10:35:38 +08:00
dengjianbo
f8664fe215 LoongArch: Add ifunc support for rawmemchr{aligned, lsx, lasx}
According to glibc rawmemchr microbenchmark, A few cases tested with
char '\0' experience performance degradation due to the lasx and lsx
versions don't handle the '\0' separately. Overall, rawmemchr-lasx
implementation could reduce the runtime about 40%-80%, rawmemchr-lsx
implementation could reduce the runtime about 40%-66%, rawmemchr-aligned
implementation could reduce the runtime about 20%-40%.
2023-08-29 10:35:38 +08:00
Xi Ruoyao
3efa26749e LoongArch: Micro-optimize LD_PCREL
We are requiring Binutils >= 2.41, so explicit relocation syntax is
always supported by the assembler.  Use it to reduce one instruction.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
2023-08-29 10:35:38 +08:00
Xi Ruoyao
aac842d0ed LoongArch: Remove support code for old linker in start.S
We are requiring Binutils >= 2.41, so la.pcrel always works here.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
2023-08-29 10:35:38 +08:00
Xi Ruoyao
e757412c3e LoongArch: Simplify the autoconf check for static PIE
We are strictly requiring GAS >= 2.41 now, so we don't need to check
assembler capability anymore.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
2023-08-29 10:35:38 +08:00
Kir Kolyshkin
42c960a4f1 Add F_SEAL_EXEC from Linux 6.3 to bits/fcntl-linux.h.
This patch adds the new F_SEAL_EXEC constant from Linux 6.3 (see Linux
commit 6fd7353829c ("mm/memfd: add F_SEAL_EXEC") to bits/fcntl-linux.h.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-28 14:51:39 -03:00
Joe Simmons-Talbott
46924663bd argp-parse: Get rid of alloca
Even though the alloca usage is relatively small and fixed size the code
can be written without using alloca.  Convert to local variables.

Checked on x86_64-linux-gnu.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-28 16:49:27 +00:00
Joe Simmons-Talbott
4d8b093933 gencat: Get rid of alloca.
Convert to scratch_buffers to avoid potential stack overflow.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-28 16:42:53 +00:00
Adhemerval Zanella
87ced255bd m68k: Use M68K_SCALE_AVAILABLE on __mpn_lshift and __mpn_rshift
This patch adds a new macro, M68K_SCALE_AVAILABLE, similar to gmp
scale_available_p (mpn/m68k/m68k-defs.m4) that expand to 1 if a
scale factor can be used in addressing modes.  This is used
instead of __mc68020__ for some optimization decisions.

Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
2023-08-25 10:07:24 -03:00
Adhemerval Zanella
b85880633f m68k: Fix build with -mcpu=68040 or higher (BZ 30740)
GCC currently does not define __mc68020__ for -mcpu=68040 or higher,
which memcpy/memmove assumptions.  Since this memory copy optimization
seems only intended for m68020, disable for other m680X0 variants.

Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
2023-08-25 10:07:24 -03:00
Florian Weimer
3d9265467e elf: Check that --list-diagnostics output has the expected syntax
Parts of elf/tst-rtld-list-diagnostics.py have been copied from
scripts/tst-ld-trace.py.

The abnf module is entirely optional and used to verify the
ABNF grammar as included in the manual.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-25 14:19:16 +02:00
Florian Weimer
f21962ddfc manual: Document ld.so --list-diagnostics output
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-25 14:15:28 +02:00
Mark Wielaard
5a21cefd5a manual/jobs.texi: Add missing @item EPERM for getpgid
The missing @item makes it look like errno will be set to ESRCH
if a cross-session getpgid is not permitted.

Found by ulfvonbelow on irc.
2023-08-25 11:43:30 +02:00
dengjianbo
ddbb74f5c2 LoongArch: Add ifunc support for strncmp{aligned, lsx}
Based on the glibc microbenchmark, only a few short inputs with this
strncmp-aligned and strncmp-lsx implementation experience performance
degradation, overall, strncmp-aligned could reduce the runtime 0%-10%
for aligned comparision, 10%-25% for unaligend comparision, strncmp-lsx
could reduce the runtime about 0%-60%.
2023-08-24 17:19:47 +08:00
dengjianbo
82d9426e4a LoongArch: Add ifunc support for strcmp{aligned, lsx}
Based on the glibc microbenchmark, strcmp-aligned implementation could
reduce the runtime 0%-10% for aligned comparison, 10%-20% for unaligned
comparison, strcmp-lsx implemenation could reduce the runtime 0%-50%.
2023-08-24 17:19:47 +08:00
dengjianbo
e74d959862 LoongArch: Add ifunc support for strnlen{aligned, lsx, lasx}
Based on the glibc microbenchmark, strnlen-aligned implementation could
reduce the runtime more than 10%, strnlen-lsx implementation could reduce
the runtime about 50%-78%, strnlen-lasx implementation could reduce the
runtime about 50%-88%.
2023-08-24 17:19:47 +08:00
Guy-Fleury Iteriteka
1dc0bc8f07 htl: move pthread_attr_setdetachstate into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-11-gfleury@disroot.org>
2023-08-24 01:57:22 +02:00
Guy-Fleury Iteriteka
92a6c26470 htl: move pthread_attr_getdetachstate into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-10-gfleury@disroot.org>
2023-08-24 01:57:17 +02:00
Guy-Fleury Iteriteka
c2c9feebdc htl: move pthread_attr_setschedpolicy into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-9-gfleury@disroot.org>
2023-08-24 01:57:16 +02:00
Guy-Fleury Iteriteka
0f3a39072b htl: move pthread_attr_getschedpolicy into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-8-gfleury@disroot.org>
2023-08-24 01:57:14 +02:00
Guy-Fleury Iteriteka
fb2d92a5b3 htl: move pthread_attr_setinheritsched into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-7-gfleury@disroot.org>
2023-08-24 01:57:13 +02:00
Guy-Fleury Iteriteka
62cf5d2bb3 htl: move pthread_attr_getinheritsched into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-6-gfleury@disroot.org>
2023-08-24 01:57:11 +02:00
Guy-Fleury Iteriteka
79de1a0ca2 htl: move pthread_attr_getschedparam into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-5-gfleury@disroot.org>
2023-08-24 01:57:10 +02:00
Guy-Fleury Iteriteka
3caa6362d0 htl: move pthread_setschedparam into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-4-gfleury@disroot.org>
2023-08-24 01:57:08 +02:00
Guy-Fleury Iteriteka
a1a942fb5f htl: move pthread_getschedparam into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-3-gfleury@disroot.org>
2023-08-24 01:57:04 +02:00
Guy-Fleury Iteriteka
9dfa256216 htl: move pthread_equal into libc
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230716084414.107245-2-gfleury@disroot.org>
2023-08-24 01:56:57 +02:00
Florian Weimer
65a5112ede Linux: Avoid conflicting types in ld.so --list-diagnostics
The path auxv[*].a_val could either be an integer or a string,
depending on the a_type value.  Use a separate field, a_val_string, to
simplify mechanical parsing of the --list-diagnostics output.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-23 08:12:48 +02:00
Florian Weimer
f6c8204fd7 elf: Do not run constructors for proxy objects
Otherwise, the ld.so constructor runs for each audit namespace
and each dlmopen namespace.
2023-08-22 13:56:25 +02:00
H.J. Lu
a8ecb126d4 x86_64: Add log1p with FMA
On Skylake, it changes log1p bench performance by:

        Before       After     Improvement
max     63.349       58.347       8%
min     4.448        5.651        -30%
mean    12.0674      10.336       14%

The minimum code path is

 if (hx < 0x3FDA827A)                          /* x < 0.41422  */
    {
      if (__glibc_unlikely (ax >= 0x3ff00000))           /* x <= -1.0 */
        {
	   ...
        }
      if (__glibc_unlikely (ax < 0x3e200000))           /* |x| < 2**-29 */
        {
          math_force_eval (two54 + x);          /* raise inexact */
          if (ax < 0x3c900000)                  /* |x| < 2**-54 */
            {
	      ...
            }
          else
            return x - x * x * 0.5;

FMA and non-FMA code sequences look similar.  Non-FMA version is slightly
faster.  Since log1p is called by asinh and atanh, it improves asinh
performance by:

        Before       After     Improvement
max     75.645       63.135       16%
min     10.074       10.071       0%
mean    15.9483      14.9089      6%

and improves atanh performance by:

        Before       After     Improvement
max     91.768       75.081       18%
min     15.548       13.883       10%
mean    18.3713      16.8011      8%
2023-08-21 10:44:26 -07:00
Andreas Schwab
ce99601fa8 Remove references to the defunct db2 subdir
The db2 subdir has been removed more than 20 years ago.
2023-08-21 18:20:53 +02:00
Mahesh Bodapati
f1c7ed0859 string: Fix tester build with fortify enable with gcc < 12
When building with fortify enabled, GCC < 12 issues a warning on the
fortify strncat wrapper might overflow the destination buffer (the
failure is tied to -Werror).

Checked on ppc64 and x86_64.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-08-18 07:59:55 -05:00
Stefan Liebler
f5f96b784b s390x: Fix static PIE condition for toolchain bootstrapping.
The static PIE configure check uses link tests.  When bootstrapping
a cross-toolchain, the link tests fail due to missing crt-files /
libc.so.  As we explicitely want to test an issue in binutils (ld),
we now also explicitely check for known linker versions.

See also commit 368b7c614b
S390: Use compile-only instead of also link-tests in configure.
2023-08-18 10:57:59 +02:00
Andreas Schwab
464fd8249e m68k: fix __mpn_lshift and __mpn_rshift for non-68020
From revision 03f3d275d0d6 in the gmp repository.
2023-08-17 21:56:14 +02:00
Sam James
369f373057
sysdeps: tst-bz21269: fix -Wreturn-type
Thanks to Andreas Schwab for reporting.

Fixes: 652b9fdb77
Signed-off-by: Sam James <sam@gentoo.org>
2023-08-17 09:30:57 +01:00
dengjianbo
8944ba483f Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and memmove{aligned, unaligned, lsx, lasx}
These implementations improve the time to copy data in the glibc
microbenchmark as below:
memcpy-lasx       reduces the runtime about 8%-76%
memcpy-lsx        reduces the runtime about 8%-72%
memcpy-unaligned  reduces the runtime of unaligned data copying up to 40%
memcpy-aligned    reduece the runtime of unaligned data copying up to 25%
memmove-lasx      reduces the runtime about 20%-73%
memmove-lsx       reduces the runtime about 50%
memmove-unaligned reduces the runtime of unaligned data moving up to 40%
memmove-aligned   reduces the runtime of unaligned data moving up to 25%
2023-08-17 10:12:18 +08:00
dengjianbo
ba67bc8e0a Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and strchrnul{aligned, lsx, lasx}
These implementations improve the time to run strchr{nul}
microbenchmark in glibc as below:
strchr-lasx       reduces the runtime about 50%-83%
strchr-lsx        reduces the runtime about 30%-67%
strchr-aligned    reduces the runtime about 10%-20%
strchrnul-lasx    reduces the runtime about 50%-83%
strchrnul-lsx     reduces the runtime about 36%-65%
strchrnul-aligned reduces the runtime about 6%-10%
2023-08-17 10:12:18 +08:00