Commit Graph

38218 Commits

Author SHA1 Message Date
Adhemerval Zanella
063f9ba220 elf: Avoid unnecessary slowdown from profiling with audit (BZ#15533)
The rtld-audit interfaces introduces a slowdown due to enabling
profiling instrumentation (as if LD_AUDIT implied LD_PROFILE).
However, instrumenting is only necessary if one of audit libraries
provides PLT callbacks (la_pltenter or la_pltexit symbols).  Otherwise,
the slowdown can be avoided.

The following patch adjusts the logic that enables profiling to iterate
over all audit modules and check if any of those provides a PLT hook.
To keep la_symbind to work even without PLT callbacks, _dl_fixup now
calls the audit callback if the modules implements it.

Co-authored-by: Alexander Monakov <amonakov@ispras.ru>

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
8c0664e2b8 elf: Add _dl_audit_pltexit
It consolidates the code required to call la_pltexit audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
eff687e846 elf: Add _dl_audit_pltenter
It consolidates the code required to call la_pltenter audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
0b98a87487 elf: Add _dl_audit_preinit
It consolidates the code required to call la_preinit audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
cda4f265c6 elf: Add _dl_audit_symbind_alt and _dl_audit_symbind
It consolidates the code required to call la_symbind{32,64} audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
311c9ee54e elf: Add _dl_audit_objclose
It consolidates the code required to call la_objclose audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
c91008d349 elf: Add _dl_audit_objsearch
It consolidates the code required to call la_objsearch audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
3dac3959a5 elf: Add _dl_audit_activity_map and _dl_audit_activity_nsid
It consolidates the code required to call la_activity audit
callback.

Also for a new Lmid_t the namespace link_map list are empty, so it
requires to check if before using it.  This can happen for when audit
module is used along with dlmopen.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Adhemerval Zanella
aee6e90f93 elf: Add _dl_audit_objopen
It consolidates the code required to call la_objopen audit callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-28 08:40:38 -03:00
Samuel Thibault
ae49f218da hurd: Fix static-PIE startup
hurd initialization stages use RUN_HOOK to run various initialization
functions.  That is however using absolute addresses which need to be
relocated, which is done later by csu.  We can however easily make the
linker compute relative addresses which thus don't need a relocation.
The new SET_RELHOOK and RUN_RELHOOK macros implement this.
2021-12-28 10:28:22 +01:00
Samuel Thibault
2ce0481d26 hurd: let csu initialize tls
Since 9cec82de71 ("htl: Initialize later"), we let csu initialize
pthreads. We can thus let it initialize tls later too, to better align
with the generic order.  Initialization however accesses ports which
links/unlinks into the sigstate for unwinding.  We can however easily
skip that during initialization.
2021-12-28 10:15:52 +01:00
Samuel Thibault
7b358de1af hurd: Fix XFAIL-ing mallocfork2 tests
They are using setpshared but are outside the htl directory.
2021-12-27 22:21:08 +01:00
Samuel Thibault
1c6e6e52e5 hurd: XFAIL more tests that require setpshared support 2021-12-27 22:15:43 +01:00
Samuel Thibault
53c38911b8 malloc: Add missing shared thread library flags 2021-12-27 22:10:15 +01:00
Samuel Thibault
422e4cd0ff stdio-common: Fix %m sprintf test output for GNU/Hurd
GNU/Hurd has slightly different error messages for undefined numbers,
due to the notion of error subsystems.
2021-12-27 21:23:05 +01:00
Noah Goldstein
cca457f9c5 x86: Optimize L(less_vec) case in memcmpeq-evex.S
No bug.
Optimizations are twofold.

1) Replace page cross and 0/1 checks with masked load instructions in
   L(less_vec). In applications this reduces branch-misses in the
   hot [0, 32] case.
2) Change controlflow so that L(less_vec) case gets the fall through.

Change 2) helps copies in the [0, 32] size range but comes at the cost
of copies in the [33, 64] size range.  From profiles of GCC and
Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this
appears to the the right tradeoff.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-27 03:18:58 -06:00
Noah Goldstein
abddd61de0 x86: Optimize L(less_vec) case in memcmp-evex-movbe.S
No bug.
Optimizations are twofold.

1) Replace page cross and 0/1 checks with masked load instructions in
   L(less_vec). In applications this reduces branch-misses in the
   hot [0, 32] case.
2) Change controlflow so that L(less_vec) case gets the fall through.

Change 2) helps copies in the [0, 32] size range but comes at the cost
of copies in the [33, 64] size range.  From profiles of GCC and
Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this
appears to the the right tradeoff.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-27 03:17:59 -06:00
H.J. Lu
268d812c19 elf: Remove AArch64 from comment for AT_MINSIGSTKSZ
Remove AArch64 from comment for AT_MINSIGSTKSZ to match

commit 7cd60e43a6def40ecb75deb8decc677995970d0b
Author: Chang S. Bae <chang.seok.bae@intel.com>
Date:   Tue May 18 13:03:15 2021 -0700

    uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ

    Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
    as generic ABI in glibc's generic elf.h, and this define will prevent
    future namespace conflicts. In particular, x86 is also using this
    generic definition.

in Linux kernel 5.14.
2021-12-23 06:48:24 -08:00
H.J. Lu
6e30181b4a math: Properly cast X_TLOSS to float [BZ #28713]
Add

 #define AS_FLOAT_CONSTANT_1(x) x##f
 #define AS_FLOAT_CONSTANT(x) AS_FLOAT_CONSTANT_1(x)

to cast X_TLOSS to float at compile-time to fix:

FAIL: math/test-float-j0
FAIL: math/test-float-jn
FAIL: math/test-float-y0
FAIL: math/test-float-y1
FAIL: math/test-float-yn
FAIL: math/test-float32-j0
FAIL: math/test-float32-jn
FAIL: math/test-float32-y0
FAIL: math/test-float32-y1
FAIL: math/test-float32-yn

when compiling with GCC 12.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-12-23 06:45:47 -08:00
Adhemerval Zanella
a4b4131355 Set default __TIMESIZE default to 64
This is expected size for newer ABIs.
2021-12-23 11:41:08 -03:00
Florian Weimer
9702a7901e stdio: Implement %#m for vfprintf and related functions
%#m prints errno as an error constant if one is available, or
a decimal number as a fallback.  This intends to address the gap
that strerrorname_np does not work well with printf for unknown
error codes due to its NULL return values in those cases.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-23 15:02:50 +01:00
Florian Weimer
cd0c333d2e elf: Remove unused NEED_DL_BASE_ADDR and _dl_base_addr
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-23 14:12:56 +01:00
Sunil K Pandey
f20f980c71 x86-64: Add vector acos/acosf implementation to libmvec
Implement vectorized acos/acosf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector acos/acosf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-22 13:03:14 -08:00
Andrea Monaco
c6d7d6312c intl/plural.y: Avoid conflicting declarations of yyerror and yylex
bison-3.8 includes these lines in the generated intl/plural.c:

  #if !defined __gettexterror && !defined YYERROR_IS_DECLARED
  void __gettexterror (struct parse_args *arg, const char *msg);
  #endif
  #if !defined __gettextlex && !defined YYLEX_IS_DECLARED
  int __gettextlex (YYSTYPE *yylvalp, struct parse_args *arg);
  #endif

Those default prototypes provided by bison conflict with the
declarations later on in plural.y.  This patch solves the issue.

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2021-12-22 14:46:39 +01:00
H.J. Lu
163f625cf9 elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688]
p_align does not have to be a multiple of the page size.  Only PT_LOAD
segment layout should be aligned to the page size.

1: Remove p_align check against the page size.
2. Use the page size, instead of p_align, to check PT_LOAD segment layout.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-22 05:12:30 -08:00
H.J. Lu
d3e4f5a101 s_sincosf.h: Change pio4 type to float [BZ #28713]
s_cosf.c and s_sinf.c have

  if (abstop12 (y) < abstop12 (pio4))

where abstop12 takes a float argument, but pio4 is static const double.
pio4 is used only in calls to abstop12 and never in arithmetic.  Apply

-static const double pio4 = 0x1.921FB54442D18p-1;
+static const float pio4 = 0x1.921FB6p-1f;

to fix:

FAIL: math/test-float-cos
FAIL: math/test-float-sin
FAIL: math/test-float-sincos
FAIL: math/test-float32-cos
FAIL: math/test-float32-sin
FAIL: math/test-float32-sincos

when compiling with GCC 12.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-12-21 08:56:12 -08:00
maminjie
e0fc721ce6 Linux: Fix 32-bit vDSO for clock_gettime on powerpc32
When the clock_id is CLOCK_PROCESS_CPUTIME_ID or CLOCK_THREAD_CPUTIME_ID,
on the 5.10 kernel powerpc 32-bit, the 32-bit vDSO is executed successfully (
because the __kernel_clock_gettime in arch/powerpc/kernel/vdso32/gettimeofday.S
does not support these two IDs, the 32-bit time_t syscall will be used),
but tp32.tv_sec is equal to 0, causing the 64-bit time_t syscall to continue to be used,
resulting in two system calls.

Fix commit 72e84d1db2.

Signed-off-by: maminjie  <maminjie2@huawei.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-21 09:47:16 -03:00
H.J. Lu
de8a0897e3 Regenerate ulps on x86_64 with GCC 12
Fix

FAIL: math/test-float-clog10
FAIL: math/test-float32-clog10

on Intel Core i7-1165G7 with GCC 12.
2021-12-20 15:25:00 -08:00
Joseph Myers
a94d9659cd Add ARPHRD_CAN, ARPHRD_MCTP to net/if_arp.h
Add the constant ARPHRD_MCTP, from Linux 5.15, to net/if_arp.h, along
with ARPHRD_CAN which was added to Linux in version 2.6.25 (commit
cd05acfe65ed2cf2db683fa9a6adb8d35635263b, "[CAN]: Allocate protocol
numbers for PF_CAN") but apparently missed for glibc at the time.

Tested for x86_64.
2021-12-20 15:38:32 +00:00
Adhemerval Zanella
691d9ae9e6 Remove ununsed tcb-offset
Some architectures do not use the auto-generated tcb-offsets.h.
2021-12-17 17:47:29 -03:00
Aurelien Jarno
225da459ce riscv: align stack before calling _dl_init [BZ #28703]
Align the stack pointer to 128 bits during the call to _dl_init() as
specified by the RISC-V ABI [1]. This fixes the elf/tst-align2 test.

Fixes bug 28703.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
2021-12-17 20:29:34 +01:00
Aurelien Jarno
d2e594d715 riscv: align stack in clone [BZ #28702]
The RISC-V ABI [1] mandates that "the stack pointer shall be aligned to
a 128-bit boundary upon procedure entry". This as not the case in clone.

This fixes the misc/tst-misalign-clone-internal and
misc/tst-misalign-clone tests.

Fixes bug 28702.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
2021-12-17 20:29:32 +01:00
Aurelien Jarno
94058f6cde elf: Fix tst-cpu-features-cpuinfo for KVM guests on some AMD systems [BZ #28704]
On KVM guests running on some AMD systems, the IBRS feature is reported
as a synthetic feature using the Intel feature, while the cpuinfo entry
keeps the same. Handle that by first checking the presence of the Intel
feature on AMD systems.

Fixes bug 28704.
2021-12-17 20:20:15 +01:00
Matheus Castanho
ae91d3df24 powerpc64[le]: Allocate extra stack frame on syscall.S
The syscall function does not allocate the extra stack frame for scv like other
assembly syscalls using DO_CALL_SCV. So after commit d120fb9941 changed the
offset that is used to save LR, syscall ended up using an invalid offset,
causing regressions on powerpc64. So make sure the extra stack frame is
allocated in syscall.S as well to make it consistent with other uses of
DO_CALL_SCV and avoid similar issues in the future.

Tested on powerpc, powerpc64, and powerpc64le (with and without scv)

Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
2021-12-17 15:40:53 -03:00
Maxim Kuvyrkov
c16dc431c8 Update copyright header in recently merged ab_GE locale
ab_GE locale was committed under DCO and this header
proposed in [1] suits it better.

[1] https://sourceware.org/pipermail/libc-alpha/2021-September/130692.html

Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
2021-12-17 18:22:21 +00:00
Siddhesh Poyarekar
2bbd07c715 fortify: Fix spurious warning with realpath
The length and object size arguments were swapped around for realpath.
Also add a smoke test so that any changes in this area get caught in
future.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-12-17 18:49:27 +05:30
Florian Weimer
b99b0f93ee nss: Use "files dns" as the default for the hosts database (bug 28700)
This matches what is currently in nss/nsswitch.conf.  The new ordering
matches what most distributions use in their installed configuration
files.

It is common to add localhost to /etc/hosts because the name does not
exist in the DNS, but is commonly used as a host name.

With the built-in "dns [!UNAVAIL=return] files" default, dns is
searched first and provides an answer for "localhost" (NXDOMAIN).
We never look at the files database as a result, so the contents of
/etc/hosts is ignored.  This means that "getent hosts localhost"
fail without a /etc/nsswitch.conf file, even though the host name
is listed in /etc/hosts.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-12-17 12:01:25 +01:00
Florian Weimer
ce1e5b1122 arm: Guard ucontext _rtld_global_ro access by SHARED, not PIC macro
Due to PIE-by-default, PIC is now defined in more cases.  libc.a
does not have _rtld_global_ro, and statically linking setcontext
fails.  SHARED is the right condition to use, so that libc.a
references _dl_hwcap instead of _rtld_global_ro.

For static PIE support, the !SHARED case would still have to be made
PIC.  This patch does not achieve that.

Fixes commit 23645707f1
("Replace --enable-static-pie with --disable-default-pie").

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-17 11:48:44 +01:00
Siddhesh Poyarekar
72e4a717bd Fix The GNU ToolChain Authors copyright notice
I (and maybe one or two others) added a (C) to the copyright notice
regardless of the contribution checklist[1] not mentioning it.  Fix all
these instances so that the notice reads as "Copyright The GNU Toolchain
Authors" across the source code.

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-17 07:59:45 +05:30
Patrick McGehearty
0a4df6f534 Remove upper limit on tunable MALLOC_MMAP_THRESHOLD
The current limit on MALLOC_MMAP_THRESHOLD is either 1 Mbyte (for
32-bit apps) or 32 Mbytes (for 64-bit apps).  This value was set by a
patch dated 2006 (15 years ago).  Attempts to set the threshold higher
are currently ignored.

The default behavior is appropriate for many highly parallel
applications where many processes or threads are sharing RAM. In other
situations where the number of active processes or threads closely
matches the number of cores, a much higher limit may be desired by the
application designer. By today's standards on personal computers and
small servers, 2 Gbytes of RAM per core is commonly available. On
larger systems 4 Gbytes or more of RAM is sometimes available.
Instead of raising the limit to match current needs, this patch
proposes to remove the limit of the tunable, leaving the decision up
to the user of a tunable to judge the best value for their needs.

This patch does not change any of the defaults for malloc tunables,
retaining the current behavior of the dynamic malloc mmap threshold.

bugzilla 27801 - Remove upper limit on tunable MALLOC_MMAP_THRESHOLD
Reviewed-by: DJ Delorie <dj@redhat.com>

malloc/
        malloc.c changed do_set_mmap_threshold to remove test
        for HEAP_MAX_SIZE.
2021-12-16 17:24:37 +00:00
Nart Tlisha
a16c5ab139 localedata: add new locale ab_GE
Add the Abkhazian language in the Georgia territory

The ab_GE was just recently added to CLDR, it should be available
in CLDR v41, https://github.com/unicode-org/cldr/pull/1402

The Abkhazian language has been added to Gnome for localization

The locale has been tested on Ubuntu 20.04, Mint 20.2 and Fedora 35 Beta

Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
2021-12-16 14:37:14 +00:00
Stefan Liebler
ff3cb03f38 Fix __minimal_malloc segfaults in __mmap due to stack-protector
Starting with commit b05fae4d8e
"elf: Use the minimal malloc on tunables_strdup",
I get lots of segfaults in static tests on s390x when also using, e.g.:
export GLIBC_TUNABLES="glibc.elision.enable=1"

tunables_strdup callls __minimal_malloc which tries to call __mmap
due to insufficient space left. __mmap itself first setups a new
stack frame and segfaults when copying the stack-protector canary
from thread-pointer. The latter one is not yet setup.

Thus this patch also turns off stack-protection for mmap.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-16 15:19:28 +01:00
Siddhesh Poyarekar
ae23fa3e5f __glibc_unsafe_len: Fix comment
We know that the length is *unsafe*.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-16 07:21:43 +05:30
Adhemerval Zanella
0f982c1827 malloc: Enable huge page support on main arena
This patch adds support huge page support on main arena allocation,
enable with tunable glibc.malloc.hugetlb=2.  The patch essentially
disable the __glibc_morecore() sbrk() call (similar when memory
tag does when sbrk() call does not support it) and fallback to
default page size if the memory allocation fails.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
0849eed45d malloc: Move MORECORE fallback mmap to sysmalloc_mmap_fallback
So it can be used on hugepage code as well.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
c1beb51d08 malloc: Add Huge Page support to arenas
It is enabled as default for glibc.malloc.hugetlb set to 2 or higher.
It also uses a non configurable minimum value and maximum value,
currently set respectively to 1 and 4 selected huge page size.

The arena allocation with huge pages does not use MAP_NORESERVE.  As
indicate by kernel internal documentation [1], the flag might trigger
a SIGBUS on soft page faults if at memory access there is no left
pages in the pool.

On systems without a reserved huge pages pool, is just stress the
mmap(MAP_HUGETLB) allocation failure.  To improve test coverage it is
required to create a pool with some allocated pages.

Checked on x86_64-linux-gnu with no reserved pages, 10 reserved pages
(which trigger mmap(MAP_HUGETBL) failures) and with 256 reserved pages
(which does not trigger mmap(MAP_HUGETLB) failures).

[1] https://www.kernel.org/doc/html/v4.18/vm/hugetlbfs_reserv.html#resv-map-modifications

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:39 -03:00
Adhemerval Zanella
98d5fcb8d0 malloc: Add Huge Page support for mmap
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages.  And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.

This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.

Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.

To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting.  On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure.  To improve test coverage it is required to create
a pool with some allocated pages.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:38 -03:00
Adhemerval Zanella
6cc3ccc67e malloc: Move mmap logic to its own function
So it can be used with different pagesize and flags.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:15 -03:00
Adhemerval Zanella
7478c9959a malloc: Add THP/madvise support for sbrk
To increase effectiveness with Transparent Huge Page with madvise, the
large page size is use instead page size for sbrk increment for the
main arena.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:15 -03:00
Adhemerval Zanella
5f6d8d97c6 malloc: Add madvise support for Transparent Huge Pages
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'.  The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages.  However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.

To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.

This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size.  The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.

To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:14 -03:00