__morecore, __after_morecore_hook, and __default_morecore had not
been deprecated in commit 7d17596c19
("Mark malloc hook variables as deprecated"), probably by accident.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The buffer allocation uses the same strategy of strsignal.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The per-thread state is refactored two use two strategies:
1. The default one uses a TLS structure, which will be placed in the
static TLS space (using __thread keyword).
2. Linux allocates via struct pthread and access it through THREAD_*
macros.
The default strategy has the disadvantage of increasing libc.so static
TLS consumption and thus decreasing the possible surplus used in
some scenarios (which might be mitigated by BZ#25051 fix).
It is used only on Hurd, where accessing the thread storage in the in
single thread case is not straightforward (afaiu, Hurd developers could
correct me here).
The fallback static allocation used for allocation failure is also
removed: defining its size is problematic without synchronizing with
translated messages (to avoid partial translation) and the resulting
usage is not thread-safe.
Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
and s390x-linux-gnu.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The code for set_max_fast() stores an "impossibly small value"
instead of zero, when the parameter is zero. However, for
small values of the parameter (ex: 1 or 2) the computation
results in a zero being stored anyway.
This patch checks for the parameter being small enough for the
computation to result in zero instead, so that a zero is never
stored.
key values which result in zero being stored:
x86-64: 1..7 (or other 64-bit)
i686: 1..11
armhfp: 1..3 (or other 32-bit)
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Adding the test "tst-safe-linking" for testing that Safe-Linking works
as expected. The test checks these 3 main flows:
* tcache protection
* fastbin protection
* malloc_consolidate() correctness
As there is a random chance of 1/16 that of the alignment will remain
correct, the test checks each flow up to 10 times, using different random
values for the pointer corruption. As a result, the chance for a false
failure of a given tested flow is 2**(-40), thus highly unlikely.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Alignment checks should be performed on the user's buffer and NOT
on the mchunkptr as was done before. This caused bugs in 32 bit
versions, because: 2*sizeof(t) != MALLOC_ALIGNMENT.
As the tcache works on users' buffers it uses the aligned_OK()
check, and the rest work on mchunkptr and therefore check using
misaligned_chunk().
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Removed unneeded '\' chars from end of lines and fixed some
indentation issues that were introduced in the original
Safe-Linking patch.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Safe-Linking is a security mechanism that protects single-linked
lists (such as the fastbin and tcache) from being tampered by attackers.
The mechanism makes use of randomness from ASLR (mmap_base), and when
combined with chunk alignment integrity checks, it protects the "next"
pointers from being hijacked by an attacker.
While Safe-Unlinking protects double-linked lists (such as the small
bins), there wasn't any similar protection for attacks against
single-linked lists. This solution protects against 3 common attacks:
* Partial pointer override: modifies the lower bytes (Little Endian)
* Full pointer override: hijacks the pointer to an attacker's location
* Unaligned chunks: pointing the list to an unaligned address
The design assumes an attacker doesn't know where the heap is located,
and uses the ASLR randomness to "sign" the single-linked pointers. We
mark the pointer as P and the location in which it is stored as L, and
the calculation will be:
* PROTECT(P) := (L >> PAGE_SHIFT) XOR (P)
* *L = PROTECT(P)
This way, the random bits from the address L (which start at the bit
in the PAGE_SHIFT position), will be merged with LSB of the stored
protected pointer. This protection layer prevents an attacker from
modifying the pointer into a controlled value.
An additional check that the chunks are MALLOC_ALIGNed adds an
important layer:
* Attackers can't point to illegal (unaligned) memory addresses
* Attackers must guess correctly the alignment bits
On standard 32 bit Linux machines, an attack will directly fail 7
out of 8 times, and on 64 bit machines it will fail 15 out of 16
times.
This proposed patch was benchmarked and it's effect on the overall
performance of the heap was negligible and couldn't be distinguished
from the default variance between tests on the vanilla version. A
similar protection was added to Chromium's version of TCMalloc
in 2012, and according to their documentation it had an overhead of
less than 2%.
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Adhemerval Zacnella <adhemerval.zanella@linaro.org>
If the test fails due some unexpected failure after the children
creation, either in the signal handler by calling abort or in the main
loop; the created children might not be killed properly.
This patches fixes it by:
* Avoid aborting in the signal handler by setting a flag that
an error has occured and add a check in the main loop.
* Add a atexit handler to handle kill child processes.
Checked on x86_64-linux-gnu.
pvalloc is guarantueed to round up the allocation size to the page
size, so applications can assume that the memory region is larger
than the passed-in argument. The alloc_size attribute cannot express
that.
The test case is based on a suggestion from Jakub Jelinek.
This fixes commit 9bf8e29ca1 ("malloc:
make malloc fail with requests larger than PTRDIFF_MAX (BZ#23741)").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch moves the vDSO setup from libc to loader code, just after
the vDSO link_map setup. For static case the initialization
is moved to _dl_non_dynamic_init instead.
Instead of using the mangled pointer, the vDSO data is set as
attribute_relro (on _rtld_global_ro for shared or _dl_vdso_* for
static). It is read-only even with partial relro.
It fixes BZ#24967 now that the vDSO pointer is setup earlier than
malloc interposition is called.
Also, vDSO calls should not be a problem for static dlopen as
indicated by BZ#20802. The vDSO pointer would be zero-initialized
and the syscall will be issued instead.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, powerpc64-linux-gnu,
powerpc-linux-gnu, s390x-linux-gnu, sparc64-linux-gnu, and
sparcv9-linux-gnu. I also run some tests on mips.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
I've updated copyright dates in glibc for 2020. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus libc.texinfo which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a fix to
sysdeps/unix/sysv/linux/powerpc/bits/termios-c_lflag.h where a typo in
the copyright notice meant it failed to be updated automatically.
Please remember to include 2020 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
do_set_tcache_max, do_set_mxfast:
Fix two instances of comparing "size_t < 0"
Both cases have upper limit, so the "negative value" case
is already handled via overflow semantics.
do_set_tcache_max, do_set_tcache_count:
Fix return value on error. Note: currently not used.
mallopt:
pass return value of helper functions to user. Behavior should
only be actually changed for mxfast, where we restore the old
(pre-tunables) behavior.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
set_max_fast sets the "impossibly small" value based on,
eventually, MALLOC_ALIGNMENT. The comparisons for the smallest
chunk used is, eventually, MIN_CHUNK_SIZE. Note that i386
is the only platform where these are the same, so a smallest
chunk *would* be put in a no-fastbins fastbin.
This change calculates the "impossibly small" value
based on MIN_CHUNK_SIZE instead, so that we can know it will
always be impossibly small.
Fixes `<total type="rest" size="..."> incorrectly showing as 0 most
of the time.
The rest value being wrong is significant because to compute the
actual amount of memory handed out via malloc, the user must subtract
it from <system type="current" size="...">. That result being wrong
makes investigating memory fragmentation issues like
<https://bugzilla.redhat.com/show_bug.cgi?id=843478> close to
impossible.
memusagestat may indirectly link against libpthread. The built
libpthread should be used, but that is only possible if it has been
built before the malloc programs.
GCC mainline has recently added warn_unused_result attributes to some
malloc-like built-in functions, where glibc previously had them in its
headers only for __USE_FORTIFY_LEVEL > 0. This results in those
attributes being newly in effect for building the glibc testsuite, so
resulting in new warnings that break the build where tests
deliberately call such functions and ignore the result. Thus patch
duly adds calls to DIAG_* macros around those calls to disable the
warning.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
* malloc/tst-calloc.c: Include <libc-diag.h>.
(null_test): Ignore -Wunused-result around calls to calloc.
* malloc/tst-mallocfork.c: Include <libc-diag.h>.
(do_test): Ignore -Wunused-result around call to malloc.
Change the tcache->counts[] entries to uint16_t - this removes
the limit set by char and allows a larger tcache. Remove a few
redundant asserts.
bench-malloc-thread with 4 threads is ~15% faster on Cortex-A72.
Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc/malloc.c (MAX_TCACHE_COUNT): Increase to UINT16_MAX.
(tcache_put): Remove redundant assert.
(tcache_get): Remove redundant asserts.
(__libc_malloc): Check tcache count is not zero.
* manual/tunables.texi (glibc.malloc.tcache_count): Update maximum.
The tcache counts[] array is a char, which has a very small range and thus
may overflow. When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.
[BZ #24531]
* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
(do_set_tcache_count): Only update if count is small enough.
* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.
This synchronization method has a lower overhead and makes
it more likely that the signal arrives during one of the critical
functions.
Also test for fork deadlocks explicitly.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The memusagestat is the only binary that has its own link line which
causes it to be linked against the existing installed C library. It
has been this way since it was originally committed in 1999, but I
don't see any reason as to why. Since we want all the programs we
build locally to be against the new copy of glibc, change the build
to be like all other programs.
Remove do_set_mallopt_check prototype since it is unused.
* malloc/arena.c (do_set_mallopt_check): Removed.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
As discussed previously on libc-alpha [1], this patch follows up the idea
and add both the __attribute_alloc_size__ on malloc functions (malloc,
calloc, realloc, reallocarray, valloc, pvalloc, and memalign) and limit
maximum requested allocation size to up PTRDIFF_MAX (taking into
consideration internal padding and alignment).
This aligns glibc with gcc expected size defined by default warning
-Walloc-size-larger-than value which warns for allocation larger than
PTRDIFF_MAX. It also aligns with gcc expectation regarding libc and
expected size, such as described in PR#67999 [2] and previously discussed
ISO C11 issues [3] on libc-alpha.
From the RFC thread [4] and previous discussion, it seems that consensus
is only to limit such requested size for malloc functions, not the system
allocation one (mmap, sbrk, etc.).
The implementation changes checked_request2size to check for both overflow
and maximum object size up to PTRDIFF_MAX. No additional checks are done
on sysmalloc, so it can still issue mmap with values larger than
PTRDIFF_T depending on the requested size.
The __attribute_alloc_size__ is for functions that return a pointer only,
which means it cannot be applied to posix_memalign (see remarks in GCC
PR#87683 [5]). The runtimes checks to limit maximum requested allocation
size does applies to posix_memalign.
Checked on x86_64-linux-gnu and i686-linux-gnu.
[1] https://sourceware.org/ml/libc-alpha/2018-11/msg00223.html
[2] https://gcc.gnu.org/bugzilla//show_bug.cgi?id=67999
[3] https://sourceware.org/ml/libc-alpha/2011-12/msg00066.html
[4] https://sourceware.org/ml/libc-alpha/2018-11/msg00224.html
[5] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87683
[BZ #23741]
* malloc/hooks.c (malloc_check, realloc_check): Use
__builtin_add_overflow on overflow check and adapt to
checked_request2size change.
* malloc/malloc.c (__libc_malloc, __libc_realloc, _mid_memalign,
__libc_pvalloc, __libc_calloc, _int_memalign): Limit maximum
allocation size to PTRDIFF_MAX.
(REQUEST_OUT_OF_RANGE): Remove macro.
(checked_request2size): Change to inline function and limit maximum
requested size to PTRDIFF_MAX.
(__libc_malloc, __libc_realloc, _int_malloc, _int_memalign): Limit
maximum allocation size to PTRDIFF_MAX.
(_mid_memalign): Use _int_memalign call for overflow check.
(__libc_pvalloc): Use __builtin_add_overflow on overflow check.
(__libc_calloc): Use __builtin_mul_overflow for overflow check and
limit maximum requested size to PTRDIFF_MAX.
* malloc/malloc.h (malloc, calloc, realloc, reallocarray, memalign,
valloc, pvalloc): Add __attribute_alloc_size__.
* stdlib/stdlib.h (malloc, realloc, reallocarray, valloc): Likewise.
* malloc/tst-malloc-too-large.c (do_test): Add check for allocation
larger than PTRDIFF_MAX.
* malloc/tst-memalign.c (do_test): Disable -Walloc-size-larger-than=
around tests of malloc with negative sizes.
* malloc/tst-posix_memalign.c (do_test): Likewise.
* malloc/tst-pvalloc.c (do_test): Likewise.
* malloc/tst-valloc.c (do_test): Likewise.
* malloc/tst-reallocarray.c (do_test): Replace call to reallocarray
with resulting size allocation larger than PTRDIFF_MAX with
reallocarray_nowarn.
(reallocarray_nowarn): New function.
* NEWS: Mention the malloc function semantic change.
If an error occurs during the tracing operation, particularly during a
call to lock_and_info() which calls _dl_addr, we may end up calling back
into the malloc-subsystem and relock the loader lock and deadlock. For
all intents and purposes the call to _dl_addr can call any of the malloc
family API functions and so we should disable all tracing before calling
such loader functions. This is similar to the strategy that the new
malloc tracer takes when calling the real malloc, namely that all
tracing ceases at the boundary to the real function and any faults at
that point are the purvue of the library (though the new tracer does
this on a per-thread basis in an MT-safe fashion). Since the new tracer
and the hook deprecation are not yet complete we must fix these issues
where we can.
Tested on x86_64 with no regressions.
Co-authored-by: Kwok Cheung Yeung <kcy@codesourcery.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Fixes bug 24216. This patch adds security checks for bk and bk_nextsize pointers
of chunks in large bin when inserting chunk from unsorted bin. It was possible
to write the pointer to victim (newly inserted chunk) to arbitrary memory
locations if bk or bk_nextsize pointers of the next large bin chunk
got corrupted.
One group of warnings seen with -Wextra is warnings for static or
inline not at the start of a declaration (-Wold-style-declaration).
This patch fixes various such cases for inline, ensuring it comes at
the start of the declaration (after any static). A common case of the
fix is "static inline <type> __always_inline"; the definition of
__always_inline starts with __inline, so the natural change is to
"static __always_inline <type>". Other cases of the warning may be
harder to fix (one pattern is a function definition that gets
rewritten to be static by an including file, "#define funcname static
wrapped_funcname" or similar), but it seems worth fixing these cases
with inline anyway.
Tested for x86_64.
* elf/dl-load.h (_dl_postprocess_loadcmd): Use __always_inline
before return type, without separate inline.
* elf/dl-tunables.c (maybe_enable_malloc_check): Likewise.
* elf/dl-tunables.h (tunable_is_name): Likewise.
* malloc/malloc.c (do_set_trim_threshold): Likewise.
(do_set_top_pad): Likewise.
(do_set_mmap_threshold): Likewise.
(do_set_mmaps_max): Likewise.
(do_set_mallopt_check): Likewise.
(do_set_perturb_byte): Likewise.
(do_set_arena_test): Likewise.
(do_set_arena_max): Likewise.
(do_set_tcache_max): Likewise.
(do_set_tcache_count): Likewise.
(do_set_tcache_unsorted_limit): Likewise.
* nis/nis_subr.c (count_dots): Likewise.
* nptl/allocatestack.c (advise_stack_range): Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Likewise.
(do_sin): Likewise.
(reduce_sincos): Likewise.
(do_sincos): Likewise.
* sysdeps/unix/sysv/linux/x86/elision-conf.c
(do_set_elision_enable): Likewise.
(TUNABLE_CALLBACK_FNDECL): Likewise.
One of the warnings that appears with -Wextra is "ordered comparison
of pointer with integer zero" in malloc.c:tcache_get, for the
assertion:
assert (tcache->entries[tc_idx] > 0);
Indeed, a "> 0" comparison does not make sense for
tcache->entries[tc_idx], which is a pointer. My guess is that
tcache->counts[tc_idx] is what's intended here, and this patch changes
the assertion accordingly.
Tested for x86_64.
* malloc/malloc.c (tcache_get): Compare tcache->counts[tc_idx]
with 0, not tcache->entries[tc_idx].
Commit 6923f6db1e ("malloc: Use current
(C11-style) atomics for fastbin access") caused a substantial
performance regression on POWER and Aarch64, and the old atomics,
while hard to prove correct, seem to work in practice.
This commit removes the custom memcpy implementation from _int_realloc
for small chunk sizes. The ncopies variable has the wrong type, and
an integer wraparound could cause the existing code to copy too few
elements (leaving the new memory region mostly uninitialized).
Therefore, removing this code fixes bug 24027.
This one tests for BZ#23907 where the double free
test didn't check the tcache bin bounds before dereferencing
the bin.
[BZ #23907]
* malloc/tst-tcfree3.c: New.
* malloc/Makefile: Add it.
The previous check could read beyond the end of the tcache entry
array. If the e->key == tcache cookie check happened to pass, this
would result in crashes.
This commit is in preparation of turning the macro into a proper
function. The output arguments of the macro were in fact unused.
Also clean up uses of __builtin_expect.
On Thu, Jan 11, 2018 at 3:50 PM, Florian Weimer <fweimer@redhat.com> wrote:
> On 11/07/2017 04:27 PM, Istvan Kurucsai wrote:
>>
>> + next = chunk_at_offset (victim, size);
>
>
> For new code, we prefer declarations with initializers.
Noted.
>> + if (__glibc_unlikely (chunksize_nomask (victim) <= 2 * SIZE_SZ)
>> + || __glibc_unlikely (chunksize_nomask (victim) >
>> av->system_mem))
>> + malloc_printerr("malloc(): invalid size (unsorted)");
>> + if (__glibc_unlikely (chunksize_nomask (next) < 2 * SIZE_SZ)
>> + || __glibc_unlikely (chunksize_nomask (next) >
>> av->system_mem))
>> + malloc_printerr("malloc(): invalid next size (unsorted)");
>> + if (__glibc_unlikely ((prev_size (next) & ~(SIZE_BITS)) !=
>> size))
>> + malloc_printerr("malloc(): mismatching next->prev_size
>> (unsorted)");
>
>
> I think this check is redundant because prev_size (next) and chunksize
> (victim) are loaded from the same memory location.
I'm fairly certain that it compares mchunk_size of victim against
mchunk_prev_size of the next chunk, i.e. the size of victim in its
header and footer.
>> + if (__glibc_unlikely (bck->fd != victim)
>> + || __glibc_unlikely (victim->fd != unsorted_chunks (av)))
>> + malloc_printerr("malloc(): unsorted double linked list
>> corrupted");
>> + if (__glibc_unlikely (prev_inuse(next)))
>> + malloc_printerr("malloc(): invalid next->prev_inuse
>> (unsorted)");
>
>
> There's a missing space after malloc_printerr.
Noted.
> Why do you keep using chunksize_nomask? We never investigated why the
> original code uses it. It may have been an accident.
You are right, I don't think it makes a difference in these checks. So
the size local can be reused for the checks against victim. For next,
leaving it as such avoids the masking operation.
> Again, for non-main arenas, the checks against av->system_mem could be made
> tighter (against the heap size). Maybe you could put the condition into a
> separate inline function?
We could also do a chunk boundary check similar to what I proposed in
the thread for the first patch in the series to be even more strict.
I'll gladly try to implement either but believe that refining these
checks would bring less benefits than in the case of the top chunk.
Intra-arena or intra-heap overlaps would still be doable here with
unsorted chunks and I don't see any way to counter that besides more
generic measures like randomizing allocations and your metadata
encoding patches.
I've attached a revised version with the above comments incorporated
but without the refined checks.
Thanks,
Istvan
From a12d5d40fd7aed5fa10fc444dcb819947b72b315 Mon Sep 17 00:00:00 2001
From: Istvan Kurucsai <pistukem@gmail.com>
Date: Tue, 16 Jan 2018 14:48:16 +0100
Subject: [PATCH v2 1/1] malloc: Additional checks for unsorted bin integrity
I.
Ensure the following properties of chunks encountered during binning:
- victim chunk has reasonable size
- next chunk has reasonable size
- next->prev_size == victim->size
- valid double linked list
- PREV_INUSE of next chunk is unset
* malloc/malloc.c (_int_malloc): Additional binning code checks.
The House of Force is a well-known technique to exploit heap
overflow. In essence, this exploit takes three steps:
1. Overwrite the size of top chunk with very large value (e.g. -1).
2. Request x bytes from top chunk. As the size of top chunk
is corrupted, x can be arbitrarily large and top chunk will
still be offset by x.
3. The next allocation from top chunk will thus be controllable.
If we verify the size of top chunk at step 2, we can stop such attack.