Improve small memsets by avoiding branches and use overlapping stores.
Use DC ZVA for copies over 128 bytes. Remove unnecessary code for ZVA sizes
other than 64 and 128. Performance of random memset benchmark improves by 24%
on Neoverse N1.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since the last operation is destructive, the first argument to the FMA
also has to be the first argument to the special-case in order to
avoid unnecessary MOVs. Reorder arguments and adjust special-case
bounds to facilitate this.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
As recently discussed, document that freopen does not work with
streams opened with functions such as popen, fmemopen, open_memstream
or fopencookie. I've filed
<https://austingroupbugs.net/view.php?id=1855> to clarify this issue
in POSIX.
Tested with "make info" and "make html".
Using TLS directly introduces a GLIBC_PRIVATE ABI dependency
into libc_nonshared.a, and thus indirectly into applications.
Adding the !defined LIBC_NONSHARED condition deactivates direct
TLS access, and libc_nonshared.a code switches to using
__errno_location, like application code.
Currently, this has no effect because there is no code in
libc_nonshared.a that accesses errno.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Sync tzselect, zdump, zic to TZDB 2024b. This patch incorporates the
following TZDB source code changes:
6903dde3 Release 2024b
812aff32 Improve historical transitions in Mexico 1921-1997
52662566 Adjust to mailing list software change
7748036b Mention Internet RFC 9557
339e81d1 Mention Levine’s proposal to replace leap seconds
b4e6ad2d No leap second on 2024-12-31
7eb5bf88 Asia/Choibalsan is now an alias for Asia/Ulaanbaatar
43450cbf Improve historical data for Portugal and former possessions.
13d7348b Typo and validation fixes.
3c39cde8 Fix typo for “removed” in a comment
03fd9e45 More documentation updates for POSIX.1-2024
eb3bcceb POSIX.1-2014 is now published
913b0410 tzselect: support POSIX.1-2024 offset range
b5318b55 Document POSIX.1-2024 better
837609b7 Fix typo when making .txt man pages
d56ae6ee SUPPORT_C89 now defaults to 1, not 0
b1fe113d Port ! to Solaris make
8f1fd321 Avoid crash in Solaris 10 /usr/xpg4/bin/make
e0fcfdd6 Use ‘export VAR=VAL’ syntax
eba43166 Avoid an awk invocation via $'...'
36479a80 Avoid some subshells in tzselect
7f6cf054 * tzselect.ksh: Assume POSIX.2 awk.
a1cf1daf * tzselect.ksh: Assume POSIX.2 $PWD.
a9b8e536 Assume POSIX.2 command substitution
eaa4ef16 Avoid subshells when possible
9dac9eb7 Prefer $PWD to $(pwd) in Makefile
fada6a4c Prefer $(CMD) to `CMD` in Makefile
3e871b9a Assume POSIX.2 and eschew ‘expr’
c5d67805 difftime isn’t pure either
5857c056 * CONTRIBUTING: Document build assumptions.
6822cc82 ‘make check’ no longer depends on curl+Internet
cc6eb255 Document GCC bug 114833 and workaround
bcbc86bf Scale back on function attribute use
c0789e46 C23 [[reproducible]] and [[unsequenced]] fixups
bbd88154 More updates to GCC_DEBUG_FLAGS for GCC 14
1a35b7c8 Spelling fixes
f71085f2 POSIX.1-2024 removes asctime_r, ctime_r
70856f8e Adjust to refactored location of ctime, ctime_r
aacd151d Update GCC_DEBUG_FLAGS for GCC 14
967dcf3b Sub-second history for Maputo and Zurich
782d0826 Make EET, MET and WET links
a0b09c02 Mark CET, CST6CDT etc. as obsolescent
db7fb40d Document SMPTE timecodes and rolling leaps
97232e18 Don’t be so sure about leap seconds going away
5b6a74fb Update some URLs
a75a6251 * zic.8: Tweak for consistency.
1e75b31f Document what %s means before any rule applies
00c96cbb Conform to RFC 8536 section 3.2 for default type
3e944959 Document problems with stripped-down TZif readers
20fc91cf Shanks is likely wrong about Maputo switch to CAT
d99589b6 * zic.8: Add missing tab character.
94e6b3b0 Switch to %z in main dataform
2cd57b93 Treat W-Eur like Port when reguarding
ad6f6d94 Check that main.zi agrees with sources
a43b030f .gitignore: Add .pdf, .ps, .s. Remove obsolete ‘yearistype’.
253ca020 * theory.html: ‘CLT’ → ‘LTC’ (per Michael H Deckers)
a3dee8c8 * NEWS: ‘how’ → ‘now’ (thanks to Paul Goyette).
ea6341c5 * theory.html: Mention NASA and CLT (per Arthur David Olson).
0dcebe37 America/Scoresbysund matches America/Nuuk from now on
b1e07fb0 Update Vzic link (thanks to Allen Winter)
a4b05030 Fix wday/mday typo in previous patch
732a4803 Document how to detect mktime failure reliably
a64067e9 ziguard.awk: generalize for proposed Portugal patch
59c861fd Line up zdump examples
66c106c9 tzfile.5: srcfix
e5553001 Fix .RS/.RE problem in tzfile.5
d647eb01 Add Doctorow book
59d4a1ba Asia/Almaty matches Asia/Tashkent from now on
d4d3c3ba * asia: Update Philippine URLs (thanks to Guy Harris).
9fc11a27 Port unlikely overflow check to C23
b52a2969 Fix 2023d NEWS typo
e48c5b53 Cite "The NTP Leap Second File"
b1dc2122 Update Israel tz-link
6cf4e912 Extrapolate less from the 2022 CGPM resolution.
It fixes glibc build with gcc master [1].
Checked on x86_64-linux-gnu and on i686-linux-gnu.
[1] https://sourceware.org/pipermail/libc-alpha/2024-September/159571.html
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
As reported in bug 23675 and shown up in the recently added tests of
different cases of freopen (relevant part of the test currently
conditioned under #if 0 to avoid a failure resulting from this bug),
freopen wrongly forces the stream to unoriented even when a mode with
,ccs= is specified, though such a mode is supposed to result in a
wide-oriented stream. Move the clearing of _mode to before the actual
reopening occurs, so that the main fopen implementation can leave a
wide-oriented stream in the ,ccs= case.
Tested for x86_64.
Add new file libio/tst-fclosed-unopened.c that tests whether fclose on
an unopened file returns EOF.
Calling fclose on unopened files normally causes a use-after-free bug,
however the standard streams are an exception since they are not
deallocated by fclose.
fclose returning EOF for unopened files is not part of the external
contract but there are dependancies on this behaviour. For example,
gnulib's close_stdout in lib/closeout.c.
Tested for x86_64.
Signed-off-by: Aaron Merey <amerey@redhat.com>
As reported in bug 32140, freopen leaks the FILE object when it
returns NULL: there is no valid use of the FILE * pointer (including
passing to freopen again or to fclose) after such an error return, so
the underlying object should be freed. Add code to free it.
Note 1: while I think it's clear from the relevant standards that the
object should be freed and the FILE * can't be used after the call in
this case (the stream is closed, which ends the lifetime of the FILE),
it's entirely possible that some existing code does in fact try to use
the existing FILE * in some way and could be broken by this change.
(Though the most common case for freopen may be stdin / stdout /
stderr, which _IO_deallocate_file explicitly checks for and does not
deallocate.)
Note 2: the deallocation is only done in the _IO_IS_FILEBUF case.
Other kinds of streams bypass all the freopen logic handling closing
the file, meaning a call to _IO_deallocate_file would neither be safe
(the FILE might still be linked into the list of all open FILEs) nor
sufficient (other internal memory allocations associated with the file
would not have been freed). I think the validity of freopen for any
other kind of stream will need clarifying with the Austin Group, but
if it is valid in any such case (where "valid" means "not undefined
behavior so required to close the stream" rather than "required to
successfully associate the stream with the new file in cases where
fopen would work"), more significant changes would be needed to ensure
the stream gets fully closed.
Tested for x86_64.
As reported in bug 32134, freopen does not clear the flags set in
fp->_flags2 by the "e", "m" or "c" mode characters. Clear these so
that they can be set or not as appropriate from the mode string passed
to freopen. The relevant test for "e" in tst-freopen2-main.c is
enabled accordingly; "c" is expected to be covered in a separately
written test (and while tst-freopen2-main.c does include transitions
to and from "m", that's not really a semantic flag intended to result
in behaving in an observably different way).
Tested for x86_64.
This allows to monitor the exact file system operations
performed by glibc and inject errors.
Hurd does not have <sys/mount.h>. To get the sources to compile
at least, the same approach as in support/test-container.c is used.
Reviewed-by: DJ Delorie <dj@redhat.com>
Upon error, return the errno value set by the __getdents call
in __readdir_unlocked. Previously, kernel-reported errors
were ignored.
Reviewed-by: DJ Delorie <dj@redhat.com>
Use static functions for readdir/readdir_r, so that
-D_FILE_OFFSET_BITS=64 does not improperly redirect calls to the wrong
implementation.
Reviewed-by: DJ Delorie <dj@redhat.com>
And include the required licensing information. The only
change is a removed trailing empty line in
LICENSES/exceptions/Linux-syscall-note.
Bundling <linux/fuse.h> is the recommended way to deal with
the evolution of the FUSE userspace interface because
structs change sizes over time. The kernel maintains
compatibility, but source-level compatibility on recompilation
may require additional code that is aware of older struct sizes.
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
freopen is rather minimally tested in libio/tst-freopen and
libio/test-freopen. Add some more thorough tests, covering different
cases for change of mode in particular. The tests are run for both
freopen and freopen64 (given that those functions have two separate
copies of much of the code, so any bug fix directly in the freopen
code would probably need applying in both places).
Note that there are two parts of the tests disabled because of bugs
discovered through running the tests, with bug numbers given in
comments. I expect to address those separately. The tests also don't
cover changes to cancellation ("c" in mode); I think that will better
be handled through a separate test. Also to handle separately:
testing on stdin / stdout / stderr; documenting lack of support for
streams opened with popen / fmemopen / open_memstream / fopencookie;
maybe also a chroot test without /proc; maybe also more thorough tests
for large file handling on 32-bit systems (freopen64).
Tested for x86_64.
_wide_data and _mode are not available in legacy code, so do not attempt
to free the wide backup buffer in legacy code.
Resolves: BZ #32137 and BZ #27821
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
As reported in bug 32045, it's incorrect for strtod/nan functions to
set errno based on overflowing payload (strtod should only set errno
for overflow / underflow of its actual result, and potentially if
nothing in the string can be parsed as a number at all; nan should be
a pure function that never sets it). Save and restore errno around
the internal strtoull call and add associated test coverage.
Tested for x86_64.
There are two separate sets of tests of NaN payloads in glibc:
* libm-test-{get,set}payload* verify that getpayload, setpayload,
setpayloadsig and __builtin_nan functions are consistent in their
payload handling.
* test-nan-payload verifies that strtod-family functions and the
not-built-in nan functions are consistent in their payload handling.
Nothing, however, connects the two sets of functions (i.e., verifies
that strtod / nan are consistent with getpayload / setpayload /
__builtin_nan).
Improve test-nan-payload to check actual payload value with getpayload
rather than just verifying that the strtod and nan functions produce
the same NaN. Also check that the NaNs produced aren't signaling and
extend the tests to cover _FloatN / _FloatNx.
Tested for x86_64.
On Linux most descriptors that do not correspond to file system
entities (such as anonymous pipes and sockets) have file permissions
that can be changed. While it is possible to create a custom file
system that returns (say) EINVAL for an fchmod attempt, testing this
does not appear to be useful.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
In __syscall_cancel_arch, there's a tail call to __syscall_do_cancel.
On P10, since the caller uses the TOC and the callee is using
PC-relative addressing, there's only a branch instruction with no NOPs
to restore the TOC, which causes the build error. The fix involves adding
the NOTOC directive to the branch instruction, informing the linker
not to generate a TOC stub, thus resolving the issue.
Some of the strtod tests use type-generic machinery in tst-strtod.h to
test the strto* functions for all floating types, while others only
test double even when the tests are in fact meaningful for all
floating types.
Convert the tests of the internal __strtod_internal interface to cover
all floating types. I haven't tried to convert them to use newer test
interfaces in other ways, just made the changes necessary to use the
type-generic machinery. As an internal interface, there are no
aliases for different types with the same ABI (however,
__strtold_internal is defined even if long double has the same ABI as
double), so macros used by the type-generic testing code are redefined
as needed to avoid expecting such aliases to be present.
Tested for x86_64.
As reported in bug 30220, the implementation of strtod-family
functions has a bug in the following case: the input string would,
with infinite exponent range, take one more bit to represent than is
available in the normal precision of the return type; the value
represented is in the subnormal range; and there are no nonzero bits
in the value, below those that can be represented in subnormal
precision, other than the least significant bit and possibly the
0.5ulp bit. In this case, round_and_return ends up discarding the
least significant bit.
Fix by saving that bit to merge into more_bits (it can't be merged in
at the time it's computed, because more_bits mustn't include this bit
in the case of after-rounding tininess detection checking if the
result is still subnormal when rounded to normal precision, so merging
this bit into more_bits needs to take place after that check).
Tested for x86_64.
Add tests of underflow in tst-strtod-round, and thus also test for
errno being unchanged when there is neither overflow nor underflow.
The errno setting before the function call to test for being unchanged
is adjusted to set errno to 12345 instead of 0, so that any bugs where
strtod sets errno to 0 would be detected.
This doesn't add any new test inputs for tst-strtod-round, and in
particular doesn't cover the edge cases of underflow the way
tst-strtod-underflow does (none of the existing test inputs for
tst-strtod-round actually exercise cases that have underflow with
before-rounding tininess detection but not with after-rounding
tininess detection), but at least it provides some coverage (as per
the recent discussions) that ordinary non-overflowing non-underflowing
inputs to these functions do not set errno.
Tested for x86_64.
Reference this new section from the O_PATH documentation.
And document the functions openat, openat64, fstatat, fstatat64.
(The safety assessment for fstatat was already obsolete because
current glibc assumes kernel support for the underlying system
call.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch uses 'Avoid_Non_Temporal_Memset' flag to access
the non-temporal memset implementation for hygon processors.
Test Results:
hygon1 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 0.994
4MB 0.996
8MB 0.670
16MB 0.343
32MB 0.355
hygon2 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 1
4MB 1
8MB 1.312
16MB 0.822
32MB 0.830
hygon3 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 1
4MB 0.990
8MB 0.737
16MB 0.390
32MB 0.401
For hygon arch with this patch, non-temporal stores can improve
performance by 20% - 65%.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Add hygon branch in dl_init_cacheinfo function to initialize
cache size variables for hygon processors. In the meanwhile,
add handle_hygon() function to get cache information.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Add a new architecture type arch_kind_hygon to spilt Hygon branch
from AMD. This is to facilitate the Hygon processors to make settings
that are suitable for its own characteristics.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
One can be very unlucky to call time_now first just before a second switch,
and mach_msg sleep just a bit more enough for the second time_now call to
count one second too many (or even more if scheduling is really unlucky).
So we have to protect against returning a bogus negative value in such case.
This patch modifies the current Power9 implementation of strcpy and
stpcpy to optimize it for Power9 and Power10.
No new Power10 instructions are used, so the original Power9 strcpy
is modified instead of creating a new implementation for Power10.
The changes also affect stpcpy, which uses the same implementation
with some additional code before returning.
Improvements compared to the old Power9 version:
Use simple comparisons for the first ~512 bytes:
The main loop is good for long strings, but comparing 16B each time is
better for shorter strings. After aligning the address to 16 bytes, we
unroll the loop four times, checking 128 bytes each time. There may be
some overlap with the main loop for unaligned strings, but it is better
for shorter strings.
Loop with 64 bytes for longer bytes:
Use 4 consecutive lxv/stxv instructions.
Showed an average improvement of 13%.
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The test io/tst-mkdirat doesn't verify the permissions on the created
directory (thus, doesn't verify at all anything about how mkdirat uses
the mode argument). Add checks of this to the existing test.
Tested for x86_64.
There is very little test coverage for getline (only a minimal
stdio-common/tstgetln.c which doesn't verify anything about the
results of the getline calls). Add some more thorough tests
(generally using fopencookie for convenience in testing various cases
for what the input and possible errors / EOF in the file read might
look like).
Note the following regarding testing of error cases:
* Nothing is said in the specifications about what if anything might
be written into the buffer, and whether it might be reallocated, in
error cases. The expectation of the tests (required to avoid memory
leaks on error) is that at least on error cases, the invariant that
lineptr points to at least n bytes is maintained.
* The optional EOVERFLOW error case specified in POSIX, "The number of
bytes to be written into the buffer, including the delimiter
character (if encountered), would exceed {SSIZE_MAX}.", doesn't seem
practically testable, as any case reading so many characters (half
the address space) would also be liable to run into allocation
failure along (ENOMEM) the way.
* If a read error occurs part way through reading an input line, it
seems unclear whether a partial line should be returned by getline
(avoid input getting lost), which is what glibc does at least in the
fopencookie case used in this test, or whether getline should return
-1 (error) (so avoiding the program misbehaving by processing a
truncated line as if it were complete). (There was a short,
inconclusive discussion about this on the Austin Group list on 9-10
November 2014.)
* The POSIX specification of getline inherits errors from fgetc. I
didn't try to cover fgetc errors systematically, just one example of
such an error.
Tested for x86_64 and x86.
This will avoid in the future cases like a57cbbd853 ("malloc: Link
threading tests with $(shared-thread-library") missing the memcheck
cases added in 251843e16f ("malloc: Link threading tests with
$(shared-thread-library)")
Tests for if_nameindex, if_name2index, and if_index2name
Tests that valid results are consistent.
Tests that invalid parameters fail correctly.
Reviewed-by: Florian Weimer <fweimer@redhat.com>