When static PIE is enabled by default, we shouldn't use crtbeginS.o and
crtendS.o for non-PIE static executables. Check $($(@F)-no-pie) to use
crtbeginT.o and crtend.o to create non-PIE static executables.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
This prints some information from struct cpu_features, and the midr_el1
and dczid_el0 system register contents on every CPU.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This is surprisingly difficult to implement if the goal is to produce
reasonably sized output. With the current approaches to output
compression (suppressing zeros and repeated results between CPUs,
folding ranges of identical subleaves, dealing with the %ecx
reflection issue), the output is less than 600 KiB even for systems
with 256 logical CPUs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Sync tzselect, zdump, zic to TZDB 2024a.
This patch incorporates the following TZDB source code changes,
listed roughly in descending order of importance.
zic now supports links to links, needed for future tzdata
zic now defaults to '-b slim'
zic now updates output files atomically
zic has new options -R, -l -, -p -
zic -r now uses -00 for unspecified timestamps
zdump now uses [lo,hi) for both -c and -t
Fix several integer overflow bugs
zic now checks input bytes more carefully
Simplify and fix new TZDIR setup
Default time_t to 64 bits on glibc 2.34+ 32-bit
zic now generates TZ strings that conform to POSIX when all-year DST
zic -v now shows extreme-int tm_year transitions
Fix zic bug in last time type of Asia/Gaza etc.
Fix zic bug with Palestine after 2075
Fix bug uncovered by recent change to Iran history
Fix 'zic -b fat' bug with Port Moresby 32-bit data
Fix zic bug with -r @X where X is deduced from TZ
Fix bug with zic -r cutoff before 1st transition
Fix leap second expiry and truncation
Fix zic bug on Linux 2.6.16 and 2.6.17
Fix bug with 'zic -d /a/b/c' if /a is unwriteable
Don't mistruncate TZif files at leap seconds
Fix zdump undefined behavior if !USE_LTZ
zdump -v reports localtime+gmtime failures better
Fix zdump diagnostic for missing timezone
Don't assume nonempty argv
Port better to C23
Do not assume negative >> behavior
I18nize zdump a bit better
Port zdump to right_only installations
New tzselect menu option 'now'
tzselect can now use current time to help choose
Improve tzselect behavior for Turkey etc.
tzselect: do not create temporary files
tzselect: work around mawk bug with {2,}
tzselect: Port to POSIX awk, which prohibits -v newlines
Do not use empty RE in tzselect
Don't set TZ in tzselect
Avoid sed, expr in tzselect
tzselect: Fix problems with spaces in TZDIR
Improve tzselect diagnostics
Remove zic workaround for Qt bug 53071
Remove zic support for "min" in Rule lines
Remove zic support for zic -y, Rule TYPEs, pacificnew
Remove tzselect workaround for Bash 1.14.7 bug
* SHARED-FILES: Update to match current sync.
* config.h.in (HAVE_STRERROR): Remove; no longer needed.
* timezone/Makefile ($(objpfx)zic.o): Depend on tzdir.h.
($(objpfx)tzdir.h): New rule to build a placeholder.
* timezone/private.h, timezone/tzfile.h, timezone/version:
* timezone/zdump.c, timezone/zic.c: Copy verbatim from TZDB 2024a.
* manual/search.texi (Array Search Function):
Correct the statement about lfind’s mean runtime:
it is proportional to a number (not that number),
and this is true only if random elements are searched for.
Relax the constraint on bsearch’s array argument:
POSIX says it need not be sorted, only partially sorted.
Say that the first arg passed to bsearch’s comparison function
is the key, and the second arg is an array element, as
POSIX requires. For bsearch and qsort, say that the
comparison function should not alter the array, as POSIX
requires. For qsort, say that the comparison function
must define a total order, as POSIX requires, that
it should not depend on element addresses, that
the original array index can be used for stable sorts,
and that if qsort still works if memory allocation fails.
Be more consistent in calling the array elements
“elements” rather than “objects”.
Co-authored-by: Zack Weinberg <zack@owlfolio.org>
When -mapxf is used to build glibc, the resulting glibc will never run
on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used.
This requires GCC which defines __APX_F__ for -mapxf with commit:
1df56719bd8 x86: Define __APX_F__ for -mapxf
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The a4ed0471d7 removed the generic version which is included by
features.h and used by Hurd.
Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.
The implementations of trunc functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The implementations of floor functions using x87 floating point (i386 and
86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The implementations of ceil functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.
This caused two test failures as these tests expected the flag 0x00000020
to be unused. Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).
FAIL: misc/tst-preadvwritev2
FAIL: misc/tst-preadvwritev64v2
This patch adds the flag, adjusts the test and adds details to
documentation.
Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The ifunc variants now uses the powerpc implementation which in turn
uses the compiler builtin. Without the proper -mcpu switch the builtin
does not generate the expected optimization.
Checked on powerpc-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI. Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).
The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi
[1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html
[2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign
Reviewed-by: DJ Delorie <dj@redhat.com>
Use same strategy as bench-strstr.c (93eebae516 and 80b2bfb535)
and use json_ctx for output to help standardize format across all
benchtests.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
As indicated in a recent thread, this it is a simple brute-force
algorithm that checks the whole needle at a matching character pair
(and does so 1 byte at a time after the first 64 bytes of a needle).
Also it never skips ahead and thus can match at every haystack
position after trying to match all of the needle, which generic
implementation avoids.
As indicated by Wilco, a 4x larger needle and 16x larger haystack gives
a clear 65x slowdown both basic_strstr and __strstr_avx512:
"ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512",
"__strstr_sse2_unaligned", "__strstr_generic"],
{
"len_haystack": 65536,
"len_needle": 1024,
"align_haystack": 0,
"align_needle": 0,
"fail": 1,
"desc": "Difficult bruteforce needle",
"timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]
},
{
"len_haystack": 1048576,
"len_needle": 4096,
"align_haystack": 0,
"align_needle": 0,
"fail": 1,
"desc": "Difficult bruteforce needle",
"timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]
}
PS: I don't have an AVX512 capable machine to verify this issues, but
skimming through the code it does seems to follow what Wilco has
described.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The value of l_scope is only valid post relocation, so this original
check was triggering undefined behavior. Instead just directly check to
see if the object has been relocated, at which point using l_scope is
safe.
Reported-by: Andreas Schwab <schwab@suse.de>
Closes: BZ #31317
Fixes: e0590f41fe ("RISC-V: Enable static-pie.")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Previously, HTL would always allocate non-executable stacks. This has
never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and
makes all pages implicitly executable. Since GNU Mach on AArch64
supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE
immediately breaks any code that (unfortunately, still) relies on
executable stacks.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>
While we could support it on any architecture, the tunable is currently
only defined on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>
We would like to avoid statically defining any specific page size on
aarch64-gnu, and instead make sure that everything uses the dynamic
page size, available via vm_page_size and GLRO(dl_pagesize).
There are currently a few places in glibc that require EXEC_PAGESIZE
to be defined. Per Roland's suggestion [0], drop the static
GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not
defined), and don't require EXEC_PAGESIZE definition for libio to
enable mmap usage.
[0]: https://mail.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-4-bugaevc@gmail.com>
We'd like to avoid committing to a specific size of virtual address
space (i.e. the value of VM_AARCH64_T0SZ) on AArch64. While the current
version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we
should try to avoid relying on it when we can. This piece of logic in
_hurdsig_getenv () doesn't actually care about the size of user-
accessible virtual address space, it just wants to preempt faults on any
addresses starting from the value of the P pointer and above. So, use
(unsigned long int) -1 instead of VM_MAX_ADDRESS.
While at it, change the casts to (unsigned long int) and not just
(long int), since the type of struct hurd_signal_preemptor.{first,last}
is unsigned long int.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-3-bugaevc@gmail.com>
Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and
_hurd_critical_section_unlock () inline implementations (that were
already guarded by #if defined _LIBC) to the internal version of the
header. While at it, add <tls.h> to the includes, and use
__LIBC_NO_TLS () unconditionally.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>
The log incorrectly prints, setcontext failed. Update this to indicate
that actually swapcontext failed.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
On OpenRISC variadic functions and regular functions have different
calling conventions so this wrapper is needed to translate. This
wrapper is copied from x86_64/x32. I don't know the build system enough
to find a cleaner way to share the code between x86_64/x32 and or1k
(maybe Implies?), so I went with the straight copy.
This fixes test failures:
misc/tst-prctl
nptl/tst-setgetname
This test failure:
math/test-fenv
If rounding mode and exception macros are defined then the fenv tests
run and always fail. This patch adds an ifdef using the
__or1k_hard_float__ macro provided by gcc to avoid defining these fenv
macros when they cnnot be used. This is similar to what is done in csky.
Note, I will post the or1k hard-float support soon. So, I prefer to
leave the hard-float bits here for now.
Old Linux kernels disable SVE after every system call. Calling the
SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
As a result, applications with a high use of syscalls may run slower with
the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0,
except for 5.14.0 which was patched. Avoid this by checking the kernel
version and selecting the SVE ifunc on modern kernels.
Parse the kernel version reported by uname() into a 24-bit kernel.major.minor
value without calling any library functions. If uname() is not supported or
if the version format is not recognized, assume the kernel is modern.
Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The following three changes have been added to provide initial Power11 support.
1. Add the directories to hold Power11 files.
2. Add support to select Power11 libraries based on AT_PLATFORM.
3. Let submachine=power11 be set automatically.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This patch adds a new feature for powerpc. In order to get faster
access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for
implementing __builtin_cpu_supports() in GCC) without the overhead of
reading them from the auxiliary vector, we now reserve space for them
in the TCB.
This is an ABI change for GLIBC 2.39.
Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The aarch64 uses 'trad' for traditional tls and 'desc' for tls
descriptors, but unlike other targets it defaults to 'desc'. The
gnutls2 configure check does not set aarch64 as an ABI that uses
TLS descriptors, which then disable somes stests.
Also rename the internal machinery fron gnu2 to tls descriptors.
Checked on aarch64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
ARM _dl_tlsdesc_dynamic slow path has two issues:
* The ip/r12 is defined by AAPCS as a scratch register, and gcc is
used to save the stack pointer before on some function calls. So it
should also be saved/restored as well. It fixes the tst-gnu2-tls2.
* None of the possible VFP registers are saved/restored. ARM has the
additional complexity to have different VFP bank sizes (depending of
VFP support by the chip).
The tst-gnu2-tls2 test is extended to check for VFP registers, although
only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic
does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test
it and it is almost a decade since newer hardware was released).
With this patch there is no need to mark tst-gnu2-tls2 as XFAIL.
Checked on arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>