This patch fix the static build for strftime, which uses __wcschr.
Current powerpc32 implementation defines the __wcschr be an alias to
__wcschr_ppc32 and current implementation misses the correct alias for
static build.
It also changes the default wcschr.c logic so a IFUNC implementation
should just define WCSCHR and undefine the required alias/internal
definitions.
These new memcpy functions are the 32-bit version of x86_64 SSE2 unaligned
memcpy. Memcpy average performace benefit is 18% on Silvermont, other
platforms also improved about 35%, benchmarked on Silvermont, Haswell, Ivy
Bridge, Sandy Bridge and Westmere, performance results attached in
https://sourceware.org/ml/libc-alpha/2014-07/msg00157.html
* sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S: New file.
* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/bcopy.S: Select the sse2_unaligned
version if bit_Fast_Unaligned_Load is set.
* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
bcopy-sse2-unaligned, memcpy-sse2-unaligned,
memmove-sse2-unaligned and mempcpy-sse2-unaligned.
* sysdeps/i386/i686/multiarch/ifunc-impl-list.c (MAX_IFUNC): Set
to 4.
(__libc_ifunc_impl_list): Test __bcopy_sse2_unaligned,
__memmove_chk_sse2_unaligned, __memmove_sse2_unaligned,
__memcpy_chk_sse2_unaligned, __memcpy_sse2_unaligned,
__mempcpy_chk_sse2_unaligned, and __mempcpy_sse2_unaligned.
Use of strftime, a C90 function, ends up bringing in wcschr, which is
not a C90 function. Although not a conformance bug (C90 reserves
wcs*), this is still contrary to glibc practice of avoiding relying on
those reservations; this patch arranges for the internal uses to use
__wcschr instead, with wcschr being a weak alias. This is more
complicated than some such patches because of the various IFUNC
definitions of wcschr (which include code redefining libc_hidden_def
in a way that involves creating __GI_wcschr manually and so also needs
to create __GI___wcschr after the change of internal uses to use
__wcschr).
Tested for x86_64 and 32-bit x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
2014-12-10 Joseph Myers <joseph@codesourcery.com>
Adhemerval Zanella <azanella@linux.vnet.ibm.com>
[BZ #17634]
* wcsmbs/wcschr.c [!WCSCHR] (wcschr): Define as __wcschr.
Undefine after defining function. Define as weak alias of
__wcschr. Use libc_hidden_weak.
* include/wchar.h (__wcschr): Declare. Use libc_hidden_proto.
* sysdeps/i386/i686/multiarch/wcschr-c.c [IS_IN (libc) && SHARED]
(libc_hidden_def): Also define __GI___wcschr alias.
* sysdeps/i386/i686/multiarch/wcschr.S (wcschr): Rename to
__wcschr and define as weak alias of __wcschr.
* sysdeps/powerpc/power6/wcschr.c [!WCSCHR] (WCSCHR): Define as
__wcschr.
[!WCSCHR] (DEFAULT_WCSCHR): Define.
[DEFAULT_WCSCHR] (__wcschr): Use libc_hidden_def.
[DEFAULT_WCSCHR] (wcschr): Define as weak alias of __wcschr. Use
libc_hidden_weak. Do not use libc_hidden_def.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c
[IS_IN (libc) && SHARED] (libc_hidden_def): Also define
__GI___wcschr alias.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c
[IS_IN (libc)] (wcschr): Define as macro expanding to
__redirect_wcschr.
[IS_IN (libc)] (__wcschr_ppc): Use __redirect_wcschr in typeof.
[IS_IN (libc)] (__wcschr_power6): Likewise.
[IS_IN (libc)] (__wcschr_power7): Likewise.
[IS_IN (libc)] (__libc_wcschr): New. Define with libc_ifunc
instead of wcschr.
[IS_IN (libc)] (wcschr): Undefine and define as weak alias of
__libc_wcschr.
[!IS_IN (libc)] (libc_hidden_def): Do not undefine and redefine.
* sysdeps/powerpc/powerpc64/multiarch/wcschr.c (wcschr): Rename to
__wcschr and define as weak alias of __wcschr. Use
libc_hidden_builtin_def.
* sysdeps/x86_64/wcschr.S (wcschr): Rename to __wcschr and define
as weak alias of __wcschr. Use libc_hidden_weak.
* time/alt_digit.c (_nl_get_walt_digit): Use __wcschr instead of
wcschr.
* time/era.c (_nl_init_era_entries): Likewise.
* conform/Makefile (test-xfail-ISO/time.h/linknamespace): Remove
variable.
(test-xfail-XPG3/time.h/linknamespace): Likewise.
(test-xfail-XPG4/time.h/linknamespace): Likewise.
This patch fixes __ieee754_logl (-LDBL_MAX) on x86_64 and x86 not to
subtract 1 from its argument and so cause spurious overflow in
FE_DOWNWARD mode. (For any argument strictly less than -1, it doesn't
matter whether or not 1 is subtracted before computing log1p, as long
as the result doesn't overflow to -Inf.)
Tested x86_64 and x86. (This particular case lacks test coverage,
since the testsuite doesn't cover -lieee, but it will be covered by
tests after the following patch to test pow in all rounding modes,
which was the context in which this bug was found.)
[BZ #17022]
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Do not subtract 1
from arguments -2 or below.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise.
According to ISO C Annex F, log (1) should be +0 in all rounding
modes, but some implementations in glibc wrongly return -0 in
round-downward mode (mapping to log1p (x - 1) is problematic because 1
- 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch
fixes this. (It helps with some implementations of other functions
such as acosh, log2 and log10 that call out to log, but not enough to
enable all-rounding-modes testing for those functions without further
fixes to other implementations of them.)
Tested x86_64 and x86 and ulps updated accordingly, and did spot tests
for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu
implementations shadowed by those in sysdeps/i386/i686/fpu.
[BZ #16731]
* sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value
when x - 1 is zero.
* sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise.
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when
argument is 1.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is
zero.
* math/libm-test.inc (log_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* sysdeps/i386/i686/fpu/multiarch/Makefile (sysdep_routines):
Add s_sinf-sse2, s_conf-sse2.
* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: New file.
* sysdeps/ieee754/flt-32/s_sinf.c (SINF, SINF_FUNC): Add macros
for using routine as __sinf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/ieee754/flt-32/s_cosf.c (COSF, COSF_FUNC): Add macros
for using routine as __cosf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Fix Copyright.
* sysdeps/i386/i686/fpu/multiarch/e_expf.c: Fix Copyright.
* sysdeps/x86_64/fpu/s_sinf.S: New file.
* sysdeps/x86_64/fpu/s_cosf.S: New file.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
* math/libm-test.inc (cos_test): Add more test cases.
(sin_test): Likewise.
(sincos_test): Likewise.
2012-08-15 Liubov Dmitrieva <liubov.dmitrieva@gmail.com>
[BZ #14195]
* sysdeps/i386/i686/multiarch/strcmp-sssse3.S: Fix
segmentation fault for a case of two empty input strings.
* string/test-strncasecmp.c (check1): Renamed to...
(bz12205): ...this.
(bz14195): Add new testcase for two empty input strings and N > 0.
(test_main): Call new testcase, adapt for renamed function.
Fixes:
In file included from ../sysdeps/i386/i686/multiarch/wcschr-c.c:8:0:
../wcsmbs/wcschr.c:26:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
../wcsmbs/wcschr.c:37:1: warning: data definition has no type or storage class [enabled by default]
../wcsmbs/wcschr.c:37:1: warning: type defaults to ‘int’ in declaration of ‘__hidden_ver1’ [enabled by default]
../wcsmbs/wcschr.c:37:1: warning: parameter names (without types) in function declaration [enabled by default]
2012-05-14 Liubov Dmitrieva <liubov.dmitrieva@gmail.com>
* sysdeps/i386/i686/fpu/multiarch/Makefile: New file.
* sysdeps/i386/i686fpu/multiarch/e_expf.c: New file.
* sysdeps/i386/i686fpu/multiarch/e_expf-ia32.S: New file.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: New file.
The proper define to check "am I in a shared lib" is "SHARED", not "PIC".
The two new memset_chk functions incorrectly depend on "PIC".
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
I've improved the following implementation of memcpy:
"sysdeps/i386/i686/multiarch/memcpy-ssse3.S".
The patch includes some minor style fixes, but the important part is
just using prefetch loops for the case:
DATA_CACHE_SIZE_HALF <= len < SHARED_CACHE_SIZE_HALF and
src and dst pointers have unequal 16 byte alignments.
This gives from 6% - 50% performance boost on the atom machine, about
24,73% in geometric mean.
Wrong copy algorithm for last bytes, not thread safety.
In some particular cases it uses the destination
memory beyond the string end for
16-byte load, puts changes into that part that is relevant
to destination string and writes whole 16-byte chunk into memory.
I have a test case where the memory beyond the string end contains
malloc/free data, that appear corrupted in case free() updates
it in between the 16-byte read and 16-byte write.
32bit memset-sse2.S assumes cache size is multiple of 128 bytes. If
it isn't true, memset-sse2.S will fail. For example, a processor can
have 24576 KB L3 cache and 20 cores. That is 2516582 byte per core. Half
of it is 1258291, which isn't helpful for vector instructions. This
patch rounds cache sizes to multiple of 256 bytes and adds "raw" cache
sizes.