glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-25 20:21:07 +00:00

Author	SHA1	Message	Date
Szabolcs Nagy	55f82d328d	aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking symbols that reference functions that may follow a variant PCS with different register usage convention from the base PCS. DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that have R__JUMP_SLOT relocations for symbols marked with STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT). elf/elf.h (STO_AARCH64_VARIANT_PCS): Define. (DT_AARCH64_VARIANT_PCS): Define.	2019-06-13 09:44:44 +01:00
Adhemerval Zanella	1192696069	powerpc: Remove optimized finite The powerpc finite optimization do not show much gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc64-linux-gnu-power7 (--with-cpu=power7): - Generic sysdeps/ieee754 implementation: "isfinite": { "": { "duration": 5.0082e+09, "iterations": 2.45299e+09, "max": 43.824, "min": 2.008, "mean": 2.04167 }, "INF": { "duration": 4.66554e+09, "iterations": 2.28288e+09, "max": 35.73, "min": 2.008, "mean": 2.04371 }, "NAN": { "duration": 4.66274e+09, "iterations": 2.28716e+09, "max": 34.161, "min": 2.009, "mean": 2.03866 } } - power7 optimized one: "isfinite": { "": { "duration": 4.99111e+09, "iterations": 2.65566e+09, "max": 25.015, "min": 1.716, "mean": 1.87942 }, "INF": { "duration": 4.6783e+09, "iterations": 2.0999e+09, "max": 35.264, "min": 1.868, "mean": 2.22787 }, "NAN": { "duration": 4.67915e+09, "iterations": 2.08678e+09, "max": 38.099, "min": 1.869, "mean": 2.24228 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_finite* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_finite* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	a72186761b	math: Use wordsize-64 version for finite - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra finite call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ... * sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	6427a6ac8c	powerpc: Remove optimized isinf The powerpc isinf optimizations onyl adds complexity: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input pattern and branch implementation for INF and denormal that does: return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000) Although it does show slight better latency than generic algorithm (as below), it is only for power7 and requires it to override it for power8. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	a8c590f789	math: Use wordsize-64 version for isinf - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isinf call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ... * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:39 -03:00
Adhemerval Zanella	2666f96390	powerpc: Remove optimized isnan The powerpc isnan optimizations are not really a gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power5, power6, and power6x are just micro-optimization to improve the Load-Hit-Store hazards from floating-point to general register transfer, and current GCC already has support to minimize it by inserting either extra nops or group dispatch instructions. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc-linux-gnu-power4 (which uses the hp-timing support on benchtests): - Generic sysdeps/ieee754 implementation: "isnan": { "": { "duration": 4.98415e+09, "iterations": 2.34516e+09, "max": 45.925, "min": 2.052, "mean": 2.12529 }, "INF": { "duration": 4.74057e+09, "iterations": 1.69761e+09, "max": 91.01, "min": 2.052, "mean": 2.79249 }, "NAN": { "duration": 4.74071e+09, "iterations": 1.68768e+09, "max": 282.343, "min": 2.052, "mean": 2.809 } } - power7 optimized one: $ ./testrun.sh benchtests/bench-isnan "isnan": { "": { "duration": 4.96842e+09, "iterations": 2.56297e+09, "max": 50.048, "min": 1.872, "mean": 1.93854 }, "INF": { "duration": 4.76648e+09, "iterations": 1.54213e+09, "max": 373.408, "min": 2.661, "mean": 3.09084 }, "NAN": { "duration": 4.76845e+09, "iterations": 1.54515e+09, "max": 51.016, "min": 2.736, "mean": 3.08607 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The generic implementation requires getting the floating point status, disable the invalid operation bit, and restore the floating-point status. Each operation is costly and requires flushing the FP pipeline. Using the same scenarion for the previous analysis: "isnan": { "": { "duration": 5.08284e+09, "iterations": 6.2898e+08, "max": 41.844, "min": 8.057, "mean": 8.08108 }, "INF": { "duration": 4.97904e+09, "iterations": 6.16176e+08, "max": 39.661, "min": 8.057, "mean": 8.08055 }, "NAN": { "duration": 4.98695e+09, "iterations": 5.95866e+08, "max": 29.728, "min": 8.345, "mean": 8.36925 } } - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_isnan.c: Remove file. * sysdeps/powerpc/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S: Remove file * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:36 -03:00
Adhemerval Zanella	197dbda1a1	math: Use wordsize-64 version for isnan - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isnan call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ... * sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 14:32:18 -03:00
Adhemerval Zanella	2731a326b1	benchtests: Add isnan/isinf/isfinite benchmark * benchtests/Makefile (bench-math): Add isnan, isinf, and isfinite. (CFLAGS-bench-isnan.c, CFLAGS-bench-isinf.c, CFLAGS-bench-isfinite.c): New rule. * benchtests/isnan-input: New file. * benchtests/isinf-input: New file. * benchtests/isfinite-input: New file. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 11:46:30 -03:00
Adhemerval Zanella	e41d66e41a	powerpc: copysign cleanup GCC always expand copysign{f} for all possible cpus, so calling the libm is only done if user explicitly states to disable the builtin (which is done usually not for performance reason). So to provide ifunc variant for copysign is just unrequired complexity, since libm will be called on non-performance critical code. This patch removes both powerpc32 and powerpc64 ifunc variants and consolidates the powerpc implementation on sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_copysign.c: New file. * sysdeps/powerpc/fpu/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and s_copysign-ppc32. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls): Remove s_copysign-power6 s_copysign-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 11:46:26 -03:00
Adhemerval Zanella	21bd039bb4	powerpc: consolidate rint This patches consolidates all the powerpc rint{f} implementations on the generic sysdeps/powerpc/fpu/s_rint{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode, round_to_integer_float, round_mode): Add RINT handling. (reset_fenv_mode): New symbol. * sysdeps/powerpc/fpu/s_rint.c (__rint): Use generic implementation. * sysdeps/powerpc/fpu/s_rintf.c (__rintf): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_rint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-06-12 11:46:22 -03:00
Florian Weimer	cfa611447b	libio: freopen of default streams crashes in old programs [BZ #24632 ] As seen with very old i386 GCC binaries.	2019-06-12 14:48:33 +02:00
Florian Weimer	744e829637	Linux: Deprecate <sys/sysctl.h> and sysctl Now that there are no internal users of __sysctl left, it is possible to add an unconditional deprecation warning to <sys/sysctl.h>. To avoid a test failure due this warning in check-install-headers, skip the test for sys/sysctl.h. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-06-12 14:32:08 +02:00
Florian Weimer	5dad6ffbb2	<sys/stat.h>: Use Linux UAPI header for statx if available and useful This will automatically import new STATX_* constants. It also avoids a conflict between <sys/stat.h> and <linux/stat.h>.	2019-06-12 13:04:43 +02:00
Florian Weimer	4e75c2a43b	<sys/cdefs.h>: Add __glibc_has_include macro	2019-06-12 13:04:43 +02:00
Wilco Dijkstra	680942b016	Improve performance of memmem This patch significantly improves performance of memmem using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 2 use a dedicated linear search. Very long needles use the Two-Way algorithm (to avoid increasing stack size or slowing down the common case, inlining is disabled). The performance gain is 6.6 times on English text on AArch64 using random needles with average size 8. Tested against GLIBC testsuite and randomized tests. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/memmem.c (__memmem): Rewrite to improve performance.	2019-06-12 11:42:34 +01:00
Wilco Dijkstra	5e0a7ecb66	Improve performance of strstr This patch significantly improves performance of strstr using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 3 use a dedicated linear search. Very long needles use the Two-Way algorithm. The performance gain using the improved bench-strstr on Cortex-A72 is 5.8 times basic_strstr and 3.7 times twoway_strstr. Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test (https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/str-two-way.h (two_way_short_needle): Add inline to avoid warning. (two_way_long_needle): Block inlining. * string/strstr.c (strstr2): Add new function. (strstr3): Likewise. (STRSTR): Completely rewrite strstr to improve performance.	2019-06-12 11:38:52 +01:00
Wilco Dijkstra	80b2bfb535	Benchmark strstr hard needles Benchmark needles which exhibit worst-case performance. This shows that basic_strstr is quadratic and thus unsuitable for large needles. On the other hand the Two-way and new strstr implementations are linear with increasing needle sizes. The slowest cases of the two implementations are within a factor of 2 on several different microarchitectures. Two-way is slowest on inputs which cause a branch mispredict on almost every character. The new strstr is slowest on inputs which almost match and result in many calls to memcmp. Thanks to Szabolcs for providing various hard needles. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> * benchtests/bench-strstr.c (test_hard_needle): New function.	2019-06-11 15:52:21 +01:00
Joseph Myers	e6e2424390	Fix malloc tests build with GCC 10. GCC mainline has recently added warn_unused_result attributes to some malloc-like built-in functions, where glibc previously had them in its headers only for __USE_FORTIFY_LEVEL > 0. This results in those attributes being newly in effect for building the glibc testsuite, so resulting in new warnings that break the build where tests deliberately call such functions and ignore the result. Thus patch duly adds calls to DIAG_* macros around those calls to disable the warning. Tested with build-many-glibcs.py for aarch64-linux-gnu. * malloc/tst-calloc.c: Include <libc-diag.h>. (null_test): Ignore -Wunused-result around calls to calloc. * malloc/tst-mallocfork.c: Include <libc-diag.h>. (do_test): Ignore -Wunused-result around call to malloc.	2019-06-10 22:12:08 +00:00
Florian Weimer	51ea67d548	Linux: Add getdents64 system call No 32-bit system call wrapper is added because the interface is problematic because it cannot deal with 64-bit inode numbers and 64-bit directory hashes. A future commit will deprecate the undocumented getdirentries and getdirentries64 functions. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2019-06-07 09:27:01 +02:00
Paul A. Clarke	de751ebc9e	[powerpc] get_rounding_mode: utilize faster method to get rounding mode Add support to use 'mffsl' instruction if compiled for POWER9 (or later). Also, mask the result to avoid bleeding unrelated bits into the result of _FPU_GET_RC(). Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2019-06-06 14:11:56 -05:00
Florian Weimer	28dd393922	riscv: Do not use __has_include__ The user-visible preprocessor construct is called __has_include.	2019-06-06 11:24:32 +02:00
Paul A. Clarke	0158473d8f	[powerpc] fegetexcept: utilize function instead of duplicating code fegetexcept() included code which exactly duplicates the code in fenv_reg_to_exceptions(). Replace with a call to that function. 2019-06-05 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code with call to equivalent function.	2019-06-05 19:38:10 -05:00
Florian Weimer	e863dbf6b2	iconv: Use __twalk_r in __gconv_release_shlib	2019-06-04 14:13:44 +02:00
Andreas Schwab	4802be92c8	Fix iconv buffer handling with IGNORE error handler (bug #18830 )	2019-06-04 14:03:04 +02:00
Joseph Myers	dc91a19e6f	Add INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to netinet/in.h. This patch adds INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to netinet/in.h. Tested for x86_64. * inet/netinet/in.h (INADDR_ALLSNOOPERS_GROUP): New macro.	2019-06-03 11:16:02 +00:00
Florian Weimer	6a1a9a495a	Fix data of ChangeLog entry	2019-06-01 14:41:37 +02:00
Florian Weimer	6b33f373c7	arm: Remove ioperm/iopl/inb/inw/inl/outb/outw/outl support Linux only supports the required ISA sysctls on StrongARM devices, which are armv4 and no longer tested during glibc development and probably bit-rotted by this point. (No reported test results, and the last discussion of armv4 support was in the glibc 2.19 release notes.)	2019-06-01 13:33:49 +02:00
Florian Weimer	0bb8f8c791	Linux: Add oddly-named arm syscalls to syscall-names.list <asm/unistd.h> on arm defines the following macros: #define __ARM_NR_breakpoint (__ARM_NR_BASE+1) #define __ARM_NR_cacheflush (__ARM_NR_BASE+2) #define __ARM_NR_usr26 (__ARM_NR_BASE+3) #define __ARM_NR_usr32 (__ARM_NR_BASE+4) #define __ARM_NR_set_tls (__ARM_NR_BASE+5) #define __ARM_NR_get_tls (__ARM_NR_BASE+6) These do not follow the regular __NR_* naming convention and have so far been ignored by the syscall-names.list consistency checks. This commit adds these names to the file, preparing for the availability of these names in the regular __NR_* namespace.	2019-06-01 13:33:49 +02:00
Gabriel F. T. Gomes	9250e6610f	powerpc: Fix build failures with current GCC Since GCC commit 271500 (svn), also known as the following commit on the git mirror: commit 61edec870f9fdfb5df3fa4e40f28cbaede28a5b1 Author: amodra <amodra@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Wed May 22 04:34:26 2019 +0000 [RS6000] Don't pass -many to the assembler glibc builds are failing when an assembly implementation does not declare the correct '.machine' directive, or when no such directive is declared at all. For example, when a POWER6 instruction is used, but '.machine power6' is not declared, the assembler will fail with an error similar to the following: ../sysdeps/powerpc/powerpc64/power8/strcmp.S: Assembler messages: 24 ../sysdeps/powerpc/powerpc64/power8/strcmp.S:55: Error: unrecognized opcode: `cmpb' This patch adds '.machine powerN' directives where none existed, as well as it updates '.machine power7' directives on POWER8 files, because the minimum binutils version required to build glibc (binutils 2.25) now provides this machine version. It also adds '-many' to the assembler command used to build tst-set_ppr.c. Tested for powerpc, powerpc64, and powerpc64le, as well as with build-many-glibcs.py for powerpc targets. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2019-05-30 11:10:48 -03:00
Adhemerval Zanella	fbd6c928bb	Remove unused get_clockfreq files The patch `6e8ba7fd57` meant to remove the all get_clockfreq.c. This patch removes the missing files for sparcv9 and x86_64. Checked against a build to x86_64-linux-gnu and sparcv9-linux-gnu. * sysdeps/unix/sysv/linux/sparc/sparc32/sparcv9/get_clockfreq.c: Remove file. * sysdeps/unix/sysv/linux/x86_64/get_clockfreq.c: Likewise.	2019-05-29 10:26:30 -03:00
Adhemerval Zanella	e47308c98d	powerpc: generic nearbyint/nearbyintf This patches consolidates all the powerpc nearbyint{f} implementations on the generic sysdeps/powerpc/fpu/s_nearbyint{f}. * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add NEARBYINT handling. * sysdeps/powerpc/fpu/s_nearbyint.c: New file. * sysdeps/powerpc/fpu/s_nearbyintf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.	2019-05-28 18:16:48 -03:00
mansayk	157cda1ff0	tt_RU: Add lang_name [BZ #24370 ] This commit adds a lang_name according to CLDR-35.1. [BZ #24370] * localedata/locales/tt_RU (lang_name): Add from CLDR-35.1.	2019-05-28 22:13:32 +02:00
mansayk	182a3746b8	tt_RU: Fix orthographic mistakes in mon and abmon sections [BZ #24369 ] This commit fixes some errors and converts all month names to lowercase. The content is synchronized with CLDR-35.1 now but trailing dots are removed from abmon values in order to maintain consistency with the previous values and with many other locales which do the same. [BZ #24369] * localedata/locales/tt_RU (mon): Update from CLDR-35.1, fix errors. (abmon): Likewise, but remove the trailing dots.	2019-05-28 22:11:22 +02:00
Joseph Myers	c6df1ce3d5	Add IGMP_MRDISC_ADV from Linux 5.1 to netinet/igmp.h. This patch adds the IGMP_MRDISC_ADV macro from Linux 5.1 to netinet/igmp.h. Tested for x86_64. * inet/netinet/igmp.h (IGMP_MRDISC_ADV): New macro.	2019-05-28 12:01:12 +00:00
Florian Weimer	85188d8211	nptl: Add comment to __pthread_get_minstack about external users	2019-05-27 12:57:45 +02:00
Florian Weimer	5c23c82195	nss_dns: Check for proper A/AAAA address alignment Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2019-05-24 22:14:04 +02:00
Joseph Myers	bee1f2c413	Add F_SEAL_FUTURE_WRITE from Linux 5.1 to bits/fcntl-linux.h. This patch adds the new F_SEAL_FUTURE_WRITE constant from Linux 5.1 to bits/fcntl-linux.h. Tested for x86_64. * sysdeps/unix/sysv/linux/bits/fcntl-linux.h [__USE_GNU] (F_SEAL_FUTURE_WRITE): New macro.	2019-05-23 13:20:48 +00:00
Alexandra Hájková	481c30cb95	elf: Add tst-ldconfig-bad-aux-cache test [BZ #18093 ] This test corrupts /var/cache/ldconfig/aux-cache and executes ldconfig to check it will not segfault using the corrupted aux_cache. The test uses the test-in-container framework. Verified no regressions on x86_64.	2019-05-23 11:49:44 +02:00
Zack Weinberg	cb755eede7	Add ChangeLog entry for previous commit.	2019-05-22 15:09:32 -04:00
Wilco Dijkstra	46ae07324b	Improve string benchtest timing Improve string benchtest timing. Many tests run for 0.01s which is way too short to give accurate results. Other tests take over 40 seconds which is way too long. Significantly increase the iterations of the short running tests. Reduce number of alignment variations in the long running memcpy walk tests so they take less than 5 seconds. As a result most tests take at least 0.1s and all finish within 5 seconds. * benchtests/bench-memcpy-random.c (do_one_test): Use medium iterations. * benchtests/bench-memcpy-walk.c (test_main): Reduce alignment tests. * benchtests/bench-memmem.c (do_one_test): Use small iterations. * benchtests/bench-memmove-walk.c (test_main): Reduce alignment tests. * benchtests/bench-memset-walk.c (test_main): Reduce alignment tests. * benchtests/bench-strcasestr.c (do_one_test): Use small iterations. * benchtests/bench-string.h (INNER_LOOP_ITERS): Increase iterations. (INNER_LOOP_ITERS_MEDIUM): New define. (INNER_LOOP_ITERS_SMALL): New define. * benchtests/bench-strpbrk.c (do_one_test): Use medium iterations. * benchtests/bench-strsep.c (do_one_test): Use small iterations. * benchtests/bench-strspn.c (do_one_test): Use medium iterations. * benchtests/bench-strstr.c (do_one_test): Use small iterations. * benchtests/bench-strtok.c (do_one_test): Use small iterations.	2019-05-21 15:19:06 +01:00
Adhemerval Zanella	004e52febf	sysvipc: Add missing bit of semtimedop s390 consolidation This patch add the missing SEMTIMEDOP_IPC_ARGS definions on powerpc and sparc ipc_priv.h. Checked on powerpc64le-linux-gnu and with a build for sparc64-linux-gnu. * sysdeps/unix/sysv/linux/powerpc/ipc_priv.h (SEMTIMEDOP_IPC_ARGS): New define. * sysdeps/unix/sysv/linux/sparc/sparc64/ipc_priv.h (SEMTIMEDOP_IPC_ARGS): Likewise.	2019-05-21 10:43:10 -03:00
Florian Weimer	c9c15ac316	wcsmbs: Fix data race in __wcsmbs_clone_conv [BZ #24584 ] This also adds an overflow check and documents the synchronization requirement in <gconv.h>.	2019-05-21 12:04:55 +02:00
Florian Weimer	7e740ab2e7	libio: Fix gconv-related memory leak [BZ #24583 ] struct gconv_fcts for the C locale is statically allocated, and __gconv_close_transform deallocates the steps object. Therefore this commit introduces __wcsmbs_close_conv to avoid freeing the statically allocated steps objects.	2019-05-21 12:03:54 +02:00
Florian Weimer	09e1b0e3f6	libio: Remove codecvt vtable [BZ #24588 ] The codecvt vtable is not a real vtable because it also contains the conversion state data. Furthermore, wide stream support was added to GCC 3.0, after a C++ ABI bump, so there is no compatibility requirement with libstdc++. This change removes several unmangled function pointers which could be used with a corrupted FILE object to redirect execution. (libio vtable verification did not cover the codecvt vtable.) Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-05-20 21:54:57 +02:00
Florian Weimer	75c51570c7	support: Expose sbindir as support_sbindir_prefix	2019-05-20 21:15:35 +02:00
Mike Crowe	b62bb3bc68	support: Add missing EOL terminators on timespec The original implementations of test_timespec_before_impl and test_timespec_equal_or_after in `5198399651` were missing the backslash required for a newline. Checked on x86_64-linux-gnu. * support/timespec.c: Add backslash to correct newline in failure message. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-05-20 15:06:17 -03:00
Mike Crowe	ff6bec7d47	support: Correct confusing comment * support/timespec.h: Correct confusing comment. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-05-20 15:05:35 -03:00
Adhemerval Zanella	236c18e568	sysvipc: Consolidate semtimedop s390 This patch consolidates the s390-32 semtimedop implementation by defining a arch-specific SEMTIMEDOP_IPC_ARGS to rearrange the arguments expected by s390 Linux kABI. The idea is to avoid have multiples semtimedop implementation changes for Linux v5.1 change to enable wire-up sysvipc support. Checked with a s390-linux-gnu and s390x-linux-gnu and checking that resulting semtimedop objects did not change. * sysdeps/unix/sysv/linux/ipc_priv.h (SEMTIMEDOP_IPC_ARGS): New define. * sysdpes/unix/sysv/linux/s390/ipc_priv.h: New file. * sysdeps/unix/sysv/linux/s390/semtimedop.c: Remove file. * sysdeps/unix/sysv/linux/semtimedop.c (semtimedop): Use SEMTIMEDOP_IPC_ARGS for calls with __NR_ipc.	2019-05-20 12:25:31 -03:00
Adhemerval Zanella	dfba907fed	sysvipc: Fix compat msgctl (BZ#24570) The __IPC64 flags is meant to be used to enable the new sysv struct format when the architectures supports it (ARCH_WANT_IPC_PARSE_VERSION config flag on Linux kernel). This currently issue only affects alpha. [BZ #24570] * sysdeps/unix/sysv/linux/msgctl.c (__old_msgctl): Remove __IPC_64 usage.	2019-05-20 12:25:28 -03:00
Joseph Myers	1388600877	Add NT_ARM_PACA_KEYS and NT_ARM_PACG_KEYS from Linux 5.1 to elf.h. This patch adds the new NT_ARM_PACA_KEYS and NT_ARM_PACG_KEYS from Linux 5.1 to glibc's elf.h. Tested for x86_64. * elf/elf.h (NT_ARM_PACA_KEYS): New macro. (NT_ARM_PACG_KEYS): Likewise.	2019-05-20 11:51:58 +00:00

1 2 3 4 5 ...

23556 Commits