Adjustments for libm ulps added with commit d8b82cad1b,
495fd99f3a, and 5ba3cc691c.
I also adjusted some exp10 ulps definition that was higher than needed.
In the past the "-ftree-loop-linear" switch provided a measurable
improvement in performance for certain functions. At some point it
was assigned as the responsibility of Graphite in GCC. It has been
found that even with Graphite enabled these flags no longer perform
any appreciable improvement over the baseline.
Graphite now has some open bugs which need to be fixed in order for it
to provide measurable performance improvements but it lacks active
development. As a result some compiler distributors may disable
Graphite. If Graphite is disabled then building GLIBC will fail if
the "-ftree-loop-linear" switch is used.
This patch removes the use of "-ftree-loop-linear" as unnecessary.
This patch provides optimized logb (1.2x on PPC32 and 2.5x on PPC64),
logbf (1.1x on PPC32 and 2.2x on PPC64), and logbl (1.3x on PPC32 and
50% on PPC64) for the POWER7 processor.
* sysdeps/powerpc/elf/ifunc-sel.h: Moved to ...
* sysdeps/powerpc/ifunc-sel.h: ... here.
* sysdeps/powerpc/elf/rtld-global-offsets.sym: Moved to ...
* sysdeps/powerpc/rtld-global-offsets.sym: ... here.
* sysdeps/powerpc/powerpc32/crti.S: New file.
* sysdeps/powerpc/powerpc32/crtn.S: New file.
* sysdeps/powerpc/powerpc64/crti.S: New file.
* sysdeps/powerpc/powerpc64/crtn.S: New file.
This patch tries to organize the implies files for ppc, since there are
a number of processors and most of them are compatible with each other
(backwards compatible).
Having in mind that we start the search for processor-specific files in
the sysdeps/unix/sysv/linux tree
(sysdeps/unix/sysv/linux/powerpc/powerpc[32|64]/[processor]/fpu to be
exact), we would like to grab any linux-specific code from that tree
prior to going through the other tree (sysdeps/powerpc/...).
For that, i removed the Implies files that were originally inside the
fpu directories and placed then in the non-fpu directories (still inside
the unix/sysv/linux tree). If no processor-specific/linux-specific files
could be found, we "imply" the other tree's (sysdeps/powerpc/...) fpu
directory for that specific processor AND also the non-fpu directory for
that same tree.
If, again, no processor-specific code is found, we read another Implies
file that will point to the most compatible processor that we should
grab code from, and so on, until we reach the power4 processor.
So, in summary, the Implies files will live inside these directories
now:
* sysdeps/unix/sysv/linux/powerpc/powerpc[32|64]/[processor]
* sysdeps/powerpc/powerpc[32|64]/[processor]
Practical example of the order we will use to pick power6-specific code
with the new structure.
sysdeps/unix/sysv/linux/powerpc/powerpc[32|64]/power6/fpu ->
sysdeps/unix/sysv/linux/powerpc/powerpc[32|64]/power6 ->
sysdeps/powerpc/powerpc[32|64]/power6/fpu ->
sysdeps/powerpc/powerpc[32|64]/power6 ->
sysdeps/powerpc/powerpc[32|64]/power5+/fpu ->
sysdeps/powerpc/powerpc[32|64]/power5+ ->
sysdeps/powerpc/powerpc[32|64]/power5/fpu ->
sysdeps/powerpc/powerpc[32|64]/power5 ->
sysdeps/powerpc/powerpc[32|64]/power4/fpu ->
sysdeps/powerpc/powerpc[32|64]/power4 (from here, it'll go to the
generic path as usual)
I've just committed STT_GNU_IFUNC ppc/ppc64 support into prelink,
and this patch is needed on the glibc side. Without it ld.so segfaults,
as in dl-conflict.c sym_map is always NULL. While dl-machine.h could use
RESOLVE_CONFLICT_FIND_MAP macro to compute it, it doesn't make sense,
because with prelink we know it is already properly relocated (all relative
relocations are applied by prelink).
2009-01-11 Ryan S. Arnold <rsa@us.ibm.com>
[BZ #9726]
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_SET_DI_FPSCR,
_SET_SI_FPSCR): Clobber fp0 to prevent erroneous test-case passes.
2009-01-08 Ryan S. Arnold <rsa@us.ibm.com>
[BZ #9726]
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
(__CONTEXT_FUNC_NAME): Fix mtfsf to use fp31 instead of fp0.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S
(__CONTEXT_FUNC_NAME): Fix mtfsf to use fp31 instead of fp0.
2008-11-13 Ryan S. Arnold <rsa@us.ibm.com>
[BZ #6411]
* sysdeps/powerpc/fpu/Makefile: Added test case tst-setcontext-fpscr.
* sysdeps/powerpc/fpu/feholdexcpt.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/fenv_libc.h (fesetenv_register): Dynamically
choose mtfsf insn based on PPC_FEATURE_HAS_DFP.
(relax_fenv_state): Same as above.
(FPSCR_29): Reserve bit in ISA 2.05.
(FPSCR_NI): Provide define for compat.
* sysdeps/powerpc/fpu/fesetenv.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/feupdateenv.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: New file. Test case to
test setcontext and swapcontext with dynamic 64-bit FPSCR detection.
* sysdeps/powerpc/powerpc32/fpu/__longjmp-common.S (__longjmp): Adjust
access to hwcap to account for hwcap size increase to uint64_t.
* sysdeps/powerpc/powerpc32/fpu/setjmp-common.S (__sigsetjmp ):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/getcontext-common.S
(*setcontext): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6/fpu/setcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6/fpu/swapcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
(*setcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP. Adjust access to hwcap to account for hwcap size
increase to uint64_t.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S
(*swapcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP. Adjust access to hwcap to account for hwcap size
increase to uint64_t.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6/fpu/setcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6/fpu/swapcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S
(*setcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S
(*swapcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP.
2008-08-14 Ryan S. Arnold <rsa@us.ibm.com>
[BZ #6845]
* sysdeps/powerpc/fpu/bits/mathinline.h (__signbitl): Copy new
__signbitl definition and __LONG_DOUBLE_128__ guard from:
* sysdeps/unix/sysv/linux/powerpc/bits/mathinline.h: Remove as
redundant. Functions which call floating point assembler operations
should go into a sysdeps powerpc/fpu directory.
2008-08-01 Steven Munroe <sjmunroe@us.ibm.com>
Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
[BZ #6817]
* sysdeps/powerpc/dl-procinfo.c (_dl_powerpc_cap_flags):
Added the members 'vsx' and 'arch_2_06'.
(_dl_powerpc_platforms): Add the member 'power7'.
* sysdeps/powerpc/dl-procinfo.h: Modify _DL_HWCAP_FIRST
to reflect the changes required by VSX and ISA 2.06.
Modify _DL_PLATFORMS_COUNT to reflect the addition of
'power7'.
Defined PPC_PLATFORM_POWER7.
(_dl_string_platform): Add support for POWER7.
* sysdeps/powerpc/sysdep.h: Define bit masks for VSX
capability and ISA 2.06.
__fe_nomask_env.
* sysdeps/powerpc/fpu/fe_nomask.c: Add libm_hidden_def.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/fe_nomask.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu/fe_nomask.c: Likewise.
* sysdeps/powerpc/bits/fenv.h: Make safe for C++.
* sysdeps/unix/sysv/linux/powerpc/bits/mathinline.h: New file.
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Rename
function from fegetexcept and make old name weak alias.
* include/fenv.h: Declare __fegetexcept.
* sysdeps/powerpc/fpu/fedisblxcpt.c: Use __fegetexcept instead of
fegetexcept.
* sysdeps/powerpc/fpu/feenablxcpt.c: Likewise.
* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Avoid call
to fetestexcept.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Use __frexpl
instead of frexpl to avoid local PLT.
* math/s_significandl.c (__significandl): Use __ilogbl instead of
ilogbl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __ldexpl
instead of ldexpl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c (__ieee754_expl): Use
__roundl not roundl to avoid local PLT.
* sysdeps/ieee754/ldbl-128/e_j0l.c: Use function names which avoid
local PLTs. Use __sincosl instead of separate sinl and cosl
calls.
* sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_lround.S (__lround): Fixed erroneous
result when x is +/-nextafter(+/-0.5,-/+1) i.e. all 1's in the
mantissa.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S (__llround):
Likewise. Also account for when x is an odd number between 2^52
and 2^53-1.
* sysdeps/powerpc/powerpc64/fpu/s_llround.S (__llround): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S (__llroundf): Likewise.
* math/libm-test.inc (lround_test, llround_test): Added test cases to
detect aforementioned erroneous conditions.
2008-01-24 Steven Munroe <sjmunroe@us.ibm.com>
[BZ #5741]
* sysdeps/powerpc/powerpc64/dl-machine.h (PPC_DCBT, PPC_DCBF):
Define additonal Data Cache Block instruction macros.
(elf_machine_fixup_plt): Add dcbt for opd and plt entries.
Replace dcbst with dcbf and sync with sync/isync.
* sysdeps/powerpc/powerpc32/power4/hp-timing.h: New file.
* sysdeps/powerpc/powerpc64/hp-timing.h [_ARCH_PWR4] (HP_TIMING_NOW):
For ISA 2.01 and later replace mftb with mfspr 268.
* sysdeps/i386/i686/memcpy.S: Optimize copying of equally aligned
buffers.