Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13980R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@14025 2bbb7eff-a529-9590-31e7-b0007b416f81
This patch implements basics for Xfermode SSE optimization. Based on
these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
implementation for other modes will come in future. With this patch
performance of Xfermode_Multiply will improve about 45%. Here are the
data on desktop i7-3770.
before:
Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
after:
Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
BUG=
R=mtklein@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/202903004
git-svn-id: http://skia.googlecode.com/svn/trunk@14006 2bbb7eff-a529-9590-31e7-b0007b416f81
Reason for revert:
GYP's failing on most (all?) bots.
Original issue's description:
> ARM Skia NEON patches - 35 - First AArch64 support
>
> Aarch64 support
>
> This change contains the necessary modifications to have Skia build and
> run properly on an ARMv8 processor in aarch64 execution state.
>
> Here's a list of the changes:
>
> - add an arm64 target to the build system + SK_CPU_ARM64 flag
>
> - MatrixTest was failing when built in Release mode. Fused MAC
> instructions were generated which made some intermediate results
> more accurate. As the test relies on result comparison, the more
> precise results when compared to others led to a gap bigger than
> what was tolerated. As I don't know if some actual skia code relies
> on results being comparable, I've disabled fused MAC instruction
> with -ffp-contract=off for arm64.
>
> - Modify include/core/SkOnce.h to have barriers work.
>
> - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
>
> - use existing Xfermode optimisations with modifications that can be
> removed in the future when toolchains are ready. Also save a few
> instructions is two Xfermodes (will apply to ARM too).
>
> - use existing SkBoxBlur and SkMorphology optimisations.
>
> - use existing SkBlitMask optimisations
>
> - use existing BitmapProcState and Convolution optimisations.
>
> Future changes will include:
>
> - Blitters (only partialy merged upstream)
>
> - SkUtils (there's little value in sending asm optimisations without
> having them benchmarked on real hardware).
>
> Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
>
> BUG=skia:
>
> Committed: http://code.google.com/p/skia/source/detail?r=13980R=djsollen@google.com, reed@google.com, halcanary@google.com, kevin.petit@arm.comTBR=djsollen@google.com, halcanary@google.com, kevin.petit@arm.com, reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/216113005
git-svn-id: http://skia.googlecode.com/svn/trunk@13983 2bbb7eff-a529-9590-31e7-b0007b416f81
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@13980 2bbb7eff-a529-9590-31e7-b0007b416f81
Split off from https://codereview.chromium.org/140503007/.
The eventual goal is to create our Android.mk from gyp. This patch
adds an option for skia_android_framework with the right settings.
The follow-up (https://codereview.chromium.org/140503007/) will
use scripts to create the final makefile.
gyp/android_deps.gyp:
Use different dependencies for the framework than for building Skia
normally.
gyp/android_framework_lib.gyp:
Like skia_lib, specifies the minimum needed for building Skia, in this
case for the framework.
gyp/common_conditions.gypi:
Add settings specific to skia_android_framework. In some cases this
means turning off flags and defines.
gyp/common.gypi
Turn off SK_DEBUG and SK_DEVELOPER when building for the framework.
This allows the framework to create a single makefile which can be
modified to add SK_DEBUG and SK_DEVELOPER as desired.
gyp/common_variables.gypi:
Add skia_android_framework.
gyp/core.gyp:
Don't depend on cpufeatures, and add the cutils library for
skia_android_framework.
gyp/freetype.gyp:
skia_android_framework-specific options:
Don't include freetype_static as a dependency.
Include the proper folders.
Include the android library.
gyp/images.gyp:
Don't export libjpeg as a dependency for targets that include images
for the framework.
Also reorder image decoders to match the Android order, leaving our
most commonly used ones last (and therefore first in the chain for
trying them).
gyp/libwebp.gyp:
Use the system webp when building for the Android framework. Specify
the correct settings for the framework.
gyp/opts.gyp:
Specify a default set of files to compile when there are no possible
optimizations.
gyp/pdf.gyp:
Add dependencies for Android framework.
gyp/zlib.gyp:
Include the zlib folder, and undefine SK_ZLIB_INCLUDE.
BUG=skia:1975
R=djsollen@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13298
Review URL: https://codereview.chromium.org/153093003
git-svn-id: http://skia.googlecode.com/svn/trunk@13304 2bbb7eff-a529-9590-31e7-b0007b416f81
Split off from https://codereview.chromium.org/140503007/.
The eventual goal is to create our Android.mk from gyp. This patch
adds an option for skia_android_framework with the right settings.
The follow-up (https://codereview.chromium.org/140503007/) will
use scripts to create the final makefile.
gyp/android_deps.gyp:
Use different dependencies for the framework than for building Skia
normally.
gyp/android_framework_lib.gyp:
Like skia_lib, specifies the minimum needed for building Skia, in this
case for the framework.
gyp/common_conditions.gypi:
Add settings specific to skia_android_framework. In some cases this
means turning off flags and defines.
gyp/common.gypi
Turn off SK_DEBUG and SK_DEVELOPER when building for the framework.
This allows the framework to create a single makefile which can be
modified to add SK_DEBUG and SK_DEVELOPER as desired.
gyp/common_variables.gypi:
Add skia_android_framework.
gyp/core.gyp:
Don't depend on cpufeatures, and add the cutils library for
skia_android_framework.
gyp/freetype.gyp:
skia_android_framework-specific options:
Don't include freetype_static as a dependency.
Include the proper folders.
Include the android library.
gyp/images.gyp:
Don't export libjpeg as a dependency for targets that include images
for the framework.
Also reorder image decoders to match the Android order, leaving our
most commonly used ones last (and therefore first in the chain for
trying them).
gyp/libwebp.gyp:
Use the system webp when building for the Android framework. Specify
the correct settings for the framework.
gyp/opts.gyp:
Specify a default set of files to compile when there are no possible
optimizations.
gyp/pdf.gyp:
Add dependencies for Android framework.
gyp/zlib.gyp:
Include the zlib folder, and undefine SK_ZLIB_INCLUDE.
BUG=skia:1975
R=djsollen@google.com
Review URL: https://codereview.chromium.org/153093003
git-svn-id: http://skia.googlecode.com/svn/trunk@13298 2bbb7eff-a529-9590-31e7-b0007b416f81
BitmapProcState: new factorised code
This one basically factorises the clamp and repeat transformations with
some performance improvements. It has the benefit of being faster, much
easier to maintain (nearly three times less code for more work
done :-)), and more complete (all persp transformations weren't optimised
in the previous version).
It also introduces the use of can_truncate_to_fixed_for_decal where
useful.
The effect on benchmarks ranges from a 5% penalty to a 25% gain on a
Cortex-A9 and from a 5% penalty to a 100% gain on a Cortex-A15.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23835006
git-svn-id: http://skia.googlecode.com/svn/trunk@13218 2bbb7eff-a529-9590-31e7-b0007b416f81
Using OTHER_CPLUSPLUSFLAGS instead of OTHER_CFLAGS will append -mssse3 into the
argument list instead of overwriting as the old note warns about. (So it's
actually there twice now for the files in opts_ssse3, and we can still build if
we remove -mssse3 from common_conditions.gypi.)
We could also just delete this clause entirely given that
common_conditions.gypi sets it anyway. Which do you think is best? This code
won't compile unless _someone_ has set -mssse3. Seems to me the redundancy
helps communicate that and protect against changes in common_conditions.gypi.
BUG=
R=epoger@google.com, bungeman@google.com
Author: mtklein@google.com
Review URL: https://chromiumcodereview.appspot.com/21279005
git-svn-id: http://skia.googlecode.com/svn/trunk@10573 2bbb7eff-a529-9590-31e7-b0007b416f81
$ compare-android.sh bench --match bitmap_ --repeat 30
master -> ssse3
N=30 p=0.001000 (corrected to 0.000033)
sig? speedup bench
n -1.16% bitmap_scale_filter_256_64
y -0.72% bitmap_8888_A_scale_bicubic
y -0.21% bitmap_index8_A
n -0.00% bitmap_565
n -0.00% bitmap_scale_filter_90_80
n 0.03% bitmap_8888_A_source_transparent
y 0.06% bitmap_index8
y 0.30% bitmap_8888_A_source_stripes_two
n 0.34% bitmap_scale_filter_80_90
y 0.42% bitmap_8888_A
y 0.44% bitmap_8888_A_source_opaque
n 0.53% bitmap_scale_filter_90_10
y 0.71% bitmap_8888_A_source_stripes_three
y 0.91% bitmap_8888_A_scale_rotate_bicubic
y 1.04% bitmap_8888_update
n 1.19% bitmap_scale_filter_10_90
n 1.39% bitmap_scale_filter_90_90
y 1.77% bitmap_8888_update_volatile
y 1.89% bitmap_8888
y 2.37% bitmap_scale_filter_30_90
y 9.57% bitmap_scale_filter_64_256
n 17.86% bitmap_scale_filter_90_30
y 25.40% bitmap_8888_A_scale_rotate_bilerp
y 27.19% bitmap_8888_scale_rotate_bilerp
y 27.23% bitmap_8888_update_scale_rotate_bilerp
y 27.29% bitmap_8888_update_volatile_scale_rotate_bilerp
y 55.08% bitmap_8888_A_scale_bilerp
y 58.75% bitmap_8888_update_volatile_scale_bilerp
y 58.90% bitmap_8888_scale_bilerp
y 58.92% bitmap_8888_update_scale_bilerp
Overall speedup: 10.52%
BUG=skia:1111
R=djsollen@google.com
Review URL: https://codereview.chromium.org/21203005
git-svn-id: http://skia.googlecode.com/svn/trunk@10474 2bbb7eff-a529-9590-31e7-b0007b416f81
This is a step toward targets declaring their deps in a sane fashion.
This change resolves cycles by forcing core to the root, then everything
in skia_lib pointing toward core as best possible, then everything
outside skia_lib depending on skia_lib for things in skia_lib. This
prevents double definitions where a symbol is provided by both the
skia_lib shared object and and a statically linked component of skia_lib.
R=djsollen@google.com
Review URL: https://codereview.chromium.org/19823003
git-svn-id: http://skia.googlecode.com/svn/trunk@10231 2bbb7eff-a529-9590-31e7-b0007b416f81
Need to avoid linking in .a things which are already provided by .so things.
git-svn-id: http://skia.googlecode.com/svn/trunk@10222 2bbb7eff-a529-9590-31e7-b0007b416f81
This is a step toward targets declaring their deps in a sane fashion.
This change resolves cycles by forcing core to the root,
then opts, ports, and utils depending on core, then everything else.
We will need some other change to resolve the fact that
core, opts, ports, and utils depend on each other and other targets which
depend on them. Outside of these targets, things look ok.
R=djsollen@google.com
Review URL: https://codereview.chromium.org/19823003
git-svn-id: http://skia.googlecode.com/svn/trunk@10217 2bbb7eff-a529-9590-31e7-b0007b416f81