djsollen
933834851f
Enable the SSSE3 compile time check on all platforms (3rd attempt)
...
BUG=skia:2746
R=halcanary@google.com , mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/403583002
2014-07-22 07:20:18 -07:00
krajcevski
630598cbb8
Add support for NEON intrinsics to speed up texture compression. We can
...
now convert the time that we would have spent uploading the texture to
compressing it giving a net 50% memory savings for these things.
Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a
R=robertphillips@google.com , mtklein@google.com , kevin.petit@arm.com
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/390453002
2014-07-14 12:00:04 -07:00
krajcevski
d41dab43e6
Revert of Add support for NEON intrinsics to speed up texture compression. We can ( https://codereview.chromium.org/390453002/ )
...
Reason for revert:
Breaking chrome.
Original issue's description:
> Add support for NEON intrinsics to speed up texture compression. We can
> now convert the time that we would have spent uploading the texture to
> compressing it giving a net 50% memory savings for these things.
>
> Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a
R=robertphillips@google.com , mtklein@google.com , kevin.petit@arm.com
TBR=kevin.petit@arm.com , mtklein@google.com , robertphillips@google.com
NOTREECHECKS=true
NOTRY=true
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/384053003
2014-07-11 13:15:15 -07:00
krajcevski
bc9205be0a
Add support for NEON intrinsics to speed up texture compression. We can
...
now convert the time that we would have spent uploading the texture to
compressing it giving a net 50% memory savings for these things.
R=robertphillips@google.com , mtklein@google.com , kevin.petit@arm.com
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/390453002
2014-07-11 12:12:27 -07:00
djordje.pesut
a26bbb95a6
MIPS: added optimizations for functions from SkBitmapProcState
...
gain is ~30%
following functions are optimized:
SI8_D16_nofilter_DX
SI8_opaque_D32_nofilter_DX
R=djsollen@google.com , teodora.petrovic@gmail.com
Author: djordje.pesut@imgtec.com
Review URL: https://codereview.chromium.org/336533003
2014-07-08 02:24:16 -07:00
henrik.smiding
5f7f9d04dc
Add SSE4 version of BlurImage optimizations.
...
Adds an SSE4.1 version of the existing BlurImage optimizations.
Performance of blur_image_filter_* benchmarks show a 10-50%
improvement on Linux/Ubuntu Core i7.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665
R=mtklein@google.com , reed@chromium.org
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/366593004
2014-07-07 08:05:40 -07:00
reed
82cb86f613
Revert of Add SSE4 version of BlurImage optimizations. ( https://codereview.chromium.org/366593004/ )
...
Reason for revert:
breaks linker on chrome
[04:36:09.966000] [503/5965] LIB obj\chrome\installer_util.lib
[04:36:10.466000] FAILED: C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True skia.dll "C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:skia.dll.lib /DLL /OUT:skia.dll @skia.dll.rsp" 2 mt.exe rc.exe "obj\skia\skia.skia.dll.intermediate.manifest" obj\skia\skia.skia.dll.generated.manifest
[04:36:10.466000] skia.opts_check_x86.obj : error LNK2019: unresolved external symbol "bool __cdecl SkBoxBlurGetPlatformProcs_SSE4(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs_SSE4@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) referenced in function "bool __cdecl SkBoxBlurGetPlatformProcs(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z)
[04:36:10.466000]
[04:36:10.466000] skia.dll : fatal error LNK1120: 1 unresolved externals
Original issue's description:
> Add SSE4 version of BlurImage optimizations.
>
> Adds an SSE4.1 version of the existing BlurImage optimizations.
> Performance of blur_image_filter_* benchmarks show a 10-50%
> improvement on Linux/Ubuntu Core i7.
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665
R=mtklein@google.com , henrik.smiding@intel.com
TBR=henrik.smiding@intel.com , mtklein@google.com
NOTREECHECKS=true
NOTRY=true
Author: reed@chromium.org
Review URL: https://codereview.chromium.org/375503003
2014-07-06 18:51:29 -07:00
henrik.smiding
2830632ce9
Add SSE4 version of BlurImage optimizations.
...
Adds an SSE4.1 version of the existing BlurImage optimizations.
Performance of blur_image_filter_* benchmarks show a 10-50%
improvement on Linux/Ubuntu Core i7.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
R=mtklein@google.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/366593004
2014-07-04 04:23:17 -07:00
henrik.smiding
3bb195ef0d
Add SSE4 optimization of S32A_Opaque_Blitrow
...
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
instruction set. Special case for when alpha is zero or opaque.
Performance increase of 10%-400% compared to the existing SSE2
optimization (measured on Silvermont architecture).
Noticeable in ~25 different skia bench subtests, especially in
bitmap_8888_*, repeatTile_*, and morph_*.
bitmap_8888_A - 100% faster
bitmap_8888_A_source_transparent - 250% faster
bitmap_8888_A_source_opaque - 25% faster
bitmap_8888_A_scale_bicubic - 75% faster
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9
R=reed@google.com , mtklein@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/289473009
2014-06-27 08:03:17 -07:00
mtklein
db6346a5b1
Revert of Add SSE4 optimization of S32A_Opaque_Blitrow ( https://codereview.chromium.org/289473009/ )
...
NOTREECHECKS=true
NOTRY=true
Reason for revert:
Valgrind bot's seeing this code use uninitialized memory, and it's somehow blocking our roll into Chrome too:
> ld: warning: could not create compact unwind for
S32A_Opaque_BlitRow32_SSE4_asm:
> stack subq instruction is too different from dwarf stack size
> [10339/10982 | 3247.792] PACKAGE FRAMEWORK "Chromium Framework.framework",
> POSTBUILDS
> FAILED: ./gyp-mac-tool package-framework "Chromium Framework.framework" A &&
> (export
> BUILT_PRODUCTS_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release;
> export CONFIGURATION=Release; export CONTENTS_FOLDER_PATH="Chromium
> Framework.framework/Versions/A"; export
> DYLIB_INSTALL_NAME_BASE=@executable_path/../Versions/37.0.2056.0; export
> EXECUTABLE_NAME="Chromium Framework"; export EXECUTABLE_PATH="Chromium
> Framework.framework/Versions/A/Chromium Framework"; export
> FULL_PRODUCT_NAME="Chromium Framework.framework"; export
> INFOPLIST_PATH="Chromium Framework.framework/Versions/A/Resources/Info.plist";
> export
LD_DYLIB_INSTALL_NAME="@executable_path/../Versions/37.0.2056.0/Chromium
> Framework.framework/Chromium Framework"; export MACH_O_TYPE=mh_dylib; export
> PRODUCT_NAME="Chromium Framework"; export
> PRODUCT_TYPE=com.apple.product-type.framework; export
>
SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.6.sdk;
> export
>
SRCROOT=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/../../chrome;
> export SOURCE_ROOT="${SRCROOT}"; export
> TARGET_BUILD_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release;
> export TEMP_DIR="${TMPDIR}"; export
UNLOCALIZED_RESOURCES_FOLDER_PATH="Chromium
> Framework.framework/Versions/A/Resources"; export WRAPPER_NAME="Chromium
> Framework.framework"; (cd ../../chrome && ../build/mac/tweak_info_plist.py
> "--breakpad=1" "--breakpad_uploads=0" "--keystone=0" "--scm=1"
> "--branding=Chromium" && ln -fns Versions/Current/Libraries
> "${BUILT_PRODUCTS_DIR}/${WRAPPER_NAME}/Libraries" &&
> tools/build/mac/verify_order _ChromeMain
> "${BUILT_PRODUCTS_DIR}/${EXECUTABLE_PATH}"); G=$?; ((exit $G) || rm -rf
> 'Chromium Framework.framework') && exit $G) && touch "Chromium
> Framework.framework"
> tools/build/mac/verify_order: unordered symbols in
> /Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/Chromium
> Framework.framework/Versions/A/Chromium Framework:
> S32A_Opaque_BlitRow32_SSE4_asm
> _S32A_Opaque_BlitRow32_SSE4_asm
> ninja: build stopped: subcommand failed.
Original issue's description:
> Add SSE4 optimization of S32A_Opaque_Blitrow
>
> Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
> instruction set. Special case for when alpha is zero or opaque.
>
> Performance increase of 10%-400% compared to the existing SSE2
> optimization (measured on Silvermont architecture).
> Noticeable in ~25 different skia bench subtests, especially in
> bitmap_8888_*, repeatTile_*, and morph_*.
>
> bitmap_8888_A - 100% faster
> bitmap_8888_A_source_transparent - 250% faster
> bitmap_8888_A_source_opaque - 25% faster
> bitmap_8888_A_scale_bicubic - 75% faster
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
>
> Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9
R=reed@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com , henrik.smiding@intel.com , mtklein@chromium.org
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/336413007
2014-06-17 17:37:05 -07:00
henrik.smiding
b5c281e1e0
Add SSE4 optimization of S32A_Opaque_Blitrow
...
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
instruction set. Special case for when alpha is zero or opaque.
Performance increase of 10%-400% compared to the existing SSE2
optimization (measured on Silvermont architecture).
Noticeable in ~25 different skia bench subtests, especially in
bitmap_8888_*, repeatTile_*, and morph_*.
bitmap_8888_A - 100% faster
bitmap_8888_A_source_transparent - 250% faster
bitmap_8888_A_source_opaque - 25% faster
bitmap_8888_A_scale_bicubic - 75% faster
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
R=reed@google.com , mtklein@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/289473009
2014-06-17 11:32:47 -07:00
djordje.pesut
632a4546b0
MIPS: added optimization for functions from SkBlitRow.
...
gain is ~40%
following function are optimized:
S32_D565_Blend
S32A_D565_Opaque_Dither
S32_D565_Opaque_Dither
S32_D565_Blend_Dither
S32A_D565_Opaque
S32A_D565_Blend
S32_Blend_BlitRow32
R=djsollen@google.com , teodora.petrovic@gmail.com
Author: djordje.pesut@imgtec.com
Review URL: https://codereview.chromium.org/326913004
2014-06-11 06:56:10 -07:00
jvanverth
71804cc303
Revert of Add SSE4 optimization of S32A_Opaque_Blitrow ( https://codereview.chromium.org/289473009/ )
...
Reason for revert:
Buildbot failures on Mac 10.6 and Mac 10.7.
R=reed@google.com , mtklein@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com , henrik.smiding@intel.com
TBR=reed@google.com
NOTRY=True
Original issue's description:
> Add SSE4 optimization of S32A_Opaque_Blitrow
>
> Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
> instruction set. Special case for when alpha is zero or opaque.
>
> Performance increase of 10%-400% compared to the existing SSE2
> optimization (measured on Silvermont architecture).
> Noticeable in ~25 different skia bench subtests, especially in
> bitmap_8888_*, repeatTile_*, and morph_*.
>
> bitmap_8888_A - 100% faster
> bitmap_8888_A_source_transparent - 250% faster
> bitmap_8888_A_source_opaque - 25% faster
> bitmap_8888_A_scale_bicubic - 75% faster
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
Author: jvanverth@google.com
Review URL: https://codereview.chromium.org/311053009
2014-06-05 08:19:23 -07:00
henrik.smiding
e2527b1476
Add SSE4 optimization of S32A_Opaque_Blitrow
...
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
instruction set. Special case for when alpha is zero or opaque.
Performance increase of 10%-400% compared to the existing SSE2
optimization (measured on Silvermont architecture).
Noticeable in ~25 different skia bench subtests, especially in
bitmap_8888_*, repeatTile_*, and morph_*.
bitmap_8888_A - 100% faster
bitmap_8888_A_source_transparent - 250% faster
bitmap_8888_A_source_opaque - 25% faster
bitmap_8888_A_scale_bicubic - 75% faster
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
R=reed@google.com , mtklein@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/289473009
2014-06-05 07:50:54 -07:00
kevin.petit
866b95d65d
ARM Skia NEON patches - 38 - arm64 8888 blitters
...
Enable NEON on arm64 for most 8888 blitters
This patch enables NEON optimisation for the Color32, S32_Blend,
S32A_Opaque blitters on arm64.
Here are the perf improvements vs the existing code:
Color32:
========
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -2.39% | 23.78% |
+-------+------------+------------+
| 2 | -5.46% | 8.88% |
+-------+------------+------------+
| 4 | -4.74% | 4.89% |
+-------+------------+------------+
| 8 | 67.74% | 107.12% |
+-------+------------+------------+
| 16 | 40.03% | 101.20% |
+-------+------------+------------+
| 64 | 11.09% | 98.40% |
+-------+------------+------------+
| 256 | -2.20% | 74.81% |
+-------+------------+------------+
| 1024 | -4.28% | 78.90% |
+-------+------------+------------+
S32_Blend:
==========
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | 7.84% | -6.75% |
+-------+------------+------------+
| 2 | 28.95% | 39.77% |
+-------+------------+------------+
| 4 | 5.80% | 8.26% |
+-------+------------+------------+
| 8 | 1.35% | 33.80% |
+-------+------------+------------+
| 16 | -2.13% | 41.13% |
+-------+------------+------------+
| 64 | -4.91% | 42.84% |
+-------+------------+------------+
| 256 | -6.53% | 48.72% |
+-------+------------+------------+
| 1024 | -6.65% | 46.66% |
+-------+------------+------------+
S32A_Opaque:
============
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -7.51% | -19.06% |
+-------+------------+------------+
| 2 | -5.02% | -27.70% |
+-------+------------+------------+
| 4 | 15.38% | -21.66% |
+-------+------------+------------+
| 8 | -0.98% | 1.05% |
+-------+------------+------------+
| 16 | -7.35% | 3.34% |
+-------+------------+------------+
| 64 | 50.53% | 94.63% |
+-------+------------+------------+
| 256 | 71.17% | 164.10% |
+-------+------------+------------+
| 1024 | 79.58% | 197.60% |
+-------+------------+------------+
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com , mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/302283003
2014-06-03 10:08:07 -07:00
commit-bot@chromium.org
ab08437e39
Revert "Temporarily disable NEON on Android framework builds."
...
R=scroggo@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/294183002
git-svn-id: http://skia.googlecode.com/svn/trunk@14844 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-05-22 13:42:34 +00:00
bungeman@google.com
daeec365ff
Remove non-existent header file from Android opts.
...
R=djsollen@google.com
Review URL: https://codereview.chromium.org/274793004
git-svn-id: http://skia.googlecode.com/svn/trunk@14657 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-05-08 19:41:48 +00:00
djsollen@google.com
901c43c26f
Temporarily disable NEON on Android framework builds.
...
The GCC 4.8 compiler has an AARCH64 bug that generated non-PIC output
that fails to link.
R=scroggo@google.com
Review URL: https://codereview.chromium.org/266883011
git-svn-id: http://skia.googlecode.com/svn/trunk@14597 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-05-06 19:47:07 +00:00
commit-bot@chromium.org
8c4953c6f1
Cleanup of SSE optimization files.
...
General cleanup of optimization files for x86/SSEx.
Renamed the opts_check_SSE2.cpp file to _x86, since it's not specific
to SSE2. Commented out the ColorRect32 optimization, since it's
disabled anyway, to make it more visible.
Also fixed a lot of indentation, inclusion guards, spelling,
copyright headers, braces, whitespace, and sorting of includes.
Author: henrik.smiding@intel.com
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
R=reed@google.com , mtklein@google.com , tomhudson@google.com , djsollen@google.com , joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/264603002
git-svn-id: http://skia.googlecode.com/svn/trunk@14464 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-30 14:58:46 +00:00
commit-bot@chromium.org
c524e98f1e
Xfermode: SSE2 implementation of multiply_modeproc
...
This patch implements basics for Xfermode SSE optimization. Based on
these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
implementation for other modes will come in future. With this patch
performance of Xfermode_Multiply will improve about 45%. Here are the
data on desktop i7-3770.
before:
Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
after:
Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=14006
Committed: http://code.google.com/p/skia/source/detail?r=14050
R=mtklein@google.com , robertphillips@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/202903004
git-svn-id: http://skia.googlecode.com/svn/trunk@14107 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-09 15:43:46 +00:00
commit-bot@chromium.org
479d1ae9b6
Fixes to Android.mk generation for arm64.
...
Remove warning about no optimizations for arm64 and rebaseline the
associated test.
Exclude _opts_none.cpps when building arm64, to avoid double definitions.
BUG=skia:1975
R=halcanary@google.com , djsollen@google.com
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/229393002
git-svn-id: http://skia.googlecode.com/svn/trunk@14104 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-09 13:34:26 +00:00
commit-bot@chromium.org
77815fd74d
Revert of Xfermode: SSE2 implementation of multiply_modeproc ( https://codereview.chromium.org/202903004/ )
...
Reason for revert:
It looks like serialization is broken. The serialize and pipe-cross-process tests are failing and turning (at least the Ubuntu12 and Win7) bots red
Original issue's description:
> Xfermode: SSE2 implementation of multiply_modeproc
>
> This patch implements basics for Xfermode SSE optimization. Based on
> these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
> implementation for other modes will come in future. With this patch
> performance of Xfermode_Multiply will improve about 45%. Here are the
> data on desktop i7-3770.
> before:
> Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
> after:
> Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
>
> BUG=
>
> Committed: http://code.google.com/p/skia/source/detail?r=14006
>
> Committed: http://code.google.com/p/skia/source/detail?r=14050
R=mtklein@google.com , qiankun.miao@intel.com
TBR=mtklein@google.com , qiankun.miao@intel.com
NOTREECHECKS=true
NOTRY=true
BUG=
Author: robertphillips@google.com
Review URL: https://codereview.chromium.org/224253003
git-svn-id: http://skia.googlecode.com/svn/trunk@14053 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-03 18:53:33 +00:00
commit-bot@chromium.org
c311873927
Xfermode: SSE2 implementation of multiply_modeproc
...
This patch implements basics for Xfermode SSE optimization. Based on
these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
implementation for other modes will come in future. With this patch
performance of Xfermode_Multiply will improve about 45%. Here are the
data on desktop i7-3770.
before:
Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
after:
Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=14006
R=mtklein@google.com , robertphillips@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/202903004
git-svn-id: http://skia.googlecode.com/svn/trunk@14050 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-03 18:26:40 +00:00
commit-bot@chromium.org
6f2d4d4679
ARM Skia NEON patches - 35 - First AArch64 support
...
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13980
R=djsollen@google.com , reed@google.com , mtklein@google.com , halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@14025 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-02 15:03:56 +00:00
commit-bot@chromium.org
079d298600
Revert of Xfermode: SSE2 implementation of multiply_modeproc ( https://codereview.chromium.org/202903004/ )
...
Reason for revert:
Breaking builds
Original issue's description:
> Xfermode: SSE2 implementation of multiply_modeproc
>
> This patch implements basics for Xfermode SSE optimization. Based on
> these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
> implementation for other modes will come in future. With this patch
> performance of Xfermode_Multiply will improve about 45%. Here are the
> data on desktop i7-3770.
> before:
> Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
> after:
> Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
>
> BUG=
>
> Committed: http://code.google.com/p/skia/source/detail?r=14006
R=mtklein@google.com , qiankun.miao@intel.com
TBR=mtklein@google.com , qiankun.miao@intel.com
NOTREECHECKS=true
NOTRY=true
BUG=
Author: robertphillips@google.com
Review URL: https://codereview.chromium.org/219243009
git-svn-id: http://skia.googlecode.com/svn/trunk@14007 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-01 14:17:44 +00:00
commit-bot@chromium.org
25f7455f3a
Xfermode: SSE2 implementation of multiply_modeproc
...
This patch implements basics for Xfermode SSE optimization. Based on
these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
implementation for other modes will come in future. With this patch
performance of Xfermode_Multiply will improve about 45%. Here are the
data on desktop i7-3770.
before:
Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
after:
Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
BUG=
R=mtklein@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/202903004
git-svn-id: http://skia.googlecode.com/svn/trunk@14006 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-04-01 14:01:32 +00:00
commit-bot@chromium.org
d643a90ee2
Revert of ARM Skia NEON patches - 35 - First AArch64 support ( https://codereview.chromium.org/143423004/ )
...
Reason for revert:
GYP's failing on most (all?) bots.
Original issue's description:
> ARM Skia NEON patches - 35 - First AArch64 support
>
> Aarch64 support
>
> This change contains the necessary modifications to have Skia build and
> run properly on an ARMv8 processor in aarch64 execution state.
>
> Here's a list of the changes:
>
> - add an arm64 target to the build system + SK_CPU_ARM64 flag
>
> - MatrixTest was failing when built in Release mode. Fused MAC
> instructions were generated which made some intermediate results
> more accurate. As the test relies on result comparison, the more
> precise results when compared to others led to a gap bigger than
> what was tolerated. As I don't know if some actual skia code relies
> on results being comparable, I've disabled fused MAC instruction
> with -ffp-contract=off for arm64.
>
> - Modify include/core/SkOnce.h to have barriers work.
>
> - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
>
> - use existing Xfermode optimisations with modifications that can be
> removed in the future when toolchains are ready. Also save a few
> instructions is two Xfermodes (will apply to ARM too).
>
> - use existing SkBoxBlur and SkMorphology optimisations.
>
> - use existing SkBlitMask optimisations
>
> - use existing BitmapProcState and Convolution optimisations.
>
> Future changes will include:
>
> - Blitters (only partialy merged upstream)
>
> - SkUtils (there's little value in sending asm optimisations without
> having them benchmarked on real hardware).
>
> Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
>
> BUG=skia:
>
> Committed: http://code.google.com/p/skia/source/detail?r=13980
R=djsollen@google.com , reed@google.com , halcanary@google.com , kevin.petit@arm.com
TBR=djsollen@google.com , halcanary@google.com , kevin.petit@arm.com , reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/216113005
git-svn-id: http://skia.googlecode.com/svn/trunk@13983 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-03-28 18:05:29 +00:00
commit-bot@chromium.org
7a0e27577d
ARM Skia NEON patches - 35 - First AArch64 support
...
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com , reed@google.com , mtklein@google.com , halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@13980 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-03-28 17:56:14 +00:00
commit-bot@chromium.org
e72a408359
Updates to gyp files for building Android.mk
...
R=djsollen@google.com
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/180873012
git-svn-id: http://skia.googlecode.com/svn/trunk@13624 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-02-28 16:07:39 +00:00
commit-bot@chromium.org
5764165127
Split opts_check_arm.cpp into per-class files
...
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/157863003
git-svn-id: http://skia.googlecode.com/svn/trunk@13381 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-02-10 15:01:05 +00:00
scroggo@google.com
d4adfa37fb
Reland "Gyp file changes for the android framework."
...
Relands https://codereview.chromium.org/153093003/ , which was reverted
with https://skia.googlesource.com/skia.git/+/eb6295044b97db05ec40625dcebc2459b2a38a98
This reverts commit 6b32be1402eb6c549d5ba1db71860e24f9de2991.
BUG=skia:1975
R=djsollen@google.com
Review URL: https://codereview.chromium.org/154053002
git-svn-id: http://skia.googlecode.com/svn/trunk@13321 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-02-05 16:35:12 +00:00
scroggo@google.com
eb6295044b
Gyp file changes for the android framework.
...
Split off from https://codereview.chromium.org/140503007/ .
The eventual goal is to create our Android.mk from gyp. This patch
adds an option for skia_android_framework with the right settings.
The follow-up (https://codereview.chromium.org/140503007/ ) will
use scripts to create the final makefile.
gyp/android_deps.gyp:
Use different dependencies for the framework than for building Skia
normally.
gyp/android_framework_lib.gyp:
Like skia_lib, specifies the minimum needed for building Skia, in this
case for the framework.
gyp/common_conditions.gypi:
Add settings specific to skia_android_framework. In some cases this
means turning off flags and defines.
gyp/common.gypi
Turn off SK_DEBUG and SK_DEVELOPER when building for the framework.
This allows the framework to create a single makefile which can be
modified to add SK_DEBUG and SK_DEVELOPER as desired.
gyp/common_variables.gypi:
Add skia_android_framework.
gyp/core.gyp:
Don't depend on cpufeatures, and add the cutils library for
skia_android_framework.
gyp/freetype.gyp:
skia_android_framework-specific options:
Don't include freetype_static as a dependency.
Include the proper folders.
Include the android library.
gyp/images.gyp:
Don't export libjpeg as a dependency for targets that include images
for the framework.
Also reorder image decoders to match the Android order, leaving our
most commonly used ones last (and therefore first in the chain for
trying them).
gyp/libwebp.gyp:
Use the system webp when building for the Android framework. Specify
the correct settings for the framework.
gyp/opts.gyp:
Specify a default set of files to compile when there are no possible
optimizations.
gyp/pdf.gyp:
Add dependencies for Android framework.
gyp/zlib.gyp:
Include the zlib folder, and undefine SK_ZLIB_INCLUDE.
BUG=skia:1975
R=djsollen@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13298
Review URL: https://codereview.chromium.org/153093003
git-svn-id: http://skia.googlecode.com/svn/trunk@13304 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-02-04 16:08:48 +00:00
scroggo@google.com
1c9bd55ea3
Gyp file changes for the android framework.
...
Split off from https://codereview.chromium.org/140503007/ .
The eventual goal is to create our Android.mk from gyp. This patch
adds an option for skia_android_framework with the right settings.
The follow-up (https://codereview.chromium.org/140503007/ ) will
use scripts to create the final makefile.
gyp/android_deps.gyp:
Use different dependencies for the framework than for building Skia
normally.
gyp/android_framework_lib.gyp:
Like skia_lib, specifies the minimum needed for building Skia, in this
case for the framework.
gyp/common_conditions.gypi:
Add settings specific to skia_android_framework. In some cases this
means turning off flags and defines.
gyp/common.gypi
Turn off SK_DEBUG and SK_DEVELOPER when building for the framework.
This allows the framework to create a single makefile which can be
modified to add SK_DEBUG and SK_DEVELOPER as desired.
gyp/common_variables.gypi:
Add skia_android_framework.
gyp/core.gyp:
Don't depend on cpufeatures, and add the cutils library for
skia_android_framework.
gyp/freetype.gyp:
skia_android_framework-specific options:
Don't include freetype_static as a dependency.
Include the proper folders.
Include the android library.
gyp/images.gyp:
Don't export libjpeg as a dependency for targets that include images
for the framework.
Also reorder image decoders to match the Android order, leaving our
most commonly used ones last (and therefore first in the chain for
trying them).
gyp/libwebp.gyp:
Use the system webp when building for the Android framework. Specify
the correct settings for the framework.
gyp/opts.gyp:
Specify a default set of files to compile when there are no possible
optimizations.
gyp/pdf.gyp:
Add dependencies for Android framework.
gyp/zlib.gyp:
Include the zlib folder, and undefine SK_ZLIB_INCLUDE.
BUG=skia:1975
R=djsollen@google.com
Review URL: https://codereview.chromium.org/153093003
git-svn-id: http://skia.googlecode.com/svn/trunk@13298 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-02-04 14:50:20 +00:00
commit-bot@chromium.org
a96176dc03
ARM Skia NEON patches - 20 - New improved BitmapProcState code
...
BitmapProcState: new factorised code
This one basically factorises the clamp and repeat transformations with
some performance improvements. It has the benefit of being faster, much
easier to maintain (nearly three times less code for more work
done :-)), and more complete (all persp transformations weren't optimised
in the previous version).
It also introduces the use of can_truncate_to_fixed_for_decal where
useful.
The effect on benchmarks ranges from a 5% penalty to a 25% gain on a
Cortex-A9 and from a 5% penalty to a 100% gain on a Cortex-A15.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com , mtklein@google.com , luisjoseromeroesclusa@hotmail.com , reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23835006
git-svn-id: http://skia.googlecode.com/svn/trunk@13218 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-01-28 15:18:54 +00:00
senorblanco@chromium.org
0ded88d431
[Reland of r13154, since the Housekeeping bot seems to have reverted it in r13155. Next time I'll put the "do not disturb" sign on my commit.]
...
Refactor SkMorphologyImageFilter, CPU and GPU paths. This required making opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com , djsollen@google.com , reed@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13154
BUG=skia:
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13168 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-01-24 15:43:50 +00:00
skia.committer@gmail.com
1878a44c74
Sanitizing source files in Housekeeper-Nightly
...
git-svn-id: http://skia.googlecode.com/svn/trunk@13155 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-01-23 18:48:56 +00:00
senorblanco@chromium.org
76d4d04b18
Refactor SkMorphologyImageFilter, CPU and GPU paths. This required making opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
...
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com , djsollen@google.com , reed@google.com
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13154 2bbb7eff-a529-9590-31e7-b0007b416f81
2014-01-23 18:45:23 +00:00
commit-bot@chromium.org
b6872c06e1
Add support for MIPS to android build
...
R=borenet@google.com , scroggo@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/109323004
git-svn-id: http://skia.googlecode.com/svn/trunk@12592 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-12-10 12:53:56 +00:00
senorblanco@chromium.org
1d62f42e21
Implement a NEON version of the RGBA gaussian blur. This shows a 9-15% speedup on Nexus-10.
...
R=mtklein@google.com , mtklein
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 33063.23
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 32800.25
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 33017.88
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 32743.35
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 21024.04
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 22904.15
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 18738.08
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 18798.98
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30180.96
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 29861.90
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30178.98
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 29911.25
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19344.35
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19957.07
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17158.84
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17330.73
Review URL: https://codereview.chromium.org/99933004
git-svn-id: http://skia.googlecode.com/svn/trunk@12486 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-12-04 18:19:45 +00:00
commit-bot@chromium.org
611fde182a
Remove the comments settings for vim tab width and expansion variables.
...
These add unnecessary bloat for everyone to carry around, so we just
remove them now.
The same change was made in chromium by Tony in
http://codereview.chromium.org/7310019 - crrev.com/92046
BUG=None
TEST=./gyp_skia
R=mtklein@google.com
Author: tfarina@chromium.org
Review URL: https://codereview.chromium.org/92673003
git-svn-id: http://skia.googlecode.com/svn/trunk@12443 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-12-02 22:23:03 +00:00
rmistry@google.com
d6bab02386
Reverting r12427
...
git-svn-id: http://skia.googlecode.com/svn/trunk@12428 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-12-02 13:50:38 +00:00
skia.committer@gmail.com
5b39f5ba9c
Sanitizing source files in Housekeeper-Nightly
...
git-svn-id: http://skia.googlecode.com/svn/trunk@12427 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-12-02 13:36:22 +00:00
commit-bot@chromium.org
dbe7f52412
ARM Skia NEON patches - 16/17 - Blitmask
...
Blitmask: NEON optimised version of the D32_A8 functions
Here are the microbenchmark results I got for the D32_A8
functions:
Cortex-A9:
==========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -14% | -39,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -3% | -29,9% | -25% |
+-------+--------+--------+--------+
| 4 | -11,3% | -22% | -14,5% |
+-------+--------+--------+--------+
| 8 | +128% | +66,6% | +105% |
+-------+--------+--------+--------+
| 16 | +159% | +102% | +149% |
+-------+--------+--------+--------+
| 64 | +189% | +136% | +189% |
+-------+--------+--------+--------+
| 256 | +126% | +102% | +149% |
+-------+--------+--------+--------+
| 1024 | +67,5% | +81,4% | +123% |
+-------+--------+--------+--------+
Cortex-A15:
===========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -24% | -46,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -18,5% | -35,5% | -28% |
+-------+--------+--------+--------+
| 4 | -5,2% | -17,5% | -15,5% |
+-------+--------+--------+--------+
| 8 | +72% | +65,8% | +84,7% |
+-------+--------+--------+--------+
| 16 | +168% | +117% | +149% |
+-------+--------+--------+--------+
| 64 | +165% | +110% | +145% |
+-------+--------+--------+--------+
| 256 | +106% | +99,6% | +141% |
+-------+--------+--------+--------+
| 1024 | +93,7% | +94,7% | +130% |
+-------+--------+--------+--------+
Blitmask: add NEON optimised PlatformBlitRowProcs16
Here are the microbenchmark results (speedup vs. C code):
+-------+-----------------+-----------------+
| | Cortex-A9 | Cortex-A15 |
| count +--------+--------+--------+--------+
| | Blend | Opaque | Blend | Opaque |
+-------+--------+--------+--------+--------+
| 1 | -19,2% | -36,7% | -33,6% | -44,7% |
+-------+--------+--------+--------+--------+
| 2 | -12,6% | -27,8% | -39% | -48% |
+-------+--------+--------+--------+--------+
| 4 | -11,5% | -21,6% | -37,7% | -44,3% |
+-------+--------+--------+--------+--------+
| 8 | +141% | +59,7% | +123% | +48,7% |
+-------+--------+--------+--------+--------+
| 16 | +213% | +119% | +214% | +121% |
+-------+--------+--------+--------+--------+
| 64 | +212% | +105% | +242% | +167% |
+-------+--------+--------+--------+--------+
| 256 | +289% | +167% | +249% | +207% |
+-------+--------+--------+--------+--------+
| 1024 | +273% | +169% | +146% | +220% |
+-------+--------+--------+--------+--------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com , mtklein@google.com , reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23719002
git-svn-id: http://skia.googlecode.com/svn/trunk@12420 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-11-27 17:08:36 +00:00
senorblanco@chromium.org
f376f5de93
Implement a NEON version of morphology. This is good for ~2.2X speedup on Tegra3.
...
R=mtklein@google.com , mtklein, reed@google.com
Review URL: https://codereview.chromium.org/68123003
git-svn-id: http://skia.googlecode.com/svn/trunk@12219 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-11-11 16:48:51 +00:00
senorblanco@chromium.org
27eec46d69
SSE2 implementation of RGBA box blurs. This yields ~2X perf improvement on Xeon ES-2690.
...
R=mtklein@google.com
Review URL: https://codereview.chromium.org/61643011
git-svn-id: http://skia.googlecode.com/svn/trunk@12204 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-11-08 20:49:04 +00:00
senorblanco@chromium.org
1986756c06
Speculative Android build fix.
...
TBR=robertphillips
Review URL: https://codereview.chromium.org/52693003
git-svn-id: http://skia.googlecode.com/svn/trunk@12041 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-10-30 22:38:15 +00:00
senorblanco@chromium.org
7a47ad3bac
Implement SSE2-based implementations of the morphology filters (dilate & erode). This gives a 3-5X speedup over the naive implementation, and also mitigates a timing-based security attack in Chrome ( https://code.google.com/p/chromium/issues/detail?id=251711 ).
...
NOTE: this will require a corresponding GYP change on the Skia roll into Chrome: https://codereview.chromium.org/52453004/
R=mtklein@google.com , reed@google.com
Review URL: https://codereview.chromium.org/52603004
git-svn-id: http://skia.googlecode.com/svn/trunk@12038 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-10-30 21:57:04 +00:00
commit-bot@chromium.org
cd7992ba55
ARM Skia NEON patches - 30 - Xfermode: NEON modeprocs
...
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
Committed: http://code.google.com/p/skia/source/detail?r=11813
R=djsollen@google.com , mtklein@google.com , reed@google.com , robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11843 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-10-17 16:29:34 +00:00
robertphillips@google.com
dfe0f43e11
Reverting r11813 (ARM Skia NEON patches - 30 - Xfermode: NEON modeprocs - https://codereview.chromium.org/26627004 ) due to Chromium compilation faliures.
...
git-svn-id: http://skia.googlecode.com/svn/trunk@11833 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-10-17 00:09:17 +00:00
commit-bot@chromium.org
b4c29c5363
ARM Skia NEON patches - 30 - Xfermode: NEON modeprocs
...
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
R=djsollen@google.com , mtklein@google.com , reed@google.com , robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11813 2bbb7eff-a529-9590-31e7-b0007b416f81
2013-10-16 16:24:08 +00:00