Go to file
bungeman 2d80dd2647 Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #14 id:260001 of https://codereview.chromium.org/874863002/)
Reason for revert:
This kills Mac 10.6 bots.

FAILED: c++ -MMD -MF obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o.d -DSK_INTERNAL -DSK_GAMMA_SRGB -DSK_GAMMA_APPLY_TO_A8 -DSK_SCALAR_TO_FLOAT_EXCLUDED -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=1 -DSK_SUPPORT_GPU=1 -DSK_SUPPORT_OPENCL=0 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_BUILD_FOR_MAC -DSK_CRASH_HANDLER -DSK_DEVELOPER=1 -I../../src/core -I../../src/utils -I../../include/c -I../../include/config -I../../include/core -I../../include/pathops -I../../include/pipe -I../../include/utils/mac -I../../include/effects -O0 -gdwarf-2 -mmacosx-version-min=10.6 -arch x86_64 -mssse3 -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wno-unused-parameter -Wno-invalid-offsetof -msse4.1  -c ../../src/opts/SkBlitRow_opts_SSE4.cpp -o obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o
../../src/opts/SkBlitRow_opts_SSE4.cpp:15:27: warning: x86intrin.h: No such file or directory
../../src/opts/SkBlitRow_opts_SSE4.cpp: In function 'void S32A_Opaque_BlitRow32_SSE4(SkPMColor*, const SkPMColor*, int, U8CPU)':
../../src/opts/SkBlitRow_opts_SSE4.cpp:40: error: '_mm_testz_si128' was not declared in this scope
../../src/opts/SkBlitRow_opts_SSE4.cpp:45: error: '_mm_testc_si128' was not declared in this scope

Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent.  _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots.  (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
>                bitmap_RGBA_8888_A_source_stripes_two	  14us -> 14.4us	1.03x
>              bitmap_RGBA_8888_A_source_stripes_three	14.3us -> 14.5us	1.01x
>                        bitmap_RGBA_8888_scale_bilerp	61.9us -> 62.2us	1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp	 102us ->  101us	0.99x
>                 bitmap_RGBA_8888_scale_rotate_bilerp	 103us ->  101us	0.99x
>                               bitmap_RGBA_8888_scale	18.4us -> 18.2us	0.99x
>              bitmap_RGBA_8888_A_scale_rotate_bicubic	  71us ->   70us	0.99x
>          bitmap_RGBA_8888_update_scale_rotate_bilerp	 103us ->  101us	0.99x
>               bitmap_RGBA_8888_A_scale_rotate_bilerp	 112us ->  109us	0.98x
>                     bitmap_RGBA_8888_update_volatile	5.72us -> 5.58us	0.98x
>                                     bitmap_RGBA_8888	5.73us -> 5.58us	0.97x
>                              bitmap_RGBA_8888_update	5.78us ->  5.6us	0.97x
>                      bitmap_RGBA_8888_A_scale_bilerp	70.7us ->   68us	0.96x
>                     bitmap_RGBA_8888_A_scale_bicubic	23.7us -> 21.8us	0.92x
>                                   bitmap_RGBA_8888_A	13.9us -> 10.9us	0.78x
>                     bitmap_RGBA_8888_A_source_opaque	  14us -> 6.29us	0.45x
>                bitmap_RGBA_8888_A_source_transparent	  14us -> 3.65us	0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785

TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842

Review URL: https://codereview.chromium.org/874033004
2015-01-26 14:32:09 -08:00
animations
bench add bench for building mipmaps 2015-01-26 12:28:54 -08:00
bin run clean branch baseline only once 2015-01-23 10:39:55 -08:00
debugger Make SkStream *not* ref counted. 2015-01-21 12:09:53 -08:00
dm Write dm.json periodically instead of only once at the end. 2015-01-23 05:48:00 -08:00
docs Remove references to out/Debug/tests executable. 2014-10-09 08:56:55 -07:00
experimental experimental/skp_to_pdf_md5 optionally also outputs pdf files 2015-01-24 13:04:57 -08:00
forth
gm Add sbix font to coloremoji gm. 2015-01-26 14:08:52 -08:00
gyp Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #14 id:260001 of https://codereview.chromium.org/874863002/) 2015-01-26 14:32:09 -08:00
include Fix Chrome build 2015-01-26 07:00:05 -08:00
platform_tools android_run_skia: dump logcat on failure 2015-01-22 10:43:34 -08:00
resources Remove PDF JPEG shortcut, since it fails on grayscale JPEGs. 2014-12-02 06:37:21 -08:00
samplecode s/sk_tools::DrawCheckerboard/sk_tool_utils::draw_checkerboard/ 2015-01-26 12:49:00 -08:00
site site/dev/contrib/directory <= https://status.skia.org/ 2015-01-26 13:46:41 -08:00
src Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #14 id:260001 of https://codereview.chromium.org/874863002/) 2015-01-26 14:32:09 -08:00
tests Alter gpu veto 2015-01-26 11:29:36 -08:00
third_party DM warning-free on win64 2014-12-12 16:41:46 -08:00
tools s/sk_tools::DrawCheckerboard/sk_tool_utils::draw_checkerboard/ 2015-01-26 12:49:00 -08:00
.gitignore Cleanup: Remove bug_chomper entry from .gitignore. 2014-11-24 17:53:55 -08:00
AUTHORS Revert of Revert of Add gpu support for Apple specific 'Vertex Arrays' functions (patchset #1 id:1 of https://codereview.chromium.org/750973003/) 2014-11-24 11:22:37 -08:00
codereview.settings Add Project to skia 2014-06-20 09:39:15 -07:00
CONTRIBUTING Add CONTRIBUTING file 2014-01-13 15:06:26 +00:00
CQ_COMMITTERS Add Herb as a Skia committer 2014-12-12 12:37:28 -08:00
DEPS Revert of Roll libwebp to v0.4.2 (latest stable) to fix annoying build warning. (patchset #1 id:1 of https://codereview.chromium.org/807553002/) 2014-12-15 12:23:00 -08:00
Doxyfile Fix links to skia-buildbot code in preparation for deletion 2014-10-14 04:44:44 -07:00
gyp_skia allow caller to override the default output directory for gyp 2014-09-29 11:42:25 -07:00
gyp_skia.py Roll gyp deps from 1765 to 1796. 2013-11-21 18:11:14 +00:00
LICENSE
make.bat Link to skiadocs site, since that is the canonical location for documentation. 2014-10-13 17:56:30 -03:00
make.py Fix reference to non-existant 'tests' target. 2014-10-13 17:51:57 -03:00
Makefile Revert "Revert "delete old things!"" 2015-01-20 10:23:02 -08:00
OWNERS add root files from chrome 2013-08-13 19:11:15 +00:00
PRESUBMIT.py PRESUBMIT should only check owners for the top level include directory 2014-08-26 14:00:55 -07:00
README Point to skiadocs in our README. 2014-05-09 04:30:09 +00:00
README.chromium add root files from chrome 2013-08-13 19:11:15 +00:00
skia.gyp Remove the comments settings for vim tab width and expansion variables. 2013-12-02 22:23:03 +00:00
SKP_VERSION Update SKP version 2015-01-25 22:29:44 -08:00
whitespace.txt force a build with new --config flags 2015-01-20 10:24:19 -05:00

Skia is a complete 2D graphic library for drawing Text, Geometries, and Images.

See full details, and build instructions, at https://sites.google.com/site/skiadocs/home