Go to file
Mike Klein e7f89fc257 improve HSW 16->8 bit pack
__builtin_convertvector(..., U8x4) is producing a fairly long
sequence of code to convert U16x4 to U8x4 on HSW:
    vextracti128  $0x1,%ymm2,%xmm3
    vmovdqa       0x1848(%rip),%xmm4
    vpshufb       %xmm4,%xmm3,%xmm3
    vpshufb       %xmm4,%xmm2,%xmm2
    vpunpcklqdq   %xmm3,%xmm2,%xmm2
    vextracti128  $0x1,%ymm0,%xmm3
    vpshufb       %xmm4,%xmm3,%xmm3
    vpshufb       %xmm4,%xmm0,%xmm0
    vpunpcklqdq   %xmm3,%xmm0,%xmm0
    vinserti128   $0x1,%xmm2,%ymm0,%ymm0

We can do much better with _mm256_packus_epi16:
    vinserti128   $0x1,%xmm0,%ymm2,%ymm3
    vperm2i128    $0x31,%ymm0,%ymm2,%ymm0
    vpackuswb     %ymm0,%ymm3,%ymm0

vpackuswb packs the values in a somewhat surprising order,
which the first two instructions get us lined up for.

This is a pretty noticeable speedup, 7-8% on some benchmarks.

The same sort of change could be made for SSE2 and SSE4.1 also
using _mm_packus_epi16, but the difference for that change is
much less dramatic.  Might as well stick to focusing on HSW.

Change-Id: I0d6765bd67e0d024d658a61d19e6f6826b4d392c
Reviewed-on: https://skia-review.googlesource.com/30420
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-08-03 13:24:46 +00:00
animations first cut at a checkbox 2009-10-21 19:41:10 +00:00
bench Compute correct bounds for DrawShadowRec. 2017-07-31 13:55:32 +00:00
bin add a Win/Clang build bot 2017-07-31 19:53:51 +00:00
debugger Add Make[backend] calls for creating GrContexts 2017-07-25 14:33:03 +00:00
dm Roll ANGLE 2017-08-01 17:08:03 +00:00
docs fix self references 2017-07-31 16:49:12 +00:00
example Add Make[backend] calls for creating GrContexts 2017-07-25 14:33:03 +00:00
experimental Add Make[backend] calls for creating GrContexts 2017-07-25 14:33:03 +00:00
fuzz use unique_ptr for codec factories 2017-07-25 15:35:23 +00:00
gm Fix sweep_tiling GM sizing 2017-08-02 14:23:36 +00:00
gn Tell clang/win to emulate MSVC 2015 2017-07-31 20:58:55 +00:00
include Add a private API for writing the clip to the stencil 2017-08-02 19:56:07 +00:00
infra Chromium lkgr is no longer updated. Use lkcr 2017-08-03 12:41:36 +00:00
platform_tools Enable ios on Raspberry Pi 2017-04-25 16:56:41 +00:00
resources Fix double delete in SkBmpCodec 2017-07-14 16:25:54 +00:00
samplecode clang on windows support 2017-07-31 18:39:23 +00:00
site Fix typo XPS to SVG 2017-08-02 20:53:36 +00:00
src improve HSW 16->8 bit pack 2017-08-03 13:24:46 +00:00
tests Revert "Revert "support for 'half' types in sksl, plus general numeric type improvements"" 2017-08-02 18:47:00 +00:00
third_party link libwebpmux in system-webp builds 2017-08-01 14:33:38 +00:00
tools Chromium lkgr is no longer updated. Use lkcr 2017-08-03 12:41:36 +00:00
.clang-format Mark flatennable macros as block beginning/ending in .clang-format 2017-01-09 15:31:36 +00:00
.gitignore clang on windows support 2017-07-31 18:39:23 +00:00
.gn Basic standalone GN configs. 2016-07-21 12:25:45 -07:00
AUTHORS Added support for building for tvOS 2017-03-14 22:55:04 +00:00
BUILD.gn 8-bit hacking 2017-08-03 01:54:58 +00:00
codereview.settings Make uploading to Gerrit the default for Skia 2016-11-09 19:07:56 +00:00
CONTRIBUTING Fix references to https://sites.google.com/site/skiadocs/. 2015-02-03 13:12:54 -02:00
CQ_COMMITTERS Moved committer list to chrome-infra-auth and deleted it from the repo 2015-09-02 13:37:54 -07:00
DEPS Roll skia/third_party/externals/angle2/ 6c58b0620..a0bcc50be (2 commits) 2017-08-03 00:23:26 +00:00
Doxyfile Make the housekeeper upload doxygen to a newer bucket 2016-10-04 13:23:57 -07:00
LICENSE BUG=skia:5602 2016-09-02 11:19:34 -07:00
PRESUBMIT.py Update CQ extra trybots after switch to Debian 2017-06-29 19:35:40 +00:00
public.bzl exclude SkJumper_stages_8bit.cpp from Google3 build 2017-08-03 04:56:37 +00:00
README Fix references to https://sites.google.com/site/skiadocs/. 2015-02-03 13:12:54 -02:00
README.chromium Update README.chromium. 2015-06-11 13:19:24 -07:00
whitespace.txt Revert "Revert "Make GrAtlasTextOp a non-legacy GrMeshDrawOp"" 2017-07-19 12:17:34 +00:00

Skia is a complete 2D graphic library for drawing Text, Geometries, and Images.

See full details, and build instructions, at https://skia.org.