skia2/resources
Mike Klein 8c1e0effbb sketch out structure for ops with immediates
Lots of x86 instructions can take their right hand side argument from
memory directly rather than a register.  We can use this to avoid the
need to allocate a register for many constants.

The strategy in this CL is one of several I've been stewing over, the
simplest of those strategies I think.  There are some trade offs
particularly on ARM; this naive ARM implementation means we'll load&op
every time, even though the load part of the operation can logically be
hoisted.  From here on I'm going to just briefly enumerate a few other
approaches that allow the optimization on x86 and still allow the
immediate splats to hoist on ARM.

1) don't do it on ARM
A very simple approach is to simply not perform this optimization on
ARM.  ARM has more vector registers than x86, and so register pressure
is lower there.  We're going to end up with splatted constants in
registers anyway, so maybe just let that happen the normal way instead
of some roundabout complicated hack like I'll talk about in 2).  The
only downside in my mind is that this approach would make high-level
program descriptions platform dependent, which isn't so bad, but it's
been nice to be able to compare and diff debug dumps.

2) split Op::splat up
The next less-simple approach to this problem could fix this by
splitting splats into two Ops internally, one inner Op::immediate that
guantees at least the constant is in memory and is compatible with
immediate-aware Ops like mul_f32_imm, and an outer Op::constant that
depends on that Op::immediate and further guarantees that constant has
been broadcast into a register to be compatible with non-immediate-aware
ops like div_f32.  When building a program, immediate-aware ops would
peek for Op::constants as they do today for Op::splats, but instead of
embedding the immediate themselves, they'd replace their dependency with
the inner Op::immediate.

On x86 these new Ops would work just as advertised, with Op::immediate a
runtime no-op, Op::constant the usual vbroadcastss.  On ARM
Op::immediate needs to go all the way and splat out a register to make
the constant compatible with immediate-aware ops, and the Op::constant
becomes a noop now instead.  All this comes together to let the
Op::immediate splat hoist up out of the loop while still feeding
Op::mul_f32_imm and co.  It's a rather complicated approach to solving
this issue, but I might want to explore it just to see how bad it is.

3) do it inside the x86 JIT
The conceptually best approach is to find a way to do this peepholing
only inside the JIT only on x86, avoiding the need for new
Op::mul_f32_imm and co.  ARM and the interpreter don't benefit from this
peephole, so the x86 JIT is the logical owner of this optimization.
Finding a clean way to do this without too much disruption is the least
baked idea I've got here, though I think the most desirable long-term.

Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Debug-All-SK_USE_SKVM_BLITTER,Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SK_USE_SKVM_BLITTER
Change-Id: Ie9c6336ed08b6fbeb89acf920a48a319f74f3643
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/254217
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
2019-11-12 20:17:55 +00:00
..
android_fonts Move lang to list in Android font manager. 2017-08-14 19:29:05 +00:00
diff_canvas_traces SkRemoteGlyphCache Add tracing to diff canvas 2019-10-24 17:09:31 +00:00
empty_images Do not create an SkRawCodec with zero dimensions 2016-12-02 22:23:35 +00:00
fonts Font resolution: all unit tests working 2019-11-08 17:24:14 +00:00
icc_profiles update ColorSpaceTest.cpp to remove MakeICC 2018-05-23 17:11:11 +00:00
images Initial version of rescaling async readback API 2019-05-17 16:39:10 +00:00
invalid_images Update piex and add test image 2018-02-22 21:32:48 +00:00
lua Update SampleLua and lua files. 2018-05-22 15:51:25 +00:00
nima skeletal animation support added to API and software backend 2018-06-29 19:34:28 +00:00
particles Particles: Fake 3D example 2019-10-17 20:10:05 +00:00
skottie [skottie] Streamlined gradient stop merger 2019-11-05 19:44:11 +00:00
text Shaper Tests: make a macro, rename test resources. 2019-05-03 17:16:36 +00:00
Cowboy.svg Add animated cowboy sample from WebKit tests, and fix. 2017-09-25 21:14:09 +00:00
crbug769134.fil Avoid uninitialized memory in readByteArrayAsData 2017-09-28 19:51:32 +00:00
ducky.jpg Clamp RGB outputs of GrYUVtoRGBEffect. 2019-11-11 20:04:15 +00:00
ducky.png Clamp RGB outputs of GrYUVtoRGBEffect. 2019-11-11 20:04:15 +00:00
nov-talk-sequence.txt demo tweaks, scale up perlin, add call to flush for fps 2015-11-09 13:10:30 -08:00
pdf_command_stream.txt SkPDF/Bench: add bench for SkPDFSharedStream (deflate) 2016-02-24 15:17:20 -08:00
README Add animated cowboy sample from WebKit tests, and fix. 2017-09-25 21:14:09 +00:00
SkVMTest.expected sketch out structure for ops with immediates 2019-11-12 20:17:55 +00:00

The resources directory includes some third party content used by Skia.
Licenses for that code are included in this file.

Openclipart

Openclipart uses the Creative Commons Zero 1.0 Public Domain License every time
an artist uploads a piece of clipart to Openclipart to make it clear the artist
is releasing the creative work for anyone to use for any reason, even
commercially. This act of "sharing" is the foundation Openclipart is based upon.
More details on the license can be found at
https://creativecommons.org/publicdomain/zero/1.0/.

LGPL or compatible (as implied by inclusion in KDE SVN)
http://websvn.kde.org/trunk/tests/ksvgtests/custom/cowboy.svg