unnest matrix multiply

As written the order that skvm::Builder sees the values passed as
arguments to mad() is dependent on C++ function argument evaluation
order.  This can be counterintuitive, and since skvm::Builder uses the
original order it saw arguments to break ties when reordering programs,
it can affect program order and register pressure.

This change keeps the math happening in the same order, but takes
precise control over when we mention (and thus when we load) each matrix
coefficient.  The program goes from looking like this, loading uniforms
in reverse order that they're needed,

    ...
    r4 = uniform32 arg(0) 4C
    r3 = uniform32 arg(0) 50
    r12 = uniform32 arg(0) 54
    r13 = uniform32 arg(0) 58
    r14 = uniform32 arg(0) 5C
    r14 = mad_f32 r13 r8 r14
    r14 = mad_f32 r12 r9 r14
    r14 = mad_f32 r3 r11 r14
    r14 = mad_f32 r4 r6 r14
    ...

to something nicer like this that reuses the same temporaries to load
and accumulate the uniforms in the order they're needed,

    ...
    r7 = uniform32 arg(0) 5C
    r11 = uniform32 arg(0) 58
    r7 = mad_f32 r11 r8 r7
    r11 = uniform32 arg(0) 54
    r7 = mad_f32 r11 r9 r7
    r11 = uniform32 arg(0) 50
    r7 = mad_f32 r11 r10 r7
    r11 = uniform32 arg(0) 4C
    r7 = mad_f32 r11 r6 r7
    ...

In all this cuts three unnecessary temporary registers from programs
using SkColorFilter_Matrix, and would be enough to get gm/skvm.cpp all
JITing again if all the instructions it used were implemented... (next).

Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Debug-All-SK_USE_SKVM_BLITTER
Change-Id: Ie03a5da476a49eeb950e74290001a0625cf61177
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/253126
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This commit is contained in:
Mike Klein 2019-11-06 12:10:35 -06:00 committed by Skia Commit-Bot
parent 3dce06be0e
commit d0792e6de6

View File

@ -104,11 +104,11 @@ bool SkColorFilter_Matrix::program(skvm::Builder* p,
skvm::F32 rgba[4];
for (int j = 0; j < 4; j++) {
rgba[j] = p->mad(m(0+j*5), R,
p->mad(m(1+j*5), G,
p->mad(m(2+j*5), B,
p->mad(m(3+j*5), A,
m(4+j*5)))));
rgba[j] = m(4+j*5);
rgba[j] = p->mad(m(3+j*5), A, rgba[j]);
rgba[j] = p->mad(m(2+j*5), B, rgba[j]);
rgba[j] = p->mad(m(1+j*5), G, rgba[j]);
rgba[j] = p->mad(m(0+j*5), R, rgba[j]);
}
// Clamp back to bytes.