Avoid scale by alpha if unnecessary

~15% improvement for S32_alpha_D32_filter_DX on skylake-x.

nanobench result on 7900X(fixed frequency@3.2GHz)
                                before    after
bitmaprect_FF_filter_trans      524µs     453µs

Change-Id: I1c0c05915ecd3dc6f59da5eb49b5ae1c6cd98814
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/288436
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This commit is contained in:
jiepan 2020-05-07 14:09:05 +08:00 committed by Skia Commit-Bot
parent 832e1fe8c8
commit e947efb9d2

View File

@ -139,9 +139,11 @@ static void decode_packed_coordinates_and_weight(U32 packed, Out* v0, Out* v1, O
// Get back to [0,255] by dividing by maximum weight 16x16 = 256.
sum >>= 8;
// Scale by [0,256] alpha.
sum *= s.fAlphaScale;
sum >>= 8;
// Scale by alpha if needed.
if(s.fAlphaScale < 256) {
sum *= s.fAlphaScale;
sum >>= 8;
}
// Pack back to 8-bit channels, undoing to_16x4().
return skvx::bit_pun<skvx::Vec<8,uint32_t>>(skvx::cast<uint8_t>(sum));