skia2

Author	SHA1	Message	Date
Florin Malita	0022f5cf1b	[SkTrimPathEffect] Preserve wrap-around continuity In inverted mode (Mode::kInverted), the trim result represents the logical segment [stop..start] (wrapping around at the path's end). We currently emit two segments [0..start] and [stop..1], in that exact order. This behavior breaks continuity for single closed contour paths. Update SkTrimPath to 1) emit the segments in the correct order ([stop..1],[0..start]) 2) skip the connecting moveTo for closed paths Bug: skia:10107 Change-Id: Icd280554ba7291c985f504793feff104df2a4a99 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/281882 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-04-07 14:40:17 +00:00
Florin Malita	0ba30328d0	[skottie] Corner pin effect Note: works for well-formed poly-to-poly (perspective) transforms, but doesn't support AE's degenerate corners semantics (concave/inverted polys) at this point. Bug: skia:10100 Change-Id: I5b3492b008302495b616867c139c6e5ad6dc57df Reviewed-on: https://skia-review.googlesource.com/c/skia/+/281595 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-04-05 16:47:06 +00:00
Mike Klein	4067a9429a	the return of bit_clear bit_clear is at least useful as a special case for select(), which helps with code readability. Add is_NaN() and use these all together in sweep gradient. Change-Id: I57a54f8956f85e0db0662b33f8446b8dc7342d8d Reviewed-on: https://skia-review.googlesource.com/c/skia/+/281685 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-04-05 16:44:16 +00:00
Mike Klein	aa68109a59	special-case overhaul - new this-> convention: never use it when calling common public Builder methods like splat(), bit_and(), etc like you'd see in normal user code, but always use it when calling private methods like this->push(), this->isImm(), this->allImm(). - use c++17 if-statements to scope this->allImm() variables tighter. - check for x.id == y.id cases where applicable, including a tweak to min() and max() to make them able to hit the special case. - add special cases for I32 +,-,*, and remove an old unimportant unit test that assumed we didn't fold these. - add special cases for select(), and use select() in a few more places where it's clearer and now just as efficient. Change-Id: Idaac9250ac5a95a48d33eeba1cc4380c8c91629d Reviewed-on: https://skia-review.googlesource.com/c/skia/+/281678 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-04-05 15:41:36 +00:00
Mike Klein	cca2acfb77	remove little-used bit_clear() and bytes() bit_clear() is just another bit_and(), and bytes() is a way of expression pshufb that we never really use (yet). Can always add them back later, but there's some extra complexity to think about for each that I'd like to not think about now: - common sub-expression elimination between bit_and and bit_clear - large constant management JIT'ing bytes Change-Id: I3a54afa963231fec1d5de949acc647e3430ed0d8 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/281557 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-04-03 18:16:54 +00:00
Mike Reed	6e9b179d20	move ducky images into images Change-Id: I819c23d38989f81d2e493ad8eea0c22cfd364284 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/279036 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Reed <reed@google.com>	2020-03-25 12:19:56 +00:00
Florin Malita	ae58199380	[skottie] Initial drop shadow style support Plumb layer style parsing, and extend existing DropShadowAdapter to support both drop shadow style and drop shadow effect. Change-Id: Id99a419dacd06dc38dc4cf84ff4ecb92218c45f7 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/279020 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-03-25 12:14:26 +00:00
Herb Derby	43f7641e1b	Arrange instruction to reduce register pressure When converting from Instructions to OptimizedInstructions place instructions that reduce register pressure earlier in the instruction list. This change reduces some register pressure in SkVM, and improves the bitmap_RGBA_8888_A_scale_bilerp benchmark by about 5%. Change-Id: If5f6385bd2f7720701d1c827265062b35491a790 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/276485 Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@google.com>	2020-03-17 21:37:18 +00:00
Mike Klein	5caf7dee25	restore Op::round While I think trunc(mad(x, scale, 0.5)) is fine for doing our float to fixed point conversions, round(mul(x, scale)) was kind of better all around: - better rounding than +0.5 and trunc - faster when mad() is not an fma - often now no need to use the constant 0.5f or have it in a register - allows the mul() in to_unorm to use mul_f32_imm Those last two points are key... this actually frees up 2 registers in the x86 JIT when using to_unorm(). So I think maybe we can resurrect round and still guarantee our desired intra-machine stability by committing to using instructions that follow the current rounding mode, which is what [v]cvtps2dq inextricably uses. Left some notes on the ARM impl... we're rounding to nearest even there, which is probably the current mode anyway, but to be more correct we need a slightly longer impl that rounds float->float then "truncates". Unsure whether it matters in practice. Same deal in the unit test that I added back, now testing negative and 0.5 cases too. The expectations assume the current mode is nearest even. I had the idea to resurrect this when I was looking at adding _imm Ops for fma_f32. I noticed that the y and z arguments to an fma_f32 were by far most likely to be constants, and when they are, they're by far likely to both be constants, e.g. 255.0f & 0.5f from to_unorm(8,...). llvm disassembly for SkVM_round unit test looks good: ~ $ llc -mcpu=haswell /tmp/skvm-jit-1231521224.bc -o - .section __TEXT,__text,regular,pure_instructions .macosx_version_min 10, 15 .globl "_skvm-jit-1231521224" ## -- Begin function skvm-jit-1231521224 .p2align 4, 0x90 "_skvm-jit-1231521224": ## @skvm-jit-1231521224 .cfi_startproc cmpl $8, %edi jl LBB0_3 .p2align 4, 0x90 LBB0_2: ## %loopK ## =>This Inner Loop Header: Depth=1 vcvtps2dq (%rsi), %ymm0 vmovupd %ymm0, (%rdx) addl $-8, %edi addq $32, %rsi addq $32, %rdx cmpl $8, %edi jge LBB0_2 LBB0_3: ## %hoist1 xorl %eax, %eax testl %edi, %edi jle LBB0_6 .p2align 4, 0x90 LBB0_5: ## %loop1 ## =>This Inner Loop Header: Depth=1 vcvtss2si (%rsi,%rax), %ecx movl %ecx, (%rdx,%rax) decl %edi addq $4, %rax testl %edi, %edi jg LBB0_5 LBB0_6: ## %leave vzeroupper retq .cfi_endproc ## -- End function Change-Id: Ib59eb3fd8a6805397850d93226c6c6d37cc3ab84 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/276738 Auto-Submit: Mike Klein <mtklein@google.com> Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Herb Derby <herb@google.com>	2020-03-12 21:10:34 +00:00
Mike Klein	7c0332cd35	re-enable fnma - hook up fmls.4s as fnma_f32 - add fneg.4s - use fneg.4s + fmls.4s to impl fms_f32 - more tests to exercise these Change-Id: I60173a5e4618ab968a9361e15334a1d63c001372 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/275412 Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Herb Derby <herb@google.com>	2020-03-05 21:58:07 +00:00
Mike Klein	f8db8a5516	disable fnma peephole Change-Id: I66b251826211653586c64e8495cdfde7d7d8f1a2 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/275410 Reviewed-by: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@google.com>	2020-03-05 20:28:27 +00:00
Herb Derby	d4c4f0c004	Add Multiply-Subtract (fms) to SkVM Add fms op and instruction generation. Do fms and fnma instruction selection. TODO: Add the ops to Arm Change-Id: I7e53abd7f4752eb99c31dcbff1f2ea7cf28af6c9 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/275197 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-03-05 18:44:18 +00:00
Mike Klein	823d319f3b	do our own fma discovery Peephole add(F32,F32) for an argument that is a mul(). As a flourish, only generate Op::fma_f32 on machines we know support real fused mul-adds. This removes the ambiguity of whether Op::mad_f32 is an FMA or not; the new Op::fma_f32 is always an FMA, and otherwise you'll just see ordinary mul-add. No more Op::mad_f32. Change-Id: I38016a2430774583116d8d6a8ada677012c1a8fc Reviewed-on: https://skia-review.googlesource.com/c/skia/+/275138 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Herb Derby <herb@google.com>	2020-03-04 22:29:30 +00:00
Mike Klein	cb50b117e3	get rid of troublesome Op::round We really only need to_unorm(), and that's fine with trunc(mad(x, scale, 0.5)). Change-Id: I1561c678501963a9ae53c22994fc906159fc7199 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/275075 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@google.com>	2020-03-04 22:26:01 +00:00
Florin Malita	ae2da5e7f9	[skottie] Add another text grouping test TBR= Change-Id: I9c1ca3e834fdc5b017446811db52582a326d777b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/274740 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-03-03 15:17:20 +00:00
Florin Malita	960f3d4cd1	[skottie] Text anchor point grouping support Implement all AE grouping modes: character/word/line/all. -- character grouping was already supported (default mode) -- for word and line grouping, expand the existing domain mapping logic to also track cumulative advance and max(ascent) per span, then use this info to compute anchor point boxes -- for "all" grouping, the anchor point box coincides with the text box (https://helpx.adobe.com/after-effects/using/animating-text.html#text_anchor_point_properties) TBR= Change-Id: I8564f1349d167d82c31862d8f7e57615cdae0dcf Reviewed-on: https://skia-review.googlesource.com/c/skia/+/274201 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-03-02 23:07:39 +00:00
Florin Malita	c4fae744d0	[skottie] Fix handling of time-reversed precomp layers With time-reverse enabled - inPoint/outPoint are reversed - time stretch is negative Bug: skia:9958 Change-Id: I5c1197251608aad4b0417cde6ca2600b1b2822fc Reviewed-on: https://skia-review.googlesource.com/c/skia/+/273808 Commit-Queue: Avinash Parchuri <aparchur@google.com> Reviewed-by: Avinash Parchuri <aparchur@google.com>	2020-02-28 19:42:51 +00:00
Florin Malita	917dddece6	[skottie] Add support for text grouping alignment AE allows relative adjustments for text animator transform anchor points [1]. [1] https://helpx.adobe.com/after-effects/using/animating-text.html#text_anchor_point_properties Change-Id: If98d6b522e73a768ed2358d918867d2aefd09071 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/274044 Reviewed-by: Ben Wagner <bungeman@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-02-28 18:59:05 +00:00
Florin Malita	9642b31a71	[skottie] Add support for text animator blur In adition to transforms/opacity/etc, text animators can target per-glyph opacity. Change-Id: I6ab63a6e49a64beaf63fc955f0b672a5b8ba84ba Reviewed-on: https://skia-review.googlesource.com/c/skia/+/272886 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-02-24 14:01:16 +00:00
Florin Malita	178b860769	[skottie] Initial support for per-character 3D When per-character 3D is enabled, text properties can be animated in 3 dimensions. - position and scale become 3-value vectors - in addition to existing "r" (really rz), rotation gains "rx" and "ry" - instead of specializing for 3D, expand the existing structures to handle both 3D and 2D modes - also ensure that sksg::Transform does not flatten to SkMatrix Change-Id: I426a7ee1ff38c1702deb85e9f1db80f6069f36d6 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/272648 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-02-21 21:14:02 +00:00
Florin Malita	0de01c05b7	[skottie] Clip overflowing paragraph lines AE discards lines with baselines outside the paragraph box. This aligns Skottie's behavior with AE for default/top-alignment (but not for any of the custom vertical alignment modes). Bug: skia:9933 Change-Id: Id0318f0744bf89580774e89494faf19bfb6f6d14 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/272376 Reviewed-by: Ben Wagner <bungeman@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-02-20 18:35:15 +00:00
Ben Wagner	d38f00a12a	Skip degenerate contours in glyphs. Stroking in Skia follows the SVG rules of adding end caps to degenerate contours. Skip all degenerate contours and degenerate curves on contours to avoid this. Bug: skia:9820 Change-Id: I320beeeb3728f39c764729454dcb128a05524d35 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/268166 Commit-Queue: Ben Wagner <bungeman@google.com> Reviewed-by: Herb Derby <herb@google.com>	2020-02-13 16:22:42 +00:00
Mike Klein	5cdeb390d0	only emit _imm ops when JITing for x86 There are probably ways to make this more efficient by only optimizing what's necessary (e.g. try JIT first, then interpreter only if it fails) and some other performance improvements to make, but for now I want to focus mostly on keeping things simple and correct. The line between Builder::done() and Program::Program() is particularly fuzzy and becoming fuzzier here, and I think that'll be something that'll change eventually. This makes SkVMTest debug dumps more portable, though perhaps less useful. Might kill that feature soon now that SkVM is tested more thoroughly in unit tests and GMs and bots and such. Change-Id: Id9ce8daaf8570e5bea8b10f1a80b97f5b33d45dc Reviewed-on: https://skia-review.googlesource.com/c/skia/+/269941 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-02-10 19:26:05 +00:00
Leon Scroggins III	42a604f431	Allow decoding without color conversion - part 2 Bug: b/135133301 Follow-on to `196f319b`. - Add SkCodec::getICCProfile to match the SkAndroidCodec version. - Update comments on getPixels() regarding how the SkColorSpace on the SkImageInfo is treated. - Add two new images that have ICC profiles that do not map to an SkColorSpace. Add a test to verify that they have the un-transformed color we expect. - Stop uploading ColorCodecSrc images decoded to a null SkColorSpace to Gold. Though they may be correct, they do not match other images they're compared against. The new test above verifies that we do not do color conversion with a null SkColorSpace. Change-Id: I08635e4262f16500fab32ef97511d305c2c06483 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/269236 Reviewed-by: Derek Sollenberger <djsollen@google.com> Commit-Queue: Leon Scroggins <scroggo@google.com>	2020-02-07 19:24:33 +00:00
Mike Klein	4bb619554e	move instruction specialization later This adds a specialization pass to Builder::optimize() and moves the x86-specific _imm ops there, rewriting with the Builder API itself. I'm only using the private Builder::push() call for the moment, but that's enough to make me feel confident that this is a good way forward: it's still all going through CSE that way. We're still doing this any time we're on x86, not when targeting the JIT, but that'll come next, see the new TODOs. It's mildly better for the interpreter to not use the _imm ops, but this is really all still warmup for optimizations with less mild opinions. I'm not proud of the switch/goto impl but it's the clearest I found. Change-Id: I30594b403832343528b95967724fd50324cd79d1 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/269232 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-02-07 16:21:08 +00:00
Mike Klein	ed9b1f1c1e	refactor out a middle representation Kind of brewing a big refactor here, to give me some room between skvm::Builder and skvm::Program to do optimizations, bakend specializations and analysis. As a warmup, I'm trying to split up today's Builder::Instruction into two forms, first just what the user requested in Builder (this stays Builder::Instruction) then a new type representing any transformation or analysis we've done to it (OptimizedInstruction). Roughly six important optimizations happen in SkVM today, in this order: 1) constant folding 2) backend-specific instruction specialization 3) common sub-expression elimination 4) reordering + dead code elimination 5) loop invariant and lifetime analysis 6) register assignment At head 1-5 all happen in Builder, and 2 is particularly awkward to have there (e.g. mul_f32 -> mul_f32_imm). 6 happens in Program per-backend, and that seems healthy. As of this CL, 1-3 happen in Builder, 4-5 now on this middle OptimizedInstruction format, and 6 still in Program. I'd like to get to the point where 1 stays in Builder, 2-5 all happen on this middle IR, and 6 stays in Program. That ought to let me do things like turn mul_f32 -> mul_f32_imm when it's good to and still benefit from things like common sub-expression elimination and code reordering happening after that trnasformation. And then, I hope that's also a good spot to do more complicated transformations, like lowering gather8 into gather32 plus some fix up when targeting an x86 JIT but not anywhere else. Today's Builder is too early to know whether we should do this or not, and in Program it's actually kind of awkward to do this sort of thing while also doing having to do register assignment. Some middle might be right. Change-Id: I9c00268a084f07fbab88d05eb441f1957a0d7c67 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/269181 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-02-06 20:09:53 +00:00
Florin Malita	4a6a640299	[skottie] Add support for ADBE Pro Levels2 effect Similar to existing ADBE Easy Levels2, but provides separate mapping controls per channel. Change-Id: Ibc58c58e1e8cb8793d6eb819998c1804ccbbf859 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/268936 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-02-05 21:05:08 +00:00
Robert Phillips	d4f68317fe	Add SW decode of ETC1 and a GM The GM exercises the compressed image formats using externally created resources Note: the original image for the new flower resources can be found on Wikimedia Commons and has a "CC0 1.0 Universal Public Domain Dedication" license. Bug: skia:9680 Change-Id: I6c5f9a12fcbbecdc3ba548dbb078bc21522073fe Reviewed-on: https://skia-review.googlesource.com/c/skia/+/267836 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Robert Phillips <robertphillips@google.com>	2020-02-03 13:56:15 +00:00
Florin Malita	cc982ecbca	[skottie] Cleanup: convert remaining effects to new adapter pattern Also add a couple of tests for missing coverage. TBR= Change-Id: I420c71d73657c5d003fda94a4c43dde20a1a6b87 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/267556 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-01-30 15:30:08 +00:00
Florin Malita	ad9110026b	[skottie] Separate text resize options The sk_vj text property (Skottie extension) is currently mixing vertical alignment and resizing semantics into a single enum. This precludes certain valid combinations. Split the resize options into a separate enum (ResizePolicy), and ensure support for all combinations. Before: "sk_vj": 0 -> Shaper::VAlign::kVisualTop "sk_vj": 1 -> Shaper::VAlign::kVisualCenter "sk_vj": 2 -> Shaper::VAlign::kVisualBottom "sk_vj": 3 -> Shaper::VAlign::kVisualResizeToFit "sk_vj": 4 -> Shaper::VAlign::kVisualDownscaleToFit After: "sk_vj": 0 -> Shaper::VAlign::kVisualTop "sk_vj": 1 -> Shaper::VAlign::kVisualCenter "sk_vj": 2 -> Shaper::VAlign::kVisualBottom "sk_rs": 0 -> Shaper::ResizePolicy::kNone "sk_rs": 1 -> Shaper::ResizePolicy::kScaleToFit "sk_rs": 2 -> Shaper::ResizePolicy::kDownscaleToFit Bug: skia:9809, skia:9810 Change-Id: I631ae1fa31a9bc9c6958bb480354138591d504ff Reviewed-on: https://skia-review.googlesource.com/c/skia/+/267040 Reviewed-by: Ben Wagner <bungeman@google.com> Reviewed-by: Isabel Ren <isabelren@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-01-29 00:09:42 +00:00
Florin Malita	7c7cd30550	[skottie] Add custom props rendering GM Also fix a couple of custom props issues: - solid layer colors were not dispatched - text values were not sync'ed TBR= Change-Id: I827f8c1d8c8bb73b03f05de15e1c7c96753a631e Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264936 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2020-01-17 13:53:02 +00:00
Mike Klein	57bdb24d0e	skip no-op masks extract() can generate silly instruction patterns like v0 = ... v1 = shr v0 24 v2 = bit_and v1 FF v3 = whatever v2 ... This CL skips those pointless bit_ands when we see the mask is an immediate and (0xFFFFFFFF>>shift) == mask. Change-Id: I2bb3847fbb2efdf24d024870ac37b37bb8f9aa3c Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263101 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-01-08 21:59:24 +00:00
Mike Klein	a6434a5ef5	refactor bit ops - Remove extract... it's not going to have any special impl. I've left it on skvm::Builder as an inline compound method. - Add no-op shift short circuits. - Add immediate ops for bit_{and,or,xor,clear}. This comes from me noticing that the masks for extract today are always immediates, and then when I started converting it to be (I32, int shift, int mask), I realized it might be even better to break it up into its component pieces. There's no backend that can do extract any better than shift-then-mask, so might as well leave it that way so we can dedup, reorder, and specialize those micro ops. Will follow up soon to get this all JITing again, and these can-we-JIT test changes will be reverted. Change-Id: I0835bcd825e417104ccc7efc79e9a0f2f4897841 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263217 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-01-08 21:20:54 +00:00
Mike Klein	b5c435579e	upgrade debugging tools - Add instruction numbers to program dumps. - Dump the program when an assertion fails, and print the failing condition or an optional other value (e.g. if alpha outside [0,1], print alpha). With all that and the new commented assert enabled, I'm seeing that sometimes we get a bilerp alpha of 0x3f800001, just a little more than 1.0f. Fix still tbd. Change-Id: I2c20e41ae370d8cd2963e2dbf0fd91aa0fd50061 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262808 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2020-01-07 18:26:22 +00:00
Ben Wagner	ab51c2ce08	Add more variation support on Mac. With the recent transition to creating fonts from data as CTFonts and dropping variation support from macOS 10.11 and earlier, it is now possible to reliably make variation clones and get the axis information. Change-Id: Ia9a0922ac94a29e1508d2e74d4ce973751044866 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/259421 Reviewed-by: Herb Derby <herb@google.com> Reviewed-by: Dominik Röttsches <drott@chromium.org> Commit-Queue: Ben Wagner <bungeman@google.com>	2019-12-13 18:16:13 +00:00
Florin Malita	46a331b93f	[skottie] Cascading track matte support Currently, we treat track matte source layers (tagged with td:1) as single-shot mask triggers: we apply once to the following layer, then move on. But track mattes can cascade: a layer with a matte can itself be applied as a track matte for the following layer. Also, for matte/masking purposes, only the layer content is being considered (ignoring blend mode and any masks applied to the matte itself). To support this, refactor the layer attachment code: - instead of tracking the presence of a single-shot matte source, always track previous layer content trees - instead of triggering matte attachment in the presence of a matte source, trigger based on the matte target property (tt: X) - log errors on unknown matte modes Change-Id: I6c71d4007e1e27d3f3a139344bbf367d7bc6e29d Reviewed-on: https://skia-review.googlesource.com/c/skia/+/259820 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-12-12 21:42:11 +00:00
Florin Malita	e1fa70000a	[skottie] Invert effect support https://helpx.adobe.com/after-effects/using/channel-effects.html#invert_effect Change-Id: Iac8e291ab9cb57714c50f1e40cecb66b3dc64ee1 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/259276 Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Mike Klein <mtklein@google.com>	2019-12-11 23:07:04 +00:00
Florin Malita	6cc49538b3	[skottie] Fix precomposed camera sizing Precomp layers can have a different size vs. main composition. Instead of relying on the global animation (main comp) size, use the current (pre)comp size when setting up cameras. Change-Id: I54106375fb39dde2bfd11e14a38e5ec3e7190764 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/258156 Commit-Queue: Florin Malita <fmalita@chromium.org> Commit-Queue: Mike Reed <reed@google.com> Auto-Submit: Florin Malita <fmalita@chromium.org> Reviewed-by: Mike Reed <reed@google.com>	2019-12-05 14:34:15 +00:00
Brian Osman	db2e7641be	Particles: SkImageBinding to allow sampling an image from script Provides functionality similar to AE property maps Change-Id: I1705706a6b7e25fbab55465f2e20d0b145330b0b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/255977 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-12-03 15:22:20 +00:00
Brian Osman	d12f2786e2	Use ResourceProvider in particles Currently just for image drawable, but going to use this for references to other kinds of data in bindings, too. Change-Id: Ic6673530013337bbaadd2d3f1c040626ec24ffb8 Bug: skia:9513 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/256776 Commit-Queue: Brian Osman <brianosman@google.com> Reviewed-by: Kevin Lubick <kjlubick@google.com>	2019-11-27 16:45:23 +00:00
Mike Klein	1cb05993af	all-constant peepholes This adds a bunch of tests for ops that can all be evaluated directly in skvm::Builder. You can see the sort of effect this has by looking at the diffs for SkVMTest.expected... lots of `v3 = sub_f32 v2 v2` transformed to `v3 = splat 0 (0)` and that sort of thing. My favorite part is handling many assert_true() calls at compile time! While the old inter-Op code parallels aren't as clear now, these new early-out tests kind of work like comments explaining each op. I find that nice. I found it hard to parse so many uses of the word "splat" so I did go back to isImm() from isSplat(), and added allImm() to test for and read several immediates all at once. Some of this is less C++17 than I'd like. :/ Change-Id: Ie8187d5d184195e3c0c92d613508fb708c28302f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/255814 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-11-21 22:00:11 +00:00
Florin Malita	ad76b2ee25	[skottie] One-node camera support So far Skottie has been assuming all cameras are two-node (have a point of interest). AE also supports one-node cameras, where the camera does not auto-orient towards a POI but starts off perpendicular to the z == 0 plane. (https://helpx.adobe.com/after-effects/how-to/camera-animation.html) Change-Id: Id565de7d8feb9a762940ac372c1bbbcce2e2dfc6 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/254559 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-11-14 18:45:18 +00:00
Mike Klein	8c1e0effbb	sketch out structure for ops with immediates Lots of x86 instructions can take their right hand side argument from memory directly rather than a register. We can use this to avoid the need to allocate a register for many constants. The strategy in this CL is one of several I've been stewing over, the simplest of those strategies I think. There are some trade offs particularly on ARM; this naive ARM implementation means we'll load&op every time, even though the load part of the operation can logically be hoisted. From here on I'm going to just briefly enumerate a few other approaches that allow the optimization on x86 and still allow the immediate splats to hoist on ARM. 1) don't do it on ARM A very simple approach is to simply not perform this optimization on ARM. ARM has more vector registers than x86, and so register pressure is lower there. We're going to end up with splatted constants in registers anyway, so maybe just let that happen the normal way instead of some roundabout complicated hack like I'll talk about in 2). The only downside in my mind is that this approach would make high-level program descriptions platform dependent, which isn't so bad, but it's been nice to be able to compare and diff debug dumps. 2) split Op::splat up The next less-simple approach to this problem could fix this by splitting splats into two Ops internally, one inner Op::immediate that guantees at least the constant is in memory and is compatible with immediate-aware Ops like mul_f32_imm, and an outer Op::constant that depends on that Op::immediate and further guarantees that constant has been broadcast into a register to be compatible with non-immediate-aware ops like div_f32. When building a program, immediate-aware ops would peek for Op::constants as they do today for Op::splats, but instead of embedding the immediate themselves, they'd replace their dependency with the inner Op::immediate. On x86 these new Ops would work just as advertised, with Op::immediate a runtime no-op, Op::constant the usual vbroadcastss. On ARM Op::immediate needs to go all the way and splat out a register to make the constant compatible with immediate-aware ops, and the Op::constant becomes a noop now instead. All this comes together to let the Op::immediate splat hoist up out of the loop while still feeding Op::mul_f32_imm and co. It's a rather complicated approach to solving this issue, but I might want to explore it just to see how bad it is. 3) do it inside the x86 JIT The conceptually best approach is to find a way to do this peepholing only inside the JIT only on x86, avoiding the need for new Op::mul_f32_imm and co. ARM and the interpreter don't benefit from this peephole, so the x86 JIT is the logical owner of this optimization. Finding a clean way to do this without too much disruption is the least baked idea I've got here, though I think the most desirable long-term. Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Debug-All-SK_USE_SKVM_BLITTER,Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SK_USE_SKVM_BLITTER Change-Id: Ie9c6336ed08b6fbeb89acf920a48a319f74f3643 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/254217 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-11-12 20:17:55 +00:00
Brian Salomon	c75bc031ef	Clamp RGB outputs of GrYUVtoRGBEffect. The matrices we're using can produce very slightly out of range color channels. This gives surprising results when in shader blending is used for color burn and color dodge. After this change we clamp the RGB values to 0..1 before applying premul. Adds a GM modeled on a blink layout test that shows the problem using SkImageMakeFromYUVAPixmaps. Bug: skia:9619 Change-Id: I446d39763a7f5a2f7c5f61d94d163927d851baa3 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/253879 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Salomon <bsalomon@google.com>	2019-11-11 20:04:15 +00:00
Mike Klein	4135cf0b57	use round() instead of trunc() to f32->unorm This does open us up to a little bit of possible inconsistency of rounding when right on a x.5 (sometimes we'll +0.5 and trunc, sometimes round to nearest, sometimes round according to the default mode which is usually round to nearest) but I think that inconsistency may be worth the free register not needing a splat(0.5f) buys us. A few invisible diffs. Change-Id: I9af092c937ccf7c5891c2ab3cb298d217e4a9e9f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/253725 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Mike Reed <reed@google.com>	2019-11-08 21:28:07 +00:00
Mike Klein	6e4aad91c3	rename to_i32 -> trunc, and add round This plumbs through round but doesn't use it. I want that change to be its own CL. It's nice to have assembler support and the name changes even if I revert using round. Change-Id: I6d67ec5c63546069eb7cc1c91599b599bafcda66 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/253724 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-11-08 21:00:51 +00:00
Julia Lavrova	2e30fde046	Font resolution: all unit tests working Change-Id: Ie6ee30901d599ceefa42651add79bb0288c54c48 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/249004 Commit-Queue: Julia Lavrova <jlavrova@google.com> Reviewed-by: Ben Wagner <bungeman@google.com> Reviewed-by: Julia Lavrova <jlavrova@google.com>	2019-11-08 17:24:14 +00:00
Florin Malita	91a1ec34bf	[skottie] Streamlined gradient stop merger Refactor as a single interpolating loop, based on careful selection of lerp coefficients. Change-Id: I58786cddb2f042b53dcbac80c2346736429be102 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/252858 Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Mike Reed <reed@google.com>	2019-11-05 19:44:11 +00:00
Florin Malita	73a722ce97	[skottie] Fix trim path mode interpretation "m": 1 -> parallel trim "m": 2 -> serial trim (we had these backwards) TBR= Bug: skia:9599 Change-Id: Ib764c04a96c3a1e627553d8b8588028a411b5240 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/252796 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-11-05 18:31:24 +00:00
Mike Klein	e8356ad35d	indent loop so it stands out Change-Id: Iea0f804b1b2fed9e663e45c33fb54a91b10fd07b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/252652 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-11-05 11:33:54 +00:00
Florin Malita	e96214c32b	[skottie] Add a couple more 3D tests TBR= Change-Id: I0602ae6bf30d4c41ecfd9b5995968364c60ce391 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/252556 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-11-04 19:08:43 +00:00
Florin Malita	512eb94916	[skottie] Fix layer blend modes under mattes The layer blend mode should be applied post-masking (after compositing with the matte layer). TBR= Change-Id: Ie84760526cd9be95f08bc68bc5a8dbfb635ca905 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/251316 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-10-29 01:49:26 +00:00
Florin Malita	c6fbedc507	[skottie] 3D layer parenting refinements Observed AE layer parenting semantics: * layers are flagged as either 2D or 3D * camera applies to 3D layers, but not to 2D layers * parented 3D layers treat their ancestor transform chain as 3D (SkMatrix44) * parented 2D layers treat their ancestor transform chain as 2D (SkMatrix, ignoring 3D components) This means that for a given layer, we may need to build two distinct transform chains - depending on the type of descendant layer being considered. Furthermore, transforms are animatable and their animators are scoped to a layer controller. Since we're potentially building two version of the transform node, we need to ensure all animators for both of them are transferred to controller object (we still want to only instantiate a single layer controller and render tree to avoid duplication). IOW, all dependent layer transforms need to be considered before "sealing off" a given layer controller. In order to avoid a layer dependency/topological sort, we can split off the transform tree construction into a separate pass. High-level changes: -- replace existing LayerAttachContext with CompositionBuilder (holds LayerBuilders and other Composition-wide state) -- replace LayerRec with LayerBuilder (holds Layer-wide state and also caches transform nodes) -- pass 1: for each LayerBuilder, transitively build and cache a transform chain of a type (2d/3d) determined by the leaf (entry point) layer -- pass 2: for each LayerBuilder, build the actual layer content render tree and instantiate the layer controller objects Bug: skia:8914 Change-Id: I9f7efcf4819424282fd3dda98f5621ba12fd001b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/251001 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-10-28 20:29:54 +00:00
Hal Canary	e107faa062	SkRemoteGlyphCache Add tracing to diff canvas Use `extra_cflags=["-DSK_CAPTURE_DRAW_TEXT_BLOB"]` to enable. Change-Id: I1d6db478ee91696cdce090647b889c17a83a2718 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/250259 Commit-Queue: Hal Canary <halcanary@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-10-24 17:09:31 +00:00
Mike Klein	d48488b5ea	reorder to minimize register pressure Rewrite program instructions so that each value becomes available as late as possible, just before it's used by another instruction. This reorders blocks of instructions to reduce them number of temporary registers in flight. Take this example of the sort of program that we naturally write, noting the registers needed as we progress down the right: src = load32 ... (1) sr = extract src ... (2) sg = extract src ... (3) sb = extract src ... (4) sa = extract src ... (4, src dies) dst = load32 ... (5) dr = extract dst ... (6) dg = extract dst ... (7) db = extract dst ... (8) da = extract dst ... (8, dst dies) r = add sr dr (7, sr and dr die) g = add sg dg (6, sg and dg die) b = add sb db (5, sb and db die) a = add sa da (4, sa and da die) rg = pack r g ... (3, r and g die) ba = pack b a ... (2, b and a die) rgba = pack rg ba ... (1, rg and ba die) store32 rgba ... (0, rgba dies) That original ordering of the code needs 8 registers (perhaps with a temporary 9th, but we'll ignore that here). This CL will rewrite the program to something more like this by recursively issuing inputs only once needed: src = load32 ... (1) sr = extract src ... (2) dst = load32 ... (3) dr = extract dst ... (4) r = add sr dr (3, sr and dr die) sg = extract src ... (4) dg = extract dst ... (5) g = add sg dg (4, sg and dg die) rg = pack r g (3, r and g die) sb = extract src ... (4) db = extract dst ... (5) b = add sb db (4, sb and db die) sa = extract src ... (4, src dies) da = extract dst ... (4, dst dies) a = add sa da (3, sa and da die) ba = pack b a (2, b and a die) rgba = pack rg ba ... (1, rg and ba die) store32 rgba ... (0) That trims 3 registers off the example, just by reordering! I've added the real version of this example to SkVMTest.cpp. (Its 6th register comes from holding the 0xff byte mask used by extract, in case you're curious). I'll admit it's not exactly easy to work out how this reordering works without a pen and paper or trial and error. I've tried to make the implementation preserve the original program's order as much as makes sense (i.e. when order is an otherwise arbitrary choice) to keep it somewhat sane to follow. This reordering naturally skips dead code, so pour one out for ☠️ . We lose our cute dead code emoji marker, but on the other hand all code downstream of Builder::done() can assume every instruction is live. Change-Id: Iceffcd10fd7465eae51a39ef8eec7a7189766ba2 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/249999 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-10-22 21:49:05 +00:00
Florin Malita	c1b501c352	[skottie] Shift Channels effect support (https://helpx.adobe.com/after-effects/using/channel-effects.html#shift_channels_effect) Limitations: no HSL sources for now. Change-Id: Iffd63f2bbfc8c5f1de00846412be26847e822420 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/250036 Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Mike Klein <mtklein@google.com> Reviewed-by: Mike Reed <reed@google.com>	2019-10-22 20:36:01 +00:00
Brian Osman	eddfc3562f	Particles: Fake 3D example Change-Id: I6d29290eb2962262bb080a86dc829c39986cae4f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/249226 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-10-17 20:10:05 +00:00
Brian Osman	e8bcc56951	Fix a couple minor bugs in particle code - Copy effect state to particle uniforms before each script, so changes from spawn or update are visible. - Guard path binding against out of range access - New effect that actually stresses both of these conditions Change-Id: Ice6112793099e515438af8bb863e9e1bf03d08b1 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/249125 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-10-17 14:55:34 +00:00
Mike Klein	0f61c12737	add used_in_loop bit to skvm::Builder::Instruction Most hoisted values are used in the loop body (and that's really the whole point of hoisting) but some are just temporaries to help produce other hoisted values. This used_in_loop bit helps us distinguish the two, and lets us recycle registers holding temporary hoisted values not used in the loop. The can-we-recycle logic now becomes: - is this a real value? - is it time for it to die? - is it either not hoisted or a hoisted temporary? The set-death-to-infinity approach for hoisted values is now gone. That worked great for hoisted values used inside the loop, but was too conservative for hoisted temporaries. This lifetime extension was preventing us from recycling those registers, pinning enough registers that we run out and fail to JIT. Small amounts of refactoring to make this clearer: - move the Instruction hash function definition near its operator== - rename the two "hoist" variables to "can_hoist" for Instructions and "try_hoisting" for the JIT approach - add ↟ to mark hoisted temporaries, _really_ hoisted values. There's some redundancy here between tracking the can_hoist bit, the used_in_loop bit, and lifetime tracking. I think it should be true, for instance, that !can_hoist && !used_in_loop implies an instruction is dead code. I plan to continue refactoring lifetime analysis (in particular reordering instructions to decrease register pressure) so hopefully by the time I'm done that metadata will shake out a little crisper. Change-Id: I6460ca96d1cbec0315bed3c9a0774cd88ab5be26 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/248986 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-10-16 18:29:06 +00:00
Brian Osman	5b43113e75	Interpreter: Reflect all uniform variables in ByteCode Gives enough information to locate variables by name (using the same scheme as glGetUniformLocation), and provide hints about type and size. Bug: skia:9513 Change-Id: I9444f1042471967a79c9f05167dcdb78eca41bad Reviewed-on: https://skia-review.googlesource.com/c/skia/+/244502 Reviewed-by: Ethan Nicholas <ethannicholas@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-10-16 15:35:48 +00:00
Florin Malita	59e72b71b5	[skottie] Luma matte support Expand matte support to include normal/inverted luma modes [1]. [1] https://helpx.adobe.com/after-effects/using/alpha-channels-masks-mattes.html#track_mattes_and_traveling_mattes TBR= Bug: skia:9390 Change-Id: Ie6555852e70449e4343944c70d2f9b8a98bb33cb Reviewed-on: https://skia-review.googlesource.com/c/skia/+/248701 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-10-16 13:17:51 +00:00
Brian Osman	df18296f98	Add accessors to get/set SkParticleEffect fields Simplify burst handling. Scripts should just add to burst (if they want to handle programmatic bursting, as well). Update most effects to handle dynamic updates to position better, and add a sample effect meant to be used with mouse tracking. Change-Id: Ia302e1d04e62e2b07974807c44067786cc10a8ad Bug: skia:9513 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/248798 Commit-Queue: Brian Osman <brianosman@google.com> Reviewed-by: Brian Osman <brianosman@google.com>	2019-10-15 14:54:50 +00:00
Brian Osman	7edfb69406	Remove SkCurve and SkColorCurve This was only being used in one effect (and for no good reason). SkSL is plenty powerful to re-implement something similar if required, at no real performance cost. Re-implemented the one effect that used it with simpler math in the script, updated the copy of that effect in the gallery. Docs-Preview: https://skia.org/?cl=247040 Change-Id: I68c86d6550dd4f003f6ba5ecd0febab37b86540b Bug: skia:9513 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/247040 Reviewed-by: Kevin Lubick <kjlubick@google.com> Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-10-08 15:38:20 +00:00
Brian Osman	647c7a97d3	Particles: New confetti effect, minor tweaks elsewhere Confetti mimics the look of a standard skottie asset Change-Id: Iffeedeb24182c4ac2d3ec390614bc1861b821376 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/246518 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-10-07 14:06:56 +00:00
Brian Osman	559ffe4a23	Particles: Added particle flags for tracking state (one-time triggers, etc) Also removed some older effects that weren't interesting, improved others, cleaned up the unused functions in several, and renamed most of them to reflect which feature they're demonstrating. Change-Id: Ib44a00ec3d25e852a1d1661918137ba13d30c86b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/244119 Reviewed-by: Michael Ludwig <michaelludwig@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-09-25 15:58:09 +00:00
Brian Osman	9a8b846baf	Particles: Sub-effect spawning and some slight refactoring * Added a new binding type, SkEffectBinding. This stores another entire effect params structure (so the JSON is just nested). The name is a callable value that spawns a new instance of that effect, inheriting the parameters of the spawning effect or particle (depending on which kind of script made the call). * Broke up the monolithic update function into some helpers, got some code reuse with the script calling logic. * Unlike particle capacity, there is no upper limit on child effects (yet), so it's easy to trigger runaway memory and CPU consumption. Be careful. * Added death scripts to effects and particles, which are a common place to want to spawn sub-effects. Like spawn, these run on each loop, but for one-shots they play at the end. Even with loops, this is helpful for timing sub-effects (see fireworks2.json). * Finally, added a much more comprehensive example effect, raincloud.json. This includes a total of three effects, to generate a cloud, raindrops, and splashes when those drops hit "the ground". Change-Id: I3d7b72bcbb684642cd9723518b67ab1c7d7a538a Reviewed-on: https://skia-review.googlesource.com/c/skia/+/242479 Reviewed-by: Michael Ludwig <michaelludwig@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-09-19 15:18:13 +00:00
Brian Osman	d46cb9729b	Particle effect scripting update This change adds another layer of complexity and control to the particle system. There are now two code chunks: the old code that's run per-particle, and new code that's run for the effect itself. This allows for effect lifetime to be set by the script (eg, randomly), as well as the emission rate. Rate can vary over time (see pulse.json), and particles can be emitted in bursts by setting the effect's burst field (see fireworks.json). Additionally, the effect has its own frame of reference and color, which becomes the default state for newly emitted particles. This allows synchronizing state across particles in various interesting ways (see color in fireworks.json). Change-Id: Iec2f7a3427ce1d6411ed7ef5b3023cbef2e8a134 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/240498 Reviewed-by: Brian Osman <brianosman@google.com> Reviewed-by: Michael Ludwig <michaelludwig@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-09-16 17:48:04 +00:00
Florin Malita	e359aa35d1	[sksg] Fix mask/context overrides interaction We're currently letting render context overrides (opacity, color filters, blend mode, etc) spill down the descendent/mask content tree. This is not ideal, as mask content isolation breaks atomicity assumptions for deferred overrides. Case in point: motion blur uses SkBlendMode::kPlus to accumulate content "layers" - but since mask content gets rendered into a separate layer, it fails to produce the expected result. The fix is to realize all context overrides on the top-level mask layer (we already allocate this layer, so there's no reason to defer downstream anyway). Change-Id: Icbb7e403f90feecfae5846697f559a03d8aa4097 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/239036 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-09-04 14:09:44 +00:00
Florin Malita	165ca3f85b	[skottie] Text selector ease-high/ease-low support Change-Id: Ia879868df677cabca6d5fcd09845efdb6147ee8e Reviewed-on: https://skia-review.googlesource.com/c/skia/+/238177 Reviewed-by: Ben Wagner <bungeman@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-08-29 21:34:10 +00:00
Brian Osman	8a97782956	Move common particle code to an automatically-injected header Change-Id: If99e1802c8187ebd98b67717d744c6695bb25900 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/238118 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-08-29 18:49:33 +00:00
Florin Malita	b9fb29f21e	[skottie] Shaper downscale-to-fit vertical alignment mode Introduce a new hybrid valign extension, kVisualDownscaleToFit (sk_vj: 4): - when the text shaped at the requested size fits within the box, center vertically (same as kVisualCenter) - otherwise, scale down until it fits (same as kVisualResizeToFit) Change-Id: I8e096a49e2b87582e1bd42161657ec4ef561ebdf Reviewed-on: https://skia-review.googlesource.com/c/skia/+/235601 Reviewed-by: Ben Wagner <bungeman@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-08-19 18:59:06 +00:00
Florin Malita	feacb0fb34	[skottie] Add support for multiple range selectors Text animators can have more than one range selector. (depends on https://github.com/bodymovin/bodymovin-extension/pull/21) TBR= Change-Id: Id7f73386853f0e0f9e3c0f15d5a87ec1653ba873 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/234319 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-08-14 17:34:02 +00:00
Mike Klein	f996311003	extend lifetimes for hoisted used in loop This makes the register recycling checks a bit more precise. At head we never recycle a register that's holding a hoisted value, which is overly conservative. We really should never recycle a register that's still needed. By extending the lifetime of any hoisted value that's used in the loop, we prevent that, while still allowing hoisted values that are only used in hoisted computation to be reused. This takes just a small tweak in the JIT code (removing the !hoisted({x,y,z}) checks), and a somewhat larger refactoring in the interpreter, making both hoisted and non-hoisted code go through the same recycling register assignment flow. There's one diff in the existing cases where we now reuse a hoisted register, and I've added a second test just to make sure it's covered explicitly. Change-Id: I25b37ab1f1fea3042d7fd167529abc8fed1dddff Reviewed-on: https://skia-review.googlesource.com/c/skia/+/233239 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-08-13 02:08:16 +00:00
Florin Malita	17b9d1d1de	[skottie] Initial Hue/Saturate effect support Due to limitations in BodyMovin/AE JSX, full effect data is not available (specifically the "channel range" property). We only support static master hue, static master saturation and static master lightness at this point. This CL also introduces a new animation builder pattern: DiscardableAdapterBase and attachDiscardableAdapter(). The former is a base class for adapters with full animator ownership. This enables a) capturing raw adapter pointers in animator lambdas and b) syncing to SG only once, after all local animators are updated). The latter is a helper for managing adapter creation and optional destruction (when all adapter properties are static we can discard it). Change-Id: Iecc4b78830e5464e7958cb12cdfd75a61010aa25 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/231956 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-08-02 18:04:14 +00:00
Mike Klein	8ac9f4e5b2	flesh out SkVM ops a bit more Add missing comparison and selection ops, bit casts, 16-bit memory operations, gathers, uniform loads, and fill in math holes where reasonable. Update some names to be a bit more regular. I think all instructions are implemented in the interpreter, and many tested. More testing and JITs to follow. Change-Id: I8cf377e8b72a86ac950e020892ce82b39e9d7277 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229893 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-07-29 20:43:10 +00:00
Brian Osman	e59acb79b8	Particles: Merge spawn & update into one code string with two functions Change-Id: If57fb79db8f8c5fd185fefaa202167c8082dd846 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229921 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Brian Osman <brianosman@google.com>	2019-07-25 23:51:07 +00:00
Brian Osman	d6108add51	Particles: Use list of lines for multi-line string serialization Change-Id: Ic81b3433b485ca9ce0e60bd10ec12706e673ee89 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229917 Commit-Queue: Brian Osman <brianosman@google.com> Commit-Queue: Mike Klein <mtklein@google.com> Auto-Submit: Brian Osman <brianosman@google.com> Reviewed-by: Mike Reed <reed@google.com> Reviewed-by: Mike Klein <mtklein@google.com>	2019-07-25 20:55:43 +00:00
Brian Osman	fe49163cd1	Major rewrite of the particle system based on the SkSL interpreter This removes all of the fixed-function particle affector classes. Instead, each particle effect just has two SkSL snippets, one for spawn logic, and one for update logic. Each one gets an inout copy of the particle struct. Ultimately, this makes the effects much simpler and smaller, while also being far more flexible (you can do whatever you want with any values you want). Finally, because the interpreter is vectorized and a particular effect's scripts are usually tuned to the specific behaviors desired, it's faster on basically every effect I compared. I re-created all of the old effects in the new system. Many just use pure SkSL (no curves or anything). Some of the old curve and path/text stuff was very handy, though - so those are now exposed as external values in the interpreter. Basically, an effect can have any number of named "bindings" that are a callable thing. This can be a path, text (shortcut for making fancy paths), curve, or color curve. The path ones return a float4 with position and normal, the curves return one or four floats. ... and this transposes all of the particle data storage into SoA form, so that it can use the much faster interpreter entry point. Change-Id: Iebe711c45994c4201041b12d171af976bc5e758e Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222057 Commit-Queue: Brian Osman <brianosman@google.com> Reviewed-by: Mike Reed <reed@google.com>	2019-07-25 19:59:03 +00:00
Mike Klein	5e533c9e1f	move hoist analysis back into Builder Even if a JIT ultimately doesn't end up hoisting any values, it's going to want this information while it decides. Writing it in one place also ensures we only get it wrong in one place... I'm no_ extending the lifetime of hoisted instructions here in Builder. That's something to leave to the backend so they have the flexibility of which of these values to hoist, if any. If they don't hoist, they'll need to know when the value dies. Moving this information back here lets the test expectation goldens reflect the hoist bit again too. Kind of nice. Change-Id: Ib165ca898a97c1d822cb28fe24f15bae4d570a17 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229024 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-07-22 19:34:06 +00:00
Mike Klein	c2fb3b4b72	split deaths() out of other analysis I'm slowly refactoring my way to where hoisting and register assignment are done in backend-specific ways, but this liveness analysis is always going to be useful for each backend. Use deaths() to restore friendly ☠️ dead code markers in test dumps. Change-Id: I3ab94665bbbbf0788b0b27e00d644eba927dff47 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228113 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Mike Klein <mtklein@google.com>	2019-07-17 18:11:10 +00:00
Florin Malita	5f1108ce46	[skottie] Motion blur support Unlike all other Skottie effects, motion blur requires sampling at multiple points on the timeline. To support this: 1) Introduce MotionBlurEffect - a custom SG render node which can drive the timeline of its subtree using an sksg::Animator. 2) Introduce MotionBlurController to swap for a regular LayerController when needed. MotionBlurController dispatches time ticks to MotionBlurEffect instead of directly to the layer animators. The actual motion blur impl is based on https://skia-review.googlesource.com/c/skia/+/221416. Motion blur requires Lottie files exported with this BodyMovin patch: https://github.com/bodymovin/bodymovin-extension/pull/15 Change-Id: I075e101ea91ec9aa300bac35ee810fd539f1aced Reviewed-on: https://skia-review.googlesource.com/c/skia/+/225416 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-07-09 13:02:17 +00:00
Florin Malita	97054c421e	[skottie] Add forgotten linear-wipe test TBR= Change-Id: I643fbe9491d2e134f631435444ec220af9250fc1 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/225423 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-07-03 15:06:42 +00:00
Mike Klein	aab45b5638	add misc. value programs to SkVMTest.expected Noticed we were only dumping the final register programs for the integer code. Might as well also track the value programs. Change-Id: I417c5c655b632691557bbbb136dcbd3f3167af9a Reviewed-on: https://skia-review.googlesource.com/c/skia/+/225324 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-07-02 23:13:06 +00:00
Florin Malita	afd2c10c98	[skottie] Use hybrid bounds for custom Shaper VAlign modes We used to rely solely on visual bounds for vertical alignment. That had the downside of leading/trailing empty lines being ignored. Then https://skia-review.googlesource.com/c/skia/+/220916 switched to using typographical bounds. This approach produces results in line with AE, but allows some glyphs to overflow the alignment boundary. This CL introduces a hybrid approach: 1) for standard AE text alignment, continue to use typographical bounds 2) for Skottie VAlign extensions (sk_vj), use the union of typographical and visual bounds - this should mitigate both issues mentioned above Change-Id: Ifd3ccae3d721728ce67942206160ebe92056d3a2 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/224188 Reviewed-by: Ben Wagner <bungeman@google.com> Reviewed-by: Avinash Parchuri <aparchur@google.com>	2019-06-28 11:35:09 +00:00
Florin Malita	b0944553df	[skottie] Venetian Blinds effect Change-Id: I50e133dea448e044fef45379490cb85b39eea3bc Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223856 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-06-26 13:13:10 +00:00
Mike Klein	2b7b2a2331	add bit_clear I was just reading the ARM docs and realized that their BIC ("BIt Clear") is the same as SSE's ANDN ("AND Not") instruction. It's kind of a neat little tool to have laying around... comes up more than you'd think, and it's sometimes the clearest way to express what you're doing, as in the changed program here where the comment is "mask away the low bits". That's a bit_clear with a mask for what you want to clear away! And the real reason to write this up is that I want to have a CL to point to that shows how to add an instruction top to bottom. Change-Id: I99690ed9c1009427b3986955e7ae6264de4d215c Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223120 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com> Reviewed-by: Mike Reed <reed@google.com>	2019-06-24 16:31:15 +00:00
Mike Klein	a1167abcae	split out Analysis struct from Instruction Instruction is the fundamental data, and Analysis derived from it. The fields in Analysis are only* needed in Builder::done(), and this split seems to help clarify what done() can tweak (Analysis) and what it cannot (fProgram, Instructions). done() is now const. No speed change as far as I can tell. * As you may notice looking at the test expectations, making analysis ephemeral means that dump() can no longer print the skull for dead code or the arrow for hoisted. The register program that's also in the expectation file still reflects both of these optimizations, so we're not really losing any information. Just maybe less demo-friendly. Change-Id: I79feb57558525591baf3faadeb59c418c12793f3 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223119 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-24 16:00:41 +00:00
Mike Klein	0c3346643a	refactor to remove the need for death schedule This cuts the overhead bench from about 19µs to about 15µs. The key insight here is that the only registers that might become available after any given instruction are the ones that hold that instruction's inputs. We can check when they become available directly from the original Builder::Program, without needing a side death schedule data structure. Marking hoisted instructions as having life == program size helps make this logic a little simpler to reason through. Change-Id: Ifb9957f2d0e323e0e5d07996a2cc988f7c8b4c3f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223117 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-24 15:44:10 +00:00
Florin Malita	d7b321afa2	[skottie] Radial swipe effect Implement radial wipe with a sweep gradient shader mask filter. The implementation is slightly convoluted because edge feathering requires a real blur, which in turn requires content layer isolation. So there are two distinct operation modes: - no feather -> draw the content directly into the dest buffer, with the mask filter deferred in SG context - feather -> draw the content into a separate layer, then blend (dstOut) the composed blur+shader mask on top Change-Id: I253701aff42db8010ce463762252c262e2c5d92b Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222596 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-06-21 14:03:45 +00:00
Mike Klein	397fc88fc0	first VEX ymm vector ops - 32x8 i32 add,sub,mul - add I32_Naive bench/test builder to get better i32 mul coverage - minor refactoring all over Change-Id: I13cc19ff37a2da0bcff289ba51baac08f456d6c5 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222485 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-06-20 18:20:00 +00:00
Florin Malita	60e60dfc50	[skottie] Add support for motion tile phase The motion tile phase is a one-dimensional shift, applied to every other row or column (based on a selector property). Implement using a masking shader (covering the static rows/cols), and blend mode shader composition (srcIn for static/pass-through rows/cols, and srcOut for phased rows/cols). TBR= Change-Id: I336c150e5d4900962dc2de801a4e1572cf4b5d59 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221339 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-06-18 14:01:22 +00:00
Florin Malita	b97824d4d1	[skottie] Motion tile effect Implement support for AE's Motion Tile effect [1]. This is the first effect which needs layer size information, so the CL includes related plumbing. Limitations: no phase support at this point. [1] https://helpx.adobe.com/after-effects/using/stylize-effects.html#motion_tile_effect Change-Id: I023bf8a9d3e3d2a48458fa94218f143e6aac4c9f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221244 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-06-17 18:20:15 +00:00
Mike Klein	342b1b2753	proposed: add bytes() op I'm staring at this assembly, vmovups (%rsi), %ymm3 vpsrld $24, %ymm3, %ymm4 vpslld $16, %ymm4, %ymm15 vorps %ymm4, %ymm15, %ymm4 vpsubw %ymm4, %ymm0, %ymm4 Just knowing that could be vmovups (%rsi), %ymm3 vpshufb 0x??(%rip), %ymm3, %ymm4 vpsubw %ymm4, %ymm0, %ymm4 That is, instead of shifting, shifting, and bit-oring to create the 0a0a scale factor from ymm3, we could just byte shuffle directly using some pre-baked control pattern (stored at the end of the program like other constants) pshufb lets you arbitrarily remix bytes from its argument and zero bytes, and NEON has a similar family of vtbl instructions, even including that same feature of injecting zeroes. I think I've got this working, and the speedup is great, from 0.19 to 0.16 ns/px for I32_SWAR, and from 0.43 to 0.38 ns/px for I32. Change-Id: Iab850275e826b4187f0efc9495a4b9eab4402c38 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220871 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-17 15:29:34 +00:00
Florin Malita	5fe7429bab	[skottie] Fix layer transform vs. effects interactions Turns out, in addition to solid layers, pre-comp and image layer effects are also subject to layer transforms. TBR= No-Try: true Change-Id: Ie235ff19374b8e0246eeec8e08079a2340e2a92a Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221177 Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Florin Malita <fmalita@chromium.org>	2019-06-17 12:26:13 +00:00
Florin Malita	e47d8afabd	[skottie] Add support for Transform distort effect Yet another way to transform a layer, disguised as a distort effect. TBR= Change-Id: Ic2d5479fa6ae27b460de60875924f73f77fc7f71 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221001 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org>	2019-06-14 16:58:31 +00:00
Mike Klein	4c4945a252	trim another instruction of I32_SWAR Now that we've got shr_16x2, extract(..., 8, splat(0x00ff00ff)) is better done as shr_16x2(..., 8). This swaps a 16-bit shift in for the 32-bit shift, a wash, but lets us drop the bit_and at the end, saving one whole instruction. This places I32_SWAR a tiny little bit faster than the code in Opts, like .19 ns/px vs .20 ns/px for Opts. Change-Id: I4160dc03ecc8b855c0773a927f1510ad5cbb4b87 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220856 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-13 21:58:05 +00:00
Mike Klein	7f061fb53b	streamline srcover math in I32_SWAR This is the final bunny I've got in my hat, I think... Remembering that none of the s += d*invA adds can overflow, we can use a single 32-bit add to add them all at once. This means we don't have to unpack the src pixel into rb/ga halves. We need only extract the alpha for invA. This brings I32_SWAR even with the Opts code! curr/maxrss loops min median mean max stddev samples config bench 36/36 MB 133 0.206ns 0.211ns 0.208ns 0.211ns 1% ▁▇▁█▁▇▁▇▁▇ nonrendering SkVM_4096_I32_SWAR 37/37 MB 152 0.432ns 0.432ns 0.434ns 0.444ns 1% ▃▁▁▁▁▃▁▁█▁ nonrendering SkVM_4096_I32 37/37 MB 50 0.781ns 0.794ns 0.815ns 0.895ns 5% ▆▂█▃▅▂▂▁▂▁ nonrendering SkVM_4096_F32 37/37 MB 76 0.773ns 0.78ns 0.804ns 0.907ns 6% ▄█▅▁▁▁▁▂▁▁ nonrendering SkVM_4096_RP 37/37 MB 268 0.201ns 0.203ns 0.203ns 0.204ns 0% █▇▆▆▆▆▁▆▆▆ nonrendering SkVM_4096_Opts Change-Id: Ibf0a9c5d90b35f1e9cf7265868bd18b7e0a76c43 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220805 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-06-13 21:32:45 +00:00
Mike Klein	57cb5ba122	i16x2 sub/shr More i16x2 ops, as seemed immediately useful in I32_SWAR. I32_SWAR: 0.27 ns/px --> 0.25 ns/px I32: 0.43 ns/px F32: 0.76 ns/px RP: 0.8 ns/px Opts: 0.2 ns/px Change-Id: I04fed0d1ed1c4218d0cafb45fd0ee6d68880de80 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220801 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-13 19:17:34 +00:00
Mike Klein	3538908983	baby steps into 16-bit ops I figure the easiest way to expose 16-bit operations is to expose 16x2 pair operations... this means we can continue to always work with the same size vector. Switching from 32-bit multiplies to 16-bit multiplies is going to deliver the most oomph... they cost roughly half what 32-bit multiplies do on x86. Speed now: I32_SWAR: 0.27 ns/px I32: 0.43 ns/px F32: 0.76 ns/px RP: 0.8 ns/px Opts: 0.2 ns/px Change-Id: I8350c71722a9bde714ba18f97b8687fe35cc749f Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220709 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>	2019-06-13 18:44:44 +00:00
Mike Klein	821f5e8dfe	remove mul_unorm8/mad_unorm8 I just kind of remembered that if we're doing (xy+x)/256 and x is a destination channel and y is 255-sa, then you can get the +x for free by multiplying by 256-sa instead. (d * (255-sa) + d) (d * (255-sa + 1)) (d * (256-sa) ) Duh. This is a trick we play in a lot of legacy code and I've just now realized it's exactly equivalent to the trick I want to play here... sigh. Folding this math in kind of makes mul/mad_unorm8 moot. Speed's getting good: I32_SWAR: 0.3 ns/px I32 : 0.55 ns/px F32 : 0.8 ns/px RP : 0.8 ns/px Opts : 0.2 ns/px Change-Id: I4d10db51ea80a3258c36e97b6b334ad253804613 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220708 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@google.com>	2019-06-13 18:21:44 +00:00

1 2 3 4 5 ...

378 Commits