ff4decc35e
This is just like mul(F32,F32) but optimizes 0*x == 0. Use it in SkSLVMGenerator; sksl already applies this optimization. PS2 has a sneaky version using % as a fast_mul() operator, and PS3 has a sneakier version using ** instead. We could of course write this all out using fast_mul() the long way, but I found that quickly became difficult to read. Change-Id: Iae35ce54411abc00e7729e178eb6a10f151a5304 Reviewed-on: https://skia-review.googlesource.com/c/skia/+/368838 Reviewed-by: Brian Osman <brianosman@google.com> Commit-Queue: Mike Klein <mtklein@google.com>
27 lines
648 B
Plaintext
27 lines
648 B
Plaintext
9 registers, 24 instructions:
|
|
0 r0 = uniform32 ptr0 0
|
|
1 r1 = uniform32 ptr0 C
|
|
2 r2 = splat 40000000 (2)
|
|
3 r0 = trunc r0
|
|
4 r1 = mul_i32 r0 r1
|
|
5 r0 = splat 1 (1.4012985e-45)
|
|
6 r3 = splat 2 (2.8025969e-45)
|
|
7 r4 = splat 3 (4.2038954e-45)
|
|
loop:
|
|
8 r5 = index
|
|
9 r5 = mul_f32 r2 r5
|
|
10 r5 = trunc r5
|
|
11 r5 = add_i32 r5 r1
|
|
12 r5 = shl_i32 r5 2
|
|
13 r6 = gather32 ptr0 4 r5
|
|
14 r7 = add_i32 r5 r0
|
|
15 r7 = gather32 ptr0 4 r7
|
|
16 r8 = add_i32 r5 r3
|
|
17 r8 = gather32 ptr0 4 r8
|
|
18 r5 = add_i32 r5 r4
|
|
19 r5 = gather32 ptr0 4 r5
|
|
20 store32 ptr1 r6
|
|
21 store32 ptr2 r7
|
|
22 store32 ptr3 r8
|
|
23 store32 ptr4 r5
|