skia2/tests/sksl/runtime/RecursiveComparison_Vectors.skvm
John Stiles 059d34594e Optimize SkVM bit-clears.
SkVM has a `bit_clear` opcode dedicated to the operation `x & ~y`, but
the optimizer was not smart enough to combine a bit-and with a bit-not
and replace it with a bit-clear. Now, it can.

Change-Id: Ida5345c3def0a4bf7afa08bb7f7835e1e2e37677
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/524225
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Arman Uguray <armansito@google.com>
2022-03-24 21:04:23 +00:00

114 lines
2.7 KiB
Plaintext

25 registers, 111 instructions:
0 r0 = uniform32 ptr0 4
1 r1 = uniform32 ptr0 8
2 r2 = uniform32 ptr0 C
3 r3 = uniform32 ptr0 10
4 r4 = uniform32 ptr0 14
5 r5 = uniform32 ptr0 18
6 r6 = uniform32 ptr0 1C
7 r7 = uniform32 ptr0 20
8 r8 = splat 0 (0)
9 r9 = div_f32 r0 r2
10 r10 = div_f32 r2 r0
11 r11 = mul_f32 r0 r2
12 r8 = sub_f32 r8 r0
13 r8 = mul_f32 r2 r8
14 r12 = splat 42280000 (42)
15 r12 = mul_f32 r1 r12
16 r13 = splat 422C0000 (43)
17 r13 = mul_f32 r1 r13
18 r14 = splat 42300000 (44)
19 r14 = mul_f32 r1 r14
20 r15 = splat 42340000 (45)
21 r15 = mul_f32 r1 r15
22 r16 = splat 3F800000 (1)
23 r16 = add_f32 r0 r16
24 r17 = mul_f32 r12 r16
25 r18 = mul_f32 r8 r16
26 r19 = mul_f32 r11 r16
27 r20 = mul_f32 r13 r16
28 r21 = eq_f32 r12 r17
29 r22 = eq_f32 r8 r18
30 r23 = eq_f32 r11 r19
31 r24 = eq_f32 r13 r20
32 r22 = bit_and r21 r22
33 r22 = bit_and r23 r22
34 r22 = bit_and r24 r22
35 r17 = neq_f32 r12 r17
36 r18 = neq_f32 r8 r18
37 r19 = neq_f32 r11 r19
38 r20 = neq_f32 r13 r20
39 r18 = bit_or r17 r18
40 r18 = bit_or r19 r18
41 r18 = bit_or r20 r18
42 r18 = bit_and r22 r18
43 r18 = bit_clear r22 r18
44 r22 = mul_f32 r9 r16
45 r16 = mul_f32 r10 r16
46 r19 = eq_f32 r9 r22
47 r23 = eq_f32 r10 r16
48 r19 = bit_and r21 r19
49 r19 = bit_and r23 r19
50 r19 = bit_and r24 r19
51 r22 = neq_f32 r9 r22
52 r16 = neq_f32 r10 r16
53 r22 = bit_or r17 r22
54 r22 = bit_or r16 r22
55 r22 = bit_or r20 r22
56 r22 = bit_and r18 r22
57 r22 = bit_and r18 r22
58 r19 = bit_and r19 r22
59 r19 = bit_clear r22 r19
60 r22 = splat 40000000 (2)
61 r22 = add_f32 r0 r22
62 r18 = mul_f32 r12 r22
63 r20 = mul_f32 r13 r22
64 r16 = mul_f32 r14 r22
65 r17 = mul_f32 r15 r22
66 r10 = eq_f32 r12 r18
67 r24 = eq_f32 r13 r20
68 r23 = eq_f32 r14 r16
69 r21 = eq_f32 r15 r17
70 r24 = bit_and r10 r24
71 r24 = bit_and r23 r24
72 r24 = bit_and r21 r24
73 r18 = neq_f32 r12 r18
74 r20 = neq_f32 r13 r20
75 r16 = neq_f32 r14 r16
76 r17 = neq_f32 r15 r17
77 r20 = bit_or r18 r20
78 r20 = bit_or r16 r20
79 r20 = bit_or r17 r20
80 r20 = bit_and r19 r20
81 r20 = bit_and r19 r20
82 r24 = bit_and r24 r20
83 r24 = bit_clear r20 r24
84 r20 = mul_f32 r9 r22
85 r19 = mul_f32 r8 r22
86 r22 = mul_f32 r11 r22
87 r17 = eq_f32 r9 r20
88 r16 = eq_f32 r8 r19
89 r15 = eq_f32 r11 r22
90 r16 = bit_and r17 r16
91 r16 = bit_and r15 r16
92 r16 = bit_and r10 r16
93 r20 = neq_f32 r9 r20
94 r19 = neq_f32 r8 r19
95 r22 = neq_f32 r11 r22
96 r19 = bit_or r20 r19
97 r19 = bit_or r22 r19
98 r19 = bit_or r18 r19
99 r19 = bit_and r24 r19
100 r19 = bit_and r24 r19
101 r16 = bit_and r16 r19
102 r16 = bit_clear r19 r16
103 r4 = select r16 r0 r4
104 r5 = select r16 r1 r5
105 r6 = select r16 r2 r6
106 r7 = select r16 r3 r7
loop:
107 store32 ptr1 r4
108 store32 ptr2 r5
109 store32 ptr3 r6
110 store32 ptr4 r7