We now have two functions `writeOpLoad` and `writeOpStore` which are
in charge of writing SpvOpLoad and SpvOpStore instructions.
`writeOpStore` also keeps track of pointer stores in a "store cache."
Subsequent loads from that same pointer will be found in the cache and
will return the value stored in that pointer instead.
Such a cache definitely cannot work in the face of control flow, so we
make the following concessions:
- `pruneReachableOps` is now `pruneConditionalOps`. Any pointers that
are altered inside a potentially-unreachable block are cleared from
the cache entirely.
- The entire store cache is cleared at all OpLabels within a loop.
The cache also cannot work in the presence of swizzled stores, so we
make another significant concession:
- The entire store cache is cleared whenever we store into a non-memory
pointer (e.g., assigning into a swizzled LValue, such as `foo.xz`).
Despite these significant limitations, this manages to dramatically
shrink many real-world examples.
Change-Id: I0981a0cf7b45b064e153e9ada271494c8e00cad5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/530054
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Previously, we only handled the simple case of extracting from an
OpConstantComposite. Now we also handle the complex case of extracting
from an OpCompositeConstruct, where vectors can be composed of other
vectors.
This is particularly challenging because OpCompositeConstruct can
contain SpvIds from almost any other instruction, so we need to be
able to decode those instructions and figure out their type. For
instance:
%5 = OpFAdd Vec2 %1 %2
%6 = OpFAdd Scalar %3 %4
%7 = OpCompositeConstruct FloatVec3 %5 %6
%8 = OpCompositeExtract %7 2
The %8 (OpCompositeExtract) could be replaced with %6 but we need to
peek at the type in *both* OpFAdd instructions to decode this. It
only works when the affected instructions are in-cache, so many
opportunities are currently not optimized because their code still
uses the original, uncached form of writeInstruction.
Change-Id: I5719ae6284f32e1d6f2c898eca282c22b94fc764
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/529743
Reviewed-by: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
If we start with an OpConstantComposite, then we do an
OpCompositeExtract from it, we can look up the result directly from op
cache and avoid doing any work. This helps our matrix code a lot.
Change-Id: Idfbdc0c69676b9c1e91cdc57bf0d6382b9b5d8d4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/529339
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Previously, we didn't usually generate OpConstantComposite ops for
matrices. Now, a matrix assembled from constants should come out as a
constant.
Change-Id: I458718901686dffb84e4079a81017d61195420d3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/529338
Reviewed-by: Arman Uguray <armansito@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
The output changes here are almost entirely a wash, because we already
had support for caching scalars and vectors. Almost all changes are just
inconsequential reorderings of IDs, and the removal of RelaxedPrecision
decorators on constants (which were not meaningful).
Change-Id: I45340c4a240cb504b7c4a934b3db178d2f39ec99
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/528709
Reviewed-by: Arman Uguray <armansito@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Some GPUs (Adrenos in particular) perform noticeably better when we
use OpConstantComposite instead of OpCompositeConstruct. This also gives
us some deduplication of redundant ops.
Change-Id: I53b7a3e1cf61e51647a661a08ff4c7b53ee60f10
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/528636
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit eb68973c2f.
Reason for revert: ES2 conformance test checks this
Original change's description:
> Disallow matrix ctors which overflow a column.
>
> The GLSL spec allows matrix constructors containing vectors that would
> split between multiple columns of the matrix. However, in practice, this
> does not actually work well on a lot of GPUs!
>
> - "cast not allowed", "internal error":
> Tegra 3
> Quadro P400
> GTX 660
> GTX 960
> - Compiles, but generates wrong result:
> RadeonR9M470X
> RadeonHD7770
>
> Since this isn't a pattern we expect to see in user code, we now report
> it as an error at compile time. mat2(vec4) is treated as an exceptional
> case and still allowed.
>
> Change-Id: Id6925984a2d1ec948aec4defcc790a197a96cf86
> Bug: skia:12443
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/449518
> Commit-Queue: John Stiles <johnstiles@google.com>
> Auto-Submit: John Stiles <johnstiles@google.com>
> Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Bug: skia:12443
Change-Id: I5a32744c88b9b830ad657488824c8c7dd0b0a652
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/458056
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
The GLSL spec allows matrix constructors containing vectors that would
split between multiple columns of the matrix. However, in practice, this
does not actually work well on a lot of GPUs!
- "cast not allowed", "internal error":
Tegra 3
Quadro P400
GTX 660
GTX 960
- Compiles, but generates wrong result:
RadeonR9M470X
RadeonHD7770
Since this isn't a pattern we expect to see in user code, we now report
it as an error at compile time. mat2(vec4) is treated as an exceptional
case and still allowed.
Change-Id: Id6925984a2d1ec948aec4defcc790a197a96cf86
Bug: skia:12443
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/449518
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
This exposes a bug in the Metal code generator which will be resolved
in a followup CL.
Change-Id: If073835dbee474ea9a805eb92b42dc1fca2afbd0
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/448378
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>