This is a reland of 5820b0c3f3
It is updated in patchset 2 to clean up pointers passed into memcpy, and to
optimize the bounds calculation in GrPerspQuad. This should fix a performance
regression caused by the move away from caching 1/w. The Sk4f::invert() does not
always preserve 1/1 == 1, which led to bounds slightly outside of clips and
thus forced Skia to keep the scissor test enabled. The fix also restores the
optimization of skipping the 1/w division when the quad is known to be 2D.
Original change's description:
> Use specialized quad lists in rectangle ops
>
> Hopefully reduces memory footprint of GrFillRectOp and GrTextureOp
>
> The original rect code (GrAAFillRectOp) stored 2 SkMatrices (18 floats), 2
> SkRects (8 floats) an SkPMColor4f (4 floats) and a flag (1 int) for a total
> of 124 bytes per quad that was stored in the op.
>
> The first pass at the rectangle consolidation switched to storing device and
> local quads as GrPerspQuads (32 floats), an SkPMColor4f (4 floats) and a flag
> (1 int) for a total of 148 bytes per quad. After landing, several memory
> regressions appeared in Chrome and our perf monitor.
>
> Several intertwined approaches are taken here. First, GrPerspQuad no longer
> caches 1/w, which makes a quad 12 floats instead of 16. Second, a specialized
> list type is defined that allows storing the x, y, and extra metadata together
> for quads, but keeps the w components separate. When the quad type isn't
> perspective, w is not stored at all since it is implicitly 1 and can be
> reconstituted at tessellation time. This brings the total per quad to either
> 84 or 116 bytes, depending on if the op list needs perspective information.
>
> Bug: chromium:915025
> Bug: chromium:917242
> Change-Id: If37ee122847b0c32604bb45dc2a1326b544f9cf6
> Reviewed-on: https://skia-review.googlesource.com/c/180644
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
Bug: chromium:915025, chromium:917242
Change-Id: I98a1bf83fd7d393604823d567c57d7e06fad5e55
Reviewed-on: https://skia-review.googlesource.com/c/182203
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Chris Dalton <csmartdalton@google.com>
This reverts commit 5820b0c3f3.
Reason for revert: Unanticipated gold image differences and performance regressions
Original change's description:
> Use specialized quad lists in rectangle ops
>
> Hopefully reduces memory footprint of GrFillRectOp and GrTextureOp
>
> The original rect code (GrAAFillRectOp) stored 2 SkMatrices (18 floats), 2
> SkRects (8 floats) an SkPMColor4f (4 floats) and a flag (1 int) for a total
> of 124 bytes per quad that was stored in the op.
>
> The first pass at the rectangle consolidation switched to storing device and
> local quads as GrPerspQuads (32 floats), an SkPMColor4f (4 floats) and a flag
> (1 int) for a total of 148 bytes per quad. After landing, several memory
> regressions appeared in Chrome and our perf monitor.
>
> Several intertwined approaches are taken here. First, GrPerspQuad no longer
> caches 1/w, which makes a quad 12 floats instead of 16. Second, a specialized
> list type is defined that allows storing the x, y, and extra metadata together
> for quads, but keeps the w components separate. When the quad type isn't
> perspective, w is not stored at all since it is implicitly 1 and can be
> reconstituted at tessellation time. This brings the total per quad to either
> 84 or 116 bytes, depending on if the op list needs perspective information.
>
> Bug: chromium:915025
> Bug: chromium:917242
> Change-Id: If37ee122847b0c32604bb45dc2a1326b544f9cf6
> Reviewed-on: https://skia-review.googlesource.com/c/180644
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
Change-Id: I6067b6c0e103d08787626a0a8eff753a0f0c97b6
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: chromium:915025, chromium:917242
Reviewed-on: https://skia-review.googlesource.com/c/181167
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Hopefully reduces memory footprint of GrFillRectOp and GrTextureOp
The original rect code (GrAAFillRectOp) stored 2 SkMatrices (18 floats), 2
SkRects (8 floats) an SkPMColor4f (4 floats) and a flag (1 int) for a total
of 124 bytes per quad that was stored in the op.
The first pass at the rectangle consolidation switched to storing device and
local quads as GrPerspQuads (32 floats), an SkPMColor4f (4 floats) and a flag
(1 int) for a total of 148 bytes per quad. After landing, several memory
regressions appeared in Chrome and our perf monitor.
Several intertwined approaches are taken here. First, GrPerspQuad no longer
caches 1/w, which makes a quad 12 floats instead of 16. Second, a specialized
list type is defined that allows storing the x, y, and extra metadata together
for quads, but keeps the w components separate. When the quad type isn't
perspective, w is not stored at all since it is implicitly 1 and can be
reconstituted at tessellation time. This brings the total per quad to either
84 or 116 bytes, depending on if the op list needs perspective information.
Bug: chromium:915025
Bug: chromium:917242
Change-Id: If37ee122847b0c32604bb45dc2a1326b544f9cf6
Reviewed-on: https://skia-review.googlesource.com/c/180644
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>