Fixes a bug with name conflicts in the final SkSL.
Bug: skia:10526
Change-Id: Ic238f89dd778c186e775ecbaabfbaed9e426274f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305563
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Change-Id: Iaa0829d72d0da1469df2da23102ff0e3572b641b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305556
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
This existed because GrTexture used to be public.
Change-Id: I5e507084ae12058a20481b517b9130b41c793d29
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305521
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
* Combine find and makeMRU.
* Switch from SkMutex to SkSpinLock
In the gm 'paragraph_$' this reduces acquiring the lock from 1.2% to 0.7%
as measured by instruments and nanobench.
Change-Id: I33e3af31825f175c9de42f001acf68ffe3623a8a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305564
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
GrTAllocator implies relatively limited use cases, while GrTBlockLinkedList
helps clarify the underlying data structure (and its associated advantages
and disadvantages). I am not beholden to the name, so am happy to have
a discussion on alternatives like GrTLinkedList or GrTBlockList or
GrTBlockArray.
Change-Id: I5b10801d8593991d5e804c4074a81efb1dd110ee
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304396
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Only works on raster backend for the now.
Bug: skia:10344
Change-Id: I2c82d5345ae83a2bb2744ab27e3d971c57ea989e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303271
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Did some cleanup to remove repetition that was distracting from the
thing being tested.
Change-Id: Ie385c6ec2d1325a1bd0ba5c2270e7f2ddd1d24b2
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305076
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
vpermps (added here) makes this very easy,
with an index controlling what 32-bit values go where.
A index of the form {0,2,4,6|?,?,?,?} will put the 4 low 32-bit halves
of 4 64-bit values in lanes 0,1,2,3. We can use that twice to get all 8
low halves, then our new vperm2f128 to put them together. Conveniently
vpermps can also load directly from memory:
vpermps (%rdi), {0,2,4,6|?,?,?,?}, lo
vpermps 32(%rdi), {0,2,4,6|?,?,?,?}, hi
vperm2f128 0x20, lo,hi, dst
We don't care what those top four indices are for load64_lo, so we'll
use them as the indices for load64_hi. That makes the full index
{0,2,4,6|1,3,5,7}, and load64_hi will just vpermf128 the other 128-bits
of lo/hi:
vpermps (%rdi), {?,?,?,?|1,3,5,7}, lo
vpermps 32(%rdi), {?,?,?,?|1,3,5,7}, hi
vperm2f128 0x31, lo,hi, dst
vpermps needs its index in a register, so we use a temporary for that.
Our logical lo can alias dst, and hi can alias that index, so it's just
one extra temporary register in the end.
Change-Id: Ie6a4efbf12ddada45dd09c0f580fa7350cf3019e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305171
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Add some packing instructions to make it possible.
The gist is that we've got
r(x) = {a,b,c,d|e,f,g,h}
r(y) = {i,j,k,l|m,n,o,p}
where r(x) holds each low 32-bit half of a 64-bit value,
and r(y) holds the high halves. We want to write
a,i,b,j,c,k,d,l,e,m,...
So first the vpunpck[lh]dq instructions produce
L = {a,i,b,j|e,m,f,n}
H = {c,k,d,l|g,o,h,p}
which gets us halfway there. The vperm2f128s select the low (0x20) or
high (0x31) 128-bit halves of L/H, so we end up writing to memory
dst+0: a,i,b,j,c,k,d,l
dst+32: e,m,f,n,g,o,h,p
Existing tests cover that store64 works.
Change-Id: Ic00ad9bdb448b79867584c27cf0114a42ed32379
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305156
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Does not add kNearest as a mip map mode yet.
Bug: skia:10344
Change-Id: Ida80cbf19e2b1eed5209d0680837fb45e54b4261
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303481
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Change-Id: Ic44e24057b95bb014504f02a736fb4341afc8971
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304856
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I82276e38ea4a5524127176eb5a34066b6cb06d88
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304799
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
This reverts commit 423dd689df.
Reason for revert: had missed corner case when tail block was empty
Original change's description:
> Revert "Support moving blocks from one allocator to another"
>
> This reverts commit 0f064041bf.
>
> Reason for revert: unit test failures on some windows bots
>
> Original change's description:
> > Support moving blocks from one allocator to another
> >
> > Uses this new capability and the previously implemented reserve()
> > feature to implement efficient concatenation of GrTAllocators.
> >
> > Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
> > Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
> > Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> > Reviewed-by: Robert Phillips <robertphillips@google.com>
>
> TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
>
> Change-Id: I931f2462ecf6e04d40a671336d0de7d80efd313d
> No-Presubmit: true
> No-Tree-Checks: true
> No-Try: true
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304604
> Reviewed-by: Michael Ludwig <michaelludwig@google.com>
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
# Not skipping CQ checks because this is a reland.
Change-Id: Ia793c2f0e7ab2f3fd437871fa7fb4f56979f9ceb
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304739
Auto-Submit: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Change-Id: Ic7799b3c5f4294cba9ff72f8c11a2ad285ab189f
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304738
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Change-Id: Ia92dc2be9a25f334bdbc098564cf2332496677fa
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304296
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Three TODOs, all basically the same idea: divide-by-zero is not the only
way to produce non-finite results from a division. You can also divide
by very-near-zero, and maybe some other ways.
Added is_finite() to make this clear. is_finite() is almost as cheap as
the comparisons it replaces, so performance shouldn't be affected.
Change-Id: I0a803e9ab4e3286f4e10a13d3aacee370eaaa803
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304669
Commit-Queue: Mike Klein <mtklein@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Also mipMapped params to GrBackendTexture functions to mipmapped
GrBackendTexture::hasMipMaps -> GrBackendTexture::hasMipmaps
Misc test vars fMipMapped -> fMipmapped
Change-Id: Ic0651d14fc106c21b0ab45529875b95ed8dc2dfd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304598
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
This reverts commit 0f064041bf.
Reason for revert: unit test failures on some windows bots
Original change's description:
> Support moving blocks from one allocator to another
>
> Uses this new capability and the previously implemented reserve()
> feature to implement efficient concatenation of GrTAllocators.
>
> Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
Change-Id: I931f2462ecf6e04d40a671336d0de7d80efd313d
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304604
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
For consistency with other enums and public APIs.
Change-Id: I026da5529f11051693cae5691c7ad92fad5ed446
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304597
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Uses this new capability and the previously implemented reserve()
feature to implement efficient concatenation of GrTAllocators.
Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: Ia2cfbca8982b57399b6681cbb4501c2933ab4df7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304576
Auto-Submit: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This can optimize cases in a allocate-release loop that moves back
and forth across the end of one block and the start of another. Before
this would malloc a new block and then delete it.
It also enables reserve() in higher-level data collections w/o blocking
bytes left in the current tail block.
Change-Id: Ide16e9038384fcb188164fc9620a8295f6880b9f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303268
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Previously, these would produce a "valid" effect object, but it wouldn't
draw anything.
Bug: oss-fuzz:24070
Change-Id: I17d0ed1710196853da0694cac9f4c260312700a9
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304064
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
The current algorithm runs an exponentially-increasing number of trials
based on the number of children supported by the fragment processor and
has become a large drag on test times. This version runs a fixed number
of trials to determine which optimization bits are able to appear, and
then continues running trials until each potential optimization has been
demonstrated successfully five times.
The algorithm doesn't attempt to check interactions between the various
optimization bits (e.g. a hypothetical bug that might only occur when
two optimizations interact with one another) but hopefully the minimum
of 100 successful trials is enough to shake out most issues.
Change-Id: I4eba7ace84739027a5aea8f8f895b44c4532b816
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304059
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
To extract metadata and validate the shader, we've added several
iterations over all program elements. This CL rearranges things
to iterate once (*). Variable conversion is moved to a separate
loop later, to help with nesting and readability.
Removes hard-to-read asserts. These were validating things enforced
by both the IR generator and unit tests.
*: Technically, there are additional implicit iterations when we call
SkSL::Analysis functions. Folding all of this into a single pass
would be even better, but much more complicated.
Change-Id: I4f5aa649e74094e94c365ad20ef2ac96082285cd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303924
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Mostly this is a lot of plumbing of sk_sp around instead of const*.
This does allow the d3d and vk backends to hold refs to the GrBuffers that
are bound on a command buffer. This means that our buffer alloc pools will
not try to reuse this buffers until the gpu is done with them. Previously
vk and d3d will sniff out if one of these buffers was being used again
while still active on the gpu and rip out the internal backend buffer and
allocate a new one which is not cheap. We see a lot of perf wins from
not doing this.
Change-Id: I9ffe649151ee43066dce620bd3e2763b029a9811
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303583
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
Previously, this test layered its paints by calling
`addColorFragmentProcessor` multiple times. We intend to remove support
for multiple color FPs on a GrPaint in the near future, so we use input
fragment processors instead to generate the same result.
Change-Id: I33e82ce0067183189e69b2af0fe0c228d1d60f14
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303479
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
The helper class allows us to regenerate a "random" fragment processor
with the same seed as many times as desired. This lets us check the
`compatibleWithCoverageAsAlpha` optimization without needing to clone
the input FP; instead, we can actually generate the FP under test three
times in a row, with the same random seed.
In a followup CL (http://review.skia.org/303479/) we will also leverage
this helper class to regenerate the FP under test with the same random
seed, but with a different input.
Change-Id: I1cd83082a949d555f7898970c8a1cc3002818286
Bug: skia:10384
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303657
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
This adds load/store ops for 64-bit values, with two load64 instructions
returning the low and high 32-bits each, and store64 taking both.
These are implemented in the interpreter and tested but not yet JIT'd
or hooked up for loading and storing 64-bit PixelFormats. Hopefully
those two CLs to follow shortly.
Change-Id: I7e5fc3f0ee5a421adc9fb355d0b6b661f424b505
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303380
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Everything was pretty much already plumbed through using
SkRasterPipeline, with a few key connections made here.
This support is mostly useful for differentially debugging my CLs
stacked on top of it.
Change-Id: I9c2f2ea6cd8890c057890409f21c7698857c599a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303651
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Previously, the blocks owned by a GrBlockAllocator provided block-level
metadata that users could control to track within a block. For types
like GrTAllocator (and a collection coming down the pipe), they need
to store a total count of allocations.
Since GrBlockAllocator is max-aligned, having them hold on to a single
extra int adds 4 to 12 bytes of padding for 8 and 16byte aligned
platforms. But, the old Block size already has 4 bytes of extra padding
since it was 28 bytes packed. This had been allowed to be allocation
data, for requests that were 4-byte aligned or less.
In practice, I think it's better to use that space for allocator-level
metadata so that a total count, or other high-level data, can be
packed into the data already held by GrBlockAllocator. Now GrTAllocator
instances should be 8-16 bytes smaller on those same platforms. Also
no longer writing into the struct-padded nether-realm, which is probably
just a good thing on principle.
Also cleans up how GrTAllocator's push_back() and emplace_back() are
tested to make sure all options are clearly covered.
Change-Id: If1da29132f3ec8df7a4056fcd834f760eb4693f5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303267
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: I5bf9470a3d7822ea1edee7abf693037322e5d4e6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303265
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Auto-Submit: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: I00a00e62dfb2021a2c380ef4217c1acec1f07672
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303258
Commit-Queue: Greg Daniel <egdaniel@google.com>
Reviewed-by: Jim Van Verth <jvanverth@google.com>