Added vpinsrd to make it easy, and fixed a comment on vpinsrb's tests.
Nothing too tricky, just the naive implementations.
The hardest part was getting all the data to the right places.
No diffs!
Change-Id: Ie4c1f1e429abfa75ca80a93d108061287d5ace80
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306872
Commit-Queue: Herb Derby <herb@google.com>
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This noop refactor ports the lane-immediate idea to the 64-bit loads and
to 128-bit stores, and renames to simple load64, load128, store128.
The store128 part of this change is _slightly_ sneaky in that we pack
the argument pointer index and the lane both into int immz, but there's
plenty of space there: lane needs 1 bit, and the argument pointer index
needs maybe 3 or 4 max.
Change-Id: I3fa01bf31312b8a69c7e287d649470ba15a8ea40
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306810
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
More recontexting for SkImage. Chrome flag in CL 2323135.
Flutter migration landed in https://github.com/flutter/engine/pull/19962
Bug: skia:104662
Change-Id: Id725eb130310639457ba90f378ecdb334dd5f3cd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306182
Auto-Submit: Adlai Holler <adlai@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This allows us to JIT functions taking one more argument.
Our x86 JIT was the weak link: LLVM handles arbitrary numbers
of arguments, and our aarch64 JIT handles up to seven already.
I plan to pass one more source varying sometimes, and in extremely
unusual cases it we could need six pointers:
1) uniforms
2) destination
3) 8-bit coverage plane
4) 8-bit multiply plane
5) 8-bit add plane
6) source varying
Those varyings 3-5 are all indexable off the coverage pointer 3) if we
know the dimensions, so I could be convinced that we should only pass
one there, making our maximum number of arguments four. I'll be looking
at that independently, but it doesn't hurt to have our capability to go
up to six either way.
Change-Id: Id7a5b88e382a95bb560633e95c5be273b7ea67d1
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306241
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
The 3 bits that represent rsp are used as a signaling mechanism to
choose between compact base + disp encoding and full base + disp +
scale*index encoding, one byte longer. If a base is encoded as rsp in
the MOD R/M byte, an SIB byte follows.
r12 shares its 3 bottom bits with rsp, so we need to treat it like rsp
when deciding whether or not we need an SIB byte. (As usual registers
r8-15's distinguishing upper bit is carried by a REX/VEX prefix.)
rsp is also treated as a special signal in the index field of the SIB
byte, meaning essentially scale=0, and as a result it's not possible to
use rsp as an index. It _is_ possible to use r12 as an index, with a
test added here. This worked without any code change. It seems the
index=rsp -> scale=0 signal is triggered by all four bits of the index
registger, including the X upper bit from REX/VEX.
I have found these charts useful:
https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing_2
Looking at those charts I noticed that rbp/r13 are also special cases,
so I'm eyeing them warily and will avoid using them for now.
Change-Id: Id78f826a39c060b03000eae7c50c642ef44d57db
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306237
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Since subsetting may require rasterizing/resolving the generator, we may
also need access to the GrDirectContext. To simplify apis, rely on
makeSubset() for that.
Related: https://skia-review.googlesource.com/c/skia/+/305970
Change-Id: I1980e3c823fb6cf54f197c350942c2f82b03e20f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306136
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Bug: chromium:1101491
Bug: b/161896447
Found using
git grep -wiEIl \ '(he)|(she)|(his)|(hers)|(him)|(her)|(guy)|(guys)'
Change-Id: I6b91853de067fd4c2e84f7ec70275522ce6c8bfc
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306186
Commit-Queue: Leon Scroggins <scroggo@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Auto-Submit: Leon Scroggins <scroggo@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
They have been replaced with equivalent constants (and given the
requisite k prefix).
Change-Id: I70907eec234e0861cc97aac1ca03086faca42f9a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306160
Auto-Submit: John Stiles <johnstiles@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Reed <reed@google.com>
No longer needed as GrSurface and GrRenderTarget are private.
Change-Id: I2ec653b2d9daa115233bb7eaa1f2b7f880772c0a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305730
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Bug: chromium:1101491
Bug: b/161896447
Switch to more inclusive language, like "main", or remove where
simply unnecessary.
Change-Id: I36ef6ec631eb991f54f42b98887333f07c0984c2
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306060
Commit-Queue: Leon Scroggins <scroggo@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Kevin Lubick <kjlubick@google.com>
It's hard to imagine a scenario where it makes sense to recompile a
shader for every possible point or rect.
Change-Id: Ic4ca9a4826c4ae4b604705549170503c8ec580a6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/306075
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Previously, floats would be converted to int32 using a C-style cast when
added to a processor key, truncating any fractional component of the
value. This would cause two processors compiled with different floats to
generate the same key.
The float is now sk_bit_cast to a uint32 instead; this will preserve the
uniqueness of all values.
Additionally, the CPP generator will now abort if asked to add an
unsupported type to the processor key. Previously, it would do a C-style
cast to int32 for all unrecognized types and hope for the best.
Change-Id: I270ae8b3036497a9e9a77747255f763d9aad1927
Bug: skia:10486
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305572
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This should be the last batch of tests. All the remaining uses of
GrContext should be resolved when SkImage no longer requires a context.
Change-Id: I47eeb3b74c28f483c20d9bec4daecbdb6d2cb982
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305541
Reviewed-by: Adlai Holler <adlai@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
We seed the builder with the imageinfo of an image, but after it is
built we then have to "attach" it to an image (or more than one).
Check that the mip chain is compatible with the parent image we are
attaching to.
Change-Id: Iead6aee54861fdd4910f38f89a0364ed80c341d8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305680
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Fixes a bug with name conflicts in the final SkSL.
Bug: skia:10526
Change-Id: Ic238f89dd778c186e775ecbaabfbaed9e426274f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305563
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Change-Id: Iaa0829d72d0da1469df2da23102ff0e3572b641b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305556
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
This existed because GrTexture used to be public.
Change-Id: I5e507084ae12058a20481b517b9130b41c793d29
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305521
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
* Combine find and makeMRU.
* Switch from SkMutex to SkSpinLock
In the gm 'paragraph_$' this reduces acquiring the lock from 1.2% to 0.7%
as measured by instruments and nanobench.
Change-Id: I33e3af31825f175c9de42f001acf68ffe3623a8a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305564
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
GrTAllocator implies relatively limited use cases, while GrTBlockLinkedList
helps clarify the underlying data structure (and its associated advantages
and disadvantages). I am not beholden to the name, so am happy to have
a discussion on alternatives like GrTLinkedList or GrTBlockList or
GrTBlockArray.
Change-Id: I5b10801d8593991d5e804c4074a81efb1dd110ee
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304396
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Only works on raster backend for the now.
Bug: skia:10344
Change-Id: I2c82d5345ae83a2bb2744ab27e3d971c57ea989e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303271
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Did some cleanup to remove repetition that was distracting from the
thing being tested.
Change-Id: Ie385c6ec2d1325a1bd0ba5c2270e7f2ddd1d24b2
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305076
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
vpermps (added here) makes this very easy,
with an index controlling what 32-bit values go where.
A index of the form {0,2,4,6|?,?,?,?} will put the 4 low 32-bit halves
of 4 64-bit values in lanes 0,1,2,3. We can use that twice to get all 8
low halves, then our new vperm2f128 to put them together. Conveniently
vpermps can also load directly from memory:
vpermps (%rdi), {0,2,4,6|?,?,?,?}, lo
vpermps 32(%rdi), {0,2,4,6|?,?,?,?}, hi
vperm2f128 0x20, lo,hi, dst
We don't care what those top four indices are for load64_lo, so we'll
use them as the indices for load64_hi. That makes the full index
{0,2,4,6|1,3,5,7}, and load64_hi will just vpermf128 the other 128-bits
of lo/hi:
vpermps (%rdi), {?,?,?,?|1,3,5,7}, lo
vpermps 32(%rdi), {?,?,?,?|1,3,5,7}, hi
vperm2f128 0x31, lo,hi, dst
vpermps needs its index in a register, so we use a temporary for that.
Our logical lo can alias dst, and hi can alias that index, so it's just
one extra temporary register in the end.
Change-Id: Ie6a4efbf12ddada45dd09c0f580fa7350cf3019e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305171
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Add some packing instructions to make it possible.
The gist is that we've got
r(x) = {a,b,c,d|e,f,g,h}
r(y) = {i,j,k,l|m,n,o,p}
where r(x) holds each low 32-bit half of a 64-bit value,
and r(y) holds the high halves. We want to write
a,i,b,j,c,k,d,l,e,m,...
So first the vpunpck[lh]dq instructions produce
L = {a,i,b,j|e,m,f,n}
H = {c,k,d,l|g,o,h,p}
which gets us halfway there. The vperm2f128s select the low (0x20) or
high (0x31) 128-bit halves of L/H, so we end up writing to memory
dst+0: a,i,b,j,c,k,d,l
dst+32: e,m,f,n,g,o,h,p
Existing tests cover that store64 works.
Change-Id: Ic00ad9bdb448b79867584c27cf0114a42ed32379
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/305156
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Does not add kNearest as a mip map mode yet.
Bug: skia:10344
Change-Id: Ida80cbf19e2b1eed5209d0680837fb45e54b4261
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303481
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Change-Id: Ic44e24057b95bb014504f02a736fb4341afc8971
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304856
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I82276e38ea4a5524127176eb5a34066b6cb06d88
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304799
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
This reverts commit 423dd689df.
Reason for revert: had missed corner case when tail block was empty
Original change's description:
> Revert "Support moving blocks from one allocator to another"
>
> This reverts commit 0f064041bf.
>
> Reason for revert: unit test failures on some windows bots
>
> Original change's description:
> > Support moving blocks from one allocator to another
> >
> > Uses this new capability and the previously implemented reserve()
> > feature to implement efficient concatenation of GrTAllocators.
> >
> > Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
> > Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
> > Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> > Reviewed-by: Robert Phillips <robertphillips@google.com>
>
> TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
>
> Change-Id: I931f2462ecf6e04d40a671336d0de7d80efd313d
> No-Presubmit: true
> No-Tree-Checks: true
> No-Try: true
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304604
> Reviewed-by: Michael Ludwig <michaelludwig@google.com>
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
# Not skipping CQ checks because this is a reland.
Change-Id: Ia793c2f0e7ab2f3fd437871fa7fb4f56979f9ceb
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304739
Auto-Submit: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Change-Id: Ic7799b3c5f4294cba9ff72f8c11a2ad285ab189f
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304738
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Change-Id: Ia92dc2be9a25f334bdbc098564cf2332496677fa
Bug: skia:10217
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304296
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Three TODOs, all basically the same idea: divide-by-zero is not the only
way to produce non-finite results from a division. You can also divide
by very-near-zero, and maybe some other ways.
Added is_finite() to make this clear. is_finite() is almost as cheap as
the comparisons it replaces, so performance shouldn't be affected.
Change-Id: I0a803e9ab4e3286f4e10a13d3aacee370eaaa803
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304669
Commit-Queue: Mike Klein <mtklein@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Also mipMapped params to GrBackendTexture functions to mipmapped
GrBackendTexture::hasMipMaps -> GrBackendTexture::hasMipmaps
Misc test vars fMipMapped -> fMipmapped
Change-Id: Ic0651d14fc106c21b0ab45529875b95ed8dc2dfd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304598
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
This reverts commit 0f064041bf.
Reason for revert: unit test failures on some windows bots
Original change's description:
> Support moving blocks from one allocator to another
>
> Uses this new capability and the previously implemented reserve()
> feature to implement efficient concatenation of GrTAllocators.
>
> Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
> Commit-Queue: Michael Ludwig <michaelludwig@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=robertphillips@google.com,csmartdalton@google.com,michaelludwig@google.com
Change-Id: I931f2462ecf6e04d40a671336d0de7d80efd313d
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304604
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
For consistency with other enums and public APIs.
Change-Id: I026da5529f11051693cae5691c7ad92fad5ed446
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304597
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Uses this new capability and the previously implemented reserve()
feature to implement efficient concatenation of GrTAllocators.
Change-Id: I2aff42eaf75e74e3b595d3069b6a271fa7329465
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303665
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: Ia2cfbca8982b57399b6681cbb4501c2933ab4df7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/304576
Auto-Submit: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This can optimize cases in a allocate-release loop that moves back
and forth across the end of one block and the start of another. Before
this would malloc a new block and then delete it.
It also enables reserve() in higher-level data collections w/o blocking
bytes left in the current tail block.
Change-Id: Ide16e9038384fcb188164fc9620a8295f6880b9f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/303268
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>