There is a very telling FIXME in the MSAN source code:
// FIXME: detect and handle SSE maskstore/maskload
For now, just tell MSAN (correctly) that it's initialized.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-MSAN,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I6aec67b99e4d930cb72e438458b33ed116535009
Reviewed-on: https://skia-review.googlesource.com/5311
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Ia7a133f515e29e16700aabc0633c77a703425f41
Reviewed-on: https://skia-review.googlesource.com/5239
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Make the fuzzRange not crash if min == max, just set n to be min.
BUG=skia:
Change-Id: I138cefbec9b408d3b35e4258d770e6b396af0e5f
Reviewed-on: https://skia-review.googlesource.com/5305
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Kevin Lubick <kjlubick@google.com>
Previously, we forgot to use AdditiveBlitter in two places where partial
rows are blitterred. That causes SkAAClip to complain as in skia:6003.
BUG=skia:6003
Change-Id: I4f4a896072448bdb3f287a2eb61cb64b1256ea78
Reviewed-on: https://skia-review.googlesource.com/5273
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
NVIDIA's Nsight tool is not compatible with certain GL calls. This
change places them behind a #define in GrGLGpu.cpp.
BUG=skia:
Change-Id: Id80ed291b6fe4fd777c5690b25d4fbb47de3c187
Reviewed-on: https://skia-review.googlesource.com/5281
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This reverts commit cab79aadad.
Reason for revert: Bots crashing.
Original change's description:
> Use /MD for Windows builds.
>
> I think the default is /MT, static linking.
> This should make our builds smaller, and compatible with clients using /MD.
>
> Change-Id: Id8a39a029925eda2627532bbd0223c693300f5da
> Reviewed-on: https://skia-review.googlesource.com/5277
> Reviewed-by: Ben Wagner <bungeman@google.com>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
>
TBR=mtklein@chromium.org,bungeman@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: Iff644250a23175e44a4282e7aaea0e2a2adc1ce0
Reviewed-on: https://skia-review.googlesource.com/5310
Commit-Queue: Joe Gregorio <jcgregorio@google.com>
Reviewed-by: Joe Gregorio <jcgregorio@google.com>
If we use the oldy and dy directly as we did previously, the slope could
be very different from (newSnappedX - fSnappedX) / (newSnappedY -
fSnappedY) in the updateLine when the edge made a lot of updates with
small dy but large dx. That will cause bug skia:5995
BUG=skia:5995
Change-Id: If521976ed87195dfea5961afd58bedb98447c568
Reviewed-on: https://skia-review.googlesource.com/5269
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
They're really similar, so let's make them look that way.
Finally use mask load, mask store, and gather instructions for 8888.
We avoid mask load and store when tail == 0. It's faster (one memory load instead of two) and a cheap test.
For gather, the intrinsics make it look like we could do the same, but it really all boils down to the same masked instruction in the end.
There's probably a better way to implement mask() with math instead of memory loads, but this works for now.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I578f47d4562ea19d983057bf2f4c3e21d0ab9a0e
Reviewed-on: https://skia-review.googlesource.com/5234
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
I think the default is /MT, static linking.
This should make our builds smaller, and compatible with clients using /MD.
Change-Id: Id8a39a029925eda2627532bbd0223c693300f5da
Reviewed-on: https://skia-review.googlesource.com/5277
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This will hopefully not change any behavior
BUG=android:32591644
Change-Id: Ia256d39c57a97db085d1d1c4cf003948eddf0771
Reviewed-on: https://skia-review.googlesource.com/5295
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
Some memory allocators have very coarse size buckets, so for example on
Android (jemalloc) an attempt to allocate 32 KiB + 1 byte will end up
allocating 40 KiB, wasting 8 KiB.
GrMemoryPool ctor takes two arguments that specify prealloc / block sizes,
and then inflates them to accommodate some bookkeeping structures. Since
most places create GrMemoryPools with pow2 numbers (which have buckets in
most allocators) the inflation causes allocator to select next size bucket,
wasting memory.
This CL makes GrMemoryPool to stop inflating sizes it was created with, and
allocate specified amounts exactly. Part of allocated memory is then used for
bookkeeping structures. Additionally, GrObjectMemoryPool template is provided,
which takes prealloc / block object counts (instead of sizes) and guarantees
that specified number of objects will fit in prealloc / block spaces.
BUG=651872
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2525773002
Review-Url: https://codereview.chromium.org/2525773002
Was just reading the disassembly and noticed the opportunity.
Change-Id: I25d4b70802f9a9563491f3126da69829611a9b28
Reviewed-on: https://skia-review.googlesource.com/5235
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I082c34a1f484715cd2dca55a8d23101235755e6a
Reviewed-on: https://skia-review.googlesource.com/5233
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This allows us to correctly display images with both a A2B0 tag
and *XYZ/*TRC tags, instead of ignoring the A2B0 information.
BUG=skia:
Change-Id: Icd63db5a55692ef4c5b3f098d963e7e3f583f9a4
Reviewed-on: https://skia-review.googlesource.com/5230
Commit-Queue: Robert Aftias <raftias@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
For stages that have {r,g,b,a} and {dr,dg,db,da} versions, name the {r,g,b,a} one "foo" and the {dr,dg,db,da} on "foo_d". The {r,g,b,a} registers are the ones most commonly used and fastest, so they get short ordinary names, and the d-registers are less commonly used and sometimes slower, so they get a suffix.
Some stages naturally opearate on all 8 registers (the xfermodes, accumulate). These names for those look fine and aren't ambiguous.
Also, a bit more re-arrangement in _opts.h.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Ia20029247642798a60a2566e8a26b84ed101dbd0
Reviewed-on: https://skia-review.googlesource.com/5291
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
1. tons of 565 diffs in gm, so skipping that for now
2. 32bit premul/unpremul conversions are slower than existing code, so for now use the pipeline after the existing special-cases.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5152
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I6ca43b6dd24434814f8f10cdaaabbaf396914d1a
Reviewed-on: https://skia-review.googlesource.com/5152
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Matt Sarett <msarett@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
This file is getting complicated. We're still in early days and I want to keep it nimble and easy to refactor.
Simplifications:
- go back to one stage function type, packing x and tail together in one size_t
(x_tail = x*N+tail; x = x_tail/N, tail = x_tail%N, all cheap for power of 2 N);
- all stages call next(), ending in just_return;
- stop coddling MSVC with kIsTail.
These simplifications should all make things a little slower, some only when using subpar compilers.
On a positive note, this should cut code size by about half.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I7de87c5a3cb0cbbf1e0ed0588f1ccb860a498e66
Reviewed-on: https://skia-review.googlesource.com/5285
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit dd13c02079.
Reason for revert: this breaks the Chromium DEPS roll as we break the layout_tests. I'll add a flag to guard the change in the future and enable the flag while change the layout_tests.
Original change's description:
> Add the missing shift to the dy
>
> BUG=chromium:668907
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5266
>
> Change-Id: I6d3e56ffc149fbeac6f7a2df740542abbf84dac8
> Reviewed-on: https://skia-review.googlesource.com/5266
> Reviewed-by: Cary Clark <caryclark@google.com>
> Commit-Queue: Yuqian Li <liyuqian@google.com>
>
TBR=mtklein@chromium.org,caryclark@google.com,liyuqian@google.com,reed@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: Ifd5aa50f155c3ebe2f1495cbf3b8dd706211a639
Reviewed-on: https://skia-review.googlesource.com/5286
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Yuqian Li <liyuqian@google.com>
Every sRGB GM changes, none noticeably.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I632845aea0f40751639cccbcfde8fa270cae0301
Reviewed-on: https://skia-review.googlesource.com/5275
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
BUG=skia:
Change-Id: Iff27097c248a643319e930a6212c5a7155bd0064
Reviewed-on: https://skia-review.googlesource.com/5280
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Remove some unused variants of bitmap generation and a helper that
serves no purpose.
BUG=skia:
TBR=reed@google.com
Change-Id: I16022e7f0242c4511eebdc06d890f6bfdf81d1f9
Reviewed-on: https://skia-review.googlesource.com/5229
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
Apparently we had issues with this on Qualcomm over 3 years ago.
Today it works fine on a Nexus 6P. If we still need this on some devices we should add back a narrower filter.
Change-Id: I0ccbb4918e4df7a8045b9033b9f121aca3c8d8ef
Reviewed-on: https://skia-review.googlesource.com/5278
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
BUG=skia:
Change-Id: I048f54052ca66cb5d73fe9adafe33a690f9dc0d1
Reviewed-on: https://skia-review.googlesource.com/5263
Reviewed-by: Ravi Mistry <rmistry@google.com>
Commit-Queue: Eric Boren <borenet@google.com>
By stashing the scales in the context (i.e. on the stack), we can free up enough registers to really simplify how the bitmap sample stages interact.
Nearest neighbor is straightforward now: just call the appropriate gather_ function, and you're done. The source pixels end up in the source registers. If they're sRGB encoded, follow up with from_srgb_s
To bilerp, we bracket those 1 or 2 gather+from_srgb_s stages with a stage setting up each corner (x += dx, y += dy, save off scale) and a stage that accumulates into the d-registers (load saved scale, dr += scale * r, etc.). When all the samples are accumulated, copy the d-registers into the s-registers.
from_srgb_d and to_srgb are lightly sketched here and will be used in the next CL, where I apply this same factoring to non-bitmap loads and stores. This is a little tricky, because we don't actually have a float->float to_srgb yet.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I272a1f278f0ea1b29a2f07ac225f753faa8dae81
Reviewed-on: https://skia-review.googlesource.com/5271
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug(?) in the tetrahedral interpolation causes output values to go out
of range a bit (1.035/1.0) in the upper range. We will just clamp for
now as a temporary fix.
BUG=668784
Change-Id: I78dd90da7174133e647b1c6c6e914dbde5de123c
Reviewed-on: https://skia-review.googlesource.com/5228
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Robert Aftias <raftias@google.com>
The existing invert() logic explodes when a == 0.
Less terribly, invert() also does not turn 1.0f into 1.0f, so we now use a float divide. This will cause a small diff in the matrix color filter GM due to increased unpremul precision.
There's an alternative to try if this stage turns out to be speed critical:
auto scale = (a == 0.0f).thenElse(0.0f,
a.invert() * (1.0f / SkNf(1.0f).invert()));
The (1.0f / SkNf(1.0f).invert()) bit there is a constant, scaling a bit to make 1.0f produce 1.0f.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I9db72eda108d3d28583a4357f90a0dcd7e4d8a6f
Reviewed-on: https://skia-review.googlesource.com/5227
Reviewed-by: Mike Reed <reed@google.com>
Previously: GrTextureProducer, GrTextureAdjuster, and GrTextureMaker
were all in GrTextureParamsAdjuster.h. Now they're each in their own
header.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5202
Change-Id: I17fa9057b11511aa4d3e15569ea1c378cfec4c80
Reviewed-on: https://skia-review.googlesource.com/5202
Reviewed-by: Robert Phillips <robertphillips@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
BUG=666707
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5089
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: I3bebfdf635d541d92fb84236f0f6fae2da39d691
Reviewed-on: https://skia-review.googlesource.com/5089
Reviewed-by: Bruce Dawson <brucedawson@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This is a batch of little tweaks that all preserve the existing logical behavior:
- rename dst to move_dst_src to parallel move_src_dst
- remove unused swap_src_dst
- move swap_rb up with the other utility stages
- factor out from_8888() to parallel from_565() and from_4444()
- factor out gather() from the accum_* stages
This changes the order of the math in accum_8888[_srgb] ever so slightly, from (scale * C) * (1/255.0f) to scale * (1/255.0f * C). It causes a few pixel diffs, but nothing noticeable. This makes the 8888 bilerp logic consistent with the other formats, which all convert to [0,1] float first before being scaled.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Id37857b91be3086565169dcc9b1a537574e532aa
Reviewed-on: https://skia-review.googlesource.com/5226
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Split these into their own files, and actually name the files after the
classes they contain. The top three classes in the hierarchy still need
attention, but those are going to be trickier.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5195
Change-Id: I295f4d50e35748eac38a31f302e14b5b62653c55
Reviewed-on: https://skia-review.googlesource.com/5195
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>