Reading extern values meant these couldn't be compile-time constants.
math.h has INFINITY, which is macro that is supposed to expand to float +inf.
On MSVC it seems it's natively a double, so we cast just to make sure.
There's nan(const char*) in math.h for NaN too, but I don't trust that
to be compile-time evaluated. So instead, we keep reinterpreting a bit pattern.
I did try to write
static constexpr float float_nan() { ... }
and completely failed. constexpr seems a bit too restrictive in C++11 to make
it work, but Clang kept telling me, you'll be able to do this with C++14.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2233853002
Review-Url: https://codereview.chromium.org/2233853002
Certain Vulkan devices will return difference alignment requirements for
a given allocation even if using the same heap. Thus we need to check
this alignment as well when deciding which subheap we want to use in our
memory allocation.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2232803003
Review-Url: https://codereview.chromium.org/2232803003
Reason for revert:
Erg - dumb bug
Original issue's description:
> Create blurred RRect mask on GPU (rather than uploading it)
>
> This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
>
> All blurred rrects using the "analytic" path will change slightly with this CL.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
>
> Committed: https://skia.googlesource.com/skia/+/75ccdc77a70ec2083141bf9ba98eb2f01ece2479TBR=bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2236493002
This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
All blurred rrects using the "analytic" path will change slightly with this CL.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
Review-Url: https://codereview.chromium.org/2222083004
Instead of growing at SkTDArray's chosen rate (+4, then *1.25),
grow in additive 4K pages. This is my attempt to make realloc()
have the best chance of not copying and to keep fragmentation down.
Because we use a freelist the rate we grow doesn't affect performance
too much.
I'm not getting very reliable numbers, but this looks maybe 5-10% faster
for recording, mainly I think from inlining the allocation fast path into
push().
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2231553002
Review-Url: https://codereview.chromium.org/2231553002
SkPDFFont:
- SkPDFType1Font::populate() encode advances correctly.
- break out logically independent code into new files:
* SkPDFConvertType1FontStream
* SkPDFMakeToUnicodeCmap
SkPDFFont.cpp is now 380 lines smaller.
Expose `SkPDFAppendCmapSections()` for testing.
SkPDFFontImpl.h
- Fold into SkPDFFont.
SkPDFConvertType1FontStream:
- Now assume given a SkStreamAsset
SkPDFFont:
- AdvanceMetric now hidden in a anonymous namespace.
No public API changes.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2221163002
Review-Url: https://codereview.chromium.org/2221163002
These types are ref-counted, but don't otherwise need a vtable.
This makes them good candidates for SkNVRefCnt.
Destruction can be a little more direct, and if nothing else,
sizeof(T) will get a little smaller by dropping the vptr.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2232433002
Review-Url: https://codereview.chromium.org/2232433002
- This code is entirely private and is not being used by anything.
- In a future CL we will write a class that uses CurveMeasure to compute dash points. In order to determine whether CurveMeasure or PathMeasure should be faster, we need the dash info (the sum of the on/off intervals and how many there are)
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2187083002
Review-Url: https://codereview.chromium.org/2187083002
This simply caps the number of times a display list can be reused.
As this number goes up, the average amount of memory we cache goes up
and the expected number of mallocs per SkLiteDL::New() goes down.
This strategy does not need a hard-coded cap on how many display lists
to cache, or how big they can grow.
TBR=herb@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2226813002
Review-Url: https://codereview.chromium.org/2226813002
About 9x faster than Murmur3 for long inputs.
Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2208903002
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.
This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:
1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord. This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.
2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array. This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.
These two together mean recording is faster. Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today. (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.) This new strategy records 10 drawRects() in about the same time the old strategy took for 2.
This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.
A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1. That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.
This code is not fully capable yet, but most of the key design points are there. The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.
You can run nanobench --match picture_overhead as a demo. Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go. I have not yet compared playback performance.
It should be simple to wrap this into an SkPicture subclass if we want.
I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002
Review-Url: https://codereview.chromium.org/2213333002
SkPDFFont:
- never request kHAdvance_PerGlyphInfo from typeface.
- set_glyph_widths() fn uses a glyph cache to get advances.
- stop expecting vertical advances that are never requested.
- composeAdvanceData() now non-templated
- appendAdvance() one-line function removed
SkPDFDevice:
- use a glyph cache for getting repeated advances.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2219733004
Review-Url: https://codereview.chromium.org/2219733004
In Firefox, we use SkCanvas::saveLayer in combination with a backdrop that initializes the layer to the background. When this is blended back onto background using transparency, where the source and destination pixel colors are the same, the resulting color after the blend is not preserved due to the lost precision mentioned above. In cases where this operation is repeatedly performed, this causes substantially noticeable differences in color as evidenced in this downstream Firefox bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1200684
In the test-case in the downstream report, essentially it does blend(src=0xFF2E3338, dst=0xFF2E3338, scale=217), which gives the result 0xFF2E3237, while we would expect to get back 0xFF2E3338.
This problem goes away if the blend is instead reformulated to effectively do (src*src_scale + dst*dst_scale)>>8, which keeps the intermediate precision during the addition before shifting it off.
This modifies the blending operations thusly. The performance should remain mostly unchanged, or possibly improve slightly, so there should be no real downside to doing this, with the benefit of making the results more accurate. Without this, it is currently unsafe for Firefox to blend a layer back onto itself that was initialized with a copy of its background.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2097883002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
[mtklein adds...]
No public API changes.
TBR=reed@google.com
Review-Url: https://codereview.chromium.org/2097883002
Change SkDataTable::NewXXX to SkDataTable::MakeXXX and return sk_sp.
This updates users of SkDataTable to sk_sp as well.
There do not appear to be any external users of these methods.
Review-Url: https://codereview.chromium.org/2211143002
With the move from SkData::NewXXX to SkData::MakeXXX most
SkAutoTUnref<SkData> were changed to sk_sp<SkData>. However,
there are still a few SkAutoTUnref<SkData> around, so clean
them up.
Review-Url: https://codereview.chromium.org/2212493002
Motivation: reduce code complexity.
SkCanon stores SkPDFShader::State next to SkDFObject, not inside.
many places use sk_sp<T> rather than T* to represent ownership.
SkPDFShader::State no longer holds bitmap.
SkPDFShader::State gets move constructor, no longer heap-allocated.
Classes removed:
SkPDFFunctionShader
SkPDFAlphaFunctionShader
SkPDFImageShader
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2193973002
Review-Url: https://codereview.chromium.org/2193973002
(1) Performance is better or stays the same.
(2) Code is split into functions (RasterPipeline-ish
design). IMO, it's not really more or less readable.
But I think it's now much easier add capabilities,
apply optimizations, or do more refactors. Or to
actually use RasterPipeline. I help back from trying
any of these to try to keep this CL sane.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2194303002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2194303002
These files have been renamed and exist only as stubs for transition
reasons. Remove these now unused stubs.
CQ_INCLUDE_TRYBOTS=master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2197423003
These files are now so badly misnamed that it is causing problems.
The original files are kept as shells until Chromium and PDFium can
be updated. After Chromium and PDFium builds are updated, the old
files will be removed and the cmake and bzl builds will be updated.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2199973002
CQ_INCLUDE_TRYBOTS=master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2199973002
Lessons learned
1. ImageShader (correctly) always compresses (typically via PNG) during serialization. This has the surprise results of
- if the image was marked opaque, but has some non-opaque pixels (i.e. bug in blitter or caller), then compressing may "fix" those pixels, making the deserialized version draw differently. bug filed.
- 565 compressess/decompresses to 8888 (at least on Mac), which draws differently (esp. under some filters). bug filed.
2. BitmapShader did not enforce a copy for mutable bitmaps, but ImageShader does (since it creates an Image). Thus the former would see subsequent changes to the pixels after shader creation, while the latter does not, hence the change to the BlitRow test to avoid this modify-after-create pattern. I sure hope this prev. behavior was a bug/undefined-behavior, since this CL changes that.
BUG=skia:5595
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195893002
Review-Url: https://codereview.chromium.org/2195893002
Reason for revert:
UBSAN says we're reading a bad bool here:
bool usesDistanceVectorField() const { return fUsesDistanceVectorField; }
../../../include/gpu/GrPaint.h:83:51: runtime error: load of value 239, which is not a valid value for type 'bool'
SUMMARY: AddressSanitizer: undefined-behavior ../../../include/gpu/GrPaint.h:83:51 in
Seems likely also the root of Valgrind failure:
https://luci-milo.appspot.com/swarming/task/30522e4f2241cb10
Original issue's description:
> GrFP can express distance vector field req., program builder declares variable for it
>
> This update allows fragment processors to require a field of vectors to the nearest edge. This requirement propagates:
>
> - from child FPs to their parent
> - from parent FPs to the GrPaint
> - from GrPaint through the PipelineBuilder into GrPipeline
> - acessed from GrPipeline by GrGLSLProgramBuilder
>
> GrGLSL generates a variable for the distance vector and passes it down to the GeometryProcessor->emitCode() method.
>
> This CL's base is the CL for adding the BevelNormalSource API: https://codereview.chromium.org/2080993002
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2114993002
>
> Committed: https://skia.googlesource.com/skia/+/4ef6dfa7089c092c67b0d5ec34e89c1e319af196TBR=egdaniel@google.com,robertphillips@google.com,bsalomon@google.com,dvonbeck@google.com
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=skia:
Review-Url: https://codereview.chromium.org/2201613002
Reason for revert:
https://luci-milo.appspot.com/swarming/task/3055149a25621b10
Not Nexus 5 specific. Reproduces on Pixel C with --gcc -t Debug -d arm_v7_neon. Not sure about other configs yet.
Original issue's description:
> Tidy up SkNx_neon.
>
> This takes advantage of the fact that all the compilers we use that
> support NEON implement it with their own vector extensions. This means
> normal things like c = a + b work on the underlying vector types already.
> Odd instructions like min or saturated add need to stay intrinsics.
>
> Also, rearrange functions to a more consistent order.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2196773002
> CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/6ad22315eb6eacfcd35497cd118440a619d05b18TBR=msarett@google.com,mtklein@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=skia:
Review-Url: https://codereview.chromium.org/2196953002
Most visibly this adds a macro SK_RASTER_STAGE that cuts down on the boilerplate of defining a raster pipeline stage function.
Most interestingly, SK_RASTER_STAGE doesn't define a SkRasterPipeline::Fn, but rather a new type EasyFn. This function is always static and inlined, and the details of interacting with the SkRasterPipeline::Stage are taken care of for you: ctx is just passed as a void*, and st->next() is always called. All EasyFns have to do is take care of the meat of the work: update r,g,b, etc. and read and write from their context.
The really neat new feature here is that you can either add EasyFns to a pipeline with the new append() functions, _or_ call them directly yourself. This lets you use the same set of pieces to build either a pipelined version of the function or a custom, fused version. The bench shows this off.
On my desktop, the pipeline version of the bench takes about 25% more time to run than the fused one.
The old approach to creating stages still works fine. I haven't updated SkXfermode.cpp or SkArithmeticMode.cpp because they seemed just as clear using Fn directly as they would have using EasyFn.
If this looks okay to you I will rework the comments in SkRasterPipeline to explain SK_RASTER_STAGE and EasyFn a bit as I've done here in the CL description.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195853002
Review-Url: https://codereview.chromium.org/2195853002
Adds the ability for GLInstancedRendering to use
glDrawElementsInstanced when glDrawElementsIndirect is not supported.
The only remaining 3.1 dependency now is EXT_texture_buffer.
Also moves the cap for glDraw*Instanced out of GrCaps and into
GrGLCaps.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2193303002
Review-Url: https://codereview.chromium.org/2193303002
This takes advantage of the fact that all the compilers we use that
support NEON implement it with their own vector extensions. This means
normal things like c = a + b work on the underlying vector types already.
Odd instructions like min or saturated add need to stay intrinsics.
Also, rearrange functions to a more consistent order.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2196773002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2196773002
Motivation:
SkPDFStream and SkPDFSharedStream now work the same.
Also:
- move SkPDFStream into SkPDFTypes (it's a fundamental PDF type).
- minor refactor of SkPDFSharedStream
- SkPDFSharedStream takes unique_ptr to represent ownership
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2190883003
Review-Url: https://codereview.chromium.org/2190883003
This update allows fragment processors to require a field of vectors to the nearest edge. This requirement propagates:
- from child FPs to their parent
- from parent FPs to the GrPaint
- from GrPaint through the PipelineBuilder into GrPipeline
- acessed from GrPipeline by GrGLSLProgramBuilder
GrGLSL generates a variable for the distance vector and passes it down to the GeometryProcessor->emitCode() method.
This CL's base is the CL for adding the BevelNormalSource API: https://codereview.chromium.org/2080993002
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2114993002
Review-Url: https://codereview.chromium.org/2114993002
New API mirrors the form of similar APIs in SkRegion,
SkMatrix, etc.
This also fixes a bug:
SkImageInfo appears in a object that Chrome stores in
discardable memory. So when sk_sp<SkColorSpace> was added
to SkImageInfo a leak was introduced. We'll use this new
method and deserialize to store the SkColorSpace in the
discardable object.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2192903002
Review-Url: https://codereview.chromium.org/2192903002
Updates stencilRect to call drawNonAAFilledRect instead of
drawFilledRect. drawFilledRect can use coverage AA, which isn't
appropriate for stencil draws. Also modifies drawNonAAFilledRect to
take a "useHWAA" argument instead of trying to deduce whether it
should be used.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2187583002
Review-Url: https://codereview.chromium.org/2187583002
On Debug vulkan bots, running with the debug layers on seems to be adding
more than an hour to the total running time. Since we suppress any output
on the bots anyways the debug layers are serving no purpose. Thus I am
adding a gyp define to disable the layers on the bot.
With this change, by default when running vulkan in Debug, the debug_layers
will be enabled. The bots should disable the layers. Android framework
should also have them disabled by default.
TBR=djsollen@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2185953003
Review-Url: https://codereview.chromium.org/2185953003
DrawContext's isGammaCorrect now just based on presence of color space.
Next change will remove the function and flag entirely, but I wanted to
land this separately. This alters a few GMs in srgb/f16 mode, generally
those that are creating off-screen surfaces in ways that were somewhat
lossy before. No unexplained changes.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2186633002
Review-Url: https://codereview.chromium.org/2186633002
Should feel very similar to Sk4h_store4:
NEON uses its native instruction, SSE unpacks manually.
Since we'll have our F16s in 4 Sk4h by the time we're done here,
this also extracts an Sk4h->Sk4f routine from the old uint64_t->Sk4f one.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2184753002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2184753002
This gives us a little more control over instruction order, allowing
us to pipeline the muls and get better performance. Technically,
clang should be able to do this for us anyway...
Performance on HP z620 (201295.jpg):
toSRGB: 371us -> 356us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2175413002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2175413002
* do a lot less floating-point math by converting to
an integer as early as possible [faster].
* round rather than truncate.
* use 8 significant digits rather than 9 when possible.
* remove trailing zeros in fractions.
before:
0.12 ! PDFScalar nonrendering
after:
0.07 ! PDFScalar nonrendering
Accuracy guaranteed by existing unit test.
Example diffs:
-/Shading <</Function <</C0 [.321568638 .333333343 .321568638]
+/Shading <</Function <</C0 [.32156864 .33333334 .32156864]
-/C1 [.258823543 .270588248 .258823543]
+/C1 [.25882354 .27058825 .25882354]
-1 0 0 -1 20 120.394500 Tm
+1 0 0 -1 20 120.394501 Tm
-1 0 0 -1 20 184.789001 Tm
+1 0 0 -1 20 184.789 Tm
-291.503997 0 l
+291.504 0 l
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2146103004
Review-Url: https://codereview.chromium.org/2146103004
This clamps to [0,1] premul just before every store to memory.
By making the clamp a stage itself, this design makes it easy to move the clamp
around, to replace it with a debug-only assert-we're-clamped stage for certain
formats, clamp in more places, programatically not clamp, etc. etc.
Before this change, clamping was a little haphazard: store_srgb clamped
R, G and B to [0,1], but not A, and didn't clamp the colors to A. 565
didn't clamp at all.
6 GMs draw subtly differently in sRGB, I think because we've started clamping
colors to alpha to enforce premultiplication better. No changes for 565.
My hope is that now no other stage need ever concern itself with clamping.
So we don't double-clamp, I've added a _noclamp version of sk_linear_to_srgb()
that simply asserts a clamp isn't necessary. This happens to expose the Sk4f
_needs_trunc version that might be useful for power users (*cough* Matt *cough*).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2178793002
Review-Url: https://codereview.chromium.org/2178793002
This is going to be needed in many more places as I finish connecting the
dots. Even better - I'd like to switch to a world where SkColorSpace !=
nullptr is the only signal we use for gamma-correct rendering, so I can
eliminate SkSourceGammaTreatment and SkSurfaceProps::isGammaCorrect.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2180503002
Review-Url: https://codereview.chromium.org/2180503002
These are asserts are firing from a recent change to our scissor code.
Since these asserts were added, the Vulkan spec has been updated to no
longer require the scissor is insides the bounds of the image, just that
x + width does not overflow.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2171283004
Review-Url: https://codereview.chromium.org/2171283004
This is an experiment / demo to have our 565 backend fold into
SkRasterPipelineBlitter as it grows more powerful. I plan to follow up with
the same for the other 8888 format.
Blur mask filters look significantly different (better) after this change.
We keep the full 13-14-13 bits of precision for mask blits, where the old code
uses 11-11-10 bit intermediates.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2172343002
Review-Url: https://codereview.chromium.org/2172343002
This CL has several parts that are intertwined:
* move pin/wrap functionality into BilerpSampler.
* remove the nearest neighbor and bilerp tilers
* create a simplified general tiler
* remove the pipeline virtual calls bilerpEdge and bilerpSpan because everything works of sample points now.
* redo all the bilerp sampling to use the new local to methods to wrap/pin.
* introduce a new medium rate sample that handles spans with 1 < |dx| < 2.
This change improves the performance as displayed below:
Most of top 25 desktop improves or are the same. A few are worse, but close to the noise floor. In addition, this change has about 3% smaller code.
old time new time new/old
13274693 8414645 0.633886 top25desk_google_com_search_q_c.skp_1
4946466 3258018 0.658656 top25desk_wordpress.skp_1
6977187 5737584 0.822335 top25desk_youtube_com.skp_1
3770021 3296831 0.874486 top25desk_google_com__hl_en_q_b.skp_1
8890813 8600143 0.967307 top25desk_answers_yahoo_com.skp_1
3178974 3094300 0.973364 top25desk_facebook.skp_1
8871835 8711260 0.981901 top25desk_twitter.skp_1
838509 829290 0.989005 top25desk_blogger.skp_1
2821870 2801111 0.992644 top25desk_plus_google_com_11003.skp_1
511978 509530 0.995219 top25desk_techcrunch_com.skp_1
2408588 2397435 0.995369 top25desk_ebay_com.skp_1
4446919 4448004 1.00024 top25desk_espn.skp_1
2863241 2875696 1.00435 top25desk_google_com_calendar_.skp_1
7170086 7208447 1.00535 top25desk_booking_com.skp_1
7356109 7417776 1.00838 top25desk_pinterest.skp_1
5265591 5340392 1.01421 top25desk_weather_com.skp_1
5675244 5774144 1.01743 top25desk_sports_yahoo_com_.skp_1
1048531 1067663 1.01825 top25desk_games_yahoo_com.skp_1
2075501 2115131 1.01909 top25desk_amazon_com.skp_1
4262170 4370441 1.0254 top25desk_news_yahoo_com.skp_1
3789319 3897996 1.02868 top25desk_docs___1_open_documen.skp_1
919336 949979 1.03333 top25desk_wikipedia__1_tab_.skp_1
4274454 4489369 1.05028 top25desk_mail_google_com_mail_.skp_1
4149326 4376556 1.05476 top25desk_linkedin.skp_1
BUG=skia:5566
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2134893002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Committed: https://skia.googlesource.com/skia/+/8602ede5fdfa721dcad4dcb11db028c1c24265f1
Review-Url: https://codereview.chromium.org/2134893002
Previously, SkClipStack would call "setEmpty" on itself when an
inverse-filled difference element made the stack empty. This was
a problem because setEmpty would forget the element had an inverse
fill, yet leave the op as "difference". This change modifies it to
manually update the clip bounds and set the gen-ID to kEmptyGenID,
rather than calling setEmpty.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2175493002
Review-Url: https://codereview.chromium.org/2175493002
We're using the linear procs for sRGB destintations
and the sRGB procs for linear destinations. Fix that.
C.f. State32::getLCDProc(), which flags |= kDstIsSRGB_LCDFlag.
kDistIsSRGB is (1<<2) == 4, so the sRGB procs must be 4-7, not 0-3.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2177493002
Review-Url: https://codereview.chromium.org/2177493002
Functions like GrMakeInfoFromTexture encouraged incorrect code to be
written. Similarly, the ability to construct an info from any GrSurface
was never going to be correct. Luckily, the only client of that had all
of the correct parameters much higher on the stack (and dictated or
replaced most of the properties of the returned info anyway).
With this, I can finally remove the color space as an output of the
pixel config -> color type conversion, which was never going to be
correct.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2173513002
Review-Url: https://codereview.chromium.org/2173513002
Improves performance for xforms toSRGB and to2Dot2. Seems
more optimal to save clamping until the end. That way we
don't stall the mul pipeline with a min/max.
toSRGB: 371us -> 346us
to2Dot2: 404us -> 387us
FWIW, it probably makes sense to clamp inside
sk_linear_to_srgb anyway. If not, we should potentially
provide two versions (one that clamps and one that
doesn't).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2173803002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2173803002
Reason for revert:
Crashing on Win with:
Caught exception 3221225477 EXCEPTION_ACCESS_VIOLATION, was running:
unit test GrShape
srgb gm shadertext2
srgb gm shallow_gradient_conical
srgb gm shallow_gradient_sweep
srgb gm shallow_gradient_linear_nodither
step returned non-zero exit code: -1073741819
https://status.skia.org/?commit_label=author&filter=search&search_value=Test-Win-MSVC-GCE-CPU-AVX2-x86-Release
Original issue's description:
> In the current code, tiling and bilerp sampling are strongly tied together. They can be separated by taking advantage of observation that translating a sample point into filter points in the bilerp stage the filter points will be at most 0.5 outside the tile. This allows simplified repositioning for the various tiling modes; clamp and mirror use min and max while repeat has max -> 0 and 0-> max. This allows bilerp to simply treat the filter points that fall off the tile. This allows tiling and bilerp sampling to be totally separate.
>
> This CL has several parts that are intertwined:
> * move pin/wrap functionality into BilerpSampler.
> * remove the nearest neighbor and bilerp tilers
> * create a simplified general tiler
> * remove the pipeline virtual calls bilerpEdge and bilerpSpan because everything works of sample points now.
> * redo all the bilerp sampling to use the new local to methods to wrap/pin.
> * introduce a new medium rate sample that handles spans with 1 < |dx| < 2.
>
> This change improves the performance as displayed below:
> Most of top 25 desktop improves or are the same. A few are worse, but close to the noise floor. In addition, this change has about 3% smaller code.
>
> old time new time new/old
> 13274693 8414645 0.633886 top25desk_google_com_search_q_c.skp_1
> 4946466 3258018 0.658656 top25desk_wordpress.skp_1
> 6977187 5737584 0.822335 top25desk_youtube_com.skp_1
> 3770021 3296831 0.874486 top25desk_google_com__hl_en_q_b.skp_1
> 8890813 8600143 0.967307 top25desk_answers_yahoo_com.skp_1
> 3178974 3094300 0.973364 top25desk_facebook.skp_1
> 8871835 8711260 0.981901 top25desk_twitter.skp_1
> 838509 829290 0.989005 top25desk_blogger.skp_1
> 2821870 2801111 0.992644 top25desk_plus_google_com_11003.skp_1
> 511978 509530 0.995219 top25desk_techcrunch_com.skp_1
> 2408588 2397435 0.995369 top25desk_ebay_com.skp_1
> 4446919 4448004 1.00024 top25desk_espn.skp_1
> 2863241 2875696 1.00435 top25desk_google_com_calendar_.skp_1
> 7170086 7208447 1.00535 top25desk_booking_com.skp_1
> 7356109 7417776 1.00838 top25desk_pinterest.skp_1
> 5265591 5340392 1.01421 top25desk_weather_com.skp_1
> 5675244 5774144 1.01743 top25desk_sports_yahoo_com_.skp_1
> 1048531 1067663 1.01825 top25desk_games_yahoo_com.skp_1
> 2075501 2115131 1.01909 top25desk_amazon_com.skp_1
> 4262170 4370441 1.0254 top25desk_news_yahoo_com.skp_1
> 3789319 3897996 1.02868 top25desk_docs___1_open_documen.skp_1
> 919336 949979 1.03333 top25desk_wikipedia__1_tab_.skp_1
> 4274454 4489369 1.05028 top25desk_mail_google_com_mail_.skp_1
> 4149326 4376556 1.05476 top25desk_linkedin.skp_1
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2134893002
> CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/8602ede5fdfa721dcad4dcb11db028c1c24265f1TBR=mtklein@google.com,herb@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2174793002
This CL has several parts that are intertwined:
* move pin/wrap functionality into BilerpSampler.
* remove the nearest neighbor and bilerp tilers
* create a simplified general tiler
* remove the pipeline virtual calls bilerpEdge and bilerpSpan because everything works of sample points now.
* redo all the bilerp sampling to use the new local to methods to wrap/pin.
* introduce a new medium rate sample that handles spans with 1 < |dx| < 2.
This change improves the performance as displayed below:
Most of top 25 desktop improves or are the same. A few are worse, but close to the noise floor. In addition, this change has about 3% smaller code.
old time new time new/old
13274693 8414645 0.633886 top25desk_google_com_search_q_c.skp_1
4946466 3258018 0.658656 top25desk_wordpress.skp_1
6977187 5737584 0.822335 top25desk_youtube_com.skp_1
3770021 3296831 0.874486 top25desk_google_com__hl_en_q_b.skp_1
8890813 8600143 0.967307 top25desk_answers_yahoo_com.skp_1
3178974 3094300 0.973364 top25desk_facebook.skp_1
8871835 8711260 0.981901 top25desk_twitter.skp_1
838509 829290 0.989005 top25desk_blogger.skp_1
2821870 2801111 0.992644 top25desk_plus_google_com_11003.skp_1
511978 509530 0.995219 top25desk_techcrunch_com.skp_1
2408588 2397435 0.995369 top25desk_ebay_com.skp_1
4446919 4448004 1.00024 top25desk_espn.skp_1
2863241 2875696 1.00435 top25desk_google_com_calendar_.skp_1
7170086 7208447 1.00535 top25desk_booking_com.skp_1
7356109 7417776 1.00838 top25desk_pinterest.skp_1
5265591 5340392 1.01421 top25desk_weather_com.skp_1
5675244 5774144 1.01743 top25desk_sports_yahoo_com_.skp_1
1048531 1067663 1.01825 top25desk_games_yahoo_com.skp_1
2075501 2115131 1.01909 top25desk_amazon_com.skp_1
4262170 4370441 1.0254 top25desk_news_yahoo_com.skp_1
3789319 3897996 1.02868 top25desk_docs___1_open_documen.skp_1
919336 949979 1.03333 top25desk_wikipedia__1_tab_.skp_1
4274454 4489369 1.05028 top25desk_mail_google_com_mail_.skp_1
4149326 4376556 1.05476 top25desk_linkedin.skp_1
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2134893002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2134893002
If the length of a line path is sufficiently long relative to the dash
interval, it is possible to cause SkDashPathEffect::asPoints to produce
so many points that it overflows the amount that can fit in an int type,
or otherwise produce non-finite values, i.e. path from (0,0) to (0,9e15)
with a dash interval of 1.
This fixes that by capping the amount of points to a sane limit - in this
case, 1mil, since that limit is also used in utils/SkDashPath.cpp and has
precedent.
Downstream Firefox bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1287515
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2165013002
Review-Url: https://codereview.chromium.org/2165013002
Fix another fuzzer bug.
Some PathOps asserts only make sense if the incoming data is
well-behaved. Well-behaved tests set debugging state to
trigger these additional asserts.
Formalize this by creating macros similar to SkASSERT that
check to see if the assert should be skipped.
TBR=reed@google.com
BUG=629962
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2169863002
Review-Url: https://codereview.chromium.org/2169863002
This trims the SkPM4fPriv methods down to just foolproof methods.
(Anything trying to build these itself is probably wrong.)
Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore,
at least not efficiently, so this refactor is somewhat more invasive
than you might think. Generally this means things using to_4f() are
also making a misstep... that's gone too.
It also does not make sense to try to play games with linear floats
with 255 bias any more. That hack can't work with real sRGB coding.
Rather than update them, I've removed a couple of L32 xfermode fast
paths. I'd even rather drop it entirely...
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2163683002
On Nexus Player and occasionally Nexus 5x we get transparent boxes around
paths. This appears to be because the dFdy call is not as accurate as
dFdx, which is the opposite of Mali 400. As Mali 400 is not supported with
Vulkan, we can go back to using dFdx in this case.
BUG=skia:5523
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163213004
Review-Url: https://codereview.chromium.org/2163213004
I basically just ran a big 5-deep for-loop over the five constants here.
This is the first set of coefficients I found that round trips all bytes.
I suspect there are many such sets.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2162063003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2162063003
Make SkASSERTF output readable.
Ensure the assert predicate is stringified once.
Make the abort code consistent.
TBR=reed
This doesn't change any public API, most of this should be privatized.
Review-Url: https://codereview.chromium.org/2161103002
This should give us a good baseline to explore using SkRasterPipeline.
A particular colorxform to half float drops from 425us to 282us on my desktop.
Color Xform to Half Float (HP z620)
Original 425us
Trans16 (not 32) 355us
Vector Trans16 378us
Trans16 + Keep Halfs in Vector 335us
Vector Trans16 + Keep Halfs in Vector 282us
Final 282us
Color Xform to Half Float (Nexus 5X)
Original 556us
Final 472us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2159993003
GrTextureAccess optionally includes an instance, computed from the src
and dst color spaces. In all common cases (no color space for either src
or dst, or same color space for both), no object is allocated.
This change is orthogonal to my attempts to get color space attached to
render targets - regardless of how we choose to do that, this will give
us the source color space at all points where we are connecting src to
dst.
There are many dangling injection points where I've been inserting
nullptr, but I have a record of all of them. Additionally, there are now
three places (the most common simple paths for bitmap/image rendering)
where things are plumbed enough that I expect to have access to the dst
color space (all marked with XFORMTODO).
In addition to getting the dst color space, I need to inject shader code
and uniform uploading for appendTextureLookup and friends.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2154753003
Review-Url: https://codereview.chromium.org/2154753003
Most changes stem from working on an examples bracketed
by #if DEBUG_UNDER_DEVELOPMENT // tiger
These exposed many problems with coincident curves,
as well as errors throughout the code.
Fixing these errors also fixed a number of fuzzer-inspired
bug reports.
* Line/Curve Intersections
Check to see if the end of the line nearly intersects
the curve. This was a FIXME in the old code.
* Performance
Use a central chunk allocator.
Plumb the allocator into the global variable state
so that it can be shared. (Note that 'SkGlobalState'
is allocated on the stack and is visible to children
functions but not other threads.)
* Refactor
Let SkOpAngle grow up from a structure to a class.
Let SkCoincidentSpans grow up from a structure to a class.
Rename enum Alias to AliasMatch.
* Coincidence Rewrite
Add more debugging to coincidence detection.
Parallel debugging routines have read-only logic to report
the current coincidence state so that steps through the
logic can expose whether things got better or worse.
More functions can error-out and cause the pathops
engine to non-destructively exit.
* Accuracy
Remove code that adjusted point locations. Instead,
offset the curve part so that sorted curves all use
the same origin.
Reduce the size (and influence) of magic numbers.
* Testing
The debug suite with verify and the full release suite
./out/Debug/pathops_unittest -v -V
./out/Release/pathops_unittest -v -V -x
expose one error. That error is captured as cubics_d3.
This error exists in the checked in code as well.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128633003
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128633003
Review-Url: https://codereview.chromium.org/2128633003
Reason for revert:
Causing roll to fail on telemetry_perf_unittests (bencharks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.load:search:taobao (and baidu)) and browser_tests (FindInPageControllerTest.FindInPageSpecialURLS).
This is due to triggering the assert in copyFTBitmap
SkASSERT(dstMask.fBounds.width() == static_cast<int>(srcFTBitmap.width));
when called from inside the block guarded by
if (bitmapTransform.isIdentity())
Original issue's description:
> Rotate bitmap strikes with FreeType.
>
> BUG=skia:3490
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2139703002
>
> Committed: https://skia.googlesource.com/skia/+/31e0c1379e6d0ce48196183e295b929af51fa74eTBR=mtklein@google.com,reed@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3490
Review-Url: https://codereview.chromium.org/2149253005
SkPDFUtils now has a special function (SkPDFUtils::AppendColorComponent)
just for writing out (color/255) as a decimal with three digits of
precision.
SkPDFUnion now has a type to represent a color component. It holds a
utint_8, but calls into AppendColorComponent to serialize.
Added a unit test that tests all possible input values.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2151863003
Review-Url: https://codereview.chromium.org/2151863003