Reading extern values meant these couldn't be compile-time constants.
math.h has INFINITY, which is macro that is supposed to expand to float +inf.
On MSVC it seems it's natively a double, so we cast just to make sure.
There's nan(const char*) in math.h for NaN too, but I don't trust that
to be compile-time evaluated. So instead, we keep reinterpreting a bit pattern.
I did try to write
static constexpr float float_nan() { ... }
and completely failed. constexpr seems a bit too restrictive in C++11 to make
it work, but Clang kept telling me, you'll be able to do this with C++14.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2233853002
Review-Url: https://codereview.chromium.org/2233853002
Certain Vulkan devices will return difference alignment requirements for
a given allocation even if using the same heap. Thus we need to check
this alignment as well when deciding which subheap we want to use in our
memory allocation.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2232803003
Review-Url: https://codereview.chromium.org/2232803003
Reason for revert:
Erg - dumb bug
Original issue's description:
> Create blurred RRect mask on GPU (rather than uploading it)
>
> This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
>
> All blurred rrects using the "analytic" path will change slightly with this CL.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
>
> Committed: https://skia.googlesource.com/skia/+/75ccdc77a70ec2083141bf9ba98eb2f01ece2479TBR=bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2236493002
This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
All blurred rrects using the "analytic" path will change slightly with this CL.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
Review-Url: https://codereview.chromium.org/2222083004
Instead of growing at SkTDArray's chosen rate (+4, then *1.25),
grow in additive 4K pages. This is my attempt to make realloc()
have the best chance of not copying and to keep fragmentation down.
Because we use a freelist the rate we grow doesn't affect performance
too much.
I'm not getting very reliable numbers, but this looks maybe 5-10% faster
for recording, mainly I think from inlining the allocation fast path into
push().
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2231553002
Review-Url: https://codereview.chromium.org/2231553002
SkPDFFont:
- SkPDFType1Font::populate() encode advances correctly.
- break out logically independent code into new files:
* SkPDFConvertType1FontStream
* SkPDFMakeToUnicodeCmap
SkPDFFont.cpp is now 380 lines smaller.
Expose `SkPDFAppendCmapSections()` for testing.
SkPDFFontImpl.h
- Fold into SkPDFFont.
SkPDFConvertType1FontStream:
- Now assume given a SkStreamAsset
SkPDFFont:
- AdvanceMetric now hidden in a anonymous namespace.
No public API changes.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2221163002
Review-Url: https://codereview.chromium.org/2221163002
These types are ref-counted, but don't otherwise need a vtable.
This makes them good candidates for SkNVRefCnt.
Destruction can be a little more direct, and if nothing else,
sizeof(T) will get a little smaller by dropping the vptr.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2232433002
Review-Url: https://codereview.chromium.org/2232433002
- This code is entirely private and is not being used by anything.
- In a future CL we will write a class that uses CurveMeasure to compute dash points. In order to determine whether CurveMeasure or PathMeasure should be faster, we need the dash info (the sum of the on/off intervals and how many there are)
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2187083002
Review-Url: https://codereview.chromium.org/2187083002
This simply caps the number of times a display list can be reused.
As this number goes up, the average amount of memory we cache goes up
and the expected number of mallocs per SkLiteDL::New() goes down.
This strategy does not need a hard-coded cap on how many display lists
to cache, or how big they can grow.
TBR=herb@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2226813002
Review-Url: https://codereview.chromium.org/2226813002
About 9x faster than Murmur3 for long inputs.
Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2208903002
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.
This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:
1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord. This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.
2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array. This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.
These two together mean recording is faster. Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today. (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.) This new strategy records 10 drawRects() in about the same time the old strategy took for 2.
This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.
A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1. That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.
This code is not fully capable yet, but most of the key design points are there. The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.
You can run nanobench --match picture_overhead as a demo. Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go. I have not yet compared playback performance.
It should be simple to wrap this into an SkPicture subclass if we want.
I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002
Review-Url: https://codereview.chromium.org/2213333002
SkPDFFont:
- never request kHAdvance_PerGlyphInfo from typeface.
- set_glyph_widths() fn uses a glyph cache to get advances.
- stop expecting vertical advances that are never requested.
- composeAdvanceData() now non-templated
- appendAdvance() one-line function removed
SkPDFDevice:
- use a glyph cache for getting repeated advances.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2219733004
Review-Url: https://codereview.chromium.org/2219733004
In Firefox, we use SkCanvas::saveLayer in combination with a backdrop that initializes the layer to the background. When this is blended back onto background using transparency, where the source and destination pixel colors are the same, the resulting color after the blend is not preserved due to the lost precision mentioned above. In cases where this operation is repeatedly performed, this causes substantially noticeable differences in color as evidenced in this downstream Firefox bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1200684
In the test-case in the downstream report, essentially it does blend(src=0xFF2E3338, dst=0xFF2E3338, scale=217), which gives the result 0xFF2E3237, while we would expect to get back 0xFF2E3338.
This problem goes away if the blend is instead reformulated to effectively do (src*src_scale + dst*dst_scale)>>8, which keeps the intermediate precision during the addition before shifting it off.
This modifies the blending operations thusly. The performance should remain mostly unchanged, or possibly improve slightly, so there should be no real downside to doing this, with the benefit of making the results more accurate. Without this, it is currently unsafe for Firefox to blend a layer back onto itself that was initialized with a copy of its background.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2097883002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
[mtklein adds...]
No public API changes.
TBR=reed@google.com
Review-Url: https://codereview.chromium.org/2097883002
Change SkDataTable::NewXXX to SkDataTable::MakeXXX and return sk_sp.
This updates users of SkDataTable to sk_sp as well.
There do not appear to be any external users of these methods.
Review-Url: https://codereview.chromium.org/2211143002
With the move from SkData::NewXXX to SkData::MakeXXX most
SkAutoTUnref<SkData> were changed to sk_sp<SkData>. However,
there are still a few SkAutoTUnref<SkData> around, so clean
them up.
Review-Url: https://codereview.chromium.org/2212493002
Motivation: reduce code complexity.
SkCanon stores SkPDFShader::State next to SkDFObject, not inside.
many places use sk_sp<T> rather than T* to represent ownership.
SkPDFShader::State no longer holds bitmap.
SkPDFShader::State gets move constructor, no longer heap-allocated.
Classes removed:
SkPDFFunctionShader
SkPDFAlphaFunctionShader
SkPDFImageShader
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2193973002
Review-Url: https://codereview.chromium.org/2193973002
(1) Performance is better or stays the same.
(2) Code is split into functions (RasterPipeline-ish
design). IMO, it's not really more or less readable.
But I think it's now much easier add capabilities,
apply optimizations, or do more refactors. Or to
actually use RasterPipeline. I help back from trying
any of these to try to keep this CL sane.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2194303002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2194303002
These files have been renamed and exist only as stubs for transition
reasons. Remove these now unused stubs.
CQ_INCLUDE_TRYBOTS=master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2197423003
These files are now so badly misnamed that it is causing problems.
The original files are kept as shells until Chromium and PDFium can
be updated. After Chromium and PDFium builds are updated, the old
files will be removed and the cmake and bzl builds will be updated.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2199973002
CQ_INCLUDE_TRYBOTS=master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2199973002
Lessons learned
1. ImageShader (correctly) always compresses (typically via PNG) during serialization. This has the surprise results of
- if the image was marked opaque, but has some non-opaque pixels (i.e. bug in blitter or caller), then compressing may "fix" those pixels, making the deserialized version draw differently. bug filed.
- 565 compressess/decompresses to 8888 (at least on Mac), which draws differently (esp. under some filters). bug filed.
2. BitmapShader did not enforce a copy for mutable bitmaps, but ImageShader does (since it creates an Image). Thus the former would see subsequent changes to the pixels after shader creation, while the latter does not, hence the change to the BlitRow test to avoid this modify-after-create pattern. I sure hope this prev. behavior was a bug/undefined-behavior, since this CL changes that.
BUG=skia:5595
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195893002
Review-Url: https://codereview.chromium.org/2195893002