We used to step at a 4-pixel stride as long as possible, then run up to 3 times, one pixel at a time. Now replace those 1-at-a-time runs with a single tail stamp if there are 1-3 remaining pixels.
This style is simply more efficient: e.g. we'll blend and lerp once for 3 pixels instead of 3 times. This should make short blits significantly more efficient. It's also more future-oriented... AVX+ on Intel and SVE on ARM support masked loads and stores, so we can do the entire tail in one direct step.
This also makes it possible to re-arrange the code a bit to encapsulate each stage better. I think generally this code reads more clearly than the old code, but YMMV. I've arranged things so you write one function, but it's compiled into two specializations, one for tail=0 (Body) and one for tail>0 (Tail). It's pretty tidy.
For now I've just burned a register to pass around tail. It's 2 bits now, maybe soon 3 with AVX, and capped at 4 for even the craziest new toys, so there are plenty of places we can pack it if we want to get clever.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2717
Change-Id: I45852a3e5d4c5b5e9315302c46601aee0d32265f
Reviewed-on: https://skia-review.googlesource.com/2717
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Gradients (and other shaders) are going to end up serializing this
particular color space very frequently, so we want a shorthand way of
writing it out. I think it's also helpful to have a clearer way of
creating it (vs. NewNamed(kSRGB_Named)->makeLinearGamma()).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2377763002
Review-Url: https://codereview.chromium.org/2377763002
I've even found the code that's making this happen, just don't know why.
I've added a test to assert that it's safe to assume malloc() is 8-byte aligned.
Test should compile this time.
CQ_INCLUDE_TRYBOTS=master.client.skia.android:Test-Android-Clang-NexusPlayer-CPU-Moorefield-x86-Release-GN_Android-Trybot
Change-Id: I48714b99670c20704adf4f7f216da0d60d7d9bcd
Reviewed-on: https://skia-review.googlesource.com/2662
Reviewed-on: https://skia-review.googlesource.com/2703
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit If8a2898ab3a77571622eb125c97f676e029b902c.
Reason for revert:
../../../../../work/skia/tests/OverAlignedTest.cpp: In function 'void test_OverAligned(skiatest::Reporter*, sk_gpu_test::GrContextFactory*)':
../../../../../work/skia/tests/OverAlignedTest.cpp:19:33: error: invalid operands of types 'void*' and 'int' to binary 'operator&'
REPORTER_ASSERT(r, SkIsAlign8(p));
^
ninja: build stopped: subcommand failed.
Original issue's description:
> Focus -Wno-over-aligned to just 32-bit x86 Android.
>
> I've even found the code that's making this happen, just don't know why.
> I've added a test to assert that it's safe to assume malloc() is 8-byte aligned.
>
> BUG=skia:
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2662
>
> CQ_INCLUDE_TRYBOTS=master.client.skia.android:Test-Android-Clang-NexusPlayer-CPU-Moorefield-x86-Release-GN_Android-Trybot
>
TBR=mtklein@chromium.org,bungeman@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: Ic9b30ce980d8d5155528a6f2b4e1913e5fa95dc0
Reviewed-on: https://skia-review.googlesource.com/2702
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
I've even found the code that's making this happen, just don't know why.
I've added a test to assert that it's safe to assume malloc() is 8-byte aligned.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2662
CQ_INCLUDE_TRYBOTS=master.client.skia.android:Test-Android-Clang-NexusPlayer-CPU-Moorefield-x86-Release-GN_Android-Trybot
Change-Id: If8a2898ab3a77571622eb125c97f676e029b902c
Reviewed-on: https://skia-review.googlesource.com/2662
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Ben Wagner <bungeman@google.com>
Reason for revert:
Let's see if reverting this helps the roll.
Original issue's description:
> My take on SkAlign changes.
>
> Like the other change, it makes SkAlignN(x) macros work for pointers, and makes the macros themselves just syntax sugar for SkAlign<N>(x). We can still decide if we want to sed away the macros independently.
>
> This just does it in a somewhat less repetitive way, and adds some tests.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2368293002
>
> No public API changes.
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/e1a5f4e292384046678edc5c1e360b3e13dc118cTBR=cblume@chromium.org,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2372083002
This cleans up 3 remaining sites using , that probably meant ;
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2605
Change-Id: I5e48bcd85d72a205d2b0c860461dab1ec793dd18
Reviewed-on: https://skia-review.googlesource.com/2605
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Like the other change, it makes SkAlignN(x) macros work for pointers, and makes the macros themselves just syntax sugar for SkAlign<N>(x). We can still decide if we want to sed away the macros independently.
This just does it in a somewhat less repetitive way, and adds some tests.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2368293002
No public API changes.
TBR=reed@google.com
Review-Url: https://codereview.chromium.org/2368293002
In one case the fuzzer was switching the picture's op code to an invalid value
In the other two the fuzzer was maxing out the number of points passed to drawPoints and the number of characters passed to drawTextRSXform. In these cases the validation would fail but still return a pointer into the data stream.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2593
Change-Id: Id6d4e6b7bcbae38ace7ad1d92ffcfa5c02f9fb61
Reviewed-on: https://skia-review.googlesource.com/2593
Reviewed-by: Mike Reed <reed@google.com>
properties (color space), bounds, and (optional) alphaType.
We were being pretty inconsistent before. Raster was honoring all
components of the info. GPU was using the supplied color type, but
propagating the source's color space. All call sites were saying N32.
What we want to do is propagate the original device's color space, and
pick a good format from that. Rather than force all the clients to
jump through hoops constructing an SkImageInfo that meets our criteria,
just have them supply the few bits we care about, and do everything else
internally.
This also lets us always use RGBA on GPU, but N32 on raster.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2349373004
Committed: https://skia.googlesource.com/skia/+/53c38087949252d27cde668368a3eeb59cc2eb00
Review-Url: https://codereview.chromium.org/2349373004
Reason for revert:
DM crash and/or TSAN failure
Original issue's description:
> Change SkSpecialImage::makeSurface and makeTightSurface to take output
> properties (color space), bounds, and (optional) alphaType.
>
> We were being pretty inconsistent before. Raster was honoring all
> components of the info. GPU was using the supplied color type, but
> propagating the source's color space. All call sites were saying N32.
>
> What we want to do is propagate the original device's color space, and
> pick a good format from that. Rather than force all the clients to
> jump through hoops constructing an SkImageInfo that meets our criteria,
> just have them supply the few bits we care about, and do everything else
> internally.
>
> This also lets us always use RGBA on GPU, but N32 on raster.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2349373004
>
> Committed: https://skia.googlesource.com/skia/+/53c38087949252d27cde668368a3eeb59cc2eb00TBR=robertphillips@google.com,reed@google.com,bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2366723004
properties (color space), bounds, and (optional) alphaType.
We were being pretty inconsistent before. Raster was honoring all
components of the info. GPU was using the supplied color type, but
propagating the source's color space. All call sites were saying N32.
What we want to do is propagate the original device's color space, and
pick a good format from that. Rather than force all the clients to
jump through hoops constructing an SkImageInfo that meets our criteria,
just have them supply the few bits we care about, and do everything else
internally.
This also lets us always use RGBA on GPU, but N32 on raster.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2349373004
Review-Url: https://codereview.chromium.org/2349373004
Reason for revert:
fHeapIndex is not used in release, need to hide behind SK_DEBUG. Failing on Perf-Android-Clang-NVIDIA_Shield-GPU-TegraX1-arm64-Release-GN_Android_Vulkan.
Original issue's description:
> Some Vulkan memory fixes and cleanup
>
> * Switch back to not setting transfer_dst on all buffers
> * Add some missing unit tests
> * Add tracking of heap usage for debugging purposes
> * Fall back to non-device-local memory if device-local allocation fails
>
> BUG=skia:5031
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2356343003
>
> Committed: https://skia.googlesource.com/skia/+/c5850e9fdb62cc4ae5ed2b6af51aea92cac07455TBR=egdaniel@google.com,brianosman@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:5031
Review-Url: https://codereview.chromium.org/2358123004
For now, this is just the color space (of the original
requesting device). This is used when constructing
intermediate rendering surfaces, so that we ensure we
land in a surface that's similar/compatible to the
final consumer of the DAG's output.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2357273002
Review-Url: https://codereview.chromium.org/2357273002
For now, the only options are sRGB or WideGamutRGB
Basically, the color option to the gpu config is now of
the form: 8888|srgb[_gamut]|f16[_gamut]
color=8888 still implies legacy behavior
srgb implies 8-bit gamma-correct rendering (via sRGB format)
f16 implies 16-bit gamma-correct rendering (via F16 format)
Either of the last two options can then optionally include
a gamut specifier, either _srgb or _wide.
_srgb selects the (default) sRGB gamut
_wide selects the Adobe Wide Gamut RGB gamut, which is nice
for testing, in that it's significantly wider than sRGB,
so rendering differences are obvious.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2350003003
Review-Url: https://codereview.chromium.org/2350003003
Reason for revert:
Hitting an assert
Original issue's description:
> Support Float32 output from SkColorSpaceXform
>
> * Adds Float32 support to SkColorSpaceXform
> * Changes API to allows clients to ask for F32, updates clients to
> new API
> * Adds Sk4f_load4 and Sk4f_store4 to SkNx
> * Make use of new xform in SkGr.cpp
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003
> CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/43d6651111374b5d1e4ddd9030dcf079b448ec47TBR=brianosman@google.com,mtklein@google.com,scroggo@google.com,mtklein@chromium.org,bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2347473007
* Adds Float32 support to SkColorSpaceXform
* Changes API to allows clients to ask for F32, updates clients to
new API
* Adds Sk4f_load4 and Sk4f_store4 to SkNx
* Make use of new xform in SkGr.cpp
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2339233003
This is a step towards using SkCodec in Chromium, where progressive
decoding is necessary.
Switch from using png_read_row (which expects all the data to be
available) to png_process_data, which uses callbacks when rows are
available.
Create a new API for SkCodec, which supports progressive decoding and
scanline decoding. Future changes will switch the other clients off of
startScanlineDecode and get/skip-Scanlines to the new API.
Remove SkCodec::kNone_ScanlineOrder, which was only used for interlaced
PNG images. In the new API, interlaced PNG fits kTopDown. Also remove
updateCurrScanline(), which was only used by the old implementation for
interlaced PNG.
DMSrcSink:
- In CodecSrc::kScanline_Mode, use the new method for scanline decoding
for the supported formats (just PNG and PNG-in-ICO for now).
fuzz.cpp:
- Remove reference to kNone_ScanlineOrder
SkCodec:
- Add new APIs:
- startIncrementalDecode
- incrementalDecode
- Remove kNone_SkScanlineOrder and updateCurrScanline()
- Set fDstInfo and fOptions in getPixels(). This may not be necessary
for all implementations, but it simplifies things for SkPngCodec.
SkPngCodec:
- Implement new APIs
- Switch from sk_read_fn/png_read_row etc to png_process_data
- Expand AutoCleanPng's role to decode the header and create the
SkPngCodec
- Make the interlaced PNG decoder report how many lines were
initialized during an incomplete decode
SkIcoCodec:
- Implement the new APIs; supported for PNG in ICO
SkSampledCodec:
- Call the new method for decoding scanlines, and fall back to the old
method if the new version is unimplemented
- Remove references to kNone_SkScanlineOrder
tests/CodecPartial:
- Add a test which decodes part of an image, then finishes the decode,
and compares it to the straightforward method
tests/CodecTest:
- Add a test which decodes all scanlines using the new method
- Repurpose the Codec_stripes test to decode using the new method in
sections rather than all at once
- In the method check(), add a parameter for whether the image supports
the new method of scanline decoding, and be explicit about whether an
image supports incomplete
- Test incomplete PNG decodes. We should have been doing it anyway for
non-interlaced (except for an image that is too small - one row), but
the new method supports interlaced incomplete as well
- Make test_invalid_parameters test the new method
- Add a test to ensure that it's safe to fall back to scanline decoding without
rewinding
BUG=skia:4211
The new version was generally faster than the old version (but not significantly so).
Some raw performance differences can be found at https://docs.google.com/a/google.com/spreadsheets/d/1Gis3aRCEa72qBNDRMgGDg3jD-pMgO-FXldlNF9ejo4o/
Design doc can be found at https://docs.google.com/a/google.com/document/d/11Mn8-ePDKwVEMCjs3nWwSjxcSpJ_Cu8DF57KNtUmgLM/
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1997703003
Review-Url: https://codereview.chromium.org/1997703003
The SkFontData type is not exposed externally, so any method which uses
it can be updated to use smart pointers without affecting external
users. Updating this first will make updating the public API much
easier.
This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
appears that no one outside Skia is currently using SkStream::NewfromFile
so this is a good time to update it as well.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
Committed: https://skia.googlesource.com/skia/+/d8c2476a8b1e1e1a1771b17e8dd4db8645914f8c
Review-Url: https://codereview.chromium.org/2339273002
Reason for revert:
Killing Mac
Original issue's description:
> SkFontData to use smart pointers.
>
> The SkFontData type is not exposed externally, so any method which uses
> it can be updated to use smart pointers without affecting external
> users. Updating this first will make updating the public API much
> easier.
>
> This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
> std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
> appears that no one outside Skia is currently using SkStream::NewfromFile
> so this is a good time to update it as well.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
>
> Committed: https://skia.googlesource.com/skia/+/d8c2476a8b1e1e1a1771b17e8dd4db8645914f8cTBR=mtklein@chromium.org,halcanary@google.com,mtklein@google.com,reed@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2343933002
The SkFontData type is not exposed externally, so any method which uses
it can be updated to use smart pointers without affecting external
users. Updating this first will make updating the public API much
easier.
This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
appears that no one outside Skia is currently using SkStream::NewfromFile
so this is a good time to update it as well.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
Review-Url: https://codereview.chromium.org/2339273002
If a quad a conic intersect only where the end of one
is contained by the convex hull of the other, and the
curve contained by the hull is nearly a straight line,
treating it as a line may move the end point to the
other side of the curve.
Detect this by checking to see if the end point is in
the hull, and if so, continue to subdivide the curve
rather than treating it as a line.
This fixes several existing tests that were disabled
earlier this year.
A typo in SkDCurve::nearPoint() prevented detecting when
the end of a line was nearly touching a curve.
Also fixed concidence a bit to get the second half of
tiger further along.
All existing tests, including extended testing in
Release and the first half of tiger, work.
TBR=reed@google.com
BUG=skia:5131
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2338323002
Review-Url: https://codereview.chromium.org/2338323002
The path writer takes constructs the output path out of
curves that satisfy the pathop operation.
Curves contain lists of t/point pairs that may not be
comparable to each other. To match up curve ends in the
output path, look for adjacent curves to have a shared
membership rather than comparing point values.
Use path utilities to connect partial curve lists into
closed contours.
Share the angle code that determines if a curve has become
a degenerate line with the path writer.
Clean up some code on the way, and delete some unused
functions.
TBR=reed@google.com
BUG=5188
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2321973005
Review-Url: https://codereview.chromium.org/2321973005
Also: make sure that all SkPDF unit tests are named SkPDF_* to
make testing changes to SkPDF easier. Other cleanup.
Add test: SkPDF_pdfa_document to verify that flag in public API
works.
SkPDF_JpegIdentification test: test slightly malformed JPEGs to
verify that all code paths work.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2322133003
Review-Url: https://codereview.chromium.org/2322133003
We were effectively storing the transpose, which made all of our
operations on individual colors, and our concatenation of matrices
awkward and backwards.
I'm planning to push this further into Ganesh, where I had incorrectly
adjusted to the previous layout, treating colors as row vectors in the
shaders.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2324843003
Review-Url: https://codereview.chromium.org/2324843003
during curve intersection if their
ends are nearly the same.
Loosen conic/line intersection point
check.
Detect when coincident points are
unordered. This means that points
a/b/c on one curve may appear in
b/c/a order on the opposite curve.
Restructure addMissing to return
success and return if a coincidence
was added as a parameter.
With this, tiger part a works.
Tiger part b exposes bugs around
tight quads that are nearly coincident
with themselves, and are coincident
with something else.
The greedy coicident matcher
may cause the point order to be
out of sync.
Still working out what to do in
this case.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2321773002
Review-Url: https://codereview.chromium.org/2321773002
Reason for revert:
Broken perf bots
Original issue's description:
> Checking for valid colorType, alphaType, colorSpace in SkCodec
>
> * Refactor to share code between SkPngCodec and SkWebpCodec
> * Didn't end up sharing with SkJpegCodec but did refactor
> that code a bit
> * Disallow conversions to F16 with non-linear color spaces
> * Fail to decode if we fail to create a SkColorSpaceXform
> (should be an assert soon). We used to fallback on a
> legacy decode if we failed to create the transform.
> * A bunch of name changes
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2319293003
>
> Committed: https://skia.googlesource.com/skia/+/7a9900d6d34e437bb24beb5524a1f6488ae138c9TBR=scroggo@google.com,brianosman@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2328663002
* Refactor to share code between SkPngCodec and SkWebpCodec
* Didn't end up sharing with SkJpegCodec but did refactor
that code a bit
* Disallow conversions to F16 with non-linear color spaces
* Fail to decode if we fail to create a SkColorSpaceXform
(should be an assert soon). We used to fallback on a
legacy decode if we failed to create the transform.
* A bunch of name changes
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2319293003
Review-Url: https://codereview.chromium.org/2319293003
* Necessary to read ICC profiles
* Will be necessary to implement animation
* Requires consolidated data with length
Doesn't affect decode performance (thought I believe
our performance tests only cover SkStreams with memory
bases).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2311793004
Review-Url: https://codereview.chromium.org/2311793004
Consolidates all flush actions into GrDrawingManager and makes GrContext::flush a passthrough.
Removes the unused and untested discard flush variation.
Replaces the indirect overbudget callback mechanism of GrResourceCache with a flag set by resource cache when it wants to flush that is checked after each draw by GrDrawContext.
Modifies GrResourceCache::notifyFlushOccurred() to take a param indicating whether it triggered the
flush that just occurred.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2307053002
Committed: https://skia.googlesource.com/skia/+/1dbb207babecdae8f1f74ed9d9900c73064df744
Review-Url: https://codereview.chromium.org/2307053002
Reason for revert:
Causing assertions on bots
Original issue's description:
> Restructure flushing relationship between GrContext, GrDrawingManager, and GrResourceCache.
>
> Consolidates all flush actions into GrDrawingManager and makes GrContext::flush a passthrough.
>
> Removes the unused and untested discard flush variation.
>
> Replaces the indirect overbudget callback mechanism of GrResourceCache with a flag set by resource cache when it wants to flush that is checked after each draw by GrDrawContext.
>
> Modifies GrResourceCache::notifyFlushOccurred() to take a param indicating whether it triggered the
> flush that just occurred.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2307053002
>
> Committed: https://skia.googlesource.com/skia/+/1dbb207babecdae8f1f74ed9d9900c73064df744TBR=robertphillips@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2312123003
Consolidates all flush actions into GrDrawingManager and makes GrContext::flush a passthrough.
Removes the unused and untested discard flush variation.
Replaces the indirect overbudget callback mechanism of GrResourceCache with a flag set by resource cache when it wants to flush that is checked after each draw by GrDrawContext.
Modifies GrResourceCache::notifyFlushOccurred() to take a param indicating whether it triggered the
flush that just occurred.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2307053002
Review-Url: https://codereview.chromium.org/2307053002
Conics with very large w values can
be approximated with two straight lines.
This avoids iterating endlessly in an
attempt to create quadratics with unstable
numerics.
Check to see if the first chop generated
a pair of lines within the default
point comparison tolerance.
R=reed@google.com
BUG=643933, 643665
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2312923002
Review-Url: https://codereview.chromium.org/2312923002
Pathops makes up intersections that it doesn't detect directly,
but do exist. For instance, if a is coincident with b, and
b is coincident with c, then for where they overlap
a is coincident with c.
The intersections are made up in different ways. In a few
places, the t values that are detected are interpolated to
guess the t values that represent invented intersections.
The interpolated t is not necessarily linear, but a linear
guess is good enough if the invented t lies between known
t values.
Additionally, improve debugging.
This passes the extended release test suite and additionally
passes the first 17 levels in the tiger test suite;
previously, path ops passed 7 levels.
The tiger suite is composed of 37 levels in increasing
complexity, described by about 300K tests.
TBR=reed@google.com
BUG=skia:5131
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2300203002
Review-Url: https://codereview.chromium.org/2300203002
This is working towards fixing all bugs around simplifying the tiger.
This installment simplifies the point-t intersection list as it is built rather than doing the analysis once the intersections are complete. This avoids getting the list in an inconsistent state and makes coincident checks faster and more stable.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2237223002TBR=reed@google.com
BUG=skia:5131
Review-Url: https://codereview.chromium.org/2237223002
This seems to fix text extraction on Adobe Reader
- Registry/Ordering is now set to Skia/SkiaOrdering.
- Type3 fonts now get a FontDescriptor (force symbolic font).
- CMapName is now Skia-Identity-SkiaOrdering
- CMap behaves correctly for single-byte fonts.
Also:
- SkTestTypeface returns tounicode map for testing.
- Unit test updated
All PDFs render the same
BUG=skia:5606
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2292303004
Review-Url: https://codereview.chromium.org/2292303004
Verify the rules that we're converging on for surfaces:
- For 8888, we only support sRGB-like gamma, or no color space at all.
- For F16, we require a color space, with linear gamma.
- For all other formats, we do not support color spaces.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2270823002
Review-Url: https://codereview.chromium.org/2270823002
Instead of mapping invaid glyphIDs to zero or maxGlyphID,
don't draw them at all.
Validate glyphs when glyph is written, not ahead of time.
Don't allocate array to copy user-provided glyphs.
Easy early exit from SkPDFDevice::internalDrawText()
GlyphPositioner::flush() called ~GlyphPositioner()
SkScopeExit class now exists.
Assume SkTypeface* pointers are now never null in more
places.
precalculate alignmentFactor to clean up code.
SkPDFDevice::updateFont must be called with validated
glyphID. Skip bad glyphs to make this true.
SkPDFDevice::updateFont always succeeds.
SkPDFFont::GetFontResource always succeeds (preconditions are
asserted). If GetMetrics fails, don't call GetFontResource.
SkPDFFont::glyphsToPDFFontEncodingCount() becomes
SkPDFFont::countStretch() and is inlined.
SkPDFFont::glyphsToPDFFontEncoding now works one Glyph at a
time and is inlined.
SkPDFFont::noteGlyphUsage() operates one glyph at a time.
Add SkScopeExit.h; also a unit test for it.
SkPostConfig: Fix SK_UNUSED for Win32.
No public API changes.
TBR=reed@google.com
BUG=625995
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2278703002
Review-Url: https://codereview.chromium.org/2278703002
No more:
#include SK_SFNTLY_SUBSETTER
#include ZLIB_INCLUDE
Also, rename SK_SFNTLY_SUBSETTER to SK_PDF_USE_SFNTLY
to follow my pattern of prefixing SkPDF-specific defines
with 'SK_PDF_'.
The ZLIB_INCLUDE define is no longer is used by anyone.
TODO: rename Sfntly to something pronounceable.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2273343002
Review-Url: https://codereview.chromium.org/2273343002
This removes the notion of keeping track of every different t value
that resolves to the same or a similar point. Other fixes make
this concept unnecessary, and removing it simplifies the code.
This removes an allocation, and speeds up paths with many
overlapping curves.
As a bonus, four fuzzer tests that failed before now succeed.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2275703003
Review-Url: https://codereview.chromium.org/2275703003
SkColorSpaces may be once-ptrs, so we can reuse common color spaces.
This means that they must be thread safe. SkMatrix44 is not
thread safe because it maintains a mutable type mask.
This CL ensures that we precompute the type mask so we
can use const SkMatrix44's safely.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2277463002
Review-Url: https://codereview.chromium.org/2277463002
I think we convinced ourselves that denorms, while a good chunk of half floats,
cover a rather small fraction of the representable range, which is always
close enough to zero to flush.
This makes both paths of the conversion to or from float considerably simpler.
These functions now work for zero-or-normal half floats (excluding infinite, NaN).
I'm not aware of a term for this class so I've called them "ordinary".
A handful of GMs and SKPs draw differently in --config f16, but all imperceptibly.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2256023002
Review-Url: https://codereview.chromium.org/2256023002
PNGs may contain a gAMA chunk that specifies gamma
values as floats. If so, we will use these floats
to create an SkColorSpace.
This CL fixes Equals(), serialize(), and
Deserialize() to correctly handle SkColorSpaces
with strange gammas, where we are unable to fall
back on the profile data.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2221983002
Review-Url: https://codereview.chromium.org/2221983002
Motivation: gross code simplification, also no bitset lookups at draw time.
SkPDFFont owns its glyph useage bitset.
SkPDFSubstituteMap goes away.
SkPDFObject interface is simplified.
SkPDFDocument tracks font usage (as hash set), not glyph usage.
SkPDFFont gets a simpler constructor.
SkPDFFont has first and last glyph set in constructor, not adjusted later.
SkPDFFont implementations are simplified.
SkPDFGlyphSet is replaced with simple SkBitSet.
SkPDFFont sizes its SkBitSets based on glyph count.
SkPDFGlyphSetMap goes away.
SkBitSet is now non-copyable.
SkBitSet now how utility methods to match old SkPDFGlyphSet.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2253283004
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Win-MSVC-GCE-CPU-AVX2-x86_64-Release-GDI-Trybot,Test-Win-MSVC-GCE-CPU-AVX2-x86_64-Debug-GDI-Trybot
Review-Url: https://codereview.chromium.org/2253283004
In rare cases, floating point error causes the tesselator to add the
same Vertex to more than one Poly on the same side. This was not a big
problem when we were allocating new vertices when constructing Polys,
but after https://codereview.chromium.org/2029243002 it causes more serious
issues, since each Edge can only belong to two Polys, and violating this
condition messes up the linked list of Edges used for left & right Polys
and the associated estimated vertex count.
The fix is to simply let the first Poly win, and skip that vertex for
subsequent Polys. Since this only occurs in cases where vertices are very
close to each other, it should have little visual effect.
This is also exercised by Nebraska-StateSeal.svg.
BUG=skia:5636
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2259493002
Review-Url: https://codereview.chromium.org/2259493002
Keep isOpaque as a convenience method -- many places really only need to
know that for optimization purposes (SrcOver -> Src, etc...).
In all the places where we pull data back out or convert to another
object and need to supply an SkImageInfo, we can avoid losing information
about premulness.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2250663002
Review-Url: https://codereview.chromium.org/2250663002
GrAppliedClip was about at its limit for how many "make" functions it
could have. Window rectangles would push it over the edge. This change
makes it so GrDrawTarget supplies the original draw bounds to the
constructor, and then GrClip adds the various required clipping
techniques.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2246113002
Review-Url: https://codereview.chromium.org/2246113002
Impl Overview
(1) Keep the device clip bounds up to date. This
requires minimal additional work in a few places
throughout canvas.
(2) Keep track of if the ctm isScaleTranslate. Yes,
there's a function that does this, but it's slow
to call.
(3) Perform the src->device transform in quick reject,
then check intersection/nan.
Other Notes:
(1) NaN and intersection checks are performed
simultaneously.
(2) We no longer quick reject infinity.
(3) Affine and perspective are both handled in the slow
case.
(4) SkRasterClip::isEmpty() is handled by the intersection
check.
Performance on Nexus 6P:
93.2ms -> 59.8ms
Overall Android Jank Tests Performance Impact:
Should gain us a ms or two on some tests.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2225393002
Committed: https://skia.googlesource.com/skia/+/d22a817ff57986407facd16af36320fc86ce02da
Review-Url: https://codereview.chromium.org/2225393002
Impl Overview
(1) Keep the device clip bounds up to date. This
requires minimal additional work in a few places
throughout canvas.
(2) Keep track of if the ctm isScaleTranslate. Yes,
there's a function that does this, but it's slow
to call.
(3) Perform the src->device transform in quick reject,
then check intersection/nan.
Other Notes:
(1) NaN and intersection checks are performed
simultaneously.
(2) We no longer quick reject infinity.
(3) Affine and perspective are both handled in the slow
case.
(4) SkRasterClip::isEmpty() is handled by the intersection
check.
Performance on Nexus 6P:
93.2ms -> 59.8ms
Overall Android Jank Tests Performance Impact:
Should gain us a ms or two on some tests.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2225393002
Review-Url: https://codereview.chromium.org/2225393002
Reason for revert:
Erg - dumb bug
Original issue's description:
> Create blurred RRect mask on GPU (rather than uploading it)
>
> This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
>
> All blurred rrects using the "analytic" path will change slightly with this CL.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
>
> Committed: https://skia.googlesource.com/skia/+/75ccdc77a70ec2083141bf9ba98eb2f01ece2479TBR=bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2236493002
This CL doesn't try to resolve any of the larger issues. It just moves the computation of the blurred RRect to the gpu and sets up to start using vertex attributes for a nine patch draw (i.e., returning the texture coordinates)
All blurred rrects using the "analytic" path will change slightly with this CL.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2222083004
Review-Url: https://codereview.chromium.org/2222083004
SkPDFFont:
- SkPDFType1Font::populate() encode advances correctly.
- break out logically independent code into new files:
* SkPDFConvertType1FontStream
* SkPDFMakeToUnicodeCmap
SkPDFFont.cpp is now 380 lines smaller.
Expose `SkPDFAppendCmapSections()` for testing.
SkPDFFontImpl.h
- Fold into SkPDFFont.
SkPDFConvertType1FontStream:
- Now assume given a SkStreamAsset
SkPDFFont:
- AdvanceMetric now hidden in a anonymous namespace.
No public API changes.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2221163002
Review-Url: https://codereview.chromium.org/2221163002
About 9x faster than Murmur3 for long inputs.
Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2208903002
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.
This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:
1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord. This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.
2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array. This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.
These two together mean recording is faster. Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today. (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.) This new strategy records 10 drawRects() in about the same time the old strategy took for 2.
This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.
A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1. That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.
This code is not fully capable yet, but most of the key design points are there. The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.
You can run nanobench --match picture_overhead as a demo. Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go. I have not yet compared playback performance.
It should be simple to wrap this into an SkPicture subclass if we want.
I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002
Review-Url: https://codereview.chromium.org/2213333002
Change SkDataTable::NewXXX to SkDataTable::MakeXXX and return sk_sp.
This updates users of SkDataTable to sk_sp as well.
There do not appear to be any external users of these methods.
Review-Url: https://codereview.chromium.org/2211143002
With the move from SkData::NewXXX to SkData::MakeXXX most
SkAutoTUnref<SkData> were changed to sk_sp<SkData>. However,
there are still a few SkAutoTUnref<SkData> around, so clean
them up.
Review-Url: https://codereview.chromium.org/2212493002
Motivation: reduce code complexity.
SkCanon stores SkPDFShader::State next to SkDFObject, not inside.
many places use sk_sp<T> rather than T* to represent ownership.
SkPDFShader::State no longer holds bitmap.
SkPDFShader::State gets move constructor, no longer heap-allocated.
Classes removed:
SkPDFFunctionShader
SkPDFAlphaFunctionShader
SkPDFImageShader
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2193973002
Review-Url: https://codereview.chromium.org/2193973002
Lessons learned
1. ImageShader (correctly) always compresses (typically via PNG) during serialization. This has the surprise results of
- if the image was marked opaque, but has some non-opaque pixels (i.e. bug in blitter or caller), then compressing may "fix" those pixels, making the deserialized version draw differently. bug filed.
- 565 compressess/decompresses to 8888 (at least on Mac), which draws differently (esp. under some filters). bug filed.
2. BitmapShader did not enforce a copy for mutable bitmaps, but ImageShader does (since it creates an Image). Thus the former would see subsequent changes to the pixels after shader creation, while the latter does not, hence the change to the BlitRow test to avoid this modify-after-create pattern. I sure hope this prev. behavior was a bug/undefined-behavior, since this CL changes that.
BUG=skia:5595
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195893002
Review-Url: https://codereview.chromium.org/2195893002
Most visibly this adds a macro SK_RASTER_STAGE that cuts down on the boilerplate of defining a raster pipeline stage function.
Most interestingly, SK_RASTER_STAGE doesn't define a SkRasterPipeline::Fn, but rather a new type EasyFn. This function is always static and inlined, and the details of interacting with the SkRasterPipeline::Stage are taken care of for you: ctx is just passed as a void*, and st->next() is always called. All EasyFns have to do is take care of the meat of the work: update r,g,b, etc. and read and write from their context.
The really neat new feature here is that you can either add EasyFns to a pipeline with the new append() functions, _or_ call them directly yourself. This lets you use the same set of pieces to build either a pipelined version of the function or a custom, fused version. The bench shows this off.
On my desktop, the pipeline version of the bench takes about 25% more time to run than the fused one.
The old approach to creating stages still works fine. I haven't updated SkXfermode.cpp or SkArithmeticMode.cpp because they seemed just as clear using Fn directly as they would have using EasyFn.
If this looks okay to you I will rework the comments in SkRasterPipeline to explain SK_RASTER_STAGE and EasyFn a bit as I've done here in the CL description.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195853002
Review-Url: https://codereview.chromium.org/2195853002
Motivation:
SkPDFStream and SkPDFSharedStream now work the same.
Also:
- move SkPDFStream into SkPDFTypes (it's a fundamental PDF type).
- minor refactor of SkPDFSharedStream
- SkPDFSharedStream takes unique_ptr to represent ownership
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2190883003
Review-Url: https://codereview.chromium.org/2190883003
New API mirrors the form of similar APIs in SkRegion,
SkMatrix, etc.
This also fixes a bug:
SkImageInfo appears in a object that Chrome stores in
discardable memory. So when sk_sp<SkColorSpace> was added
to SkImageInfo a leak was introduced. We'll use this new
method and deserialize to store the SkColorSpace in the
discardable object.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2192903002
Review-Url: https://codereview.chromium.org/2192903002
The handling of angle-gl changes with SK_ANGLE. We don't have any bots
that test this particular combination, but I see it all the time while
running DM for other things.
Previously considered changing things so that the config parsing results
are consistent, regardless of GYP_DEFINES, but this is much simpler (and
more consistent with the other code we already have for testing config
parsing).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2183323004
Review-Url: https://codereview.chromium.org/2183323004
AtomicTest was the only use of sk_atomic_add().
AtomicInc64 bench was the only use of sk_atomic_inc(int64_t*).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2183473005
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot,Test-Ubuntu-GCC-Golo-GPU-GT610-x86_64-Release-TSAN-Trybot
Review-Url: https://codereview.chromium.org/2183473005
* do a lot less floating-point math by converting to
an integer as early as possible [faster].
* round rather than truncate.
* use 8 significant digits rather than 9 when possible.
* remove trailing zeros in fractions.
before:
0.12 ! PDFScalar nonrendering
after:
0.07 ! PDFScalar nonrendering
Accuracy guaranteed by existing unit test.
Example diffs:
-/Shading <</Function <</C0 [.321568638 .333333343 .321568638]
+/Shading <</Function <</C0 [.32156864 .33333334 .32156864]
-/C1 [.258823543 .270588248 .258823543]
+/C1 [.25882354 .27058825 .25882354]
-1 0 0 -1 20 120.394500 Tm
+1 0 0 -1 20 120.394501 Tm
-1 0 0 -1 20 184.789001 Tm
+1 0 0 -1 20 184.789 Tm
-291.503997 0 l
+291.504 0 l
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2146103004
Review-Url: https://codereview.chromium.org/2146103004
Previously, SkClipStack would call "setEmpty" on itself when an
inverse-filled difference element made the stack empty. This was
a problem because setEmpty would forget the element had an inverse
fill, yet leave the op as "difference". This change modifies it to
manually update the clip bounds and set the gen-ID to kEmptyGenID,
rather than calling setEmpty.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2175493002
Review-Url: https://codereview.chromium.org/2175493002
If the length of a line path is sufficiently long relative to the dash
interval, it is possible to cause SkDashPathEffect::asPoints to produce
so many points that it overflows the amount that can fit in an int type,
or otherwise produce non-finite values, i.e. path from (0,0) to (0,9e15)
with a dash interval of 1.
This fixes that by capping the amount of points to a sane limit - in this
case, 1mil, since that limit is also used in utils/SkDashPath.cpp and has
precedent.
Downstream Firefox bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1287515
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2165013002
Review-Url: https://codereview.chromium.org/2165013002
Fix another fuzzer bug.
Some PathOps asserts only make sense if the incoming data is
well-behaved. Well-behaved tests set debugging state to
trigger these additional asserts.
Formalize this by creating macros similar to SkASSERT that
check to see if the assert should be skipped.
TBR=reed@google.com
BUG=629962
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2169863002
Review-Url: https://codereview.chromium.org/2169863002
This trims the SkPM4fPriv methods down to just foolproof methods.
(Anything trying to build these itself is probably wrong.)
Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore,
at least not efficiently, so this refactor is somewhat more invasive
than you might think. Generally this means things using to_4f() are
also making a misstep... that's gone too.
It also does not make sense to try to play games with linear floats
with 255 bias any more. That hack can't work with real sRGB coding.
Rather than update them, I've removed a couple of L32 xfermode fast
paths. I'd even rather drop it entirely...
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2163683002
I basically just ran a big 5-deep for-loop over the five constants here.
This is the first set of coefficients I found that round trips all bytes.
I suspect there are many such sets.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2162063003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2162063003
This should give us a good baseline to explore using SkRasterPipeline.
A particular colorxform to half float drops from 425us to 282us on my desktop.
Color Xform to Half Float (HP z620)
Original 425us
Trans16 (not 32) 355us
Vector Trans16 378us
Trans16 + Keep Halfs in Vector 335us
Vector Trans16 + Keep Halfs in Vector 282us
Final 282us
Color Xform to Half Float (Nexus 5X)
Original 556us
Final 472us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2159993003
GrTextureAccess optionally includes an instance, computed from the src
and dst color spaces. In all common cases (no color space for either src
or dst, or same color space for both), no object is allocated.
This change is orthogonal to my attempts to get color space attached to
render targets - regardless of how we choose to do that, this will give
us the source color space at all points where we are connecting src to
dst.
There are many dangling injection points where I've been inserting
nullptr, but I have a record of all of them. Additionally, there are now
three places (the most common simple paths for bitmap/image rendering)
where things are plumbed enough that I expect to have access to the dst
color space (all marked with XFORMTODO).
In addition to getting the dst color space, I need to inject shader code
and uniform uploading for appendTextureLookup and friends.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2154753003
Review-Url: https://codereview.chromium.org/2154753003
Most changes stem from working on an examples bracketed
by #if DEBUG_UNDER_DEVELOPMENT // tiger
These exposed many problems with coincident curves,
as well as errors throughout the code.
Fixing these errors also fixed a number of fuzzer-inspired
bug reports.
* Line/Curve Intersections
Check to see if the end of the line nearly intersects
the curve. This was a FIXME in the old code.
* Performance
Use a central chunk allocator.
Plumb the allocator into the global variable state
so that it can be shared. (Note that 'SkGlobalState'
is allocated on the stack and is visible to children
functions but not other threads.)
* Refactor
Let SkOpAngle grow up from a structure to a class.
Let SkCoincidentSpans grow up from a structure to a class.
Rename enum Alias to AliasMatch.
* Coincidence Rewrite
Add more debugging to coincidence detection.
Parallel debugging routines have read-only logic to report
the current coincidence state so that steps through the
logic can expose whether things got better or worse.
More functions can error-out and cause the pathops
engine to non-destructively exit.
* Accuracy
Remove code that adjusted point locations. Instead,
offset the curve part so that sorted curves all use
the same origin.
Reduce the size (and influence) of magic numbers.
* Testing
The debug suite with verify and the full release suite
./out/Debug/pathops_unittest -v -V
./out/Release/pathops_unittest -v -V -x
expose one error. That error is captured as cubics_d3.
This error exists in the checked in code as well.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128633003
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128633003
Review-Url: https://codereview.chromium.org/2128633003
SkPDFUtils now has a special function (SkPDFUtils::AppendColorComponent)
just for writing out (color/255) as a decimal with three digits of
precision.
SkPDFUnion now has a type to represent a color component. It holds a
utint_8, but calls into AppendColorComponent to serialize.
Added a unit test that tests all possible input values.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2151863003
Review-Url: https://codereview.chromium.org/2151863003
It's become clear we need to sometimes deal with values <0 or >1.
I'm not yet convinced we care about NaN or +-inf.
We had some fairly clever tricks and optimizations here for NEON
and SSE. I've thrown them out in favor of a single implementation.
If we find the specializations mattered, we can certainly figure out
how to extend them to this new range/domain.
This happens to add a vectorized float -> half for ARMv7, which was
missing from the _01 version. (The SSE strategy was not portable to
platforms that flush denorm floats to zero.)
I've tested the full float range for FloatToHalf on my desktop and a 5x.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot
Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90
Review-Url: https://codereview.chromium.org/2145663003
Reason for revert:
Unit tests fail on Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast
Original issue's description:
> Expand _01 half<->float limitation to _finite. Simplify.
>
> It's become clear we need to sometimes deal with values <0 or >1.
> I'm not yet convinced we care about NaN or +-inf.
>
> We had some fairly clever tricks and optimizations here for NEON
> and SSE. I've thrown them out in favor of a single implementation.
> If we find the specializations mattered, we can certainly figure out
> how to extend them to this new range/domain.
>
> This happens to add a vectorized float -> half for ARMv7, which was
> missing from the _01 version. (The SSE strategy was not portable to
> platforms that flush denorm floats to zero.)
>
> I've tested the full float range for FloatToHalf on my desktop and a 5x.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
> CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90TBR=msarett@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2151023003
It's become clear we need to sometimes deal with values <0 or >1.
I'm not yet convinced we care about NaN or +-inf.
We had some fairly clever tricks and optimizations here for NEON
and SSE. I've thrown them out in favor of a single implementation.
If we find the specializations mattered, we can certainly figure out
how to extend them to this new range/domain.
This happens to add a vectorized float -> half for ARMv7, which was
missing from the _01 version. (The SSE strategy was not portable to
platforms that flush denorm floats to zero.)
I've tested the full float range for FloatToHalf on my desktop and a 5x.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2145663003
If we make sure all SkOpts functions are static, we can give the namespaces any
name we like. This lets us drop the sk_ prefix and give a real indication of
the default SIMD instruction set rather than just saying sk_default.
Both of these changes help debugger, profiler, and crash report readability.
Perhaps more importantly, keeping these functions static helps prevent
accidentally linking in unused versions of functions, as you see here with
sk_avx::srcover_srgb_srgb().
This requires we update SkBlend_opts tests and benches to call SkOpts functions
through SkOpts rather than declaring the methods externally. In practice this
drops testing of the SSE2 version on machines with SSE4. If we still really
need to test/bench the compile time best SIMD level version of this method
against the runtime detected best, we can include SkBlend_opts.h into the tests
or benches directly, similar to what we do for the trivial, brute-force, or best
non-SIMD versions.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145833002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2145833002
Now that there may be multiple font managers in a process the typeface
ids must be unique across all typefaces, not just unique within a font
manager. If two typefaces have the same id there will be issues in the
glyph cache. All existing font managers were already doing this by
calling SkFontCache::NewFontID, so centralize this in SkTypeface.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147733002
Review-Url: https://codereview.chromium.org/2147733002
Reason for revert:
Monotonicity assert is failing on ARM. (Different rsqrt() and invert() precision?) Will investigate a bit tomorrow... might reland with the test TODO.
Original issue's description:
> Move sRGB <-> linear conversion components to their own files.
>
> This makes them a little easier to use outside SkColorXform code.
>
> I've added some notes about how best to use them and their eccentricities, and added a test.
>
> Ultimately any software sRGB <-> linear conversion should funnel somehow through here.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128893002
> CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/45e58c8807179638980aae8503573b950b844e4cTBR=reed@google.com,msarett@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2131793002
This makes them a little easier to use outside SkColorXform code.
I've added some notes about how best to use them and their eccentricities, and added a test.
Ultimately any software sRGB <-> linear conversion should funnel somehow through here.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128893002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2128893002
The original caching logic for sample locations wishfully assumed that
the GPU would always use the same sample pattern for render targets
that had the same number of samples. It turns out we can't rely on
that. This change improves the caching logic to handle mismatched
simple patterns with the same count, and adds a unit test that
emulates different sample patterns observed on real hardware.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2111423002
Review-Url: https://codereview.chromium.org/2111423002