Since we're already approximating the sRGB gamma curve with a sqrt(), we might
as well approximate with it a faster approximate sqrt(). On Intel, this
.rsqrt().invert() version is 2-3x faster than .sqrt() (~3x faster on older
machines, ~2x faster on newer machines).
This should provide ~11 bits of precision, suspiciously exactly enough.
Running dm --config srgb, there are diffs, but none perceptible.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2046063002
Review-Url: https://codereview.chromium.org/2046063002
On my Mac (so, immintrin), this improves compile time, both wall and cpu,
by about 16%. To test I ran this on an SSD with files hot in their caches:
$ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \
ninja -C out/Release -t clean && \
time ninja -C out/Release
Before: 159 wall / 3367 cpu
159 wall / 3368 cpu
After: 137 wall / 2860 cpu
136 wall / 2863 cpu
I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc.
That made no signficant difference, so I've kept immintrin for its simplicity.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
TBR=reed@google.com
No public API changes.
Review-Url: https://codereview.chromium.org/2045633002
This change is a refactorization to open up more flexable use for render passes
in the future. Once we start bundling draw calls into a single render pass we
will need to start using specific load and store ops instead of the default
currently of load and store everything.
We still will need to simply get a compatible render pass for creation of framebuffers
and pipeline objects. These render passes don't need to know load and store ops. Thus
in this change, I'm defaulting the "compatible" render pass to always be the one
which both loads and stores. All other load/store combinations will be created lazily
when requested for a specific draw (future change).
The CompatibleRPHandle is a way for us to avoid analysing the RenderTarget every time
we want to get a new GrVkRenderPass.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1977403002
Review-Url: https://codereview.chromium.org/1977403002
Reason for revert:
Still causing problems in Google3, e.g.
https://test.corp.google.com/ui#cl=124138817&flags=CAMQBQ==&id=OCL:124138817:BASE:124139560:1465227435491:219ffbdb&t=//third_party/skia/HEAD:dm
Original issue's description:
> Make SkPngCodec decode progressively.
>
> This is a step towards using SkCodec in Chromium, where progressive
> decoding is necessary.
>
> Switch from using png_read_row (which expects all the data to be
> available) to png_process_data, which uses callbacks when rows are
> available.
>
> Create a new API for SkCodec, which supports progressive decoding and
> scanline decoding. Future changes will switch the other clients off of
> startScanlineDecode and get/skip-Scanlines to the new API.
>
> Remove SkCodec::kNone_ScanlineOrder, which was only used for interlaced
> PNG images. In the new API, interlaced PNG fits kTopDown. Also remove
> updateCurrScanline(), which was only used by the old implementation for
> interlaced PNG.
>
> DMSrcSink:
> - In CodecSrc::kScanline_Mode, use the new method for scanline decoding
> for the supported formats (just PNG and PNG-in-ICO for now).
>
> fuzz.cpp:
> - Remove reference to kNone_ScanlineOrder
>
> SkCodec:
> - Add new APIs:
> - startIncrementalDecode
> - incrementalDecode
> - Remove kNone_SkScanlineOrder and updateCurrScanline()
>
> SkPngCodec:
> - Implement new APIs
> - Switch from sk_read_fn/png_read_row etc to png_process_data
> - Expand AutoCleanPng's role to decode the header and create the
> SkPngCodec
> - Make the interlaced PNG decoder report how many lines were
> initialized during an incomplete decode
> - Make initializeSwizzler return a bool instead of an SkCodec::Result
> (It only returned kSuccess or kInvalidInput anyway)
>
> SkIcoCodec:
> - Implement the new APIs; supported for PNG in ICO
>
> SkSampledCodec:
> - Call the new method for decoding scanlines, and fall back to the old
> method if the new version is unimplemented
> - Remove references to kNone_SkScanlineOrder
>
> tests/CodecPartial:
> - Add a test which decodes part of an image, then finishes the decode,
> and compares it to the straightforward method
>
> tests/CodecTest:
> - Add a test which decodes all scanlines using the new method
> - Repurpose the Codec_stripes test to decode using the new method in
> sections rather than all at once
> - In the method check(), add a parameter for whether the image supports
> the new method of scanline decoding, and be explicit about whether an
> image supports incomplete
> - Test incomplete PNG decodes. We should have been doing it anyway for
> non-interlaced (except for an image that is too small - one row), but
> the new method supports interlaced incomplete as well
> - Make test_invalid_parameters test the new method
> - Add a test to ensure that it's safe to fall back to scanline decoding without
> rewinding
>
> BUG=skia:4211
>
> The new version was generally faster than the old version (but not significantly so).
>
> Some raw performance differences can be found at https://docs.google.com/a/google.com/spreadsheets/d/1Gis3aRCEa72qBNDRMgGDg3jD-pMgO-FXldlNF9ejo4o/
>
> Design doc can be found at https://docs.google.com/a/google.com/document/d/11Mn8-ePDKwVEMCjs3nWwSjxcSpJ_Cu8DF57KNtUmgLM/
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1997703003
>
> Committed: https://skia.googlesource.com/skia/+/a4b09a117d4d1ba5dda372e6a2323e653766539e
>
> Committed: https://skia.googlesource.com/skia/+/30e78c9737ff4861dc4e3fa1e4cd010680ed6965
>
> Committed: https://skia.googlesource.com/skia/+/6fb2391b2cc83ee2160b4e994faa8128975acc1fTBR=reed@google.com,msarett@google.com,scroggo@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=skia:4211
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2044573002
Review-Url: https://codereview.chromium.org/2044573002
Crash is caused by ldxc1 instruction, which traps when double values are
not aligned on 8-byte boundaries. Problem was tracked to SkChunkAlloc which
produces pointers aligned on 4-byte boundaries leading to misalignment.
This change makes sure that SkChunkAlloc will produce pointers that are
aligned to 8 bytes.
Appropriate tests are added to tests/MemsetTest.cpp
TEST=Build Chromium with Clang and run on MIPS32r2 platform
TEST=./out/Debug/dm --match Memset
BUG=130022
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1849183004
Review-Url: https://codereview.chromium.org/1849183004
Not releasing the reference was caught by the Skia asan bot.
All remaining occurences of this pattern have been updated.
This fixes "Make GrClipMaskManager stateless and push GrPipelineBuilder construction downstack".
TBR=herb
Review-Url: https://codereview.chromium.org/2037243002
Some of the deprecated signatures are no longer used by any known
client, so remove them now to prevent future use.
TBR=reed
This doesn't add API, it just removes it.
Review-Url: https://codereview.chromium.org/2036993003
Not releasing the reference was caught by asan in the Chromium roll.
This fixes "Make GrClipMaskManager stateless and push GrPipelineBuilder construction downstack".
Review-Url: https://codereview.chromium.org/2037193002
We don't seem to require nonzero offsets for texel buffers at this
point in time, and requiring this feature greatly reduces the number
of desktop clients that can use texel buffers. If we find a use for
offsets later we can always add it back as a separate feature.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2036953002
Review-Url: https://codereview.chromium.org/2036953002
This is a step towards using SkCodec in Chromium, where progressive
decoding is necessary.
Switch from using png_read_row (which expects all the data to be
available) to png_process_data, which uses callbacks when rows are
available.
Create a new API for SkCodec, which supports progressive decoding and
scanline decoding. Future changes will switch the other clients off of
startScanlineDecode and get/skip-Scanlines to the new API.
Remove SkCodec::kNone_ScanlineOrder, which was only used for interlaced
PNG images. In the new API, interlaced PNG fits kTopDown. Also remove
updateCurrScanline(), which was only used by the old implementation for
interlaced PNG.
DMSrcSink:
- In CodecSrc::kScanline_Mode, use the new method for scanline decoding
for the supported formats (just PNG and PNG-in-ICO for now).
fuzz.cpp:
- Remove reference to kNone_ScanlineOrder
SkCodec:
- Add new APIs:
- startIncrementalDecode
- incrementalDecode
- Remove kNone_SkScanlineOrder and updateCurrScanline()
SkPngCodec:
- Implement new APIs
- Switch from sk_read_fn/png_read_row etc to png_process_data
- Expand AutoCleanPng's role to decode the header and create the
SkPngCodec
- Make the interlaced PNG decoder report how many lines were
initialized during an incomplete decode
- Make initializeSwizzler return a bool instead of an SkCodec::Result
(It only returned kSuccess or kInvalidInput anyway)
SkIcoCodec:
- Implement the new APIs; supported for PNG in ICO
SkSampledCodec:
- Call the new method for decoding scanlines, and fall back to the old
method if the new version is unimplemented
- Remove references to kNone_SkScanlineOrder
tests/CodecPartial:
- Add a test which decodes part of an image, then finishes the decode,
and compares it to the straightforward method
tests/CodecTest:
- Add a test which decodes all scanlines using the new method
- Repurpose the Codec_stripes test to decode using the new method in
sections rather than all at once
- In the method check(), add a parameter for whether the image supports
the new method of scanline decoding, and be explicit about whether an
image supports incomplete
- Test incomplete PNG decodes. We should have been doing it anyway for
non-interlaced (except for an image that is too small - one row), but
the new method supports interlaced incomplete as well
- Make test_invalid_parameters test the new method
- Add a test to ensure that it's safe to fall back to scanline decoding without
rewinding
BUG=skia:4211
The new version was generally faster than the old version (but not significantly so).
Some raw performance differences can be found at https://docs.google.com/a/google.com/spreadsheets/d/1Gis3aRCEa72qBNDRMgGDg3jD-pMgO-FXldlNF9ejo4o/
Design doc can be found at https://docs.google.com/a/google.com/document/d/11Mn8-ePDKwVEMCjs3nWwSjxcSpJ_Cu8DF57KNtUmgLM/
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1997703003
Committed: https://skia.googlesource.com/skia/+/a4b09a117d4d1ba5dda372e6a2323e653766539e
Committed: https://skia.googlesource.com/skia/+/30e78c9737ff4861dc4e3fa1e4cd010680ed6965
Review-Url: https://codereview.chromium.org/1997703003
With this change, the CMake build, which does not use DEPS to sync
external projects, is able to build and use the same version of libpng
that is used in other builds.
This will allow all platforms (including Google3 CMake build) to test on
the same version of libpng, so we do not need to make SkPngCodec support
all versions of libpng.
- Update CMakeLists.txt to use the checked in libpng.
- Check in libpng version 1.6.22rc01
- Update README.google
- Replace our old LICENSE file with the latest one from libpng
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2033063003
CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2033063003