372d65cc6e
Local timing says this 4-byte Paeth function takes about 0.3x the time the serial libpng code does, dropping from ~10 cycles per byte to ~2.9. bpp=4 is mainly an easy demo. This approach can work for any bpp up to 16, 1 pixel at a time, at roughly the same cost per pixel. Doing more than 1 pixel at a time is a tricky math problem I have yet to attempt to solve. Everything here can be trivially downgraded to MMX, supporting bpp up to 8. It seems to be a little slower (~3.5 cycles per byte), but it would make the code compatible with every x86 that can still power on. I've tried four approaches: - this way; - doing things naively in 16-bit; - a 16-bit version that requires division by 3 (i.e. mulhi_epu16(..., 0x5580) ); - a mostly 8-bit version of the same. They're all fine, but this one is consistently the fastest I've measured. I'd be happy to settle on the naive 16-bit version too, which would have a very clear implementation that's only minorly slower than this version. The other two are way more complicated, and would require us to draw some serious ASCII diagrams to explain. I have learned that the .skp serialization tests (serialize-8888) have a nice side effect of testing the correctness of these filters! (Since writing the description above, I've bumped things up to {Paeth,Sub,Avg} x { 3 bpp, 4 bpp }.) BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1573943002 Review URL: https://codereview.chromium.org/1573943002 |
||
---|---|---|
.. | ||
etc1 | ||
freetype/include/freetype-android | ||
giflib | ||
ktx | ||
libmicrohttpd | ||
libpng | ||
libsdl | ||
libwebp/webp | ||
lua | ||
yasm | ||
README |
This directory contains a set of dependencies that are needed to build various components and tools within Skia. Some of these dependencies reside within the Skia repo, while others are pulled from other repositories and placed in the third_party/externals directory. These external dependencies are defined in a DEPS file and are kept up-to-date using 'gclient sync'.