f94fa7112f
A store/load pair like this is a redundant no-op: store simd_register_a, memory_address load memory_address, simd_register_a Everyone seems to be good at removing those when using SSE, but GCC and Clang are pretty terrible at this for NEON. We end up issuing both redundant commands, usually to and from the stack. That's slow. Let's not do that. This CL unions in the native SIMD register type into SkPMFloat, so that we can assign to and from it directly, which is generating a lot better NEON code. On my Nexus 5, the benchmarks improve from 36ns to 23ns. SSE is just as fast either way, but I paralleled the NEON code for consistency. It's a little terser. And because it needed the platform headers anyway, I moved all includes into SkPMFloat.h, again only for consistency. I'd union in Sk4f too to make its conversion methods a little clearer, but MSVC won't let me (it has a copy constructor... they're apparently not up to speed with C++11 unrestricted unions). BUG=skia: Review URL: https://codereview.chromium.org/1015083004 |
||
---|---|---|
animations | ||
bench | ||
bin | ||
debugger | ||
dm | ||
docs | ||
example | ||
experimental | ||
forth | ||
gm | ||
gyp | ||
include | ||
platform_tools | ||
resources | ||
samplecode | ||
site | ||
src | ||
tests | ||
third_party | ||
tools | ||
.gitignore | ||
AUTHORS | ||
codereview.settings | ||
CONTRIBUTING | ||
CQ_COMMITTERS | ||
DEPS | ||
Doxyfile | ||
gyp_skia | ||
gyp_skia.py | ||
LICENSE | ||
make.bat | ||
make.py | ||
Makefile | ||
OWNERS | ||
PRESUBMIT.py | ||
README | ||
README.chromium | ||
skia.gyp | ||
SKP_VERSION | ||
whitespace.txt |
Skia is a complete 2D graphic library for drawing Text, Geometries, and Images. See full details, and build instructions, at https://skia.org.