SkChecksum::Compute is a very, very poorly distributed hash function.
This replaces all remaining uses with Murmur3.
The only interesting stuff is in src/gpu.
BUG=skia:
Review URL: https://codereview.chromium.org/1436973003
We were doing it on Windows, now do it everywhere.
This just changes the backend. We could think about another step to actually
replacing all our sk_atomic_... with std atomic stuff.
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot
TBR=reed@google.com
Only deleting from include/...
Review URL: https://codereview.chromium.org/1441773002
As we use this more and more, we end up with more and more inline copies.
On my desktop, this makes Skia ~16K smaller.
Boy perf trybots would be neat.
BUG=skia:
Review URL: https://codereview.chromium.org/1415133003
Passing &SkGoodHash to SkTHashMap and SkTHashSet doesn't guarantee that it's actually instantiated. Using a functor does.
BUG=skia:
Review URL: https://codereview.chromium.org/1405053002
We landed this originally with lazily-correct sequentially-consistent memory
order. It turns out that's regressed performance, we think particularly when
recording paths. We also think there's no need for anything but relaxed memory
order here.
We should see this chart go down if all goes well: https://perf.skia.org/#4329
There are also Chrome performance charts to watch in the linked bug.
BUG=chromium:537700
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot,Test-Ubuntu-GCC-Golo-GPU-GT610-x86_64-Release-TSAN
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1393833003
- use C++11 features ({} init, move constructors) to eliminate the need
for explicit constructors
- collapse RECORD0...RECORD8 into just one RECORD macro
- explicitly tag record types instead of using member detectors.
Removing member detectors makes this code significantly less fragile.
This exposes a few places where we didn't really think through what to do
with SkDrawable. I've marked them TODO for now.
BUG=skia:
Review URL: https://codereview.chromium.org/1360943003
This implementation improves performance of SkMutex acquire / release pair from 42ns -> 13 ns.
SkSharedMutex and SkSpinlock have the same performance.
It also removes specialized windows and linux/mac code.
BUG=skia:
Review URL: https://codereview.chromium.org/1359733002
SkOncePtr<T[]> is identical to SkOncePtr<T> except we'll default to delete[]
for cleanup.
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug-ASAN-Trybot
BUG=skia:
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1311893010
Reason for revert:
Breaks Chrome roll.
obj/skia/ext/skia_chrome.skia_memory_dump_provider.o
does not have -I include/private on its include path, but transitively includes SkMessageBus.h.
Original issue's description:
> Port uses of SkLazyPtr to SkOncePtr.
>
> This gives SkOncePtr a non-trivial destructor that uses std::default_delete
> by default. This is overrideable, as seen in SkColorTable.
>
> SK_DECLARE_STATIC_ONCE_PTR still just leaves its pointers hanging at EOP.
>
> BUG=skia:
>
> No public API changes.
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/a1254acdb344174e761f5061c820559dab64a74cTBR=herb@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1334523002
This gives SkOncePtr a non-trivial destructor that uses std::default_delete
by default. This is overrideable, as seen in SkColorTable.
SK_DECLARE_STATIC_ONCE_PTR still just leaves its pointers hanging at EOP.
BUG=skia:
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1322933005
This change regularizes Skia's type traits so that when <type_traits>
can finally be used the transition is easier. Various traits are
renamed to match <type_traits> and placed in the skstd namespace.
Current users of these traits are updated.
Review URL: https://codereview.chromium.org/1317593004
The motivation for this was to remove SK_OFFSETOF from SkTypes, but
this CL is mostly about cleaning up our use of offsetof generally.
SK_OFFSETOF is removed to SkTypes and added to the two places it is
actually used (for the non standard behavior of finding the offset of
fields in types which are not standard layout).
Older versions of gcc required POD for offsetof to be used without
warning. Newer versions require the more relaxed standard layout.
Now that we no longer build on older versions of gcc, remove the
old warning suppressions.
PODMatrix is renamed to AggregateMatrix. SkMatrix is already POD
(trivial and standard layout). The PODMatrix name implies that the
POD-ness is needed for the offsetof, but it is actually the aggregate
attribute which is needed for compile time constant initialization.
This makes it more obvious that this can be revisited after we can
rely on constexpr constructors.
This also adds skstd::declval since this allows removal of existing
awkward code which casts a constant to a pointer to find the size of
a field.
TBR=reed@google.com
No API change, only removes unused macro.
Review URL: https://codereview.chromium.org/1309523003
SkTemplates.h contains a number of Skia specific utilities which are
not designed for external use. In addition to reducing the external
support burden, this will allow Skia to freely refactor this file.
Review URL: https://codereview.chromium.org/1272293004
Renames Sk4pxXfermode.h to SkXfermode_opts.h,
and refactors it a tiny bit internally.
This moves xfermode optimization from being "compile-time everywhere but NEON"
to simply "runtime everywhere". I don't anticipate any effect on perf or
correctness.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1264543006
With this new arrangement, the benefits of inlining sk_memset16/32 have changed.
On x86, they're not significantly different, except for small N<=10 where the inlined code is significantly slower.
On ARMv7 with NEON, our custom code is still significantly faster for N>10 (up to 2x faster). For small N<=10 inlining is still significantly faster.
On ARMv7 without NEON, our custom code is still ridiculously faster (up to 10x) than inlining for N>10, though for small N<=10 inlining is still a little faster.
We were not using the NEON memset16 and memset32 procs on ARMv8. At first blush, that seems to be an oversight, but if so it's an extremely lucky one. The ARMv8 code generation for our memset16/32 procs is total garbage, leaving those methods ~8x slower than just inlining the memset, using the compiler's autovectorization.
So, no need to inline any more on x86, and still inline for N<=10 on ARMv7. Always inline for ARMv8.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1270573002