MSAN started failing after https://skia-review.googlesource.com/c/32722.
This should fix it.
No-Try: true
Change-Id: I8956c8c211507923f078fe96921fedaadefae8a8
Reviewed-on: https://skia-review.googlesource.com/32942
Reviewed-by: Ben Wagner <bungeman@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Adding this extra field to the CommandBufferInfo may or may not have led to a memory regression.
Remove it until it is actually needed.
Change-Id: Ibdddbeb7625f91f5199584a575289f07f6e95304
Reviewed-on: https://skia-review.googlesource.com/33280
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Also makes paint clones use cloned fragment processors.
Change-Id: I60efcfc6a46a4f8430a72f4d1ec79c7d99fbe593
Reviewed-on: https://skia-review.googlesource.com/33084
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
1. Always inline (Clang previously ignored inline and got 25% slower)
2. SIMD everywhere other than x86 gcc:
non-SIMD is only faster in my desktop with gcc;
with Clang on my desktop, SIMD is 50% faster than non-SIMD.
3. Allocate 4x memory instead of 2x when running out of space:
on old Android devices with Linux kernel 3.10 (e.g., Nexus 6P, 5X),
the alloc/memcpy will triger a major bottleneck in kernel (30% of
the running time). Such bottleneck goes away (the kernel is no
longer doing stupid things during alloc/memcpy) in Linux kernel
3.18 (e.g., Pixel), and that's why DAA is much faster on Pixel than
on Nexus 6P.
I think maybe I should adopt SkRasterPipeline for device-specific
optimizations.
Bug: skia:
Change-Id: I0408aa7671a5f1b39aad3bec25f8fc994ff5a1bb
Reviewed-on: https://skia-review.googlesource.com/30820
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit a5a69cfb48.
Bug: skia:
Change-Id: I08475d96255b9df13e5c86e1ef9c7f4739e51459
Reviewed-on: https://skia-review.googlesource.com/33202
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Since both armv7-a-neon and 32-bit armv8-a have NEON, we can treat them
the same in Android.bp.
Bug: b/62895439
Corresponds to https://android-review.googlesource.com/c/423660/3
This change will generate the change to Android.bp described there.
Change-Id: Icae9b5b79093d6f2886da39771d4fbe901be237a
Reviewed-on: https://skia-review.googlesource.com/33000
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
Bug: skia:
Change-Id: I880e3d5a668743ac12fb0101baca637443e920b4
Reviewed-on: https://skia-review.googlesource.com/33082
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 0f450acd76.
Bug: skia:
Change-Id: I97428fbbc6d82bf8b186ec5fdbf1a939c00e4126
Reviewed-on: https://skia-review.googlesource.com/32726
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
I tried to follow exactly the same strategy as a start.
(Though I did fix the off-by-one dimensions.)
It does rather look like we only need 3D and 4D now
that I've looked at the call sites.
Looks like about a 20% speedup.
Change-Id: I8b1af64750ad1750716ee1ab0767e64591c7206a
Reviewed-on: https://skia-review.googlesource.com/32842
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Since recipes are now versioned with the code, this is unnecessary and
could mask recipe bugs.
Change-Id: Ic5aafbd3a7e9ccd3fd529c71b282cf6b037b78df
Reviewed-on: https://skia-review.googlesource.com/32722
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Valgrind 3.13.0 only supports up to AVX2, so I used SK_CPU_LIMIT_SSE41
to avoid failing with "Unrecognised instruction." (The existing Valgrind
tasks run on a ShuttleA machine, whose CPU only supports AVX.)
Needed to enable verbose output on all Valgrind tasks to avoid Swarming
I/O timeout. Opportunistically removed verbose output for Linux Intel
bots that are no longer failing.
Bug: skia:6881
Change-Id: I2ffa6efe901c97bd2e0bbc9b26632aafbb3cf9a6
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/31143
Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
The new Delta AA scan converter does not need the edge to be updated
with monotonic Y so chopping at y extrema is not necessary. Removing
such chopping brings ~10% performance increase to chalkboard.svg which
has tons of small cubics (the same is true for many svgs I saw).
We didn't remove the chopping for quads because that does not bring
a significant speedup. Moreover, dropping those y extremas would make
our strokecircle animation look a little more wobbly (because we would
have fewer divisions for the quads at the top and bottom of the circle).
Bug: skia:
Change-Id: I3984d2619f9f77269ed24e8cbfa9f1429ebca4a8
Reviewed-on: https://skia-review.googlesource.com/31940
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit 175af0d011.
Reason for revert: Chrome doesn't know about portable format specifiers. Sigh.
Original change's description:
> GrContext::dump that produces JSON formatted output
>
> Includes caps, GL strings, and extensions
>
> Bug: skia:
> Change-Id: I1e8b3dd50fb68357f9de8ca6149cf65443d027ef
> Reviewed-on: https://skia-review.googlesource.com/32340
> Commit-Queue: Brian Osman <brianosman@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com,brianosman@google.com
Change-Id: Ie280b25275725f0661da7541f54ed62897abb82f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/32861
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 6a7d56fa0f.
Reason for revert: Earlier commit needs to be reverted for Chrome roll.
Original change's description:
> Support single line objects and arrays
>
> This is just a formatting nicety. The new caps dump has several large
> arrays of structs, and keeping each object on one line makes them much
> more readable. (It also limits the total length of the output, which
> helps when scanning through).
>
> Example of the output, before and after this change:
> https://gist.github.com/brianosman/872f33be9af49031023b791e7db0b1fb
>
> Bug: skia:
> Change-Id: I0fe0c2241b0c7f451b0837500e554d0491126d5e
> Reviewed-on: https://skia-review.googlesource.com/32820
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Brian Osman <brianosman@google.com>
TBR=bsalomon@google.com,brianosman@google.com
Change-Id: I2b05cf79ca4804e5944f2eb3e17fe4be4d5af290
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/32860
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
It looks like our recursive approach is faster than interp3D(),
and we'd prefer trilinear interpolation over tetrahedral for quality.
Change-Id: I1019254b9ecf24b2f4feff17ed8ae1b48fcc281e
Reviewed-on: https://skia-review.googlesource.com/32800
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit b681a0f1b0.
Reason for revert: Seems to be messing up some MacMini & Nexus7 bots
Original change's description:
> Store discard request on the opList and remove GrDiscardOp
>
> Change-Id: Ic1f76bb91c16b23df1fe71c07a4d5ad5abf1dc26
> Reviewed-on: https://skia-review.googlesource.com/32640
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Robert Phillips <robertphillips@google.com>
TBR=egdaniel@google.com,bsalomon@google.com,robertphillips@google.com
Change-Id: I8a89fae7bb11791bd023d7444a074bb34d006fd0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/32704
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
This is just a formatting nicety. The new caps dump has several large
arrays of structs, and keeping each object on one line makes them much
more readable. (It also limits the total length of the output, which
helps when scanning through).
Example of the output, before and after this change:
https://gist.github.com/brianosman/872f33be9af49031023b791e7db0b1fb
Bug: skia:
Change-Id: I0fe0c2241b0c7f451b0837500e554d0491126d5e
Reviewed-on: https://skia-review.googlesource.com/32820
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Includes caps, GL strings, and extensions
Bug: skia:
Change-Id: I1e8b3dd50fb68357f9de8ca6149cf65443d027ef
Reviewed-on: https://skia-review.googlesource.com/32340
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Until now we've been using 3 separate parametric stages to apply
gamma to r,g,b. That works fine, but is kind of unnecessarily
slow, and again less clear in a stack trace than seeing "gamma".
The new bench runs in about 60% of the time the old one does
on my Trashcan.
BUG=skia:6939
Change-Id: I079698d3009b081f1c23a2e27fc26e373b439610
Reviewed-on: https://skia-review.googlesource.com/32721
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:
Change-Id: I487930955f75048ea27a1bcc61f7e0849c63759b
Reviewed-on: https://skia-review.googlesource.com/32681
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
This code simulates the integer-based ordered-dither using step/mod
with only floating point values. Produces similar results.
R=bsalomon@google.com
Bug: skia:4430
Change-Id: I1406f751f0ddd6bfd14e532dfb4efc0bb5784992
Reviewed-on: https://skia-review.googlesource.com/28942
Commit-Queue: Eric Karl <ericrk@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>
rects are already auto-vectorized, so no need to explicitly write a 4f version of SkRect::round()
Bug: skia:
Change-Id: I098945767bfcaa7093d770c376bd17ff3bdc9983
Reviewed-on: https://skia-review.googlesource.com/32060
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
A later clear call was nuking the stencil clear load setting.
Bug: skia:6936
Change-Id: Ib2c5cd930273cd6e613ca7191f8b7806abe6c218
Reviewed-on: https://skia-review.googlesource.com/32541
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
This is a stand-alone helper class for writing properly
structured JSON to an SkWStream. It currently solves two
problems (although this CL only uses it in one context):
1) Performance. Writing out JSON this way is about 10x
faster than using JSONCPP. For the large amounts of data
generated by the tracing system, that's a big win.
2) Makes it easy to emit structured JSON from code that's
not fully centralized. We'd like to spit out JSON that
describes a GrContext, GrGpu, GrCaps, etc... Doing that
with simple string manipulation is complex, and spreads
this logic over all those functions. Using JSONCPP adds
yet another (large) third party library dependency (that
we only build into our own tools right now).
This went through several revisions. I originally planned
it as a stateful SkString wrapper, so the user could just
build their JSON as a string. That's O(N^2), though,
because SkString grows by a (small) constant amount. Even
using a better growth strategy still means needing RAM
for all the resulting text, which is usually pointless.
This version has a constant memory cost, so writing huge
amounts of JSON to disk (tracing a long DM run can emit
100's of MBs) doesn't stress resources.
Bug: skia:
Change-Id: Ia716524b246db0f97d332da60d2ce9903069e748
Reviewed-on: https://skia-review.googlesource.com/31204
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
- Avoid calling floor() and ceil(), which are real
external calls on most platforms.
- Interpolate all output channels in parallel.
- Simplify recursion, allow the compiler to unroll.
Change-Id: I9ef814e91b18c5775292ca20e9ec01222b6a89cf
Reviewed-on: https://skia-review.googlesource.com/32182
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
The Lua typeface.getStyle now returns SkFontStyle.
Dumping a glyph cache entry is now more accurate.
SkTypeface::MakeFromTypeface now does a more accurate check.
Change-Id: I6150636c8c674353bd0eed4d95aa0794d3919c39
Reviewed-on: https://skia-review.googlesource.com/32200
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
This will be faster, but maybe more importantly it helps make debugging
a stack trace clearer. It's confusing to see a "parametric transfer
function" stages followed by a table transfer function stages...
This leads to a little bit of cleanup in SkColorSpaceXform_A2B.
I am uncertain whether we still need parametric_a. I need to do some
more tracing through the code before I'd say it's impossible to reach in
addTransferFn().
Change-Id: I52e85019f92d012a3086fc94cf64ae6c9307ea94
Reviewed-on: https://skia-review.googlesource.com/32040
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Using the same branch-less method as raster pipeline.
Change-Id: Iaaa36330dbf49961bdfc288cad031d891d8ff589
Reviewed-on: https://skia-review.googlesource.com/31280
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>