- GCC -> Clang;
- Add a Perf bot.
This -Fast mode is a stand-in for "how clients should/do really release code". Today everyone uses Clang, so we should probably switch this bot too.
I also got interested in how it compares in performance to our ordinary Release build, so I've added a Perf bot.
I did try turning on link-time optimization (much slower build, somewhat faster binary), but that failed at the link step. Going to save that for later.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Ubuntu-Clang-x86_64-Release-Fast,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-Fast,Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-Fast
Change-Id: I58e6109f8a89799d80e769316902549131b619cf
Reviewed-on: https://skia-review.googlesource.com/6595
Reviewed-by: Ben Wagner <benjaminwagner@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This code involves Skia having knowledge of HWUI internals and
causes problems with various build systems. It is also not
currently being used and is therefore expendable.
Change-Id: I7b6a37fa4c9afcefbc6a957b49e7735da872ff14
Reviewed-on: https://skia-review.googlesource.com/6597
Commit-Queue: Derek Sollenberger <djsollen@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Use the pipeline implementation to handle F32.
This refactor has been broken away from the following mega-CL.
https://skia-review.googlesource.com/c/6260/
Change-Id: I79d8992f8c4a7e4dbf674a78484057a6b7f4f123
Reviewed-on: https://skia-review.googlesource.com/6585
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Matt Sarett <msarett@google.com>
Full error text:
error: enums in the Microsoft ABI are signed integers by default;
consider giving the enum Properties an unsigned underlying type
to make this code portable [-Werror,-Wsigned-enum-bitfield]
BUG=skia:
Change-Id: I1491dd94c894e383fed401880fc04562140f7a66
Reviewed-on: https://skia-review.googlesource.com/6594
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Matt Sarett <msarett@google.com>
Prefer VertexList to a bare Vertex*. Also fix some 100-col issues.
This should have no user-visible impact.
Change-Id: I8fa260d5417c9832256529c232f532e69238fca0
Reviewed-on: https://skia-review.googlesource.com/6502
Commit-Queue: Stephan White <senorblanco@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This time, with manual program memory management instead of std::vector<void*>.
Using STL types from SkOpts_hsw.cpp is not safe. Things like std::vector<void*>
are inlined but not anonymous, so they're deduped by the linker arbitrarily. This
is bad when we pick the version compiled with AVX instructions on a machine that
doesn't support AVX...
std::vector<Stage> was safe before because Stage itself was anonymous. While not anonymous, std::vector<Stage> is unique to the compilation unit, because you can only refer to the anonymous Stage in the compilation unit.
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_chromium_asan_rel_ng;skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I015e27583b6b6ff06b5e9f63e3f40ee6b27d6dbd
Reviewed-on: https://skia-review.googlesource.com/6550
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Trying to get better information about what's failing (and make this
easier in the future).
BUG=skia:6086
Change-Id: Iedb1269abb4527170b919bd90bce625a7f78f05a
Reviewed-on: https://skia-review.googlesource.com/6584
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This also includes the removal of an old example whose instructions
are not compatible with GN.
BUG=skia:6009
Change-Id: I2807829ca12c19292ae0f5a7ea250ed453f9a182
Reviewed-on: https://skia-review.googlesource.com/5620
Commit-Queue: Derek Sollenberger <djsollen@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Removes the feedback to GrDrawOp via GrPipelineOptimizations.
Change-Id: I3cb17cad41779af292a92385fcd5ac23ae5a1ffd
Reviewed-on: https://skia-review.googlesource.com/6561
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Textures that we sample on the GPU are always premul, so we should
actually clamp to alpha.
Colors that are xformed on the CPU are always unpremul, so clamping to
[0,1] is correct. Add a comment explaining that.
BUG=skia:
Change-Id: I180f2d410f24afc78bd03ab8636a83fb443d68e2
Reviewed-on: https://skia-review.googlesource.com/6581
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
BUG=skia:
Change-Id: I8d4594fcf0eeebf598871bfe9203ed52460c98ce
Reviewed-on: https://skia-review.googlesource.com/6558
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
BUG=skia:
Change-Id: I744af0efd4d48a8932b834092ed2dbad13008c1d
Reviewed-on: https://skia-review.googlesource.com/6556
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
Stage foo_d should always be the same logic as stage foo swapping r and dr, g
and dg, b and db, a and da. This means we can infer their definitions.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Ia0a3abb29a201c647d9ec1860211abfbc19b56ae
Reviewed-on: https://skia-review.googlesource.com/6555
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
BUG=skia:
Change-Id: I1d2629ef954d8ac7211cfbb4707125ed175af63c
Reviewed-on: https://skia-review.googlesource.com/6553
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
All the SkFontMgr factories currently return bare pointers and sometimes
even document the ownership rules. Since such factories can be
implemented by external ports, the ownership rules should be explicit in
order to prevent simple reference counting issues.
Change-Id: I25b598ce0954cd473a3fb1f8adc0cb86331583ca
Reviewed-on: https://skia-review.googlesource.com/6533
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
It's turned out to be surprisingly subtle to make the optimization colorspace
correct (it's not today, but with this CL the affected images become colorspace
correct).
Before implementing a colorspace correct color shader, let's check we really need this.
Update: I should say, it's hard to do this optimization and make the resulting color shader be both colorspace correct and "legacy correct". SkColorShader and SkColor4fShader are both colorspace correct. It's just challenging to create one shader that's correct for both modes from an image.
Change-Id: I6593b6348175a10d7cbaaf3b531d7a7e2bf2f578
Reviewed-on: https://skia-review.googlesource.com/6548
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
When the path was "large" (as defined by ScaleFactor(...)), the computed
bounds would not be adjusted to the correct space. Make sure to scale
the result in those cases.
BUG=chromium:678162
Change-Id: Ia2eb94050c4620286e9abb69976dbc0202ecc307
Reviewed-on: https://skia-review.googlesource.com/6501
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Cary Clark <caryclark@google.com>
This reverts commit e61c40707e.
Reason for revert: this and the ODR caused operations on ContiguousContainerBase::elements_, another std::vector<void*> in Chrome, to start using AVX2 instructions. Boy this is annoying...
Change-Id: I2c4837ad70fdef8096db904022b0703b88c6fd6c
Reviewed-on: https://skia-review.googlesource.com/6549
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This sets the source and executable character set to utf-8. This avoids
issues with local code pages and avoids adding an unwanted BOM.
Change-Id: If854c0001c2363f3262d20e28dce30c1e733536a
Reviewed-on: https://skia-review.googlesource.com/6547
Commit-Queue: Ben Wagner <bungeman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Use quads rather than triangles for the edge geometry. This allows
us to perform a simpler edge categorization (see below). It also
improves performance by reducing the number of edges processed during
the simplify and tessellate steps.
Label AA edges as three types: inner, outer, and connector. This
results in correct alpha values for intersected edges, even when
the top or bottom vertex has been merged with a vertex on edges
of different types.
Changed the "collinear edges" sample from the concavepaths GM for a
"fast-foward" shape, which more clearly shows the problem being fixed
here. (The collinearity from the "collinear edges" was actually being
removed earlier up the stack, causing the path to become convex and
not exercise the concave path renderers anyway.)
NOTE: this will cause changes in the "concavepaths" GM results, and
minor pixel diffs in a number of other tests.
Change-Id: I6c2b0cdb35cda42b01cf1100621271fef5be35b0
Reviewed-on: https://skia-review.googlesource.com/6430
Reviewed-by: Stephan White <senorblanco@chromium.org>
Commit-Queue: Stephan White <senorblanco@chromium.org>
This reverts commit d4b2155248.
Reason for revert: accidentally added some unwanted changes
Original change's description:
> Quality and performance fixes for AA tessellating path renderer.
>
> Use quads rather than triangles for the edge geometry. This allows
> us to perform a simpler edge categorization (see below). It also
> improves performance by reducing the number of edges processed during
> the simplify and tessellate steps.
>
> Label AA edges as three types: inner, outer, and connector. This
> results in correct alpha values for intersected edges, even when
> the top or bottom vertex has been merged with a vertex on edges
> of different types.
>
> Changed the "collinear edges" sample from the concavepaths GM for a
> "fast-foward" shape, which more clearly shows the problem being fixed
> here. (The collinearity from the "collinear edges" was actually being
> removed earlier up the stack, causing the path to become convex and
> not exercise the concave path renderers anyway.)
>
> NOTE: this will cause changes in the "concavepaths" GM results, and
> minor pixel diffs in a number of other tests.
>
> BUG=660893
>
> Change-Id: Ide49374d6d173404c7223f7316dd439df1435787
> Reviewed-on: https://skia-review.googlesource.com/6427
> Commit-Queue: Stephan White <senorblanco@chromium.org>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
>
TBR=bsalomon@google.com,senorblanco@chromium.org,reviews@skia.org
BUG=660893
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I06a36e397645bfc42442a5a9e7c27328f6048ab9
Reviewed-on: https://skia-review.googlesource.com/6428
Commit-Queue: Stephan White <senorblanco@chromium.org>
Reviewed-by: Stephan White <senorblanco@chromium.org>
Use quads rather than triangles for the edge geometry. This allows
us to perform a simpler edge categorization (see below). It also
improves performance by reducing the number of edges processed during
the simplify and tessellate steps.
Label AA edges as three types: inner, outer, and connector. This
results in correct alpha values for intersected edges, even when
the top or bottom vertex has been merged with a vertex on edges
of different types.
Changed the "collinear edges" sample from the concavepaths GM for a
"fast-foward" shape, which more clearly shows the problem being fixed
here. (The collinearity from the "collinear edges" was actually being
removed earlier up the stack, causing the path to become convex and
not exercise the concave path renderers anyway.)
NOTE: this will cause changes in the "concavepaths" GM results, and
minor pixel diffs in a number of other tests.
BUG=660893
Change-Id: Ide49374d6d173404c7223f7316dd439df1435787
Reviewed-on: https://skia-review.googlesource.com/6427
Commit-Queue: Stephan White <senorblanco@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>
BUG=skia:
Change-Id: I04301313132df170a16995b4830b13ffbddbed3b
Reviewed-on: https://skia-review.googlesource.com/6469
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
If a frame does not have a valid transparent index and it covers the
prior frame, it does not really depend on that frame. Instead, it
depends on the frame that the prior frame depends on.
Determine this once we have parsed the local color map (if any), so a
transparent index out of range of the color map is not considered
valid.
Share code that determines whether a frame has a transparent pixel.
Add a test that we compute the dependencies correctly. randPixelsAnim.gif
has 13 frames. After the first, the frames cover all combinations of
- Whether the prior frame was keep, restoreBG or restoreToPrevious
- Whether the new frame covers the prior frame
- Whether the new frame has a transparent pixel
(It only does so when using a global color table. It may make sense to
expand the test to also cover using local color tables.)
The test caught a bug where we incorrectly reused an existing
SkColorTable for a different frame. Fix that bug by keeping track of
the transparent index associated with the current SkColorTable.
Change-Id: I3cf6be7f612990fa7a00d9e74d116d31bd227526
Reviewed-on: https://skia-review.googlesource.com/6402
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
atof is locale dependent, which can lead to bugs when a decimal separator is something other than a dot.
BUG=skia:
Change-Id: I6f161d686ddea86ce9968e46b632bc79a99ef656
Reviewed-on: https://skia-review.googlesource.com/6532
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
The overhead of a stage today is 3 x86 instructions, typically looking something like this:
- movq (%rdi), %rax // Load the next stage function pointer.
- addq $0x10, %rdi // Step our progress ahead 16 bytes to that next stage.
- jmpq *%rax // Transfer control to that stage.
But if we make sure the pointer's in esi/rsi, we can use lodsd/lodsq to do those first two steps in one instruction:
- lodsq (%rsi), %rax (≈ movq (%rdi), %rax; addq $0x8, %rsi).
- jmpq *%rax
This CL rearranges things so that we can take advantage of this and generally trim off an instruction of overhead. Instead of a vector of {Fn, ctx} pairs, we'll flatten it down into a single interlaced program vector of void*, basically just ommitting any null context pointers. We pass the pointer to program as the second argument to Fn, putting it in rsi. These two changes together make getting the next Fn to call or the current context the same cheap lodsq instruction, encapsulated as load_and_increment().
Here's how the simple "modulate" blend stage changes:
vmulps %ymm4, %ymm0, %ymm0
vmulps %ymm5, %ymm1, %ymm1
vmulps %ymm6, %ymm2, %ymm2
vmulps %ymm7, %ymm3, %ymm3
movq (%rdi), %rax
addq $0x10, %rdi
jmpq *%rax
~~~~~~~~>
vmulps %ymm4, %ymm0, %ymm0
vmulps %ymm5, %ymm1, %ymm1
vmulps %ymm6, %ymm2, %ymm2
vmulps %ymm7, %ymm3, %ymm3
lodsq (%rsi), %rax
jmpq *%rax
This does make getting the current context a one-time, destructive operation. It's switched from referring to ctx as a void* directly to using ctx() as a thunk that returns a void*. No stage so far has ever referred to ctx twice, and it all appears to inline, so this seems harmless. "matrix_2x3" is a good example of what stages that use context pointers end up looking like:
lodsq (%rsi), %rax
vbroadcastss (%rax), %ymm9
vbroadcastss 0x8(%rax), %ymm10
vbroadcastss 0x10(%rax), %ymm8
vfmadd231ps %ymm10, %ymm1, %ymm8
vfmadd231ps %ymm9, %ymm0, %ymm8
vbroadcastss 0x4(%rax), %ymm10
vbroadcastss 0xc(%rax), %ymm11
vbroadcastss 0x14(%rax), %ymm9
vfmadd231ps %ymm11, %ymm1, %ymm9
vfmadd231ps %ymm10, %ymm0, %ymm9
lodsq (%rsi), %rax
vmovaps %ymm8, %ymm0
vmovaps %ymm9, %ymm1
jmpq *%rax
We can't do this with MSVC, as there's no intrinsic for it I can find, and they disallow inline assembly, and rsi is not used to pass arguments to functions there anyway. ARM doesn't need it... it does this in two instructions naturally anyway. We could do this for 32-bit x86 but I'd just rather focus on x86-64.
It's unclear to me that this makes things any faster, but doesn't appear to make things any slower, and makes I think both the code and disassembly simpler.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Ia7b543a6718c75a33095371924003c5402b3445a
Reviewed-on: https://skia-review.googlesource.com/6271
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reenable test on GTX10700 bot.
BUG=skia:6080
Change-Id: Ieb4292e88fc337c226dad7ac82c6da84879e9522
Reviewed-on: https://skia-review.googlesource.com/6523
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Ben Wagner <benjaminwagner@google.com>
abort if incoming data is out of range
TBR=reed@google.com
BUG=676866
Change-Id: I7d4850611654a399e32ea2012b23ca369dc53e70
Reviewed-on: https://skia-review.googlesource.com/6525
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Cary Clark <caryclark@google.com>
Also includes a few additional cleanups.
BUG=skia:
Change-Id: I50899bfef964a3391cc9ddf42c3c5a939e01ceae
Reviewed-on: https://skia-review.googlesource.com/6497
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
Instead use std::unique_ptr to manage GrOp lifetime.
Change-Id: Ic1dc1e0ffd7254c3994221f498677af5bbf66a71
Reviewed-on: https://skia-review.googlesource.com/6479
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
I was using 'source' where I should have said 'input'. Also, to be
consistent with other image filters, ensure that the input is in the
destination gamut before we start blurring.
BUG=skia:
Change-Id: I751961b42a2a5d110ee8ea8916279c8fe0d5248e
Reviewed-on: https://skia-review.googlesource.com/6486
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
BUG=skia:
Change-Id: Icf616bec73e81aad97815b519566ff5b9db611e3
Reviewed-on: https://skia-review.googlesource.com/6495
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
The BCP47FromLanguageID and UnicodeFromMacRoman arrays are logically
const but were not marked as such. Marking them as const lets the
compiler/linker store them in the read-only data segment, which is
strictly better than being in read/write memory. This change moves about
3,000 bytes from the .data to .rdata segment in both chrome.dll and
chrome_child.dll.
BUG=677351
Change-Id: I85ff44d61aa232cf29178833fd2bb2e004032b9e
Reviewed-on: https://skia-review.googlesource.com/6424
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Bruce Dawson <brucedawson@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>