We don't call the tail code nearly as often as the body code, but when we do and call memcpy(), we first have to vzeroupper back into the non-AVX world. That does seem to slow things down considerably. You wouldn't think it, but this gives a nice speed up (tested on Windows).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3783
Change-Id: I40cbe1e529f2431825edec7638265601b64e7ec5
Reviewed-on: https://skia-review.googlesource.com/3783
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Even with a modest cache, we're going to get nearly 100% hit rate
for typical usage scenarios. I'm hoping to avoid the special case
caching of sRGB -> destination, and just rely on the more general
mechanism.
Yes, this is yet-another cache class. I wanted to use one of many
that are laying around, but couldn't find a good fit. On the plus
side, it's not much code.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3726
Change-Id: I943be5c99f0d691a87ffe8c5bc3067a8eb491fc2
Reviewed-on: https://skia-review.googlesource.com/3726
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Also going to use this to allow caching of GrColorSpaceXforms
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3670
Change-Id: I56ed2dcbdddc22046263f56d68f2d6aea55547c8
Reviewed-on: https://skia-review.googlesource.com/3670
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
I'm seeing /GS's _security_check_cookie() show up as a signficant piece of time when profiling. That's mostly just annoying noise. We generally use our Release builds for performance testing and Debug for correctness, so it seems like a fair thing to disable in Release builds... it's a sort of ASAN thing, which we only do in Debug on other platforms.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3782
Change-Id: I9b3cf4c5cf943fc2549f5bf91a1f6f7e41733e2c
Reviewed-on: https://skia-review.googlesource.com/3782
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Ben Wagner <bungeman@google.com>
These two together shave another 5MB off dm.exe, from 16MB -> 11MB.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3738
Change-Id: Id216867e0ad5bc115fbd4006095860dff9204947
Reviewed-on: https://skia-review.googlesource.com/3738
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Ben Wagner <bungeman@google.com>
By default, MSVC generates standalone versions of all functions, including static inline functions that are only inlined. Those standalone versions are dead code. This /Zc:inline flag makes MSVC behave like all the other compilers, omitting those standalone functions. Chrome builds with this flag.
This CL cuts dm.exe and nanobench.exe each down by about 3MB, 19->16MB for DM and 15MB->12MB for nanobench. This shouldn't affect runtime speed, and didn't signficantly change clean build time on my Z840 (~90s either way).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3735
Change-Id: Ibd2a80337fcefc3f4eaf4335ea4e95a80bb4fddb
Reviewed-on: https://skia-review.googlesource.com/3735
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Ben Wagner <bungeman@google.com>
(1) The transformation code *should* support any src SkColorSpace
that we successfully parse. This is agreed upon internally and
by clients. The fact that we currently don't is just a bug...
(2) We cannot and will not support all SkColorSpaces as dsts.
So if we fail to make a SkColorSpaceXform, we should assume that
it was caused by a bad dst color space. The correct response in
this case is to return kInvalidConversion. I've rewritten the CL
to do this.
The fact that weird src spaces will sometimes trigger a
kInvalidConversion is just a bug that is being actively worked on.
TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3661
Change-Id: Iac2b45120507ec71b1b3d555c61931f7348dad9e
Reviewed-on: https://skia-review.googlesource.com/3661
Commit-Queue: Matt Sarett <msarett@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Reviewed-by: Robert Aftias <raftias@google.com>
Unlike -fomit-frame-pointer, this doesn't make debugging or profiling any more difficult, as it only applies to leaves. It will make our code (negligibly) smaller and (negligibly) faster.
Mostly I just find it easier to read the disassembly without all the rbp gymnastics getting in the way.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3700
Change-Id: I4b96aee7619791d5980de7f46e82836ca08a6456
Reviewed-on: https://skia-review.googlesource.com/3700
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
- Give body and tail functions separate types. This frees a register in body functions, especially important for Windows.
- Fill out default, SSE4.1, and HSW versions of all functions. This means we don't have to mess around with SkNf_abi... all functions come from the same compilation unit where SkNf is a single consistent type.
- Move Stage::next() into SkRasterPipeline_opts.h as a static inline function.
- Remove Stage::ctx() entirely... fCtx is literally the same thing.
This is a step along the way toward building the entire pipeline in src/opts, removing the need for all the stages to be functions living in SkOpts.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3680
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-ASAN-Trybot
Change-Id: I7de78ffebc15b9bad4eda187c9f50369cd7e5e42
Reviewed-on: https://skia-review.googlesource.com/3680
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
The bug was the raster version didn't correctly handle the CTM.
This CL also adds a way to test the behavior (by translating the
reveal GM around in SampleApp)
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3729
Change-Id: Iaacc905167d20b453203307e5ef840f552fdbb38
Reviewed-on: https://skia-review.googlesource.com/3729
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Win-MSVC-ShuttleC-GPU-GTX960-x86_64-Debug-ANGLE-Trybot
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3724
Change-Id: Ic3d6efcb331ac3947026476e357e76214f2ccdf8
Reviewed-on: https://skia-review.googlesource.com/3724
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Many old pathops-related fuzz failures have built up while
the codebase was under a state a flux. Now that the code
is stable, address these failures.
Most of the CL plumbs the debug global state to downstream
routines so that, if the data is not trusted (ala fuzzed)
the function can safely exit without asserting.
TBR=reed@google.com
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2426173002
Review-Url: https://chromiumcodereview.appspot.com/2426173002
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3727
Change-Id: Ic07ea6bd2756f1be08e80075c236a70ce6c08a3b
TBR=mtklein
Reviewed-on: https://skia-review.googlesource.com/3727
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
- Support ObjC / ObjC++
- Build SDL on Mac.
- Build viewer on Mac.
Patched from Jim's CL.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3760
Change-Id: I12663f2ed2969e22f51aefed560fbc22b2524167
Reviewed-on: https://skia-review.googlesource.com/3760
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
The Android builders don't need the clang_linux asset.
I don't think anything needs the android_sdk asset.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3740
Change-Id: I3e61ba23ed661a998f9dc0eb4c46cc8eb1cf3226
Reviewed-on: https://skia-review.googlesource.com/3740
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
SkRRectsGaussianEdgeShader will be removed once the usage of the
MaskFilter flavor has been propagated to Android
I will complete the raster implementation in a follow up CL.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3632
Change-Id: I42470b17308582b040a5db1a7283c3d717405345
Reviewed-on: https://skia-review.googlesource.com/3632
Commit-Queue: Robert Phillips <robertphillips@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Some classes directly call global operator new to reserve space in
addition to the space the class will occupy. These clases must be
deleted with the unsized global operator new. If a build is configured
such that sized global operator new is called from a delete expression,
this must be overridden by such classes.
TBR=reed
Only affects private bits of SkData.
Change-Id: I797935db17a37aa8c2ca7b562a4ea65a7978a9f0
Reviewed-on: https://skia-review.googlesource.com/3678
Reviewed-by: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Ben Wagner <bungeman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
MSVC was not picking up that tail==0 in the non-_tail versions of the kernel functions in SkRasterPipeline_opts.h. This passes through a template bool parameter to convey the same message.
This makes the body code a bit smaller and faster on MSVC now by removing the tail>0 check and code.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3669
Change-Id: I8bf81717a83f216eb1eb28a75dac41779dc508c1
Reviewed-on: https://skia-review.googlesource.com/3669
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit 4f02ce7995.
Reason for revert: missed a stage
TBR=mtklein@chromium.org,msarett@google.com,herb@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I1dc1229183d67fe72977e492977a97b19dc630d2
Reviewed-on: https://skia-review.googlesource.com/3675
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
All SkNx are now in anonymous namespaces and all their methods are force-inlined. We should not have any ODR problems.
This is still a near 2x speedup, more so for f16.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3667
Change-Id: I6db9a46f7164f49827ab4d7983e80bf8cea99995
Reviewed-on: https://skia-review.googlesource.com/3667
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
crrev.com/2420843003 (DIFFERENT ISSUE) introduced some changes in
sampled images. (I already corrected the problem for interlaced PNGs
in crrev.com/2424353003.)
When deciding whether a row is needed, we need to subtract the starting
coordinate, similar to how we subtracted fFirstRow in SkPngCodec.
This should "fix" the remaining untriaged images in Gold (i.e. we will
go back to producing the original image).
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2440563002
Review-Url: https://chromiumcodereview.appspot.com/2440563002
We check this define to know which intrinsics we can call safely. The -msse flags set it for us on non-MSVC, but MSVC has no such switch. We do this in GYP (and Chrome's GN) too. No need for any defines on :avx or :hsw targets... the /arch:AVX and /arch:AVX2 do set SK_CPU_SSE_LEVEL for us.
Most directly, this means things like Sk4f::thenElse() will now use blendps when compiled into SkOpts_sse41.cpp.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3666
Change-Id: Ie80a8b8e5544250b45cfe51c40604fade06b3ef9
Reviewed-on: https://skia-review.googlesource.com/3666
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Matt Sarett <msarett@google.com>
All small stuff:
- printf doesn't go to the Visual Studio console, SkDebugf does;
- the Windows console can't show an ellipsis;
- overwriting the console line is a little different on Windows.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3664
Change-Id: I0175afd6d0147feaff8ff6edae2b35a7435c25f5
Reviewed-on: https://skia-review.googlesource.com/3664
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
In this function, when count is 0, it maps the dst point to start, where
it should really be stop. A test case is also added.
In the test case, it should be drawing three lines, without the change in
SkPath class, it will draw 2 lines only with the top horizontal line
missing because it maps the dst point to the start point, and hence
the horizontal line is not drawn.
BUG=640031
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2409983004
Review-Url: https://chromiumcodereview.appspot.com/2409983004
Was just perusing DEPS and I realized shaderc is probably no longer needed.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3561
Change-Id: I054a424b26e51dbfee77dbe79e1e175399627902
Reviewed-on: https://skia-review.googlesource.com/3561
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Greg Daniel <egdaniel@google.com>
I think the Ninja [nnn/mmm] counts started going off when we landed this.
I'd rather have the [nnn/mmm] be correct than have the timestamps.
Change-Id: I96d24664789393056f94202f2b549ed5a4fe4bdb
Reviewed-on: https://skia-review.googlesource.com/3604
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reduces copy-paste and eases maintenance. I'll be adding another field to
AsFPArgs soon, and this is going to streamline that change.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3639
Change-Id: I6372ed5dce50a5ba9d73039bd4714e34502a1f75
Reviewed-on: https://skia-review.googlesource.com/3639
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
MSVC's not so good at inlining. So tell it where to. It won't hurt the others.
This has nothing directly to do with ODR safety. The anonymous namespaces and 'static' on freestanding functions provide the correctness we need there. But this change can help to mechanically prevent the sort of problems ODR violations can lead to.
I may follow up by extending this strategy further to Sk4px, which is used to implement a lot of the legacy xfermodes.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3608
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: I927334c40910ce43da1fbabdf243c9cd5438bea6
Reviewed-on: https://skia-review.googlesource.com/3608
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>