1) Disable SampleApp. Seems like it's totally horked? SkOSFile_iOS.mm is missing about half the functions needed, and SkOSFile_stdio.cpp is double-providing the others.
2) Drop armv6.
3) Switch from putting headers in sources to putting the corresponding directories in includes.
4) Force cast the type of glShaderSource. Something to do with GR_GL_USE_NEW_SHADER_SOURCE_SIGNATURE?
After all this,
env CC=clang CXX=clang++ GYP_DEFINES=skia_os=ios make
builds for me.
BUG=skia:2363
R=bsalomon@google.com, epoger@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/226413005
git-svn-id: http://skia.googlecode.com/svn/trunk@14069 2bbb7eff-a529-9590-31e7-b0007b416f81
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13980R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@14025 2bbb7eff-a529-9590-31e7-b0007b416f81
Record performance as measured by bench_record (out/Release/bench_record --skr) improves by at least 1.9x, at most 6.7x, arithmetic mean 2.6x, geometric mean 3.0x. So, good.
Correctness as measured by DM (out/Debug/dm --skr) is ~ok. One GM (shadertext2) fails because we're assuming all paint effects are immutable, but SkShaders are still mutable.
To do after this CL:
- measure playback speed
- catch up feature-wise to SkPicture
- match today's playback speed
BUG=skia:
R=robertphillips@google.com, bsalomon@google.com, reed@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/206313003
git-svn-id: http://skia.googlecode.com/svn/trunk@14010 2bbb7eff-a529-9590-31e7-b0007b416f81
This patch implements basics for Xfermode SSE optimization. Based on
these basics, SSE2 implementation of multiply_modeproc is provided. SSE2
implementation for other modes will come in future. With this patch
performance of Xfermode_Multiply will improve about 45%. Here are the
data on desktop i7-3770.
before:
Xfermode_Multiply 8888: cmsecs = 33.30 565: cmsecs = 45.65
after:
Xfermode_Multiply 8888: cmsecs = 17.18 565: cmsecs = 24.87
BUG=
R=mtklein@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/202903004
git-svn-id: http://skia.googlecode.com/svn/trunk@14006 2bbb7eff-a529-9590-31e7-b0007b416f81
Reason for revert:
GYP's failing on most (all?) bots.
Original issue's description:
> ARM Skia NEON patches - 35 - First AArch64 support
>
> Aarch64 support
>
> This change contains the necessary modifications to have Skia build and
> run properly on an ARMv8 processor in aarch64 execution state.
>
> Here's a list of the changes:
>
> - add an arm64 target to the build system + SK_CPU_ARM64 flag
>
> - MatrixTest was failing when built in Release mode. Fused MAC
> instructions were generated which made some intermediate results
> more accurate. As the test relies on result comparison, the more
> precise results when compared to others led to a gap bigger than
> what was tolerated. As I don't know if some actual skia code relies
> on results being comparable, I've disabled fused MAC instruction
> with -ffp-contract=off for arm64.
>
> - Modify include/core/SkOnce.h to have barriers work.
>
> - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
>
> - use existing Xfermode optimisations with modifications that can be
> removed in the future when toolchains are ready. Also save a few
> instructions is two Xfermodes (will apply to ARM too).
>
> - use existing SkBoxBlur and SkMorphology optimisations.
>
> - use existing SkBlitMask optimisations
>
> - use existing BitmapProcState and Convolution optimisations.
>
> Future changes will include:
>
> - Blitters (only partialy merged upstream)
>
> - SkUtils (there's little value in sending asm optimisations without
> having them benchmarked on real hardware).
>
> Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
>
> BUG=skia:
>
> Committed: http://code.google.com/p/skia/source/detail?r=13980R=djsollen@google.com, reed@google.com, halcanary@google.com, kevin.petit@arm.comTBR=djsollen@google.com, halcanary@google.com, kevin.petit@arm.com, reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/216113005
git-svn-id: http://skia.googlecode.com/svn/trunk@13983 2bbb7eff-a529-9590-31e7-b0007b416f81
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@13980 2bbb7eff-a529-9590-31e7-b0007b416f81
Generate SkUserConfig.
Include arm64 as another build flavor.
Add tests.
gyp/common_conditions.gypi:
Add conditions for Android framework. These will get written into the generated SkUserConfig.
include/core/SkUserConfig.h:
Generated version that will ultimately be checked into Android (but not here).
platform_tools/android/bin/gyp_to_android.py:
Generate SkUserConfig.
Add arm64 (note that arm64 is not currently respected by our gyp files, so it results in use _none.cpp for the various opts).
Reset the common defines, which are now passed to the generated SkUserConfig.
platform_tools/android/gyp_gen/generate_user_config.py:
New script to generate SkUserConfig.h.
platform_tools/android/gyp_gen/gypd_parser.py:
Fix a lint error (unused import).
platform_tools/android/gyp_gen/makefile_writer.py:
Append any remaining DEFINES to LOCAL_CFLAGS (previously this was done during parsing).
Add a warning for arm64 (corresponds to downstream Android.mk).
platform_tools/android/gyp_gen/vars_dict_lib.py:
Add OrderedSet.reset().
Add DEFINES to VarsDict.
platform_tools/android/tests/expectations/:
Add and update expectations files.
platform_tools/android/tests/generate_user_config_tests.py:
New test for generate_user_config.py
platform_tools/android/tests/inputs/SkUserConfig.h:
Input to the new test, so we don't have to update the expectations each time the real SkUserConfig.h changes.
platform_tools/android/tests/makefile_writer_tests.py:
Add a way to rebaseline test_write_local_vars, which has changed.
Refactor EXPECTATIONS_DIR and compare_files into a separate file for sharing with generate_user_config_tests.py.
platform_tools/android/tests/utils.py:
Common code for tests.
platform_tools/android/tests/var_dict_tests.py:
Use a for loop to test the new key (DEFINES) and future proof this test to test any new keys in the future.
BUG=skia:1975
R=djsollen@google.com, halcanary@google.com
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/198063002
git-svn-id: http://skia.googlecode.com/svn/trunk@13975 2bbb7eff-a529-9590-31e7-b0007b416f81
Use path rendering to render the text from outlines if supported by the
GPU. Implement this in GrStencilAndCoverTextContext by copying large
chunks of code from GrBitmapTextContext (drawText) and
GrDistanceFieldTextContext (drawPosText).
The drawing is implemented with "instanced" path drawing
functions.
Moves the creation of the "main" text context from SkGpuDevice to the
GrContext::createTextContext. This is done because the decision of which
text renderer is optimal can be made only with the internal
implementation-specific information of the context.
R=jvanverth@google.com, bsalomon@google.com
Author: kkinnunen@nvidia.com
Review URL: https://codereview.chromium.org/196133014
git-svn-id: http://skia.googlecode.com/svn/trunk@13962 2bbb7eff-a529-9590-31e7-b0007b416f81
This will be used in Blink to accommodate matrices that contain
rotation or shearing. This is a generalization of SkResizeImageFilter,
so I've replaced all uses of SkResizeImageFilter in Skia. (It might be
easier to review by diffing it with SkResizeImageFilter, too.)
R=reed@google.com
Review URL: https://codereview.chromium.org/211103006
git-svn-id: http://skia.googlecode.com/svn/trunk@13941 2bbb7eff-a529-9590-31e7-b0007b416f81
Fix GPU displacement with expanding crop rects, and re-enable the
imagefilterscropexpand GM. There were two bugs: the result texture was
being created at input color bitmap size, not the cropped bounds size,
and the matrix in GrContext was not being set to identity before draw.
R=junov@chromium.org
Review URL: https://codereview.chromium.org/195973007
git-svn-id: http://skia.googlecode.com/svn/trunk@13844 2bbb7eff-a529-9590-31e7-b0007b416f81
NOTE: this patch set is based on https://codereview.chromium.org/189913021/,
and needs that patch to land first.
Until now, crop rects in Skia have only been able to reduce
the size of the destination bounds, but not expand them.
SVG semantics require the latter as well. The heart of
the change is in applyCropRect(), which now assigns each
edge, instead of doing an intersection with the crop rect.
In order to support this (and still work well with tiled
drawing) we need to clip the resulting crop rect to the
clipping region of the filters. This uses the Context struct
previously landed from https://codereview.chromium.org/189913021/.
Many of the pixel loops are not yet ready to handle a
destination rect larger than the source rect. So we provide
a convenience version of applyCropRect() which creates an
offscreen and pads it out with transparent black. Once the
pixel loops and shaders have been fixed to support larger
destination bounds, they should be switched back to the
non-drawing version of applyCropRect().
BUG=skia:
R=bsalomon@google.com, reed@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13805
Review URL: https://codereview.chromium.org/198003008
git-svn-id: http://skia.googlecode.com/svn/trunk@13809 2bbb7eff-a529-9590-31e7-b0007b416f81
NOTE: this patch set is based on https://codereview.chromium.org/189913021/,
and needs that patch to land first.
Until now, crop rects in Skia have only been able to reduce
the size of the destination bounds, but not expand them.
SVG semantics require the latter as well. The heart of
the change is in applyCropRect(), which now assigns each
edge, instead of doing an intersection with the crop rect.
In order to support this (and still work well with tiled
drawing) we need to clip the resulting crop rect to the
clipping region of the filters. This uses the Context struct
previously landed from https://codereview.chromium.org/189913021/.
Many of the pixel loops are not yet ready to handle a
destination rect larger than the source rect. So we provide
a convenience version of applyCropRect() which creates an
offscreen and pads it out with transparent black. Once the
pixel loops and shaders have been fixed to support larger
destination bounds, they should be switched back to the
non-drawing version of applyCropRect().
BUG=skia:
R=bsalomon@google.com, reed@google.com
Review URL: https://codereview.chromium.org/198003008
git-svn-id: http://skia.googlecode.com/svn/trunk@13805 2bbb7eff-a529-9590-31e7-b0007b416f81
Add SkSmallAllocator, a template for allocating small (as defined by the
instantiation) objects without extra calls to new. Add a helper macro to
make using it simple.
Remove SkTemplatesPriv.h, whose behavior is replaced by SkSmallAllocator.
The old SK_PLACEMENT_NEW had the following drawbacks:
- Easily confused with SkNEW_PLACEMENT.
- Requires passing around lots of void*s along with the storageSize.
- Requires using a separate class for deleting it.
- We had multiple ways Auto objects for deleting in different places.
- It always did a straight heap allocation on Windows, meaning Windows
did not get any advantages from the confusing code.
The new SkSmallAllocator simplifies things:
- It is clear about what it does.
- It takes care of the deletion in one place that is automatically
handled.
Further, the new class can be used to create more than one object. This
is in preparation for BUG=skia:1976, for which we would like to create
a new object without extra heap allocations. The plan is to create both
the blitter and the new object on the stack using the SkSmallAllocator.
Add a new test for SkSmallAllocator.
SkShader.h:
Move the private version of CreateBitmapShader to SkBitmapProcShader
(which already has the implementation) and remove the friend class
(which was only used to call this private function). This allows
SkSmallAllocator to reside in the private src/ directory.
SkBitmapProcShader:
Move CreateBitmapShader and the macro for the storage size here. With
the macro in a (private) header, the (private) headers with function
declarations (which now depend on the storage size used) can see the
macro.
Use SkSmallAllocator in CreateBitmapShader.
Change the macro to kBlitterStorageByteCount, since SkSmallAllocator
takes a byte count as its template parameter.
SkBlitter:
Use the SkSmallAllocator.
Remove Sk3DShader::fKillProc and SkAutoCallProc. Both of their
behaviors have been moved into SkSmallAllocator (SkAutoCallProc was
unnecessary anyway, because the only time we ever used it we also
called detach(), so its auto behavior never happened).
Create the Sk3DShader on the stack, if there's room.
Remove the helper version of Choose, which was unused.
SmallAllocatorTest:
Test for the new class.
The rest:
Use SkSmallAllocator.
BUG=skia:1976
R=reed@google.com, mtklein@google.com
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/179343005
git-svn-id: http://skia.googlecode.com/svn/trunk@13696 2bbb7eff-a529-9590-31e7-b0007b416f81
The main meat of things is in SkThreadPool. We can now give SkThreadPool a
type for each thread to create and destroy on its local stack. It's TLS
without going through SkTLS.
I've split the DM tasks into CpuTasks that run on threads with no TLS, and
GpuTasks that run on threads with a thread local GrContextFactory.
The old CpuTask and GpuTask have been renamed to CpuGMTask and GpuGMTask.
Upshot: default run of out/Debug/dm goes from ~45 seconds to ~20 seconds.
BUG=skia:
R=bsalomon@google.com, mtklein@google.com, reed@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/179233005
git-svn-id: http://skia.googlecode.com/svn/trunk@13632 2bbb7eff-a529-9590-31e7-b0007b416f81
Also:
- make GrMemoryPoolBenches threadsafe
- some tweaks to various DM code
- rename GM::shortName() to getName() to match benches and tests
On my desktop, (289 GMs, 617 benches) x 4 configs, 227 tests takes 46s in Debug, 14s in Release. (Still minutes faster than running tests && bench && gm.) GPU singlethreading is definitely the limiting factor again; going to reexamine whether that's helpful to thread it again.
BUG=skia:
R=reed@google.com, bsalomon@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/178473006
git-svn-id: http://skia.googlecode.com/svn/trunk@13603 2bbb7eff-a529-9590-31e7-b0007b416f81