skia2

Author	SHA1	Message	Date
mtklein	406654be7a	SkThreadPool ~~> SkTaskGroup SkTaskGroup is like SkThreadPool except the threads stay in one global pool. Each SkTaskGroup itself is tiny (4 bytes) and its wait() method applies only to tasks add()ed to that instance, not the whole thread pool. This means we don't need to bring up new thread pools when tests themselves want to use multithreading (e.g. pathops, quilt). We just create a new SkTaskGroup and wait for that to complete. This should be more efficient, and allow us to expand where we use threads to really latency sensitive places. E.g. we can probably now use these in nanobench for CPU .skp rendering. Now that all threads are sharing the same pool, I think we can remove most of the custom mechanism pathops tests use to control threading. They'll just ride on the global pool with all other tests now. This (temporarily?) removes the GPU multithreading feature from DM, which we don't use. On my desktop, DM runs a little faster (57s -> 55s) in Debug, and a lot faster in Release (36s -> 24s). The bots show speedups of similar proportions, cutting more than a minute off the N4/Release and Win7/Debug runtimes. BUG=skia: Committed: https://skia.googlesource.com/skia/+/9c7207b5dc71dc5a96a2eb107d401133333d5b6f R=caryclark@google.com, bsalomon@google.com, bungeman@google.com, mtklein@google.com, reed@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/531653002	2014-09-03 15:34:37 -07:00
mtklein	2460bbdfbb	Revert of SkThreadPool ~~> SkTaskGroup (patchset #4 id:60001 of https://codereview.chromium.org/531653002/) Reason for revert: Leaks, leaks, leaks. Original issue's description: > SkThreadPool ~~> SkTaskGroup > > SkTaskGroup is like SkThreadPool except the threads stay in > one global pool. Each SkTaskGroup itself is tiny (4 bytes) > and its wait() method applies only to tasks add()ed to that > instance, not the whole thread pool. > > This means we don't need to bring up new thread pools when > tests themselves want to use multithreading (e.g. pathops, > quilt). We just create a new SkTaskGroup and wait for that > to complete. This should be more efficient, and allow us > to expand where we use threads to really latency sensitive > places. E.g. we can probably now use these in nanobench > for CPU .skp rendering. > > Now that all threads are sharing the same pool, I think we > can remove most of the custom mechanism pathops tests use > to control threading. They'll just ride on the global pool > with all other tests now. > > This (temporarily?) removes the GPU multithreading feature > from DM, which we don't use. > > On my desktop, DM runs a little faster (57s -> 55s) in > Debug, and a lot faster in Release (36s -> 24s). The bots > show speedups of similar proportions, cutting more than a > minute off the N4/Release and Win7/Debug runtimes. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/9c7207b5dc71dc5a96a2eb107d401133333d5b6f R=caryclark@google.com, bsalomon@google.com, bungeman@google.com, reed@google.com, mtklein@chromium.org TBR=bsalomon@google.com, bungeman@google.com, caryclark@google.com, mtklein@chromium.org, reed@google.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/533393002	2014-09-03 14:17:48 -07:00
mtklein	9c7207b5dc	SkThreadPool ~~> SkTaskGroup SkTaskGroup is like SkThreadPool except the threads stay in one global pool. Each SkTaskGroup itself is tiny (4 bytes) and its wait() method applies only to tasks add()ed to that instance, not the whole thread pool. This means we don't need to bring up new thread pools when tests themselves want to use multithreading (e.g. pathops, quilt). We just create a new SkTaskGroup and wait for that to complete. This should be more efficient, and allow us to expand where we use threads to really latency sensitive places. E.g. we can probably now use these in nanobench for CPU .skp rendering. Now that all threads are sharing the same pool, I think we can remove most of the custom mechanism pathops tests use to control threading. They'll just ride on the global pool with all other tests now. This (temporarily?) removes the GPU multithreading feature from DM, which we don't use. On my desktop, DM runs a little faster (57s -> 55s) in Debug, and a lot faster in Release (36s -> 24s). The bots show speedups of similar proportions, cutting more than a minute off the N4/Release and Win7/Debug runtimes. BUG=skia: R=caryclark@google.com, bsalomon@google.com, bungeman@google.com, mtklein@google.com, reed@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/531653002	2014-09-03 14:06:48 -07:00
commit-bot@chromium.org	39e8d93337	DM tweaks - Don't print status updates for skipped tasks or count them as pending tasks. - Refactor DMReporter a bit for better symmetry, be more explicit about how we read atomics (that is, approximately) in printStatus() (née finish()). - Remove mutex locking from printStatus(). BUG=skia: R=halcanary@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/309483003 git-svn-id: http://skia.googlecode.com/svn/trunk@14977 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-05-29 20:14:48 +00:00
commit-bot@chromium.org	3f032156c8	DM: Push GPU-parent child tasks to the front of the queue. Like yesterday's change to run CPU-parent child tasks serially in thread, this reduces peak memory usage by improving the temporaly locality of the bitmaps we create. E.g. Let's say we start with tasks A B C and D Queue: [ A B C D ] Running A creates A' and A", which depend on a bitmap created by A. Queue: [ B C D A' A" * ] That bitmap now needs sit around in RAM while B C and D run pointlessly and can only be destroyed at . If instead we do this and push dependent child tasks to the front of the queue, the queue and bitmap lifetime looks like this: Queue: [ A' A" B C D ] This is much, much worse in practice because the queue is often several thousand tasks long. 100s of megs of bitmaps can pile up for 10s of seconds pointlessly. To make this work we add addNext() to SkThreadPool and its cousin DMTaskRunner. I also took the opportunity to swap head and tail in the threadpool implementation so it matches the comments and intuition better: we always pop the head, add() puts it at the tail, addNext() at the head. Before Debug: 49s, 1403352k peak Release: 16s, 2064008k peak After Debug: 49s, 1234788k peak Release: 15s, 1903424k peak BUG=skia:2478 R=bsalomon@google.com, borenet@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/263803003 git-svn-id: http://skia.googlecode.com/svn/trunk@14506 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-05-01 17:41:32 +00:00
commit-bot@chromium.org	b0c7156d5b	DM: run child tasks that are already on the CPU threadpool serially These tasks tend to do similar things with similar sized bitmaps, so running them serially means we tend to hold 2x bitmaps at a time (golden and comparison) instead of (1+k)x bitmaps (golden and k concurrent comparisons). We still migrate GPU task's children over to the main CPU thread pool, because they'll run faster there and free up capacity on the GPU thread. Before Debug: 54s, 2.9G peak Release: 13s, 2.4G peak After Debug: 48s, 1.5G peak Release: 15s, 2.0G peak BUG=skia:2478 R=borenet@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/261593008 git-svn-id: http://skia.googlecode.com/svn/trunk@14486 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-04-30 20:47:30 +00:00
commit-bot@chromium.org	787227d356	Let DM work without a GPU. Testing: /m/s/skia (dm) $ d dm; and env GYP_DEFINES=skia_gpu=0 d dm ninja: Entering directory `out/Debug' ninja: no work to do. (294 GMs, 620 benches) x 4 configs, 245 tests 4507 tasks leftUnsupported vertex-color/texture xfer mode. Unsupported vertex-color/texture xfer mode. 0 tasks left 416.53user 9.86system 0:47.43elapsed 898%CPU (0avgtext+0avgdata 13353376maxresident)k 0inputs+0outputs (0major+3579906minor)pagefaults 0swaps ninja: Entering directory `out/Debug' [909/909] LINK dm (287 GMs, 612 benches) x 4 configs, 227 tests 0 tasks left 365.24user 7.71system 0:14.55elapsed 2562%CPU (0avgtext+0avgdata 14718912maxresident)k 0inputs+0outputs (0major+3328269minor)pagefaults 0swaps BUG=skia: R=bsalomon@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/213093004 git-svn-id: http://skia.googlecode.com/svn/trunk@13960 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-03-26 21:26:15 +00:00
commit-bot@chromium.org	a39874b636	DM: tweak output. Show task name in verbose mode only, and add task runtime. BUG=skia: R=reed@google.com, bsalomon@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/185233004 git-svn-id: http://skia.googlecode.com/svn/trunk@13639 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-03-03 15:44:56 +00:00
commit-bot@chromium.org	ef57b7e653	DM: make GPU tasks multithreaded again. Big refactor. The main meat of things is in SkThreadPool. We can now give SkThreadPool a type for each thread to create and destroy on its local stack. It's TLS without going through SkTLS. I've split the DM tasks into CpuTasks that run on threads with no TLS, and GpuTasks that run on threads with a thread local GrContextFactory. The old CpuTask and GpuTask have been renamed to CpuGMTask and GpuGMTask. Upshot: default run of out/Debug/dm goes from ~45 seconds to ~20 seconds. BUG=skia: R=bsalomon@google.com, mtklein@google.com, reed@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/179233005 git-svn-id: http://skia.googlecode.com/svn/trunk@13632 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-02-28 20:31:31 +00:00
commit-bot@chromium.org	38aeb0fd7a	DM: also run benches once. Also: - make GrMemoryPoolBenches threadsafe - some tweaks to various DM code - rename GM::shortName() to getName() to match benches and tests On my desktop, (289 GMs, 617 benches) x 4 configs, 227 tests takes 46s in Debug, 14s in Release. (Still minutes faster than running tests && bench && gm.) GPU singlethreading is definitely the limiting factor again; going to reexamine whether that's helpful to thread it again. BUG=skia: R=reed@google.com, bsalomon@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/178473006 git-svn-id: http://skia.googlecode.com/svn/trunk@13603 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-02-26 23:01:57 +00:00
commit-bot@chromium.org	0dc5bd149a	Let DM run unit tests. - refactor GYPs and a few flags - make GPU tests grab a thread-local GrContextFactory when needed as we do in DM for GMs - add a few more UI features to make DM more like tests I believe this makes the program 'tests' obsolete. It should be somewhat faster to run the two sets together than running the old binaries serially: - serial: tests 20s (3m18s CPU), dm 21s (3m01s CPU) - together: 27s (6m21s CPU) Next up is to incorporate benches. I'm only planning there on a single-pass sanity check, so that won't obsolete the program 'bench' just yet. Tested: out/Debug/tests && out/Debug/dm && echo ok BUG=skia: Committed: http://code.google.com/p/skia/source/detail?r=13586 R=reed@google.com, bsalomon@google.com, mtklein@google.com, tfarina@chromium.org Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/178273002 git-svn-id: http://skia.googlecode.com/svn/trunk@13592 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-02-26 16:31:22 +00:00
commit-bot@chromium.org	79e13260cf	Revert of Let DM run unit tests. (https://codereview.chromium.org/178273002/) Reason for revert: broke tests Original issue's description: > Let DM run unit tests. > - refactor GYPs and a few flags > - make GPU tests grab a thread-local GrContextFactory when needed as we do in DM for GMs > - add a few more UI features to make DM more like tests > > I believe this makes the program 'tests' obsolete. > > It should be somewhat faster to run the two sets together than running the old binaries serially: > - serial: tests 20s (3m18s CPU), dm 21s (3m01s CPU) > - together: 27s (6m21s CPU) > > Next up is to incorporate benches. I'm only planning there on a single-pass sanity check, so that won't obsolete the program 'bench' just yet. > > Tested: out/Debug/tests && out/Debug/dm && echo ok > BUG=skia: > > Committed: http://code.google.com/p/skia/source/detail?r=13586 R=bsalomon@google.com, mtklein@google.com, tfarina@chromium.org, mtklein@chromium.org TBR=bsalomon@google.com, mtklein@chromium.org, mtklein@google.com, tfarina@chromium.org NOTREECHECKS=true NOTRY=true BUG=skia: Author: reed@google.com Review URL: https://codereview.chromium.org/179403010 git-svn-id: http://skia.googlecode.com/svn/trunk@13587 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-02-25 20:02:09 +00:00
commit-bot@chromium.org	6bd250a2a3	Let DM run unit tests. - refactor GYPs and a few flags - make GPU tests grab a thread-local GrContextFactory when needed as we do in DM for GMs - add a few more UI features to make DM more like tests I believe this makes the program 'tests' obsolete. It should be somewhat faster to run the two sets together than running the old binaries serially: - serial: tests 20s (3m18s CPU), dm 21s (3m01s CPU) - together: 27s (6m21s CPU) Next up is to incorporate benches. I'm only planning there on a single-pass sanity check, so that won't obsolete the program 'bench' just yet. Tested: out/Debug/tests && out/Debug/dm && echo ok BUG=skia: R=reed@google.com, bsalomon@google.com, mtklein@google.com, tfarina@chromium.org Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/178273002 git-svn-id: http://skia.googlecode.com/svn/trunk@13586 2bbb7eff-a529-9590-31e7-b0007b416f81	2014-02-25 19:32:15 +00:00
rmistry@google.com	d6bab02386	Reverting r12427 git-svn-id: http://skia.googlecode.com/svn/trunk@12428 2bbb7eff-a529-9590-31e7-b0007b416f81	2013-12-02 13:50:38 +00:00
skia.committer@gmail.com	5b39f5ba9c	Sanitizing source files in Housekeeper-Nightly git-svn-id: http://skia.googlecode.com/svn/trunk@12427 2bbb7eff-a529-9590-31e7-b0007b416f81	2013-12-02 13:36:22 +00:00
mtklein@google.com	ca5bb87a31	DM: write failed comparison mode .pngs one more level deep in the tree. E.g. instead of having to compare /tmp/dm/565/optimizations.png vs. /tmp/dm/replay/optimizations_565.png it's now /tmp/dm/565/optimizations.png vs. /tmp/dm/replay/565/optimizations.png This lets working with skdiff go a lot more smoothly. BUG= R=bsalomon@google.com Review URL: https://codereview.chromium.org/88773002 git-svn-id: http://skia.googlecode.com/svn/trunk@12402 2bbb7eff-a529-9590-31e7-b0007b416f81	2013-11-26 22:06:12 +00:00
scroggo@google.com	1ecd9cf379	Fix a warning building DM using ninja on Mac. Here is the warning: ../../dm/DMTask.cpp: In copy constructor ‘DM::Task::Task(const DM::Task&)’: ../../dm/DMTask.cpp:17: warning: base class ‘class SkRunnable’ should be explicitly initialized in the copy constructor Also add an SK_OVERRIDE. R=mtklein@google.com Review URL: https://codereview.chromium.org/76903002 git-svn-id: http://skia.googlecode.com/svn/trunk@12317 2bbb7eff-a529-9590-31e7-b0007b416f81	2013-11-20 16:44:59 +00:00
mtklein@google.com	d36522d12d	dm is like gm, but faster and with fewer features. This is sort of the near-minimal proof-of-concept skeleton. - It can run existing GMs. - It supports most configs (just not PDF). - --replay is the only "fancy" feature it currently supports Hopefully you will be disturbed by its speed. BUG= R=epoger@google.com Review URL: https://codereview.chromium.org/22839016 git-svn-id: http://skia.googlecode.com/svn/trunk@11802 2bbb7eff-a529-9590-31e7-b0007b416f81	2013-10-16 13:02:15 +00:00

18 Commits