3f032156c8
Like yesterday's change to run CPU-parent child tasks serially in thread, this reduces peak memory usage by improving the temporaly locality of the bitmaps we create. E.g. Let's say we start with tasks A B C and D Queue: [ A B C D ] Running A creates A' and A", which depend on a bitmap created by A. Queue: [ B C D A' A" * ] That bitmap now needs sit around in RAM while B C and D run pointlessly and can only be destroyed at *. If instead we do this and push dependent child tasks to the front of the queue, the queue and bitmap lifetime looks like this: Queue: [ A' A" * B C D ] This is much, much worse in practice because the queue is often several thousand tasks long. 100s of megs of bitmaps can pile up for 10s of seconds pointlessly. To make this work we add addNext() to SkThreadPool and its cousin DMTaskRunner. I also took the opportunity to swap head and tail in the threadpool implementation so it matches the comments and intuition better: we always pop the head, add() puts it at the tail, addNext() at the head. Before Debug: 49s, 1403352k peak Release: 16s, 2064008k peak After Debug: 49s, 1234788k peak Release: 15s, 1903424k peak BUG=skia:2478 R=bsalomon@google.com, borenet@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/263803003 git-svn-id: http://skia.googlecode.com/svn/trunk@14506 2bbb7eff-a529-9590-31e7-b0007b416f81 |
||
---|---|---|
.. | ||
DM.cpp | ||
DMBenchTask.cpp | ||
DMBenchTask.h | ||
DMCpuGMTask.cpp | ||
DMCpuGMTask.h | ||
DMExpectations.h | ||
DMExpectationsTask.cpp | ||
DMExpectationsTask.h | ||
DMGpuGMTask.cpp | ||
DMGpuGMTask.h | ||
DMGpuSupport.h | ||
DMPipeTask.cpp | ||
DMPipeTask.h | ||
DMQuiltTask.cpp | ||
DMQuiltTask.h | ||
DMRecordTask.cpp | ||
DMRecordTask.h | ||
DMReplayTask.cpp | ||
DMReplayTask.h | ||
DMReporter.cpp | ||
DMReporter.h | ||
DMSerializeTask.cpp | ||
DMSerializeTask.h | ||
DMTask.cpp | ||
DMTask.h | ||
DMTaskRunner.cpp | ||
DMTaskRunner.h | ||
DMTestTask.cpp | ||
DMTestTask.h | ||
DMUtil.cpp | ||
DMUtil.h | ||
DMWriteTask.cpp | ||
DMWriteTask.h | ||
README |
DM is like GM, but multithreaded. It doesn't do everything GM does yet. Current approximate list of missing features: --config pdf --mismatchPath --missingExpectationsPath --writePicturePath --deferred DM's design is based around Tasks and a TaskRunner. A Task represents an independent unit of work that might fail. We make a task for each GM/configuration pair we want to run. Tasks can kick off new tasks themselves. For example, a CpuTask can kick off a ReplayTask to make sure recording and playing back an SkPicture gives the same result as direct rendering. The TaskRunner runs all tasks on one of two threadpools, whose sizes are configurable by --cpuThreads and --gpuThreads. Ideally we'd run these on a single threadpool but it can swamp the GPU if we shove too much work into it at once. --cpuThreads defaults to the number of cores on the machine. --gpuThreads defaults to 1, but you may find 2 or 4 runs a little faster. So the main flow of DM is: for each GM: for each configuration: kick off a new task < tasks run, maybe fail, and maybe kick off new tasks > wait for all tasks to finish report failures