Commit Graph

1552 Commits

Author SHA1 Message Date
hstern
0446a3c8e2 Add initial CurveMeasure code
- This code is entirely private and is not being used by anything.

- In a future CL we will write a class that uses CurveMeasure to compute dash points. In order to determine whether CurveMeasure or PathMeasure should be faster, we need the dash info (the sum of the on/off intervals and how many there are)

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2187083002

Review-Url: https://codereview.chromium.org/2187083002
2016-08-08 12:28:13 -07:00
mtklein
4e97607d9a Use sse4.2 CRC32 instructions to hash when available.
About 9x faster than Murmur3 for long inputs.

Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot

Review-Url: https://codereview.chromium.org/2208903002
2016-08-08 09:06:28 -07:00
mtklein
8c1a4f80d9 update picture recording benchmarks to allow comparison with SkLiteRecorder
Here's a demo.  The new code is still looking 2-3x faster.

~/skia (bench) $ r nanobench --match nytimes --config nonrendering --ms 2000
curr/maxrss	loops	min	median	mean	max	stddev	samples	config	bench
  19/26  MB	2	146µs	147µs	151µs	422µs	9%	6615	nonrendering	desk_nytimes.skp
  20/26  MB	4	46.6µs	46.9µs	48.2µs	204µs	10%	10370	nonrendering	keymobi_nytimes_com_.skp

~/skia (bench) $ r nanobench --match nytimes --config nonrendering --ms 2000  --lite
curr/maxrss	loops	min	median	mean	max	stddev	samples	config	bench
  19/26  MB	2	73.8µs	76.9µs	78.7µs	417µs	14%	12702	nonrendering	desk_nytimes.skp
  20/26  MB	5	18.5µs	18.7µs	19.3µs	137µs	12%	20713	nonrendering	keymobi_nytimes_com_.skp

Here's a quick performance diff, where <1x means --lite is faster:

    top25desk_wikipedia__1_tab_.skp	 285us ->  364us	1.27x
      top25desk_games_yahoo_com.skp	 302us ->  329us	1.09x
                   tabl_mozilla.skp	 241us ->  260us	1.08x
                desk_chalkboard.skp	 321us ->  313us	0.98x
               tabl_gamedeksiam.skp	 383us ->  367us	0.96x
            top25desk_pinterest.skp	 375us ->  281us	0.75x
keymobi_reddit_com_r_programmin.skp	 258us ->  142us	0.55x
                   desk_nytimes.skp	 149us -> 77.9us	0.52x
      keymobi_worldjournal_com_.skp	 201us ->  104us	0.52x
              top25desk_blogger.skp	 112us ->   55us	0.49x
    top25desk_sports_yahoo_com_.skp	 186us -> 89.6us	0.48x
         desk_googlespreadsheet.skp	 206us -> 97.5us	0.47x
top25desk_google_com_search_q_c.skp	 192us -> 89.8us	0.47x
      keymobi_wikipedia__1_tab_.skp	 170us -> 79.3us	0.47x
keymobi_wikipedia__1_tab____del.skp	 170us -> 78.2us	0.46x
              desk_unicodetable.skp	6.25ms -> 2.87ms	0.46x
                    desk_carsvg.skp	 138us -> 63.3us	0.46x
    top25desk_answers_yahoo_com.skp	 133us -> 60.7us	0.46x
                 top25desk_espn.skp	 108us -> 49.2us	0.45x
top25desk_plus_google_com_11003.skp	 361us ->  162us	0.45x
                      desk_espn.skp	99.4us -> 44.5us	0.45x
              tabl_worldjournal.skp	 103us -> 45.6us	0.44x
             desk_ugamsolutions.skp	56.2us -> 24.8us	0.44x
             top25desk_facebook.skp	82.7us -> 35.7us	0.43x
       keymobi_cuteoverload_com.skp	 213us -> 91.9us	0.43x
             top25desk_linkedin.skp	61.3us -> 26.3us	0.43x
       top25desk_news_yahoo_com.skp	 153us -> 65.6us	0.43x
               desk_gmailthread.skp	64.9us -> 27.8us	0.43x
keymobi_androidpolice_com_2012_.skp	 167us -> 71.3us	0.43x
           top25desk_amazon_com.skp	77.5us -> 33.1us	0.43x
                   desk_wowwiki.skp	 129us -> 54.1us	0.42x
          top25desk_weather_com.skp	 113us -> 47.1us	0.42x
keymobi_facebook_com_barackobam.skp	95.2us -> 39.6us	0.42x
keymobi_shop_mobileweb_ebay_com.skp	31.5us -> 13.1us	0.42x
keymobi_amazon_com_gp_aw_s_ref_.skp	46.1us -> 18.9us	0.41x
keymobi_mobile_news_sandbox_goo.skp	90.7us ->   37us	0.41x
top25desk_google_com__hl_en_q_b.skp	52.4us -> 21.4us	0.41x
keymobi_answers_yahoo_com_quest.skp	96.5us -> 39.3us	0.41x
                    tabl_pravda.skp	 126us -> 51.2us	0.41x
           keymobi_nytimes_com_.skp	46.9us ->   19us	0.4x
keymobi_ftw_usatoday_com_2014_0.skp	 119us -> 48.2us	0.4x
          top25desk_youtube_com.skp	 162us -> 65.3us	0.4x
         keymobi_news_yahoo_com.skp	58.1us -> 23.2us	0.4x
         keymobi_boingboing_net.skp	58.8us -> 23.4us	0.4x
         keymobi_techcrunch_com.skp	26.3us -> 10.4us	0.39x
keymobi_plus_google_com_app_bas.skp	26.9us -> 10.4us	0.38x
keymobi_google_co_uk_search_hl_.skp	35.1us -> 13.4us	0.38x
              keymobi_pinterest.skp	26.2us ->   10us	0.38x
        keymobi_deviantart_com_.skp	67.1us -> 25.4us	0.38x
                     tabl_gmail.skp	10.3us -> 3.86us	0.38x
             top25desk_ebay_com.skp	65.6us -> 24.5us	0.37x
keymobi_m_youtube_com_watch_v_9.skp	57.9us -> 21.6us	0.37x
            top25desk_wordpress.skp	 138us -> 51.3us	0.37x
                 keymobi_gsp_ro.skp	  17us -> 6.34us	0.37x
       top25desk_techcrunch_com.skp	93.6us -> 34.7us	0.37x
keymobi_cnn_com_2012_10_03_poli.skp	 232us -> 85.5us	0.37x
                keymobi_cnn_com.skp	30.5us -> 11.1us	0.37x
keymobi_baidu_com_s_wd_barack_o.skp	39.3us -> 14.3us	0.36x
keymobi_online_wsj_com_home_pag.skp	50.3us -> 18.3us	0.36x
               keymobi_digg_com.skp	54.8us -> 19.5us	0.36x
keymobi_wowwiki_com_world_of_wa.skp	39.4us ->   14us	0.36x
keymobi_theverge_com_2012_10_28.skp	 102us -> 36.4us	0.36x
                      tabl_digg.skp	 105us -> 37.4us	0.36x
 top25desk_google_com_calendar_.skp	67.2us -> 23.7us	0.35x
              keymobi_wordpress.skp	65.3us ->   23us	0.35x
             desk_css3gradients.skp	56.4us -> 19.8us	0.35x
top25desk_mail_google_com_mail_.skp	 119us -> 41.6us	0.35x
                desk_googlehome.skp	 8.2us -> 2.85us	0.35x
top25desk_docs___1_open_documen.skp	23.8us -> 8.22us	0.35x
               keymobi_mlb_com_.skp	18.6us ->  6.3us	0.34x
          keymobi_slashdot_org_.skp	  33us ->   11us	0.33x
                 desk_tiger8svg.skp	96.2us ->   32us	0.33x
              top25desk_twitter.skp	 124us -> 40.7us	0.33x
keymobi_bing_com_search_q_sloth.skp	17.3us -> 5.55us	0.32x
               keymobi_linkedin.skp	6.78us -> 1.99us	0.29x
          top25desk_booking_com.skp	 291us -> 83.2us	0.29x
                keymobi_blogger.skp	19.3us -> 5.47us	0.28x
            keymobi_sfgate_com_.skp	83.3us ->   23us	0.28x
            desk_jsfiddlebigcar.skp	10.8us -> 2.95us	0.27x
           keymobi_theverge_com.skp	  22us -> 5.27us	0.24x
                    desk_mapsvg.skp	1.15us ->  216ns	0.19x
keymobi_iphone_capitolvolkswage.skp	 121us -> 22.3us	0.18x
                 desk_wikipedia.skp	1.36us ->  244ns	0.18x
               desk_pokemonwiki.skp	1.35us ->  243ns	0.18x
                  desk_samoasvg.skp	1.39us ->  241ns	0.17x
                  desk_tigersvg.skp	1.41us ->  241ns	0.17x
keymobi_booking_com_searchresul.skp	 129us -> 19.7us	0.15x

Some spot testing makes it look like everything that's not a giant speedup can be made so by tweaking my (arbitrarily set) maximum size for the free list.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2220273002

Review-Url: https://codereview.chromium.org/2220273002
2016-08-08 06:56:22 -07:00
mtklein
9c5052f16b SkLite*
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.

This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:

  1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord.  This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.

  2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array.  This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.

These two together mean recording is faster.  Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today.  (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.)  This new strategy records 10 drawRects() in about the same time the old strategy took for 2.

This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.

A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1.  That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.

This code is not fully capable yet, but most of the key design points are there.  The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.

You can run nanobench --match picture_overhead as a demo.  Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go.  I have not yet compared playback performance.

It should be simple to wrap this into an SkPicture subclass if we want.

I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002

Review-Url: https://codereview.chromium.org/2213333002
2016-08-06 12:51:51 -07:00
robertphillips
276d3286b3 Add new bench for occluded blurmaskfilter draws
w/ occluders
  44/44  MB     6       497us   500us   500us   502us   0%      .oOOooooOO      gpu     bluroccludedrrect

w/o occluders
  41/41  MB     5       1.08ms  1.09ms  1.12ms  1.47ms  11%     .........O      gpu     bluroccludedrrect

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2203153002

Review-Url: https://codereview.chromium.org/2203153002
2016-08-04 09:03:20 -07:00
bungeman
ffae30db4a Convert SkAutoTUnref<SkData> to sk_sp<SkData>.
With the move from SkData::NewXXX to SkData::MakeXXX most
SkAutoTUnref<SkData> were changed to sk_sp<SkData>. However,
there are still a few SkAutoTUnref<SkData> around, so clean
them up.

Review-Url: https://codereview.chromium.org/2212493002
2016-08-03 13:32:32 -07:00
msarett
d1ec89b1ac Perform color correction on png decodes
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2184543003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2184543003
2016-08-03 12:59:27 -07:00
fmenozzi
e57b8c9a79 Add new benchmark for testing special hard stop gradient cases
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2206713002

Review-Url: https://codereview.chromium.org/2206713002
2016-08-03 12:12:19 -07:00
bungeman
38d909ec28 Move off SK_SUPPORT_LEGACY_DATA_FACTORIES.
This moves Skia code off of SK_SUPPORT_LEGACY_DATA_FACTORIES.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2206633004

Review-Url: https://codereview.chromium.org/2206633004
2016-08-02 14:40:46 -07:00
msarett
c573a40ed5 Add drawImageLattice() and drawBitmapLattice() APIs
The specified image/bitmap is divided into rects, which
can be draw stretched, shrunk, or at a fixed size.  Will be
used by Android to draw 9patch (which are acutally N-patch)
images.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1992283002

Review-Url: https://codereview.chromium.org/1992283002
2016-08-02 08:05:56 -07:00
mtklein
fe2042e60f SkRasterPipeline: new APIs for fusion
Most visibly this adds a macro SK_RASTER_STAGE that cuts down on the boilerplate of defining a raster pipeline stage function.

Most interestingly, SK_RASTER_STAGE doesn't define a SkRasterPipeline::Fn, but rather a new type EasyFn.  This function is always static and inlined, and the details of interacting with the SkRasterPipeline::Stage are taken care of for you: ctx is just passed as a void*, and st->next() is always called.  All EasyFns have to do is take care of the meat of the work: update r,g,b, etc. and read and write from their context.

The really neat new feature here is that you can either add EasyFns to a pipeline with the new append() functions, _or_ call them directly yourself.  This lets you use the same set of pieces to build either a pipelined version of the function or a custom, fused version.  The bench shows this off.

On my desktop, the pipeline version of the bench takes about 25% more time to run than the fused one.

The old approach to creating stages still works fine.  I haven't updated SkXfermode.cpp or SkArithmeticMode.cpp because they seemed just as clear using Fn directly as they would have using EasyFn.

If this looks okay to you I will rework the comments in SkRasterPipeline to explain SK_RASTER_STAGE and EasyFn a bit as I've done here in the CL description.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195853002

Review-Url: https://codereview.chromium.org/2195853002
2016-07-29 14:27:41 -07:00
halcanary
fa25106f02 SkPDF: PDFStream has-a not is-a PDFDict
Motivation:
SkPDFStream and SkPDFSharedStream now work the same.

Also:
- move SkPDFStream into SkPDFTypes (it's a fundamental PDF type).
- minor refactor of SkPDFSharedStream
- SkPDFSharedStream takes unique_ptr to represent ownership

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2190883003

Review-Url: https://codereview.chromium.org/2190883003
2016-07-29 10:13:18 -07:00
msarett
a714bc3929 Fix various SkColorSpace bugs
(1) Fixes serialization/deserialization of wacky SkColorSpaces
(2) Fix gamma equals checking

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2194903002

Review-Url: https://codereview.chromium.org/2194903002
2016-07-29 08:58:33 -07:00
csmartdalton
e0d362929d Add test configs for instanced rendering
Adds the following configs and enables them on select bots:

  glinst, glinst4, glinstdit4, glinst16, glinstdit16,
  esinst, esinst4, esinstdit4

Makes general changes to GrContextOptions, GrCaps, etc. to facilitate
this.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2182783004

Review-Url: https://codereview.chromium.org/2182783004
2016-07-29 08:14:20 -07:00
msarett
50ce1f28ff Add color space xform support to SkJpegCodec (includes F16!)
Also changes SkColorXform to support:
RGBA->RGBA
RGBA->BGRA

Instead of:
RGBA->SkPMColor

TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b
Review-Url: https://codereview.chromium.org/2174493002
2016-07-29 06:23:33 -07:00
msarett
39979d8c6b Revert of Add color space xform support to SkJpegCodec (includes F16!) (patchset #9 id:260001 of https://codereview.chromium.org/2174493002/ )
Reason for revert:
Breaking MSAN

Original issue's description:
> Add color space xform support to SkJpegCodec (includes F16!)
>
> Also changes SkColorXform to support:
> RGBA->RGBA
> RGBA->BGRA
>
> Instead of:
> RGBA->SkPMColor
>
> TBR=reed@google.com
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
> CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b

TBR=mtklein@google.com,reed@google.com,herb@google.com,brianosman@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review-Url: https://codereview.chromium.org/2195523002
2016-07-28 17:11:18 -07:00
msarett
73d55332e2 Add color space xform support to SkJpegCodec (includes F16!)
Also changes SkColorXform to support:
RGBA->RGBA
RGBA->BGRA

Instead of:
RGBA->SkPMColor

TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2174493002
2016-07-28 15:06:16 -07:00
fmenozzi
17e829794d Add HardStopGradientBench_ScaleNumHardStops.cpp
Rename HardStopGradientBench.cpp to HardStopGradientBench_ScaleNumColors.cpp

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2178913003

Review-Url: https://codereview.chromium.org/2178913003
2016-07-28 10:59:49 -07:00
mtklein
570c868b38 Clean up some unused atomic routines.
AtomicTest was the only use of sk_atomic_add().
AtomicInc64 bench was the only use of sk_atomic_inc(int64_t*).

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2183473005
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot,Test-Ubuntu-GCC-Golo-GPU-GT610-x86_64-Release-TSAN-Trybot

Review-Url: https://codereview.chromium.org/2183473005
2016-07-27 08:40:45 -07:00
brianosman
efded51cd8 Always supply a color space (sRGB for now) with F16
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2177193004

Review-Url: https://codereview.chromium.org/2177193004
2016-07-26 08:11:50 -07:00
msarett
530c844d25 Remove unnecessary getColorSpace() API from SkCodec
Not needed since now we can get it from the SkImageInfo.

TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2170793004

Review-Url: https://codereview.chromium.org/2170793004
2016-07-21 11:57:49 -07:00
mtklein
0c902473d6 Correct sRGB <-> linear everywhere.
This trims the SkPM4fPriv methods down to just foolproof methods.
(Anything trying to build these itself is probably wrong.)

Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore,
at least not efficiently, so this refactor is somewhat more invasive
than you might think.  Generally this means things using to_4f() are
also making a misstep... that's gone too.

It also does not make sense to try to play games with linear floats
with 255 bias any more.  That hack can't work with real sRGB coding.

Rather than update them, I've removed a couple of L32 xfermode fast
paths.  I'd even rather drop it entirely...

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2163683002
2016-07-20 18:10:07 -07:00
mtklein
566ea9b9fc Tune linear->sRGB constants to round-trip all bytes.
I basically just ran a big 5-deep for-loop over the five constants here.
This is the first set of coefficients I found that round trips all bytes.
I suspect there are many such sets.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2162063003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2162063003
2016-07-20 12:10:11 -07:00
msarett
575b2a3bb9 Fix master-skia build
TBR=djsollen@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159223003

Review-Url: https://codereview.chromium.org/2159223003
2016-07-19 13:00:35 -07:00
msarett
6bdbf4412b Improve naive SkColorXform to half floats
This should give us a good baseline to explore using SkRasterPipeline.

A particular colorxform to half float drops from 425us to 282us on my desktop.

Color Xform to Half Float (HP z620)
Original                              425us
Trans16 (not 32)                      355us
Vector Trans16                        378us
Trans16 + Keep Halfs in Vector        335us
Vector Trans16 + Keep Halfs in Vector 282us
Final                                 282us

Color Xform to Half Float (Nexus 5X)
Original                              556us
Final                                 472us

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2159993003
2016-07-19 09:07:55 -07:00
msarett
9ce3a543c9 Add capability for SkColorXform to output half floats
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147763002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2147763002
2016-07-15 13:54:38 -07:00
halcanary
eb92cb3e84 SkPdf: smaller color serialization
SkPDFUtils now has a special function (SkPDFUtils::AppendColorComponent)
just for writing out (color/255) as a decimal with three digits of
precision.

SkPDFUnion now has a type to represent a color component.  It holds a
utint_8, but calls into AppendColorComponent to serialize.

Added a unit test that tests all possible input values.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2151863003

Review-Url: https://codereview.chromium.org/2151863003
2016-07-15 13:41:28 -07:00
mtklein
036e1831e0 Add a bench to measure the best way to pack from int to uint16_t with SSE.
I measured relative runtimes on my laptop:

   pack_int_uint16_t_ss…
   1036  …e41 1x  …se3 1.01x  …e2_b 3.01x  …e2_a 3.02x

I've run into Clang problems with the actual _mm_packus_epi32 instruction, I think,
so I'm going to exercise a little cowardice and leave that option disabled for now.

The ssse3 version probably looks a little faster than it will be in practice.
We'll usually need to load its mask, which here is hoisted out of the bench loop.

The two sse2 variants are close enough in speed that I'm tie breaking them on other
concerns: the <<16, >>16 version doesn't need any scratch registers or to load any
constants, so it wins.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2150343002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot

Review-Url: https://codereview.chromium.org/2150343002
2016-07-15 07:45:53 -07:00
mtklein
05c73b7ed5 Remove bulk float <-> half routines. These are dead code.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2152583002

Review-Url: https://codereview.chromium.org/2152583002
2016-07-13 13:30:49 -07:00
robertphillips
dda54455a2 Remove GrLayerHoister
This relies on https://codereview.chromium.org/1944013002/ (Add legacy flag to allow Skia to remove Ganesh layer hoister) landing first so as to not break the DEPS roll.

GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1950523002

Review-Url: https://codereview.chromium.org/1950523002
2016-07-13 13:27:16 -07:00
mtklein
0358a6ac00 Update SkOpts namespaces.
If we make sure all SkOpts functions are static, we can give the namespaces any
name we like.  This lets us drop the sk_ prefix and give a real indication of
the default SIMD instruction set rather than just saying sk_default.

Both of these changes help debugger, profiler, and crash report readability.
Perhaps more importantly, keeping these functions static helps prevent
accidentally linking in unused versions of functions, as you see here with
sk_avx::srcover_srgb_srgb().

This requires we update SkBlend_opts tests and benches to call SkOpts functions
through SkOpts rather than declaring the methods externally.  In practice this
drops testing of the SSE2 version on machines with SSE4.  If we still really
need to test/bench the compile time best SIMD level version of this method
against the runtime detected best, we can include SkBlend_opts.h into the tests
or benches directly, similar to what we do for the trivial, brute-force, or best
non-SIMD versions.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145833002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2145833002
2016-07-13 08:02:20 -07:00
mtklein
281b33fdd9 SkRasterPipeline preliminaries
Re-uploading to see if I can get a CL number < 2^31.
    patch from issue 2147533002 at patchset 240001 (http://crrev.com/2147533002#ps240001)

Already reviewed at the other crrev link.
TBR=

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147533002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2144573004
2016-07-12 15:01:26 -07:00
bungeman
7438bfc080 Factor code to rotate a canvas about a point.
SkMatrix::scale and ::rotate take a point around which to scale or rotate.
Canvas lacks these helpers, so the code to rotate a canvas around a
point has been duplicated many times. Factor all of these
implementations into SkCanvas::rotate.

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2142033002

Review-Url: https://codereview.chromium.org/2142033002
2016-07-12 15:01:19 -07:00
herb
2edf0c6a71 Remove bloat from SkBlend_opts.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2130183003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2130183003
2016-07-12 15:00:46 -07:00
fmenozzi
54d500f90c Add benchmarks for 3 and 4 colors (most common)
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2143653002

Review-Url: https://codereview.chromium.org/2143653002
2016-07-12 14:45:32 -07:00
msarett
afb8539f62 Revert of try to speed-up maprect + round2i + contains (patchset #8 id:140001 of https://codereview.chromium.org/2133413002/ )
Reason for revert:
Breaking the roll...
https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/253294/steps/compile%20%28with%20patch%29/logs/stdio

Original issue's description:
> try to speed-up maprect + round2i + contains
>
> We call roundOut in a few places. If we can get SkNx::Ceil we could efficiently implement that as well.
>
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2133413002
> CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/b42b785d1cbc98bd34aceae338060831b974f9c5

TBR=mtklein@google.com,reed@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review-Url: https://codereview.chromium.org/2136343002
2016-07-11 14:57:26 -07:00
reed
b42b785d1c try to speed-up maprect + round2i + contains
We call roundOut in a few places. If we can get SkNx::Ceil we could efficiently implement that as well.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2133413002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2133413002
2016-07-11 13:17:35 -07:00
tomhudson
63d14413be Remove obsolete bench analysis scripts
R=bungeman@google.com
BUG=skia:5459
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2125953002

Review-Url: https://codereview.chromium.org/2125953002
2016-07-11 10:26:56 -07:00
fmenozzi
d6767aa2d5 Add HardStopGradientBench.cpp
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2131643002

Committed: https://skia.googlesource.com/skia/+/d0d652e0e0cb4bcecb7ccac9f4f4831fcaccdc45
Review-Url: https://codereview.chromium.org/2131643002
2016-07-11 09:56:31 -07:00
fmenozzi
ffa185d3cf Revert of Add HardStopGradientBench.cpp (patchset #3 id:40001 of https://codereview.chromium.org/2131643002/ )
Reason for revert:
Segfaulting on HardStopGradientBench; didn't allocate enough stack memory for color/position arrays.

Original issue's description:
> Add HardStopGradientBench.cpp
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2131643002
>
> Committed: https://skia.googlesource.com/skia/+/d0d652e0e0cb4bcecb7ccac9f4f4831fcaccdc45

TBR=bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review-Url: https://codereview.chromium.org/2140603003
2016-07-11 09:30:25 -07:00
fmenozzi
d0d652e0e0 Add HardStopGradientBench.cpp
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2131643002

Review-Url: https://codereview.chromium.org/2131643002
2016-07-11 08:31:23 -07:00
reed
6092b6e0e5 Revert of change mapRectScaleTranslate to pass args/ret by value (patchset #1 id:1 of https://codereview.chromium.org/2137853002/ )
Reason for revert:
Triggered a compiler bug on Android?

FAILED: /usr/bin/ccache /b/work/skia/platform_tools/android/bin/../toolchains/arm-r11c-14/bin/arm-linux-androideabi-g++ -MMD -MF obj/src/core/core.SkMatrix.o.d -DSK_INTERNAL -DSK_GAMMA_APPLY_TO_A8 -DQT_NO_KEYWORDS -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=0 -DSK_SUPPORT_GPU=1 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_HAS_GIF_LIBRARY -DSK_HAS_JPEG_LIBRARY -DSK_HAS_PNG_LIBRARY -DSK_HAS_WEBP_LIBRARY -DSK_TEST_QCMS -DSK_IS_BOT -DSK_CODEC_DECODES_RAW -DSK_ARM_HAS_NEON -DSK_BUILD_FOR_ANDROID -DSK_GAMMA_EXPONENT=1.4 -DSK_GAMMA_CONTRAST=0.0 -DSKIA_DLL -DSKIA_IMPLEMENTATION=1 -DSK_SUPPORT_LEGACY_CLIPTOLAYERFLAG -DNDEBUG -I../../../include/c -I../../../include/config -I../../../include/core -I../../../include/pathops -I../../../include/ports -I../../../include/private -I../../../include/utils -I../../../include/images -I../../../src/core -I../../../src/sfnt -I../../../src/image -I../../../src/opts -I../../../src/utils -I../../../include/gpu -I../../../src/gpu -I../../../platform_tools/android/third_party/cpufeatures -fPIC -g -fno-exceptions -fstrict-aliasing -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wvla -Wno-unused-parameter -Werror -march=armv7-a -mthumb -mfpu=neon -mfloat-abi=softfp -fuse-ld=gold -O2 -std=c++11 -fno-rtti -fno-threadsafe-statics -Wnon-virtual-dtor  -c ../../../src/core/SkMatrix.cpp -o obj/src/core/core.SkMatrix.o
../../../src/core/SkMatrix.cpp: In member function 'SkRect SkMatrix::mapRectScaleTranslate(SkRect) const':
../../../src/core/SkMatrix.cpp:1120:1: error: unrecognizable insn:
 }
 ^
(insn 32 31 33 2 (set (reg:V4SF 115 [ D.83077 ])
        (unspec:V4SF [
                (mem/c:V4SF (reg/f:SI 104 virtual-incoming-args) [0 MEM[(const __builtin_neon_sf[4] *)&src]+0 S16 A32])
            ] UNSPEC_VLD1)) /b/work/skia/platform_tools/android/toolchains/arm-r11c-14/lib/gcc/arm-linux-androideabi/4.9/include/arm_neon.h:8693 -1
     (nil))
../../../src/core/SkMatrix.cpp:1120:1: internal compiler error: in extract_insn, at recog.c:2202
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://source.android.com/source/report-bugs.html> for instructions.

Original issue's description:
> change mapRectScaleTranslate to pass args/ret by value
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2137853002
>
> TBR=mtklein
>
> Committed: https://skia.googlesource.com/skia/+/14dce6ed5934d7a6e1fac79f8e76e12f5670aae2

TBR=msarett@google.com,mtklein@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review-Url: https://codereview.chromium.org/2139523002
2016-07-10 11:45:35 -07:00
reed
14dce6ed59 change mapRectScaleTranslate to pass args/ret by value
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2137853002

TBR=mtklein

Review-Url: https://codereview.chromium.org/2137853002
2016-07-10 11:35:03 -07:00
csmartdalton
a7f29640f6 Begin instanced rendering for simple shapes
Adds a module that performs instanced rendering and starts using it
for a select subset of draws on Mac GL platforms. The instance
processor can currently handle rects, ovals, round rects, and double
round rects. It can generalize shapes as round rects in order to
improve batching. The instance processor also employs new drawing
algorithms, irrespective of instanced rendering, that improve GPU-side
performance (e.g. sample mask, different triangle layouts, etc.).

This change only scratches the surface of instanced rendering. The
majority of draws still only have one instance. Future work may
include:

 * Passing coord transforms through the texel buffer.
 * Sending FP uniforms through instanced vertex attribs.
 * Using instanced rendering for more draws (stencil writes,
   drawAtlas, etc.).
 * Adding more shapes to the instance processor’s repertoire.
 * Batching draws that have mismatched scissors (analyzing draw
   bounds, inserting clip planes, etc.).
 * Bindless textures.
 * Uber shaders.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2066993003

Committed: https://skia.googlesource.com/skia/+/42eafa4bc00354b132ad114d22ed6b95d8849891
Review-Url: https://codereview.chromium.org/2066993003
2016-07-07 08:49:11 -07:00
reed
47df89ebfd speed up maprect for scale+trans case
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2111703002

Review-Url: https://codereview.chromium.org/2111703002
2016-06-30 06:38:54 -07:00
halcanary
ee41b7556d SkPDF: alloc less memory for strings
before:
        micros   	bench
        250.98  	WritePDFText	nonrendering
    after:
        micros   	bench
        107.10  	WritePDFText	nonrendering

Also, be slightly more space-efficient in encoding strings.

Also, add a bench.

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2099463002

Review-Url: https://codereview.chromium.org/2099463002
2016-06-23 14:08:11 -07:00
reed
dabe5d3780 update callers to not use SkColorProfileType
Requires https://codereview.chromium.org/2087833002/ to land first.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2086583002

Review-Url: https://codereview.chromium.org/2086583002
2016-06-21 10:28:14 -07:00
msarett
db197a54b5 Fix Mac clang linker
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2077353006

Review-Url: https://codereview.chromium.org/2077353006
2016-06-21 09:44:29 -07:00
msarett
67cb666481 Add --simpleCodec to nanobench flags to run a smaller set of benchmarks
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2084683002

Review-Url: https://codereview.chromium.org/2084683002
2016-06-21 08:49:26 -07:00
liyuqian
a27f669f46 Benchmark rotated rect with AA/noAA
Using this benchmark, we verify that AA is about 4x slower than noAA in
path_fill_big_rotated_rect. This is what I aim to improve in the next
CL.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2087453003

Review-Url: https://codereview.chromium.org/2087453003
2016-06-20 14:05:27 -07:00