protobuf/docs/performance.md
2018-08-22 11:55:30 -07:00

6.9 KiB

Protobuf Performance

This benchmark result is tested on workstation with processor of Intel® Xeon® Processor E5-2630 and 32GB RAM

This table contains 3 languages' results:

  • C++ - For C++ there're 3 kinds of parsing ways:
    • new - This is for using new operator for creating message instance.
    • new arena - This is for using arena for creating new message instance.
    • reuse - This is for reusing the same message instance for parsing.
  • Java - For Java there're 3 kinds of parsing/Serialization ways:
    • byte[] - This is for parsing from a Byte Array.
    • ByteString - This is for parsing from a com.google.protobuf.ByteString.
    • InputStream - This is for parsing from a InputStream
  • Python - For Pythong there're 3 kinds of python protobuf for testing:
    • C++-genereated-code - This is for using cpp generated code of the proto file as dynamic linked library.
    • C++-reflection - This is for using cpp reflection, which there's no generated code, but still using cpp protobuf library as dynamic linked library.
    • pure-Python - This is for pure Python version, which don't link with any cpp protobuf library.

Parsing performance

C++ C++ with tcmalloc java python
new new arena reuse new new arena reuse byte[] ByteString InputStream C++-generated-code C++-reflection pure-Python
google_message1_proto2 368.717MB/s 261.847MB/s 799.403MB/s 645.183MB/s 441.023MB/s 1.122GB/s 425.437MB/s 425.937MB/s 251.018MB/s 82.8314MB/s 47.6763MB/s 3.76299MB/s
google_message1_proto3 294.517MB/s 229.116MB/s 469.982MB/s 434.510MB/s 394.701MB/s 591.931MB/s 357.597MB/s 378.568MB/s 221.676MB/s 82.0498MB/s 39.9467MB/s 3.77751MB/s
google_message2 277.242MB/s 347.611MB/s 793.67MB/s 503.721MB/s 596.333MB/s 922.533MB/s 416.778MB/s 419.543MB/s 367.145MB/s 241.46MB/s 71.5723MB/s 2.73538MB/s
google_message3_1 213.478MB/s 291.58MB/s 543.398MB/s 539.704MB/s 717.300MB/s 927.333MB/s 684.241MB/s 704.47MB/s 648.624MB/s 209.036MB/s 142.356MB/s 15.3324MB/s
google_message3_2 672.685MB/s 802.767MB/s 1.21505GB/s 985.790MB/s 1.136GB/s 1.367GB/s 1.54439GB/s 1.60603GB/s 1.33443GB/s 573.835MB/s 314.33MB/s 15.0169MB/s
google_message3_3 207.681MB/s 140.591MB/s 535.181MB/s 369.743MB/s 262.301MB/s 556.644MB/s 279.385MB/s 304.853MB/s 107.575MB/s 32.248MB/s 26.1431MB/s 2.63541MB/s
google_message3_4 7.96091GB/s 7.10024GB/s 9.3013GB/s 8.518GB/s 8.171GB/s 9.917GB/s 5.78006GB/s 5.85198GB/s 4.62609GB/s 2.49631GB/s 2.35442GB/s 802.061MB/s
google_message3_5 76.0072MB/s 51.6769MB/s 237.856MB/s 178.495MB/s 111.751MB/s 329.569MB/s 121.038MB/s 132.866MB/s 36.9197MB/s 10.3962MB/s 8.84659MB/s 1.25203MB/s
google_message4 331.46MB/s 404.862MB/s 427.99MB/s 589.887MB/s 720.367MB/s 705.373MB/s 606.228MB/s 589.13MB/s 530.692MB/s 305.543MB/s 174.834MB/s 7.86485MB/s

Serialization performance

C++ C++ with tcmalloc java python
byte[] ByteString InputStream C++-generated-code C++-reflection pure-Python
google_message1_proto2 1.39698GB/s 1.701GB/s 1.12915GB/s 1.13589GB/s 758.609MB/s 260.911MB/s 58.4815MB/s 5.77824MB/s
google_message1_proto3 959.305MB/s 939.404MB/s 1.15372GB/s 1.07824GB/s 802.337MB/s 239.4MB/s 33.6336MB/s 5.80524MB/s
google_message2 1.27429GB/s 1.402GB/s 1.01039GB/s 1022.99MB/s 798.736MB/s 996.755MB/s 57.9601MB/s 4.09246MB/s
google_message3_1 1.31916GB/s 2.049GB/s 991.496MB/s 860.332MB/s 662.88MB/s 1.48625GB/s 421.287MB/s 18.002MB/s
google_message3_2 2.15676GB/s 2.632GB/s 2.14736GB/s 2.08136GB/s 1.55997GB/s 2.39597GB/s 326.777MB/s 16.0527MB/s
google_message3_3 650.456MB/s 1.040GB/s 593.52MB/s 580.667MB/s 346.839MB/s 123.978MB/s 35.893MB/s 2.32834MB/s
google_message3_4 8.70154GB/s 9.825GB/s 5.88645GB/s 5.93946GB/s 2.44388GB/s 5.9241GB/s 4.05837GB/s 876.87MB/s
google_message3_5 246.33MB/s 443.993MB/s 283.278MB/s 259.167MB/s 206.37MB/s 37.0285MB/s 12.2228MB/s 1.1979MB/s
google_message4 1.56674GB/s 2.19601GB/s 776.907MB/s 770.707MB/s 702.931MB/s 1.49623GB/s 205.116MB/s 8.93428MB/s

* The cpp performance can be improved by using tcmalloc, please follow the (instruction)[https://github.com/protocolbuffers/protobuf/blob/master/benchmarks/README.md] to link with tcmalloc to get the faster result.