protobuf/benchmarks
Elliotte Rusty Harold 8c170146ee
download_data.sh no longer exists (#9661)
@haberman download_data.sh doesn't seem to exist any more and I can't find it in the git log
2022-03-22 16:33:10 +00:00
..
cpp Restrict visibility and add target for C++ benchmark 2021-03-14 10:58:03 +01:00
datasets Restrict visibility and add target for C++ benchmark 2021-03-14 10:58:03 +01:00
go Add gogo benchmark 2018-04-10 14:32:28 -07:00
java Sync from Piper @429333699 2022-02-17 09:53:51 -08:00
js Fix lots of spelling errors (#7751) 2020-08-10 11:08:25 -07:00
php delete all duplicate empty blanks (#5758) 2019-02-20 19:28:50 -08:00
protobuf.js Fix lots of spelling errors (#7751) 2020-08-10 11:08:25 -07:00
python Sync from Piper @429333699 2022-02-17 09:53:51 -08:00
util Fix benchmark by making sure we use Python 3 (#9170) 2021-11-01 09:00:31 -07:00
__init__.py Add python benchmark 2018-01-05 11:20:40 -08:00
benchmarks.proto Addressed PR comments. 2016-05-03 12:53:49 -07:00
BUILD Restrict visibility and add target for C++ benchmark 2021-03-14 10:58:03 +01:00
google_size.proto More cleanup, based around searches for "Google.ProtocolBuffers" 2015-06-26 20:13:07 +01:00
Makefile.am Fix benchmark by making sure we use Python 3 (#9170) 2021-11-01 09:00:31 -07:00
README.md download_data.sh no longer exists (#9661) 2022-03-22 16:33:10 +00:00

Protocol Buffers Benchmarks

This directory contains benchmarking schemas and data sets that you can use to test a variety of performance scenarios against your protobuf language runtime. If you are looking for performance numbers of officially supported languages, see Protobuf Performance.

Prerequisite

First, you need to follow the instruction in the root directory's README to build your language's protobuf, then:

CPP

You need to install cmake before building the benchmark.

We are using google/benchmark as the benchmark tool for testing cpp. This will be automatically made during build the cpp benchmark.

The cpp protobuf performance can be improved by linking with TCMalloc.

Java

We're using maven to build the java benchmarks, which is the same as to build the Java protobuf. There're no other tools need to install. We're using google/caliper as benchmark tool, which can be automatically included by maven.

Python

We're using python C++ API for testing the generated CPP proto version of python protobuf, which is also a prerequisite for Python protobuf cpp implementation. You need to install the correct version of Python C++ extension package before run generated CPP proto version of Python protobuf's benchmark. e.g. under Ubuntu, you need to

$ sudo apt-get install python-dev
$ sudo apt-get install python3-dev

And you also need to make sure pkg-config is installed.

Go

Go protobufs are maintained at github.com/golang/protobuf. If not done already, you need to install the toolchain and the Go protoc-gen-go plugin for protoc.

To install protoc-gen-go, run:

$ go get -u github.com/golang/protobuf/protoc-gen-go
$ export PATH=$PATH:$(go env GOPATH)/bin

The first command installs protoc-gen-go into the bin directory in your local GOPATH. The second command adds the bin directory to your PATH so that protoc can locate the plugin later.

PHP

PHP benchmark's requirement is the same as PHP protobuf's requirements. The benchmark will automatically include PHP protobuf's src and build the c extension if required.

Node.js

Node.js benchmark need node(higher than V6) and npm package manager installed. This benchmark is using the benchmark framework to test, which needn't to manually install. And another prerequisite is protobuf js, which needn't to manually install either

C#

The C# benchmark code is built as part of the main Google.Protobuf solution. It requires the .NET Core SDK, and depends on BenchmarkDotNet, which will be downloaded automatically.

Run instructions

To run all the benchmark dataset:

Java:

First build the Java binary in the usual way with Maven:

$ cd java
$ mvn install

Assuming that completes successfully,

$ cd ../benchmarks
$ make java

CPP:

$ make cpp

For linking with tcmalloc:

$ env LD_PRELOAD={directory to libtcmalloc.so} make cpp

Python:

We have three versions of python protobuf implementation: pure python, cpp reflection and cpp generated code. To run these version benchmark, you need to:

Pure Python:

$ make python-pure-python

CPP reflection:

$ make python-cpp-reflection

CPP generated code:

$ make python-cpp-generated-code

Go

$ make go

PHP

We have two version of php protobuf implementation: pure php, php with c extension. To run these version benchmark, you need to:

Pure PHP

$ make php

PHP with c extension

$ make php_c

Node.js

$ make js

To run a specific dataset or run with specific options:

Java:

$ make java-benchmark
$ ./java-benchmark $(specific generated dataset file name) [$(caliper options)]

CPP:

$ make cpp-benchmark
$ ./cpp-benchmark $(specific generated dataset file name) [$(benchmark options)]

Python:

For Python benchmark we have --json for outputting the json result

Pure Python:

$ make python-pure-python-benchmark
$ ./python-pure-python-benchmark [--json] $(specific generated dataset file name)

CPP reflection:

$ make python-cpp-reflection-benchmark
$ ./python-cpp-reflection-benchmark [--json] $(specific generated dataset file name)

CPP generated code:

$ make python-cpp-generated-code-benchmark
$ ./python-cpp-generated-code-benchmark [--json] $(specific generated dataset file name)

Go:

$ make go-benchmark
$ ./go-benchmark $(specific generated dataset file name) [go testing options]

PHP

Pure PHP

$ make php-benchmark
$ ./php-benchmark $(specific generated dataset file name)

PHP with c extension

$ make php-c-benchmark
$ ./php-c-benchmark $(specific generated dataset file name)

Node.js

$ make js-benchmark
$ ./js-benchmark $(specific generated dataset file name)

C#

From csharp/src/Google.Protobuf.Benchmarks, run:

$ dotnet run -c Release

We intend to add support for this within the makefile in due course.

Benchmark datasets

Each data set is in the format of benchmarks.proto:

  1. name is the benchmark dataset's name.
  2. message_name is the benchmark's message type full name (including package and message name)
  3. payload is the list of raw data.

The schema for the datasets is described in benchmarks.proto.

Benchmark likely want to run several benchmarks against each data set (parse, serialize, possibly JSON, possibly using different APIs, etc).

We would like to add more data sets. In general we will favor data sets that make the overall suite diverse without being too large or having too many similar tests. Ideally everyone can run through the entire suite without the test run getting too long.