protobuf/benchmarks/README.md

146 lines
3.4 KiB
Markdown
Raw Normal View History

2016-04-28 01:34:33 +00:00
# Protocol Buffers Benchmarks
This directory contains benchmarking schemas and data sets that you
can use to test a variety of performance scenarios against your
protobuf language runtime.
2017-12-15 01:26:16 +00:00
## Prerequisite
2016-04-28 01:34:33 +00:00
First, you need to follow the instruction in the root directory's README to
2017-12-13 22:34:52 +00:00
build your language's protobuf, then:
2017-12-13 22:34:52 +00:00
### CPP
2017-12-15 01:26:16 +00:00
You need to install [cmake](https://cmake.org/) before building the benchmark.
We are using [google/benchmark](https://github.com/google/benchmark) as the
benchmark tool for testing cpp. This will be automaticly made during build the
2017-12-15 01:26:16 +00:00
cpp benchmark.
2018-01-05 19:20:40 +00:00
### Java
We're using maven to build the java benchmarks, which is the same as to build
the Java protobuf. There're no other tools need to install. We're using
[google/caliper](https://github.com/google/caliper) as benchmark tool, which
2017-12-13 22:34:52 +00:00
can be automaticly included by maven.
2018-01-02 17:57:04 +00:00
2018-01-05 19:20:40 +00:00
### Python
We're using python C++ API for testing the generated
2018-01-05 19:20:40 +00:00
CPP proto version of python protobuf, which is also a prerequisite for Python
protobuf cpp implementation. You need to install the correct version of Python
C++ extension package before run generated CPP proto version of Python
protobuf's benchmark. e.g. under Ubuntu, you need to
2018-01-05 19:20:40 +00:00
```
$ sudo apt-get install python-dev
2018-01-05 19:20:40 +00:00
$ sudo apt-get install python3-dev
```
And you also need to make sure `pkg-config` is installed.
2018-01-05 19:20:40 +00:00
2018-01-02 17:57:04 +00:00
### Big data
There's some optional big testing data which is not included in the directory
initially, you need to run the following command to download the testing data:
2018-01-02 17:57:04 +00:00
```
$ ./download_data.sh
2018-01-02 17:57:04 +00:00
```
After doing this the big data file will automaticly generated in the
benchmark directory.
2017-12-13 22:34:52 +00:00
## Run instructions
2017-12-01 19:55:38 +00:00
To run all the benchmark dataset:
2018-01-05 19:20:40 +00:00
### Java:
2017-12-01 19:55:38 +00:00
```
$ make java
```
2018-01-05 19:20:40 +00:00
### CPP:
2016-04-28 01:34:33 +00:00
```
2017-12-01 19:55:38 +00:00
$ make cpp
2016-04-28 01:34:33 +00:00
```
2018-01-05 19:20:40 +00:00
### Python:
We have three versions of python protobuf implementation: pure python, cpp
reflection and cpp generated code. To run these version benchmark, you need to:
2018-01-05 19:20:40 +00:00
#### Pure Python:
```
$ make python-pure-python
```
#### CPP reflection:
```
$ make python-cpp-reflection
```
#### CPP generated code:
```
$ make python-cpp-generated-code
```
2017-12-01 19:55:38 +00:00
To run a specific dataset:
2018-01-05 19:20:40 +00:00
### Java:
2017-12-01 19:55:38 +00:00
```
$ make java-benchmark
2018-01-02 17:57:04 +00:00
$ ./java-benchmark $(specific generated dataset file name) [-- $(caliper option)]
2017-12-01 19:55:38 +00:00
```
2018-01-05 19:20:40 +00:00
### CPP:
2017-12-01 19:55:38 +00:00
```
$ make cpp-benchmark
2017-12-01 19:55:38 +00:00
$ ./cpp-benchmark $(specific generated dataset file name)
```
2018-01-05 19:20:40 +00:00
### Python:
#### Pure Python:
```
$ make python-pure-python-benchmark
$ ./python-pure-python-benchmark $(specific generated dataset file name)
```
#### CPP reflection:
```
$ make python-cpp-reflection-benchmark
$ ./python-cpp-reflection-benchmark $(specific generated dataset file name)
```
#### CPP generated code:
```
$ make python-cpp-generated-code-benchmark
$ ./python-cpp-generated-code-benchmark $(specific generated dataset file name)
```
2017-12-13 22:34:52 +00:00
## Benchmark datasets
2017-12-01 19:55:38 +00:00
Each data set is in the format of benchmarks.proto:
2017-12-13 22:34:52 +00:00
2017-12-01 19:55:38 +00:00
1. name is the benchmark dataset's name.
2. message_name is the benchmark's message type full name (including package and message name)
3. payload is the list of raw data.
2017-12-13 22:34:52 +00:00
The schema for the datasets is described in `benchmarks.proto`.
2017-12-01 19:55:38 +00:00
Benchmark likely want to run several benchmarks against each data set (parse,
2016-04-28 01:34:33 +00:00
serialize, possibly JSON, possibly using different APIs, etc).
We would like to add more data sets. In general we will favor data sets
that make the overall suite diverse without being too large or having
too many similar tests. Ideally everyone can run through the entire
suite without the test run getting too long.