commit | c038bf3f83c56230c6e621373160d0215af0e256 | [log] [tgz] |
---|---|---|
author | Paul Lin <paullin3280@gmail.com> | Fri Apr 28 14:29:52 2023 +0800 |
committer | MartijnVisser <martijn@2symbols.com> | Tue May 09 08:51:48 2023 +0200 |
tree | 68916ebf652a2ff0f1cbbf3ea4d0eabc40778498 | |
parent | 9f56e893205bd71e5c357be814ffbbbeaed03628 [diff] |
[FLINK-31965] Fix ClassNotFoundException in benchmarks
This repository contains sets of micro benchmarks designed to run on single machine to help Apache Flink's developers assess performance implications of their changes.
The main methods defined in the various classes (test cases) are using jmh micro benchmark suite to define runners to execute those test cases. You can execute the default benchmark suite (which takes ~1hour) at once:
mvn clean install exec:exec
There is also a separate benchmark suit for state backend, and you can execute this suit (which takes ~1hour) using below command:
mvn clean package exec:exec \ -Dbenchmarks="org.apache.flink.state.benchmark.*"
If you want to execute just one benchmark, the best approach is to execute selected main function manually. There're mainly three ways:
From your IDE (hint there is a plugin for Intellij IDEA).
flink.version
, default value for the property is defined in pom.xml.From command line, using command like:
mvn -Dflink.version=<FLINK_VERSION> clean package exec:exec \ -Dbenchmarks="<benchmark_class>"
An example flink version can be -Dflink.version=1.12-SNAPSHOT.
Run the uber jar directly like:
java -jar target/benchmarks.jar -rf csv "<benchmark_class>"
We also support to run each benchmark once (with only one fork and one iteration) for testing, with below command:
mvn test -P test
There are some built-in parameters to run different benchmarks, these can be shown/overridden from the command line.
# show all the parameters combination for the <benchmark_class> java -jar target/benchmarks.jar "<benchmark_class>" -lp # run benchmark for rocksdb state backend type java -jar target/benchmarks.jar "org.apache.flink.state.benchmark.*" -p "backendType=ROCKSDB"
Besides the parameters, there is also a benchmark config file benchmark-conf.yaml
to tune some basic parameters. For example, we can change the state data dir by putting benchmark.state.data-dir: /data
in the config file. For more options, you can refer to the code in the org.apache.flink.config
package.
The recent addition of OpenSSL-based benchmarks require one of two modes to be active:
mvn -Dnetty-tcnative.flavor=static
but requires flink-shaded-netty-tcnative-static
in the version from pom.xml
. This module is not provided by Apache Flink by default due to licensing issues (see https://issues.apache.org/jira/browse/LEGAL-393) but can be generated from inside a corresponding flink-shaded
source via:mvn clean install -Pinclude-netty-tcnative-static -pl flink-shaded-netty-tcnative-static
If both options are not working, OpenSSL benchmarks will fail but that should not influence any other benchmarks.
To avoid compatibility problems and compilation errors, benchmarks defined in this repository should be using stable @Public
Flink API. If this is not possible the benchmarking code should be defined in the Apache Flink repository. In this repository there should be committed only a very thin executor class that's using executing the benchmark. For this latter pattern please take a look for example at the CreateSchedulerBenchmarkExecutor
and how is it using CreateSchedulerBenchmark
(defined in the flink-runtime). Note that the benchmark class should also be tested, just as CreateSchedulerBenchmarkTest
is tested in the flink-runtime.
Such code structured is due to using GPL2 licensed jmh library for the actual execution of the benchmarks. Ideally we would prefer to have all of the code moved to Apache Flink
Regarding naming the benchmark methods, there is one important thing. When uploading the results to the codespeed web UI, uploader is using just the benchmark's method name combined with the parameters to generate visible name of the benchmark in the UI. Because of that it is important to:
benchmark
, test
, ...)Good example of how to name benchmark methods are:
networkThroughput
sessionWindow
Please attach the results of your benchmarks.