| Oak Benchmark Jar |
| ================= |
| |
| This jar is runnable and contains test related run modes. |
| |
| The following runmodes are currently available: |
| |
| * benchmark : Run benchmark tests against different Oak repository fixtures. |
| * scalability : Run scalability tests against different Oak repository fixtures. |
| |
| See the subsections below for more details on how to use these modes. |
| |
| Benchmark mode |
| -------------- |
| |
| The benchmark mode is used for executing various micro-benchmarks. It can |
| be invoked like this: |
| |
| $ java -jar oak-benchmarks-*.jar benchmark [options] [testcases] [fixtures] |
| |
| The following benchmark options (with default values) are currently supported: |
| |
| --azure - Azure Connection String (default: |
| DefaultEndpointsProtocol=http; |
| AccountName=devstoreaccount1; |
| AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==; |
| BlobEndpoint=http://127.0.0.1: |
| 10000/devstoreaccount1;) |
| --azureContainerName - Azure container name (default: oak) |
| --azureRootPath - Azure root path (default: /oak) |
| --host localhost - MongoDB host |
| --port 27101 - MongoDB port |
| --db <name> - MongoDB database (default is a generated name) |
| --mongouri - MongoDB URI (takes precedence over host, port and db) |
| --dropDBAfterTest true - Whether to drop the MongoDB database after the test |
| --base target - Path to the base file (Tar setup), |
| --mmap <64bit?> - TarMK memory mapping (the default on 64 bit JVMs) |
| --cache 100 - cache size (in MB) |
| --wikipedia <file> - Wikipedia dump |
| --runAsAdmin false - Run test as admin session |
| --itemsToRead 1000 - Number of items to read |
| --report false - Whether to output intermediate results |
| --csvFile <file> - Optional csv file to report the benchmark results |
| --concurrency <levels> - Comma separated list of concurrency levels |
| --metrics false - Enable metrics based stats collection |
| --rdbjdbcuri - JDBC URL for RDB persistence (defaults to local file-based H2) |
| --rdbjdbcuser - JDBC username (defaults to "") |
| --rdbjdbcpasswd - JDBC password (defaults to "") |
| --rdbjdbctableprefix - for RDB persistence: prefix for table names (defaults to "") |
| --vgcMaxAge - Continuous DocumentNodeStore VersionGC max age in sec (RDB only) |
| |
| Please run `--help` to list all options. |
| |
| These options are passed to the test cases and repository fixtures |
| that need them. For example the Wikipedia dump option is needed by the |
| WikipediaImport test case and the MongoDB address information by the |
| MongoMK and SegmentMK -based repository fixtures. The cache setting |
| controls the bundle cache size in Jackrabbit, the NodeState |
| cache size in MongoMK, and the segment cache size in SegmentMK. |
| |
| The `--concurrency` levels can be specified as comma separated list of values, |
| eg: `--concurrency 1,4,8`, which will execute the same test with the number of |
| respective threads. Note that the `beforeSuite()` and `afterSuite()` are executed |
| before and after the concurrency loop. eg. in the example above, the execution order |
| is: `beforeSuite()`, 1x `runTest()`, 4x `runTest()`, 8x `runTest()`, `afterSuite()`. |
| Tests that create their own background threads, should be executed with |
| `--concurrency 1` which is the default. |
| |
| You can use extra JVM options like `-Xmx` settings to better control the |
| benchmark environment. It's also possible to attach the JVM to a |
| profiler to better understand benchmark results. For example, I'm |
| using `-agentlib:hprof=cpu=samples,depth=100` as a basic profiling |
| tool, whose results can be processed with `perl analyze-hprof.pl |
| java.hprof.txt` to produce a somewhat easier-to-read top-down and |
| bottom-up summaries of how the execution time is distributed across |
| the benchmarked codebase. |
| |
| Some system properties are also used to control the benchmarks. For example: |
| |
| -Dwarmup=5 - warmup time (in seconds) |
| -Druntime=60 - how long a single benchmark should run (in seconds) |
| -Dprofile=true - to collect and print profiling data |
| |
| The test case names like `ReadPropertyTest`, `SmallFileReadTest` and |
| `SmallFileWriteTest` indicate the specific test case being run. You can |
| specify one or more test cases in the benchmark command line, and |
| oak-run will execute each benchmark in sequence. The benchmark code is |
| located under `org.apache.jackrabbit.oak.benchmark` in the oak-run |
| component. Each test case tries to exercise some tightly scoped aspect |
| of the repository. You might remember many of these tests from the |
| Jackrabbit benchmark reports like |
| http://people.apache.org/~jukka/jackrabbit/report-2011-09-27/report.html |
| that we used to produce earlier. |
| |
| Finally the benchmark runner supports the following repository fixtures: |
| |
| | Fixture | Description | |
| |------------------------------|----------------------------------------------------------------| |
| | Jackrabbit | Jackrabbit with the default embedded Derby bundle PM | |
| | Oak-Memory | Oak with default in-memory storage | |
| | Oak-MemoryNS | Oak with default in-memory NodeStore | |
| | Oak-Mongo | Oak with the default Mongo backend | |
| | Oak-Mongo-DS | Oak with the default Mongo backend and DataStore | |
| | Oak-MongoNS | Oak with the Mongo NodeStore | |
| | Oak-Segment-Tar | Oak with the Segment Tar backend | |
| | Oak-Segment-Tar-DS | Oak with the Segment Tar backend and DataStore | |
| | Oak-Segment-Azure | Oak with the Azure Segment backend | |
| | Oak-RDB | Oak with the DocumentMK/RDB persistence | |
| | Oak-RDB-DS | Oak with the DocumentMK/RDB persistence and DataStore | |
| | Oak-Composite-Store | Oak with the Composite Node store with Segment Tar backend | |
| | Oak-Composite-Memory-Store | Oak with the Composite Node store with in-memory NodeStore | |
| | Oak-Composite-Mongo-Store | Oak with the Composite Node store with Mongo backend | |
| |
| |
| (Note that for Oak-RDB, the required JDBC drivers either need to be embedded |
| into oak-run, or be specified separately in the class path. Furthermore, |
| dropDBAfterTest is interpreted to drop the *tables*, not the database |
| itself, if and only if they have been auto-created) |
| |
| Once started, the benchmark runner will execute each listed test case |
| against all the listed repository fixtures. After starting up the |
| repository and preparing the test environment, the test case is first |
| executed a few times to warm up caches before measurements are |
| started. Then the test case is run repeatedly for one minute |
| and the number of milliseconds used by each execution |
| is recorded. Once done, the following statistics are computed and |
| reported: |
| |
| | Column | Description | |
| |-------------|-------------------------------------------------------| |
| | C | concurrency level | |
| | min | minimum time (in ms) taken by a test run | |
| | 10% | time (in ms) in which the fastest 10% of test runs | |
| | 50% | time (in ms) taken by the median test run | |
| | 90% | time (in ms) in which the fastest 90% of test runs | |
| | max | maximum time (in ms) taken by a test run | |
| | N | total number of test runs in one minute (or more) | |
| |
| The most useful of these numbers is probably the 90% figure, as it |
| shows the time under which the majority of test runs completed and |
| thus what kind of performance could reasonably be expected in a normal |
| usage scenario. However, the reason why all these different numbers |
| are reported, instead of just the 90% one, is that often seeing the |
| distribution of time across test runs can be helpful in identifying |
| things like whether a bigger cache might help. |
| |
| Finally, and most importantly, like in all benchmarking, the numbers |
| produced by these tests should be taken with a large dose of salt. |
| They DO NOT directly indicate the kind of application performance you |
| could expect with (the current state of) Oak. Instead they are |
| designed to isolate implementation-level bottlenecks and to help |
| measure and profile the performance of specific, isolated features. |
| |
| How to add a new benchmark |
| -------------------------- |
| |
| To add a new test case to this benchmark suite, you'll need to implement |
| the `Benchmark` interface and add an instance of the new test to the |
| `allBenchmarks` array in the `BenchmarkRunner` class in the |
| `org.apache.jackrabbit.oak.benchmark` package. |
| |
| The best way to implement the `Benchmark` interface is to extend the |
| `AbstractTest` base class that takes care of most of the benchmarking |
| details. The outline of such a benchmark is: |
| |
| class MyTest extends AbstracTest { |
| @Override |
| protected void beforeSuite() throws Exception { |
| // optional, run once before all the iterations, |
| // not included in the performance measurements |
| } |
| @Override |
| protected void beforeTest() throws Exception { |
| // optional, run before runTest() on each iteration, |
| // but not included in the performance measurements |
| } |
| @Override |
| protected void runTest() throws Exception { |
| // required, run repeatedly during the benchmark, |
| // and the time of each iteration is measured. |
| // The ideal execution time of this method is |
| // from a few hundred to a few thousand milliseconds. |
| // Use a loop if the operation you're hoping to measure |
| // is faster than that. |
| } |
| @Override |
| protected void afterTest() throws Exception { |
| // optional, run after runTest() on each iteration, |
| // but not included in the performance measurements |
| } |
| @Override |
| protected void afterSuite() throws Exception { |
| // optional, run once after all the iterations, |
| // not included in the performance measurements |
| } |
| } |
| |
| The rough outline of how the benchmark will be run is: |
| |
| test.beforeSuite(); |
| for (...) { |
| test.beforeTest(); |
| recordStartTime(); |
| test.runTest(); |
| recordEndTime(); |
| test.afterTest(); |
| } |
| test.afterSuite(); |
| |
| You can use the `loginWriter()` and `loginReader()` methods to create admin |
| and anonymous sessions. There's no need to logout those sessions (unless doing |
| so is relevant to the benchmark) as they will automatically be closed after |
| the benchmark is completed and the `afterSuite()` method has been called. |
| |
| Similarly, you can use the `addBackgroundJob(Runnable)` method to add |
| background tasks that will be run concurrently while the main benchmark is |
| executing. The relevant background thread works like this: |
| |
| while (running) { |
| runnable.run(); |
| Thread.yield(); |
| } |
| |
| As you can see, the `run()` method of the background task gets invoked |
| repeatedly. Such threads will automatically close once all test iterations |
| are done, before the `afterSuite()` method is called. |
| |
| Scalability mode |
| -------------- |
| |
| The scalability mode is used for executing various scalability suites to test the |
| performance of various associated tests. It can be invoked like this: |
| |
| $ java -jar oak-benchmarks-*.jar scalability [options] [suites] [fixtures] |
| |
| The following scalability options (with default values) are currently supported: |
| |
| --host localhost - MongoDB host |
| --port 27101 - MongoDB port |
| --db <name> - MongoDB database (default is a generated name) |
| --dropDBAfterTest true - Whether to drop the MongoDB database after the test |
| --base target - Path to the base file (Tar setup), |
| --mmap <64bit?> - TarMK memory mapping (the default on 64 bit JVMs) |
| --cache 100 - cache size (in MB) |
| --csvFile <file> - Optional csv file to report the benchmark results |
| --rdbjdbcuri - JDBC URL for RDB persistence (defaults to local file-based H2) |
| --rdbjdbcuser - JDBC username (defaults to "") |
| --rdbjdbcpasswd - JDBC password (defaults to "") |
| |
| These options are passed to the various suites and repository fixtures |
| that need them. For example the the MongoDB address information by the |
| MongoMK and SegmentMK -based repository fixtures. The cache setting |
| controls the NodeState cache size in MongoMK, and the segment cache |
| size in SegmentMK. |
| |
| You can use extra JVM options like `-Xmx` settings to better control the |
| scalability suite test environment. It's also possible to attach the JVM to a |
| profiler to better understand benchmark results. For example, I'm |
| using `-agentlib:hprof=cpu=samples,depth=100` as a basic profiling |
| tool, whose results can be processed with `perl analyze-hprof.pl |
| java.hprof.txt` to produce a somewhat easier-to-read top-down and |
| bottom-up summaries of how the execution time is distributed across |
| the benchmarked codebase. |
| |
| The scalability suite creates the relevant repository load before starting the tests. |
| Each test case tries to benchmark and profile a specific aspect of the repository. |
| |
| Each scalability suite is configured to run a number of related tests which require the |
| same base load to be available in the repository. |
| Either the entire suite can be executed or individual tests within the suite can be run. |
| If the suite names are specified like `ScalabilityBlobSearchSuite` then all the tests |
| configured for the suite are executed. To execute particular tests in the |
| suite, suite names appended with tests of the form `suite:test1,test2` must be specified like |
| `ScalabilityBlobSearchSuite:FormatSearcher,NodeTypeSearcher`. You can specify one or more |
| suites in the scalability command line, and oak-run will execute each suite in sequence. |
| |
| Finally the scalability runner supports the following repository fixtures: |
| |
| | Fixture | Description | |
| |-----------------------|----------------------------------------------------------------| |
| | Oak-Memory | Oak with default in-memory storage | |
| | Oak-MemoryNS | Oak with default in-memory NodeStore | |
| | Oak-Mongo | Oak with the default Mongo backend | |
| | Oak-Mongo-DS | Oak with the default Mongo backend and DataStore | |
| | Oak-MongoNS | Oak with the Mongo NodeStore | |
| | Oak-Segment-Tar | Oak with the Tar backend (aka Segment NodeStore) | |
| | Oak-Segment-Tar-DS | Oak with the Tar backend (aka Segment NodeStore) and DataStore | |
| | Oak-RDB | Oak with the DocumentMK/RDB persistence | |
| | Oak-RDB-DS | Oak with the DocumentMK/RDB persistence and DataStore | |
| |
| (Note that for Oak-RDB, the required JDBC drivers either need to be embedded |
| into oak-run, or be specified separately in the class path.) |
| |
| Once started, the scalability runner will execute each listed suite against all the listed |
| repository fixtures. After starting up the repository and preparing the test environment, |
| the scalability suite executes all the configured tests to warm up caches before measurements |
| are started. Then each configured test within the suite are run and the number of |
| milliseconds used by each execution is recorded. Once done, the following statistics are |
| computed and reported: |
| |
| | Column | Description | |
| |-------------|-------------------------------------------------------| |
| | min | minimum time (in ms) taken by a test run | |
| | 10% | time (in ms) in which the fastest 10% of test runs | |
| | 50% | time (in ms) taken by the median test run | |
| | 90% | time (in ms) in which the fastest 90% of test runs | |
| | max | maximum time (in ms) taken by a test run | |
| | N | total number of test runs in one minute (or more) | |
| |
| Also, for each test, the execution times are reported for each iteration/load configured. |
| |
| | Column | Description | |
| |-------------|-------------------------------------------------------| |
| | Load | time (in ms) taken by a test run | |
| |
| The latter is more useful of these numbers as it shows how the individual execution |
| times are scaling for each load. |
| |
| How to add a new scalability suite |
| -------------------------- |
| The scalability code is |
| located under `org.apache.jackrabbit.oak.scalabiity` in the oak-run |
| component. |
| |
| To add a new scalability suite, you'll need to implement |
| the `ScalabilitySuite` interface and add an instance of the new suite to the |
| `allSuites` array in the `ScalabilityRunner` class, along with the test benchmarks, |
| in the `org.apache.jackrabbit.oak.scalability` package. |
| To implement the test benchmarks, it is required to extend the `ScalabilityBenchmark` |
| abstract class and implement the `execute()` method. |
| In addition, the methods `beforeExecute()` and `afterExecute()` can overridden to do processing |
| before and after the benchmark executes. |
| |
| The best way to implement the `ScalabilitySuite` interface is to extend the |
| `ScalabilityAbstractSuite` base class that takes care of most of the benchmarking |
| details. The outline of such a suite is: |
| |
| class MyTestSuite extends ScalabilityAbstractSuite { |
| @Override |
| protected void beforeSuite() throws Exception { |
| // optional, run once before all the iterations, |
| // not included in the performance measurements |
| } |
| @Override |
| protected void beforeIteration(ExecutionContext) throws Exception { |
| // optional, Typically, this can be configured to create additional |
| // loads for each iteration. |
| // This method will be called before each test iteration begins |
| } |
| |
| @Override |
| protected void executeBenchmark(ScalabilityBenchmark benchmark, |
| ExecutionContext context) throws Exception { |
| // required, executes the specified benchmark |
| } |
| |
| @Override |
| protected void afterIteration() throws Exception { |
| // optional, executed after runIteration(), |
| // but not included in the performance measurements |
| } |
| @Override |
| protected void afterSuite() throws Exception { |
| // optional, run once after all the iterations are complete, |
| // not included in the performance measurements |
| } |
| } |
| |
| The rough outline of how the individual suite will be run is: |
| |
| test.beforeSuite(); |
| for (iteration...) { |
| test.beforeIteration(); |
| for (benchmarks...) { |
| recordStartTime(); |
| test.executeBenchmark(); |
| recordEndTime(); |
| } |
| test.afterIteration(); |
| } |
| test.afterSuite(); |
| |
| You can specify any context information to the test benchmarks using the ExecutionContext |
| object passed as parameter to the `beforeIteration()` and the `executeBenchmark()` methods. |
| `ExecutionBenchmark` exposes two methods `getMap()` and `setMap()` which can be used to |
| pass context information. |
| |
| You can use the `loginWriter()` and `loginReader()` methods to create admin |
| and anonymous sessions. There's no need to logout those sessions (unless doing |
| so is relevant to the test) as they will automatically be closed after |
| the suite is complete and the `afterSuite()` method has been called. |
| |
| Similarly, you can use the `addBackgroundJob(Runnable)` method to add |
| background tasks that will be run concurrently while the test benchmark is |
| executing. The relevant background thread works like this: |
| |
| while (running) { |
| runnable.run(); |
| Thread.yield(); |
| } |
| |
| As you can see, the `run()` method of the background task gets invoked |
| repeatedly. Such threads will automatically close once all test iterations |
| are done, before the `afterSuite()` method is called. |
| |
| `ScalabilityAbstractSuite` defines some system properties which are used to control the |
| suites extending from it : |
| |
| -Dincrements=10,100,1000,1000 - defines the varying loads for each test iteration |
| -Dprofile=true - to collect and print profiling data |
| -Ddebug=true - to output any intermediate results during the suite |
| run |
| |
| License |
| ------- |
| |
| (see the top-level [LICENSE.txt](../LICENSE.txt) for full license details) |
| |
| Collective work: Copyright 2012 The Apache Software Foundation. |
| |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |