This repository contains the default workload specifications for Apache Solr Orbit. This document is a guide on best practices for contributing to this repository.
This repository uses major version branches named after the Solr major version number (e.g. 9, 10). The main branch is the default.
When running a benchmark, solr-orbit automatically selects the workload branch that matches the Solr version being tested. For example, benchmarking a Solr 10.X.X cluster will use the 10 branch if it exists, falling back to main otherwise. To cherry-pick your workload changes to the right branch, base that on the major version of the cluster you intend to test against.
Use --workload-revision to pin a specific branch explicitly, regardless of the Solr version.
By submitting a contribution to this repository you certify that you have the legal right to submit it under the Apache License 2.0 — for example, that it is your own original work, or that you have the necessary rights from your employer or from any third-party rights-holders whose work is included. You agree that your contribution may be distributed under the terms of the Apache License 2.0.
For significant new features or design changes, it is recommended to first raise a discussion on the dev@solr.apache.org mailing list or open a GitHub issue so the community can provide early feedback.
Before making a change, fork this repository and make the change on a feature branch.
10 for Solr 10.x).main and backport as needed once merged.After making changes in your feature branch, test them locally and optionally via GitHub Actions integration tests in your forked solr-orbit repository.
solr-orbit pointing at your modified workload using --workload-path or --workloads-repository. Use --test-mode for a quick sanity-check run:solr-orbit run \ --pipeline=benchmark-only \ --target-host=localhost:8983 \ --workload-path=/path/to/your/fork/nyc_taxis \ --test-mode
Additional tips:
--test-mode reduces the corpus size and iteration counts so the run finishes quickly.--workloads-repository=https://github.com/<YOUR USERNAME>/solr-orbit-workloads and --distribution-version=X.Y.Z to pin solr-orbit to the matching branch.To catch regressions across the full suite, run integration tests from your forked solr-orbit repository.
One-time setup:
test-forked-workloads based off main.[workloads] default.url = https://github.com/<YOUR GITHUB USERNAME>/solr-orbit-workloads
Running the tests:
solr-orbit repository, go to GitHub Actions → Run Integration Tests, select the test-forked-workloads branch, and click Run workflow.Before opening a pull request, make sure you have addressed the following:
main only or also backported to one or more version branches (e.g. 9, 10).Create a pull request from your fork to the main branch of this repository.
Reviewers and maintainers should:
If the workload repository has Solr-version branches, changes should be cherry-picked from main to the most recent supported branch and backward from there. For example:
main → 10 → 9
In the event of a merge conflict during backporting, open a separate pull request that applies the change directly to the target branch. Ensure only the changes from the original PR are included in the backport PR.
For a step-by-step guide to creating a new workload from scratch or migrating one from OpenSearch Benchmark, see CREATE_WORKLOAD_GUIDE.md in the solr-orbit repository. If you are migrating an existing OSB workload, the converter tool can automate much of the mechanical translation.
See the Apache Solr Orbit documentation site for the full workload specification reference, including operation types, Jinja2 templating, and test procedure format.
Before contributing a workload, confirm that:
README.md.A new workload must provide:
workload.json — defining collections, corpora, operations, and test_proceduresconfigsets/<name>/ — a valid Solr configset (schema.xml + solrconfig.xml). If no configset is provided, Apache Solr Orbit will attempt to auto-generate a basic schema from the document structure, but an explicit configset is strongly recommended for benchmarking accuracy.operations/default.json — the named operations referenced by test procedurestest_procedures/default.json — at least one test procedure (mark one "default": true)README.md — see README.md contents belowfiles.txt — list of corpus data filesThe workload may also include an optional workload.py to add dynamic functionality.
Reuse the shared common_operations/ snippets for collection lifecycle and optimize steps rather than duplicating those definitions inside each workload.
Provide a detailed README.md that includes:
For an example, see the nyc_taxis README.
All test runs used to produce example output must target a live Apache Solr cluster.
Run with --test-mode against at least one supported Solr version to confirm a clean end-to-end pass:
solr-orbit run \ --pipeline=benchmark-only \ --target-host=localhost:8983 \ --workload-path=/path/to/your/workload \ --test-mode
Run a full (non-test-mode) benchmark without errors and include the result summary in your pull request description.
Optionally, run the integration suite using the steps in Testing changes with integration tests.
Once the PR is approved, coordinate with the maintainers about hosting the data corpora so that other users can download them.
For questions, reach out on the dev@solr.apache.org mailing list or open a GitHub issue.