CASSANDRASC-134: Detect out of range data and cleanup using nodetool (#125)

The patch adds a step to check the data ownership before importing sstable. When fully out of range sstables are found, the sstables are removed before importing. When partially out of range sstables are found, running nodetool cleanup is requested on job completion, including both success and failure cases.

Patch by Yifan Cai; Reviewed by Bernardo Botella and Francisco Guerrero for CASSANDRASC-134
34 files changed
tree: d116d2b6e4b664401731544cbdc31fa46554b8e7
  1. .circleci/
  2. adapters/
  3. client/
  4. client-common/
  5. docs/
  6. gradle/
  7. ide/
  8. scripts/
  9. server-common/
  10. src/
  11. vertx-client/
  12. vertx-client-shaded/
  13. .gitignore
  14. build.gradle
  15. CHANGES.txt
  16. checkstyle.xml
  17. CONTRIBUTING.md
  18. gradle.properties
  19. gradlew
  20. LICENSE.txt
  21. NOTICE.txt
  22. README.md
  23. settings.gradle
  24. spotbugs-exclude.xml
README.md

Apache Cassandra Sidecar [WIP]

This is a Sidecar for the highly scalable Apache Cassandra database. For more information, see the Apache Cassandra web site and CIP-1.

This is project is still WIP.

Requirements

  1. Java >= 1.8 (OpenJDK or Oracle), or Java 11
  2. Apache Cassandra 4.0. We depend on virtual tables which is a 4.0 only feature.
  3. Docker for running integration tests.

Build Prerequisites

We depend on the Cassandra in-jvm dtest framework for testing. Because these jars are not published, you must manually build the dtest jars before you can build the project.

./scripts/build-dtest-jars.sh

The build script supports two parameters:

  • REPO - the Cassandra git repository to use for the source files. This is helpful if you need to test with a fork of the Cassandra codebase.
    • default: git@github.com:apache/cassandra.git
  • BRANCHES - a space-delimited list of branches to build. -default: "cassandra-4.1 trunk"

Remove any versions you may not want to test with. We recommend at least the latest (released) 4.X series and trunk. See Testing for more details on how to choose which Cassandra versions to use while testing.

For multi-node in-jvm dtests, network aliases will need to be setup for each Cassandra node. The tests assume each node's ip address is 127.0.0.x, where x is the node id.

For example if you populated your cluster with 3 nodes, create interfaces for 127.0.0.2 and 127.0.0.3 (the first node of course uses 127.0.0.1).

macOS network aliases

To get up and running, create a temporary alias for every node except the first:

 for i in {2..20}; do sudo ifconfig lo0 alias "127.0.0.${i}"; done

Getting started: Running The Sidecar

After you clone the git repo, you can use the gradle wrapper to build and run the project. Make sure you have Apache Cassandra running on the host & port specified in conf/sidecar.yaml.

$ ./gradlew run

Configuring Cassandra Instance

While setting up cassandra instance, make sure the data directories of cassandra are in the path stored in sidecar.yaml file, else modify data directories path to point to the correct directories for stream APIs to work.

Testing

The test framework is set up to run 4.1 and 5.1 (Trunk) tests (see TestVersionSupplier.java) by default. You can change this via the Java property cassandra.sidecar.versions_to_test by supplying a comma-delimited string. For example, -Dcassandra.sidecar.versions_to_test=4.0,4.1,5.1.

CircleCI Testing

You will need to use the “Add Projects” function of CircleCI to set up CircleCI on your fork. When promoted to create a branch, do not replace the CircleCI config, choose the option to do it manually. CircleCI will pick up the in project configuration.

Contributing

We warmly welcome and appreciate contributions from the community. Please see CONTRIBUTING.md if you wish to submit pull requests.

Wondering where to go from here?