commit	690101840d4d8f9c656bb0ca114f6619af80e1cf	[log] [tgz]
author	Francisco Guerrero <frankgh@apache.org>	Mon Apr 08 14:33:50 2024 -0700
committer	GitHub <noreply@github.com>	Mon Apr 08 14:33:50 2024 -0700
tree	319f8c08ce57ffdc397279e750d83f08cfacf536
parent	47fdb6448b6956249790d5dc7bb76b699d35c079 [diff]

commit

690101840d4d8f9c656bb0ca114f6619af80e1cf

[log] [tgz]

author

Francisco Guerrero <frankgh@apache.org>

Mon Apr 08 14:33:50 2024 -0700

committer

GitHub <noreply@github.com>

Mon Apr 08 14:33:50 2024 -0700

tree

319f8c08ce57ffdc397279e750d83f08cfacf536

parent

47fdb6448b6956249790d5dc7bb76b699d35c079 [diff]

CASSANDRA-19526: Optionally enable TLS in the server and client for Analytics testing All integration tests today run without TLS, which is generally fine because they run locally. However, it is helpful to be able to start up the sidecar with TLS enabled in the integration test framework so that third-party tests could connect via secure connections for testing purposes. Co-authored-by: Doug Rohrer <drohrer@apple.com> Co-authored-by: Francisco Guerrero <frankgh@apache.org> Patch by Doug Rohrer, Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19526

tree: 319f8c08ce57ffdc397279e750d83f08cfacf536

README.md

Cassandra Analytics

Cassandra Spark Bulk Reader

The open-source repository for the Cassandra Spark Bulk Reader. This library allows integration between Cassandra and Spark job, allowing users to run arbitrary Spark jobs against a Cassandra cluster securely and consistently.

This project contains the necessary open-source implementations to connect to a Cassandra cluster and read the data into Spark.

For example usage, see the example repository; sample steps:

import org.apache.cassandra.spark.sparksql.CassandraDataSource
import org.apache.spark.sql.SparkSession

val sparkSession = SparkSession.builder.getOrCreate()
val df = sparkSession.read.format("org.apache.cassandra.spark.sparksql.CassandraDataSource")
                          .option("sidecar_instances", "localhost,localhost2,localhost3")
                          .option("keyspace", "sbr_tests")
                          .option("table", "basic_test")
                          .option("DC", "datacenter1")
                          .option("createSnapshot", true)
                          .option("numCores", 4)
                          .load()

Cassandra Spark Bulk Writer

The Cassandra Spark Bulk Writer allows for high-speed data ingest to Cassandra clusters running Cassandra 3.0 and 4.0.

Developers interested in contributing to the Analytics library, please see the DEV-README.

Getting Started

For example usage, see the example repository. This example covers both setting up Cassandra 4.0, Apache Sidecar, and running a Spark Bulk Reader and Spark Bulk Writer job.