What | Where |
---|---|
Community | Chat with us at Apache Cassandra |
Scala Docs | Most Recent Release (3.5.1): Connector API docs, Connector Driver docs |
Latest Production Release | 3.5.1 |
This library lets you expose Cassandra tables as Spark RDDs and Datasets/DataFrames, write Spark RDDs and Datasets/DataFrames to Cassandra tables, and execute arbitrary CQL queries in your Spark applications.
saveToCassandra
calldeleteFromCassandra
calljoinWithCassandraTable
call for RDDs, and optimizes join with data in Cassandra when using Datasets/DataFramesrepartitionByCassandraReplica
callWHERE
clauseThe connector project has several branches, each of which map into different supported versions of Spark and Cassandra. For previous releases the branch is named “bX.Y” where X.Y is the major+minor version; for example the “b1.6” branch corresponds to the 1.6 release. The “master” branch will normally contain development for the next connector release in progress.
Currently, the following branches are actively supported: 3.5.x (master), 3.4.x (b3.4), 3.3.x (b3.2), 3.2.x (b3.2), 3.1.x (b3.1), 3.0.x (b3.0) and 2.5.x (b2.5).
Connector | Spark | Cassandra | Cassandra Java Driver | Minimum Java Version | Supported Scala Versions |
---|---|---|---|---|---|
3.5.1 | 3.5 | 2.1.5*, 2.2, 3.x, 4.x, 5.0 | 4.18.1 | 8 | 2.12, 2.13 |
3.5 | 3.5 | 2.1.5*, 2.2, 3.x, 4.x | 4.13 | 8 | 2.12, 2.13 |
3.4 | 3.4 | 2.1.5*, 2.2, 3.x, 4.x | 4.13 | 8 | 2.12, 2.13 |
3.3 | 3.3 | 2.1.5*, 2.2, 3.x, 4.x | 4.13 | 8 | 2.12 |
3.2 | 3.2 | 2.1.5*, 2.2, 3.x, 4.0 | 4.13 | 8 | 2.12 |
3.1 | 3.1 | 2.1.5*, 2.2, 3.x, 4.0 | 4.12 | 8 | 2.12 |
3.0 | 3.0 | 2.1.5*, 2.2, 3.x, 4.0 | 4.12 | 8 | 2.12 |
2.5 | 2.4 | 2.1.5*, 2.2, 3.x, 4.0 | 4.12 | 8 | 2.11, 2.12 |
2.4.2 | 2.4 | 2.1.5*, 2.2, 3.x | 3.0 | 8 | 2.11, 2.12 |
2.4 | 2.4 | 2.1.5*, 2.2, 3.x | 3.0 | 8 | 2.11 |
2.3 | 2.3 | 2.1.5*, 2.2, 3.x | 3.0 | 8 | 2.11 |
2.0 | 2.0, 2.1, 2.2 | 2.1.5*, 2.2, 3.x | 3.0 | 8 | 2.10, 2.11 |
1.6 | 1.6 | 2.1.5*, 2.2, 3.0 | 3.0 | 7 | 2.10, 2.11 |
1.5 | 1.5, 1.6 | 2.1.5*, 2.2, 3.0 | 3.0 | 7 | 2.10, 2.11 |
1.4 | 1.4 | 2.1.5* | 2.1 | 7 | 2.10, 2.11 |
1.3 | 1.3 | 2.1.5* | 2.1 | 7 | 2.10, 2.11 |
1.2 | 1.2 | 2.1, 2.0 | 2.1 | 7 | 2.10, 2.11 |
1.1 | 1.1, 1.0 | 2.1, 2.0 | 2.1 | 7 | 2.10, 2.11 |
1.0 | 1.0, 0.9 | 2.0 | 2.0 | 7 | 2.10, 2.11 |
*Compatible with 2.1.X where X >= 5
API documentation for the Scala and Java interfaces are available online:
This project is available on the Maven Central Repository. For SBT to download the connector binaries, sources and javadoc, put this in your project SBT config:
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "3.5.1"
In DS320: Analytics with Spark, you will learn how to effectively and efficiently solve analytical problems with Apache Spark, Apache Cassandra, and DataStax Enterprise. You will learn about Spark API, Spark-Cassandra Connector, Spark SQL, Spark Streaming, and crucial performance optimization techniques.
New issues may be reported using JIRA. Please include all relevant details including versions of Spark, Spark Cassandra Connector, Cassandra and/or DSE. A minimal reproducible case with sample code is ideal.
Questions and requests for help may be submitted to the user mailing list.
For community help see https://cassandra.apache.org/_/community.html
To protect the community, all contributors are required to sign the Apache Software Foundation's Contribution License Agreement.
Tips for Developing the Spark Cassandra Connector
Checklist for contributing changes to the project:
To run unit and integration tests:
./sbt/sbt test ./sbt/sbt it:test
Note that the integration tests require CCM to be installed on your machine. See Tips for Developing the Spark Cassandra Connector for details.
By default, integration tests start up a separate, single Cassandra instance and run Spark in local mode. It is possible to run integration tests with your own Spark cluster. First, prepare a jar with testing code:
./sbt/sbt test:package
Then copy the generated test jar to your Spark nodes and run:
export IT_TEST_SPARK_MASTER=<Spark Master URL> ./sbt/sbt it:test
To generate the Reference Document use
./sbt/sbt spark-cassandra-connector-unshaded/run (outputLocation)
outputLocation defaults to doc/reference.md
Copyright 2014, Apache Software Foundation
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Apache Cassandra, Apache Spark, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.