Here's a short program that connects to Cassandra and executes a query:
Cluster cluster = null; try { cluster = Cluster.builder() // (1) .addContactPoint("127.0.0.1") .build(); Session session = cluster.connect(); // (2) ResultSet rs = session.execute("select release_version from system.local"); // (3) Row row = rs.one(); System.out.println(row.getString("release_version")); // (4) } finally { if (cluster != null) cluster.close(); // (5) }
execute
to send a query to Cassandra. This returns a ResultSet, which is essentially a collection of Row objects. On the next line, we extract the first row (which is the only one in this case);Note: this example uses the synchronous API. Most methods have asynchronous equivalents.
The simplest approach is to do it programmatically with Cluster.Builder, which provides a fluent API:
Cluster cluster = Cluster.builder() .withClusterName("myCluster") .addContactPoint("127.0.0.1") .build();
Alternatively, you might want to retrieve the settings from an external source (like a properties file or a web service). You'll need to provide an implementation of Initializer that loads these settings:
Initializer myInitializer = ... // your implementation Cluster cluster = Cluster.buildFrom(myInitializer);
The only required option is the list of contact points, i.e. the hosts that the driver will initially contact to discover the cluster topology. You can provide a single contact point, but it is usually a good idea to provide more, so that the driver can fallback if the first one is down.
The other aspects that you can configure on the Cluster
are:
In addition, you can register various types of listeners to be notified of cluster events; see Host.StateListener, LatencyTracker, and SchemaChangeListener.
A freshly-built Cluster
instance does not initialize automatically; that will be triggered by one of the following actions:
cluster.init()
;cluster.getMetadata()
;cluster.connect()
or one of its variants;session.init()
on a session that was created with cluster.newSession()
.The initialization sequence is the following:
Note that, at this stage, only the control connection has been established. Connections to other hosts will only be opened when a session gets created.
By default, a session isn‘t tied to any specific keyspace. You’ll need to prefix table names in your queries:
Session session = cluster.connect(); session.execute("select * from myKeyspace.myTable where id = 1");
You can also specify a keyspace name at construction time, it will be used as the default when table names are not qualified:
Session session = cluster.connect("myKeyspace"); session.execute("select * from myTable where id = 1"); session.execute("select * from otherKeyspace.otherTable where id = 1");
You might be tempted to open a separate session for each keyspace used in your application; however, note that connection pools are created at the session level, so each new session will consume additional system resources:
// Warning: creating two sessions doubles the number of TCP connections opened by the driver Session session1 = cluster.connect("ks1"); Session session2 = cluster.connect("ks2");
Also, there is currently a known limitation with named sessions, that causes the driver to unexpectedly block the calling thread in certain circumstances; if you use a fully asynchronous model, you should use a session with no keyspace.
Finally, if you issue a USE
statement, it will change the default keyspace on that session:
Session session = cluster.connect(); // No default keyspace set, need to prefix: session.execute("select * from myKeyspace.myTable where id = 1"); session.execute("USE myKeyspace"); // Now the keyspace is set, unqualified query works: session.execute("select * from myTable where id = 1");
Be very careful though: if the session is shared by multiple threads, switching the keyspace at runtime could easily cause unexpected query failures.
Generally, the recommended approach is to use a single session with no keyspace, and prefix all your queries.
You run queries with the session's execute
method:
ResultSet rs = session.execute("select release_version from system.local");
As shown here, the simplest form is to pass a query string directly. You can also pass an instance of Statement.
Executing a query produces a ResultSet, which is an iterable of Row. The basic way to process all rows is to use Java's for-each loop:
for (Row row : rs) { // process the row }
Note that this will return all results without limit (even though the driver might use multiple queries in the background). To handle large result sets, you might want to use a LIMIT
clause in your CQL query, or use one of the techniques described in the paging documentation.
When you know that there is only one row (or are only interested in the first one), the driver provides a convenience method:
Row row = rs.one();
Row provides getters to extract column values; they can be either positional or named:
Row row = session.execute("select first_name, last_name from users where id = 1").one(); // The two are equivalent: String firstName = row.getString(0); String firstName = row.getString("first_name");
In addition to these default mappings, you can register your own types with custom codecs.
For performance reasons, the driver uses primitive Java types wherever possible (boolean
, int
...); the CQL value NULL
is encoded as the type's default value (false
, 0
...), which can be ambiguous. To distinguish NULL
from actual values, use isNull
:
Integer age = row.isNull("age") ? null : row.getInt("age");
To ensure type safety, collection getters are generic. You need to provide type parameters matching your CQL type when calling the methods:
// Assuming given_names is a list<text>: List<String> givenNames = row.getList("given_names", String.class);
For nested collections, element types are generic and cannot be expressed as Java Class
instances. We use Guava's TypeToken instead:
// Assuming teams is a set<list<text>>: TypeToken<List<String>> listOfStrings = new TypeToken<List<String>>() {}; Set<List<String>> teams = row.getSet("teams", listOfStrings);
Since type tokens are anonymous inner classes, it's recommended to store them as constants in a utility class instead of re-creating them each time.
Row
exposes an API to explore the column metadata at runtime:
for (ColumnDefinitions.Definition definition : row.getColumnDefinitions()) { System.out.printf("Column %s has type %s%n", definition.getName(), definition.getType()); }
Besides explicit work with queries and rows, you can also use Object Mapper to simplify retrieval & store of your data.
If you‘re reading this from the generated HTML documentation on github.io, use the “Contents” menu on the left hand side to navigate sub-sections. If you’re browsing the source files on github.com, simply navigate to each sub-directory.