Each node in the Cassandra cluster is uniquely identified by an IP address that the driver will use to establish connections.
CqlSession
object;system.peers
on already connected nodes, or via push notifications received on the control connection when new nodes are discovered by gossip.The address that each Cassandra node shares with clients is the broadcast RPC address; it is controlled by various properties in cassandra.yaml:
cassandra.yaml
that came with your installation);If broadcast_rpc_address
is not set, it defaults to rpc_address
/rpc_interface
. If rpc_address
/rpc_interface
is 0.0.0.0 (all interfaces), then broadcast_rpc_address
must be set.
If you're not sure which address a Cassandra node is broadcasting, launch cqlsh locally on the node, execute the following query and take node of the result:
cqlsh> select broadcast_address from system.local; broadcast_address ------------------- 172.1.2.3
Then connect to another node in the cluster and run the following query, injecting the previous result:
cqlsh> select rpc_address from system.peers where peer = '172.1.2.3'; rpc_address ------------- 1.2.3.4
That last result is the broadcast RPC address. Ensure that it is accessible from the client machine where the driver will run.
Sometimes it's not possible for Cassandra nodes to broadcast addresses that will work for each and every client; for instance, they might broadcast private IPs because most clients are in the same network, but a particular client could be on another network and go through a router.
For such cases, you can register a driver-side component that will perform additional address translation. Write a class that implements AddressTranslator with the following constructor:
public class MyAddressTranslator implements AddressTranslator { public PassThroughAddressTranslator(DriverContext context, DriverOption configRoot) { // retrieve any required dependency or extra configuration option, otherwise can stay empty } @Override public InetSocketAddress translate(InetSocketAddress address) { // your custom translation logic } @Override public void close() { // free any resources if needed, otherwise can stay empty } }
Then reference this class from the configuration:
datastax-java-driver.advanced.address-translator.class = com.mycompany.MyAddressTranslator
Note: the contact points provided while creating the CqlSession
are not translated, only addresses retrieved from or sent by Cassandra nodes are.
If your client applications access Cassandra through some kind of proxy (eg. with AWS PrivateLink when all Cassandra nodes are exposed via one hostname pointing to AWS Endpoint), you can configure driver with FixedHostNameAddressTranslator
to always translate all node addresses to that same proxy hostname, no matter what IP address a node has but still using its native transport port.
To use it, specify the following in the configuration:
datastax-java-driver.advanced.address-translator.class = FixedHostNameAddressTranslator advertised-hostname = proxyhostname
When running Cassandra in a private network and accessing it from outside of that private network via some kind of proxy, we have an option to use FixedHostNameAddressTranslator
. But for multi-datacenter Cassandra deployments, we want to have more control over routing queries to a specific datacenter (eg. for optimizing latencies), which requires setting up a separate proxy per datacenter.
Normally, each Cassandra datacenter nodes are deployed to a different subnet to support internode communications in the cluster and avoid IP address collisions. So when Cassandra broadcasts its nodes IP addresses, we can determine which datacenter that node belongs to by checking its IP address against the given datacenter subnet.
For such scenarios you can use SubnetAddressTranslator
to translate node IPs to the datacenter proxy address associated with it.
To use it, specify the following in the configuration:
datastax-java-driver.advanced.address-translator { class = SubnetAddressTranslator subnet-addresses { "100.64.0.0/15" = "cassandra.datacenter1.com:9042" "100.66.0.0/15" = "cassandra.datacenter2.com:9042" # IPv6 example: # "::ffff:6440:0/111" = "cassandra.datacenter1.com:9042" # "::ffff:6442:0/111" = "cassandra.datacenter2.com:9042" } # Optional. When configured, addresses not matching the configured subnets are translated to this address. default-address = "cassandra.datacenter1.com:9042" # Whether to resolve the addresses once on initialization (if true) or on each node (re-)connection (if false). # If not configured, defaults to false. resolve-addresses = false }
Such setup is common for running Cassandra on Kubernetes with k8ssandra.
If you deploy both Cassandra and client applications on Amazon EC2, and your cluster spans multiple regions, you'll have to configure your Cassandra nodes to broadcast public RPC addresses.
However, this is not always the most cost-effective: if a client and a node are in the same region, it would be cheaper to connect over the private IP. Ideally, you'd want to pick the best address in each case.
The driver provides Ec2MultiRegionAddressTranslator
which does exactly that. To use it, specify the following in the configuration:
datastax-java-driver.advanced.address-translator.class = Ec2MultiRegionAddressTranslator
With this configuration, you keep broadcasting public RPC addresses. But each time the driver connects to a new Cassandra node:
(To achieve this, Ec2MultiRegionAddressTranslator
performs a reverse DNS lookup of the origin address, to find the domain name of the target instance. Then it performs a forward DNS lookup of the domain name; the EC2 DNS does the private/public switch automatically based on location).