blob: aa3089877e0c10997aa53146813a7ca731f3138a [file] [log] [blame]
# Camel-Kafka-connector CQL Sink
## Introduction
This is an example for Camel-Kafka-connector CQL
## What is needed
- A Cassandra instance
## Running Kafka
```
$KAFKA_HOME/bin/zookeeper-server-start.sh $KAFKA_HOME/config/zookeeper.properties
$KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties
$KAFKA_HOME/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic mytopic
```
## Setting up the needed bits and running the example
You'll need to setup the plugin.path property in your kafka
Open the `$KAFKA_HOME/config/connect-standalone.properties`
and set the `plugin.path` property to your choosen location
In this example we'll use `/home/oscerd/connectors/`
```
> cd /home/oscerd/connectors/
> wget https://repo1.maven.org/maven2/org/apache/camel/kafkaconnector/camel-cql-kafka-connector/0.10.1/camel-cql-kafka-connector-0.10.1-package.tar.gz
> untar.gz camel-cql-kafka-connector-0.10.1-package.tar.gz
```
## Setting up Apache Cassandra
This examples require a running Cassandra instance, for simplicity the steps below show how to start Cassandra using Docker. First you'll need to run a Cassandra instance:
[source,bash]
----
docker run --name master_node --env MAX_HEAP_SIZE='800M' -dt oscerd/cassandra
----
Next, check and make sure Cassandra is running:
[source,bash]
----
docker exec -ti master_node /opt/cassandra/bin/nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.17.0.2 251.32 KiB 256 100.0% 5126aaad-f143-43e9-920a-0f9540a93967 rack1
----
To populate the database using to the `cqlsh` tool, you'll need a local installation of Cassandra. Download and extract the Apache Cassandra distribution to a directory. We reference the Cassandra installation directory with `LOCAL_CASSANDRA_HOME`. Here we use version 3.11.4 to connect to the Cassandra instance we started using Docker.
[source,bash]
----
<LOCAL_CASSANDRA_HOME>/bin/cqlsh $(docker inspect --format='{{ .NetworkSettings.IPAddress }}' master_node)
----
Next, execute the following script to create keyspace `test`, the table `users` and insert one row into it.
[source,bash]
----
create keyspace test with replication = {'class':'SimpleStrategy', 'replication_factor':3};
use test;
create table users ( id timeuuid primary key, name text );
insert into users (id,name) values (now(), 'oscerd');
quit;
----
In the configuration `.properties` file we use below the IP address of the Cassandra master node needs to be configured, replace the value `172.17.0.2` in the `camel.source.url` or `localhost` in `camel.sink.url` configuration property with the IP of the master node obtained from Docker. Each example uses a different `.properties` file shown in the command line to run the example.
[source,bash]
----
docker inspect --format='{{ .NetworkSettings.IPAddress }}' master_node
----
Now it's time to setup the connectors
Open the CQL Sink configuration file
```
name=CamelCassandraQLSinkConnector
topics=mytopic
tasks.max=1
connector.class=org.apache.camel.kafkaconnector.cql.CamelCqlSinkConnector
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
camel.sink.path.hosts=172.17.0.2
camel.sink.path.port=9042
camel.sink.path.keyspace=test
camel.sink.endpoint.cql=insert into users(id, name) values (now(), ?)
```
Set the correct options in the file.
Now you can run the example
```
$KAFKA_HOME/bin/connect-standalone.sh $KAFKA_HOME/config/connect-standalone.properties config/CamelCassandraQLSinkConnector.properties
```
On a different terminal run the kafka-producer and you should see messages from the Cassandra test keyspace populated
```
kafka-console-producer.sh --broker-list localhost:9092 --topic mytopic
>message
```
You can verify the behavior through the following command
[source,bash]
----
<LOCAL_CASSANDRA_HOME>/bin/cqlsh $(docker inspect --format='{{ .NetworkSettings.IPAddress }}' master_node)
----
Next, execute the following script to create keyspace `test`, the table `users` and insert one row into it.
[source,bash]
----
use test;
select * from users;
----
and you should see
[source,bash]
----
(2 rows)
cqlsh:test> select * from users;
id | name
--------------------------------------+----------
6cbe74a0-96a6-11ea-a8ff-09d03512038e | message
fc2c66c0-96a5-11ea-a8ff-09d03512038e | oscerd
----
## Openshift
### What is needed
- An Openshift instance
### Running Kafka using Strimzi Operator
First we install the Strimzi operator and use it to deploy the Kafka broker and Kafka Connect into our OpenShift project.
We need to create security objects as part of installation so it is necessary to switch to admin user.
If you use Minishift, you can do it with the following command:
[source,bash,options="nowrap"]
----
oc login -u system:admin
----
We will use OpenShift project `myproject`.
If it doesn't exist yet, you can create it using following command:
[source,bash,options="nowrap"]
----
oc new-project myproject
----
If the project already exists, you can switch to it with:
[source,bash,options="nowrap"]
----
oc project myproject
----
We can now install the Strimzi operator into this project:
[source,bash,options="nowrap",subs="attributes"]
----
oc apply -f https://github.com/strimzi/strimzi-kafka-operator/releases/download/0.20.1/strimzi-cluster-operator-0.20.1.yaml
----
Next we will deploy a Kafka broker cluster and a Kafka Connect cluster and then create a Kafka Connect image with the Debezium connectors installed:
[source,bash,options="nowrap",subs="attributes"]
----
# Deploy a single node Kafka broker
oc apply -f https://github.com/strimzi/strimzi-kafka-operator/raw/0.20.1/examples/kafka/kafka-persistent-single.yaml
# Deploy a single instance of Kafka Connect with no plug-in installed
oc apply -f https://github.com/strimzi/strimzi-kafka-operator/raw/0.20.1/examples/connect/kafka-connect-s2i-single-node-kafka.yaml
----
Optionally enable the possibility to instantiate Kafka Connectors through specific custom resource:
[source,bash,options="nowrap"]
----
oc annotate kafkaconnects2is my-connect-cluster strimzi.io/use-connector-resources=true
----
### Add Camel Kafka connector binaries
Strimzi uses `Source2Image` builds to allow users to add their own connectors to the existing Strimzi Docker images.
We now need to build the connectors and add them to the image,
if you have built the whole project (`mvn clean package`) decompress the connectors you need in a folder (i.e. like `my-connectors/`)
so that each one is in its own subfolder
(alternatively you can download the latest officially released and packaged connectors from maven):
So we need to do something like this:
```
> cd my-connectors/
> wget https://repo1.maven.org/maven2/org/apache/camel/kafkaconnector/camel-cql-kafka-connector/0.10.1/camel-cql-kafka-connector-0.10.1-package.tar.gz
> untar.gz camel-cql-kafka-connector-0.10.1-package.tar.gz
```
Now we can start the build
[source,bash,options="nowrap"]
----
oc start-build my-connect-cluster-connect --from-dir=./my-connectors/ --follow
----
We should now wait for the rollout of the new image to finish and the replica set with the new connector to become ready.
Once it is done, we can check that the connectors are available in our Kafka Connect cluster.
Strimzi is running Kafka Connect in a distributed mode.
To check the available connector plugins, you can run the following command:
[source,bash,options="nowrap"]
----
oc exec -i `oc get pods --field-selector status.phase=Running -l strimzi.io/name=my-connect-cluster-connect -o=jsonpath='{.items[0].metadata.name}'` -- curl -s http://my-connect-cluster-connect-api:8083/connector-plugins | jq .
----
You should see something like this:
[source,json,options="nowrap"]
----
[
{
"class": "org.apache.camel.kafkaconnector.CamelSinkConnector",
"type": "sink",
"version": "0.10.1"
},
{
"class": "org.apache.camel.kafkaconnector.CamelSourceConnector",
"type": "source",
"version": "0.10.1"
},
{
"class": "org.apache.camel.kafkaconnector.cql.CamelCqlSinkConnector",
"type": "sink",
"version": "0.10.1"
},
{
"class": "org.apache.camel.kafkaconnector.cql.CamelCqlSourceConnector",
"type": "source",
"version": "0.10.1"
},
{
"class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"type": "sink",
"version": "2.5.0"
},
{
"class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"type": "source",
"version": "2.5.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
"type": "source",
"version": "1"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
"type": "source",
"version": "1"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"type": "source",
"version": "1"
}
]
----
### Deploy the Cassandra instance
Next, we need to deploy a Cassandra instance:
[source,bash,options="nowrap"]
----
oc create -f config/openshift/cassandra.yaml
----
This will create a Cassandra deployment and a service that will allow other pods to connect to it.
We then create the table in cassandra using the following command:
----
cat config/openshift/cql-init | oc run -i --restart=Never --attach --rm --image centos/cassandra-311-centos7 cassandra-client --command bash -- -c 'cqlsh -u admin -p admin cassandra'
----
### Create connector instance
Now we can create some instance of the CQL sink connector:
[source,bash,options="nowrap"]
----
oc exec -i `oc get pods --field-selector status.phase=Running -l strimzi.io/name=my-connect-cluster-connect -o=jsonpath='{.items[0].metadata.name}'` -- curl -X POST \
-H "Accept:application/json" \
-H "Content-Type:application/json" \
http://my-connect-cluster-connect-api:8083/connectors -d @- <<'EOF'
{
"name": "cql-sink-connector",
"config": {
"connector.class": "org.apache.camel.kafkaconnector.cql.CamelCqlSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"topics": "mytopic",
"camel.sink.path.hosts": "cassandra",
"camel.sink.path.port": "9042",
"camel.sink.path.keyspace": "test",
"camel.sink.endpoint.cql": "insert into users(id, name) values (now(), ?)",
"camel.sink.endpoint.username": "admin",
"camel.sink.endpoint.password": "admin"
}
}
EOF
----
Altenatively, if have enabled `use-connector-resources`, you can create the connector instance by creating a specific custom resource:
[source,bash,options="nowrap"]
----
oc create -f config/openshift/cql-sink-connector.yaml
----
You can check the status of the connector using
[source,bash,options="nowrap"]
----
oc exec -i `oc get pods --field-selector status.phase=Running -l strimzi.io/name=my-connect-cluster-connect -o=jsonpath='{.items[0].metadata.name}'` -- curl -s http://my-connect-cluster-connect-api:8083/connectors/cql-sink-connector/status
----
Run the following command and send some messages to the broker:
```
oc exec -i -c kafka my-cluster-kafka-0 -- bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic mytopic
>message1
>message2
```
### Verify the data in Cassandra
Run the following command to get an interactive cqlsh session:
----
oc run -ti --restart=Never --attach --rm --image centos/cassandra-311-centos7 cassandra-client --command bash -- -c 'cqlsh -u admin -p admin cassandra'
If you don't see a command prompt, try pressing enter.
Connected to Test Cluster at cassandra:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
admin@cqlsh> select * from test.users;
id | name
--------------------------------------+----------
4e4dfda0-19d3-11eb-9012-47ac9a308b13 | message1
4f84a8e0-19d3-11eb-9012-47ac9a308b13 | message2
----