You can configure Presto Pulsar connector and deploy a cluster with the following instruction.
You can configure Presto Pulsar Connector in the ${project.root}/conf/presto/catalog/pulsar.properties
properties file. The configuration for the connector and the default values are as follows.
# name of the connector to be displayed in the catalog connector.name=pulsar # the url of Pulsar broker service pulsar.broker-service-url=http://localhost:8080 # URI of Zookeeper cluster pulsar.zookeeper-uri=localhost:2181 # minimum number of entries to read at a single time pulsar.entry-read-batch-size=100 # default number of splits to use per query pulsar.target-num-splits=4
You can connect Presto to a Pulsar cluster with multiple hosts. To configure multiple hosts for brokers, add multiple URLs to pulsar.broker-service-url
. To configure multiple hosts for ZooKeeper, add multiple URIs to pulsar.zookeeper-uri
. The following is an example.
pulsar.broker-service-url=http://localhost:8080,localhost:8081,localhost:8082 pulsar.zookeeper-uri=localhost1,localhost2:2181
If you already have a Presto cluster, you can copy the Presto Pulsar connector plugin to your existing cluster. Download the archived plugin package with the following command.
$ wget pulsar:binary_release_url
Since Pulsar SQL is powered by Presto, the configuration for deployment is the same for the Pulsar SQL worker.
Note
For how to set up a standalone single node environment, refer to Query data.
You can use the same CLI args as the Presto launcher.
$ ./bin/pulsar sql-worker --help Usage: launcher [options] command Commands: run, start, stop, restart, kill, status Options: -h, --help show this help message and exit -v, --verbose Run verbosely --etc-dir=DIR Defaults to INSTALL_PATH/etc --launcher-config=FILE Defaults to INSTALL_PATH/bin/launcher.properties --node-config=FILE Defaults to ETC_DIR/node.properties --jvm-config=FILE Defaults to ETC_DIR/jvm.config --config=FILE Defaults to ETC_DIR/config.properties --log-levels-file=FILE Defaults to ETC_DIR/log.properties --data-dir=DIR Defaults to INSTALL_PATH --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid --launcher-log-file=FILE Defaults to DATA_DIR/var/log/launcher.log (only in daemon mode) --server-log-file=FILE Defaults to DATA_DIR/var/log/server.log (only in daemon mode) -D NAME=VALUE Set a Java system property
The default configuration for the cluster is located in ${project.root}/conf/presto
. You can customize your deployment by modifying the default configuration.
You can set the worker to read from a different configuration directory, or set a different directory to write data.
$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto --data-dir /tmp/presto-1
You can start the worker as daemon process.
$ ./bin/pulsar sql-worker start
You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The following example shows how to deploy a cluster on three-node cluster.
The first node runs as Presto coordinator. The minimal configuration requirement in the ${project.root}/conf/presto/config.properties
file is as follows.
coordinator=true node-scheduler.include-coordinator=true http-server.http.port=8080 query.max-memory=50GB query.max-memory-per-node=1GB discovery-server.enabled=true discovery.uri=<coordinator-url>
The other two nodes serve as worker nodes, you can use the following configuration for worker nodes.
coordinator=false http-server.http.port=8080 query.max-memory=50GB query.max-memory-per-node=1GB discovery.uri=<coordinator-url>
Modify pulsar.broker-service-url
and pulsar.zookeeper-uri
configuration in the ${project.root}/conf/presto/catalog/pulsar.properties
file accordingly for the three nodes.
Start the coordinator node.
$ ./bin/pulsar sql-worker run
$ ./bin/pulsar sql-worker run
$ ./bin/pulsar sql --server <coordinate_url>
presto> SELECT * FROM system.runtime.nodes; node_id | http_uri | node_version | coordinator | state ---------+-------------------------+--------------+-------------+-------- 1 | http://192.168.2.1:8081 | testversion | true | active 3 | http://192.168.2.2:8081 | testversion | false | active 2 | http://192.168.2.3:8081 | testversion | false | active
For more information about deployment in Presto, refer to Presto deployment.