| --- |
| title: "Deploying Distributed Cluster" |
| sidebar_position: 3 |
| --- |
| |
| # Deploying Distributed Cluster |
| |
| This page provides instructions on how to deploy a *distributed cluster* for Fluss on bare machines. |
| |
| |
| ## Requirements |
| |
| ### Hardware Requirements |
| |
| Fluss runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**. |
| To build a distributed cluster, you need to have at least two nodes. |
| This doc provides a simple example of how to deploy a distributed cluster on four nodes. |
| |
| ### Software Requirements |
| |
| Before you start to set up the system, make sure you have installed **Java 17** or higher **on each node** in your cluster. |
| Java 8 and Java 11 are not recommended. |
| |
| Additionally, you need a running **ZooKeeper** cluster with version 3.6.0 or higher. |
| We do not recommend to use ZooKeeper versions below 3.6.0. |
| For further information how to deploy a distributed ZooKeeper cluster, see [Running Replicated ZooKeeper](https://zookeeper.apache.org/doc/r3.6.0/zookeeperStarted.html#sc_RunningReplicatedZooKeeper). |
| |
| If your cluster does not fulfill these software requirements, you will need to install/upgrade them. |
| |
| ### `JAVA_HOME` Configuration |
| |
| Fluss requires the `JAVA_HOME` environment variable to be set on all nodes and point to the directory of your Java installation. |
| |
| ## Fluss Setup |
| |
| This part will describe how to set up a Fluss cluster consisting of one CoordinatorServer and multiple TabletServers |
| across four machines. Suppose you have four nodes in a `192.168.10/24` subnet with the following IP address assignment: |
| - Node0: `192.168.10.100` |
| - Node1: `192.168.10.101` |
| - Node2: `192.168.10.102` |
| - Node3: `192.168.10.103` |
| |
| Node0 will deploy a CoordinatorServer instance. Node1, Node2 and Node3 will deploy one TabletServer instance, respectively. |
| |
| ### Preparation |
| |
| 1. Make sure ZooKeeper has been deployed. We assume that ZooKeeper listens on `192.168.10.199:2181`. |
| |
| 2. Download Fluss |
| |
| |
| Go to the [downloads page](/downloads) and download the latest Fluss release. After downloading the latest release, copy the archive to all the nodes and extract it: |
| |
| ```shell |
| tar -xzf fluss-$FLUSS_VERSION$-bin.tgz |
| cd fluss-$FLUSS_VERSION$/ |
| ``` |
| |
| ### Configuring Fluss |
| |
| After having extracted the archived files, you need to configure Fluss for a distributed deployment. |
| We will use the _default config file_ (`conf/server.yaml`) to configure Fluss. |
| Adapt the `server.yaml` on each node as follows. |
| |
| **Node0** |
| |
| ```yaml title="server.yaml" |
| # coordinator server |
| bind.listeners: FLUSS://192.168.10.100:9123 |
| |
| zookeeper.address: 192.168.10.199:2181 |
| zookeeper.path.root: /fluss |
| |
| # When running in distributed mode, be sure to point to a remote path— |
| # e.g. oss://bucket/path for OSS or hdfs://namenode:port/path for HDFS. |
| # Otherwise, queries will fail with a “No such file or directory” error. |
| remote.data.dir: hdfs://namenode:port/tmp/fluss-remote-data |
| ``` |
| |
| **Node1** |
| |
| ```yaml title="server.yaml" |
| # tablet server |
| bind.listeners: FLUSS://192.168.10.101:9123 # alternatively, setting the port to 0 assigns a random port |
| tablet-server.id: 1 |
| |
| zookeeper.address: 192.168.10.199:2181 |
| zookeeper.path.root: /fluss |
| |
| # When running in distributed mode, be sure to point to a remote path— |
| # e.g. oss://bucket/path for OSS or hdfs://namenode:port/path for HDFS. |
| # Otherwise, queries will fail with a “No such file or directory” error. |
| remote.data.dir: hdfs://namenode:port/tmp/fluss-remote-data |
| ``` |
| |
| **Node2** |
| |
| ```yaml title="server.yaml" |
| # tablet server |
| bind.listeners: FLUSS://192.168.10.102:9123 # alternatively, setting the port to 0 assigns a random port |
| tablet-server.id: 2 |
| |
| zookeeper.address: 192.168.10.199:2181 |
| zookeeper.path.root: /fluss |
| |
| # When running in distributed mode, be sure to point to a remote path— |
| # e.g. oss://bucket/path for OSS or hdfs://namenode:port/path for HDFS. |
| # Otherwise, queries will fail with a “No such file or directory” error. |
| remote.data.dir: hdfs://namenode:port/tmp/fluss-remote-data |
| ``` |
| |
| **Node3** |
| ```yaml title="server.yaml" |
| # tablet server |
| bind.listeners: FLUSS://192.168.10.103:9123 # alternatively, setting the port to 0 assigns a random port |
| tablet-server.id: 3 |
| |
| zookeeper.address: 192.168.10.199:2181 |
| zookeeper.path.root: /fluss |
| |
| # When running in distributed mode, be sure to point to a remote path— |
| # e.g. oss://bucket/path for OSS or hdfs://namenode:port/path for HDFS. |
| # Otherwise, queries will fail with a “No such file or directory” error. |
| remote.data.dir: hdfs://namenode:port/tmp/fluss-remote-data |
| ``` |
| |
| :::note |
| - `tablet-server.id` is the unique id of the TabletServer. If you have multiple TabletServers, you should set a different id for each TabletServer. |
| - In this example, we only set the mandatory properties. For additional properties, you can refer to [Configuration](maintenance/configuration.md) for more details. |
| ::: |
| |
| ### Starting Fluss |
| |
| To deploy a distributed Fluss cluster, you should first start a CoordinatorServer instance on **Node0**. |
| Then, start a TabletServer instance on **Node1**, **Node2**, and **Node3**, respectively. |
| |
| **CoordinatorServer** |
| |
| On **Node0**, start a CoordinatorServer as follows. |
| ```shell |
| ./bin/coordinator-server.sh start |
| ``` |
| |
| **TabletServer** |
| |
| On **Node1**, **Node2** and **Node3**, start a TabletServer as follows. |
| ```shell |
| ./bin/tablet-server.sh start |
| ``` |
| |
| After that, you have successfully deployed a distributed Fluss cluster. |
| |
| ## Interacting with Fluss |
| |
| After the Fluss cluster is started, you can use **Fluss Client** (e.g., Flink SQL Client) to interact with Fluss. |
| The following subsections will show you how to use Flink SQL Client to interact with Fluss. |
| |
| ### Flink SQL Client |
| |
| Using Flink SQL Client to interact with Fluss. |
| |
| #### Preparation |
| |
| You can start a Flink standalone cluster refer to [Flink Environment Preparation](engine-flink/getting-started.md#preparation-when-using-flink-sql-client) |
| |
| **Note**: Make sure the [Fluss connector jar](/downloads/) already has copied to the `lib` directory of your Flink home. |
| |
| #### Add catalog |
| |
| In Flink SQL client, a catalog is created and named by executing the following query: |
| ```sql title="Flink SQL" |
| CREATE CATALOG fluss_catalog WITH ( |
| 'type' = 'fluss', |
| 'bootstrap.servers' = '192.168.10.100:9123' |
| ); |
| ``` |
| |
| #### Do more with Fluss |
| |
| After the catalog is created, you can use Flink SQL Client to do more with Fluss, for example, create a table, insert data, query data, etc. |
| More details please refer to [Flink Getting Started](engine-flink/getting-started.md). |