docs/ops/deployment/cluster

title: “Standalone Cluster” nav-parent_id: deployment nav-pos: 1

This page provides instructions on how to run Flink in a fully distributed fashion on a static (but possibly heterogeneous) cluster.

This will be replaced by the TOC {:toc}

Requirements

Software Requirements

Flink runs on all UNIX-like environments, e.g. Linux, Mac OS X, Cygwin (for Windows), and Windows Subsystem for Linux (for Windows 10) and expects the cluster to consist of one master node and one or more worker nodes. Before you start to setup the system, make sure you have the following software installed on each node:

Java 1.8.x or higher,
ssh (sshd must be running to use the Flink scripts that manage remote components)

If your cluster does not fulfill these software requirements, you will need to install/upgrade it.

Having passwordless SSH and the same directory structure on all your cluster nodes will allow you to use our scripts to control everything.

{% top %}

`JAVA_HOME` Configuration

Flink requires the JAVA_HOME environment variable to be set on the master and all worker nodes and point to the directory of your Java installation.

You can set this variable in conf/flink-conf.yaml via the env.java.home key.

{% top %}

Flink Setup

Go to the [downloads page]({{ site.download_url }}) and get the ready-to-run package. Make sure to pick the Flink package matching your Hadoop version. If you don't plan to use Hadoop, pick any version.

After downloading the latest release, copy the archive to your master node and extract it:

{% highlight bash %} tar xzf flink-.tgz cd flink- {% endhighlight %}

Configuring Flink

After having extracted the system files, you need to configure Flink for the cluster by editing conf/flink-conf.yaml .

Set the jobmanager.rpc.address key to point to your master node. You should also define the maximum amount of main memory the JVM is allowed to allocate on each node by setting the jobmanager.heap.mb and taskmanager.heap.mb keys.

Currently all the TaskManagers are started with the same resource specification and jvm arguments(-Xms, -Xmx, -Xmn, -XX:MaxDirectMemorySize, etc.). Navigate to TaskManager Resource to figure out how the TaskManager Resources are configured. If some worker nodes have more main memory and you want to override the heap memory settings, the environment variable FLINK_TM_HEAP will not take effect and we recommend you to start multiple TaskManager instances on those specific nodes.

Finally, you must provide a list of all nodes in your cluster which shall be used as worker nodes. Therefore, similar to the HDFS configuration, edit the file conf/slaves and enter the IP/host name of each worker node. Each worker node will later run a TaskManager.

The following example illustrates the setup with three nodes (with IP addresses from 10.0.0.1 to 10.0.0.3 and hostnames master, worker1, worker2) and shows the contents of the configuration files (which need to be accessible at the same path on all machines):

The Flink directory must be available on every worker under the same path. You can use a shared NFS directory, or copy the entire Flink directory to every worker node.

Please see the configuration page for details and additional configuration options.

In particular,

the amount of available memory per JobManager (jobmanager.heap.mb),
the amount of available memory per TaskManager (taskmanager.heap.mb),
the number of available CPUs per machine (taskmanager.numberOfTaskSlots),
the total number of CPUs in the cluster (parallelism.default) and
the temporary directories (taskmanager.tmp.dirs)

are very important configuration values.

{% top %}

Starting Flink

The following script starts a JobManager on the local node and connects via SSH to all worker nodes listed in the slaves file to start the TaskManager on each node. Now your Flink system is up and running. The JobManager running on the local node will now accept jobs at the configured RPC port.

Assuming that you are on the master node and inside the Flink directory:

{% highlight bash %} bin/start-cluster.sh {% endhighlight %}

To stop Flink, there is also a stop-cluster.sh script.

{% top %}

Adding JobManager/TaskManager Instances to a Cluster

You can add both JobManager and TaskManager instances to your running cluster with the bin/jobmanager.sh and bin/taskmanager.sh scripts.

Adding a JobManager

{% highlight bash %} bin/jobmanager.sh ((start|start-foreground) cluster)|stop|stop-all {% endhighlight %}

Adding a TaskManager

{% highlight bash %} bin/taskmanager.sh start|start-foreground|stop|stop-all {% endhighlight %}

Make sure to call these scripts on the hosts on which you want to start/stop the respective instance.

{% top %}

title: “Standalone Cluster” nav-parent_id: deployment nav-pos: 1

Requirements

Software Requirements

JAVA_HOME Configuration

Flink Setup

Configuring Flink

Starting Flink

Adding JobManager/TaskManager Instances to a Cluster

Adding a JobManager

Adding a TaskManager

`JAVA_HOME` Configuration