This document shows you how to install Heron on Kubernetes in a step-by-step, “by hand” fashion. An easier way to install Heron on Kubernetes is to use the Helm package manager. For instructions on doing so, see Heron on Kubernetes with Helm).
Heron supports deployment on Kubernetes (sometimes called k8s). Heron deployments on Kubernetes use Docker as the containerization format for Heron topologies and use the Kubernetes API for scheduling.
You can use Heron on Kubernetes in multiple environments:
In order to run Heron on Kubernetes, you will need:
kubectl
CLI tool installed and set up to communicate with your clusterheron
CLI toolAny additional requirements will depend on where you're running Heron on Kubernetes.
When deploying to Kubernetes, each Heron container is deployed as a Kubernetes pod inside of a Docker container. If there are 20 containers that are going to be deployed with a topoology, for example, then there will be 20 pods deployed to your Kubernetes cluster for that topology.
Minikube enables you to run a Kubernetes cluster locally on a single machine.
To run Heron on Minikube you'll need to install Minikube in addition to the other requirements listed above.
First you'll need to start up Minikube using the minikube start
command. We recommend starting Minikube with:
This command will accomplish precisely that:
$ minikube start \ --memory=7168 \ --cpus=5 \ --disk-size=20G
There are a variety of Heron components that you'll need to start up separately and in order. Make sure that the necessary pods are up and in the RUNNING
state before moving on to the next step. You can track the progress of the pods using this command:
$ kubectl get pods -w
Heron uses ZooKeeper for a variety of coordination- and configuration-related tasks. To start up ZooKeeper on Minikube:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/minikube/zookeeper.yaml
When running Heron on Kubernetes, Apache BookKeeper is used for things like topology artifact storage. You can start up BookKeeper using this command:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/minikube/bookkeeper.yaml
The so-called “Heron tools” include the Heron UI and the Heron Tracker. To start up the Heron tools:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/minikube/tools.yaml
The Heron API server is the endpoint that the Heron CLI client uses to interact with the other components of Heron. To start up the Heron API server on Minikube:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/minikube/apiserver.yaml
Once all of the components have been successfully started up, you need to open up a proxy port to your Minikube Kubernetes cluster using the kubectl proxy
command:
$ kubectl proxy -p 8001
Note: All of the following Kubernetes specific urls are valid with the Kubernetes 1.10.0 release.
Now, verify that the Heron API server running on Minikube is available using curl:
$ curl http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy/api/v1/version
You should get a JSON response like this:
{ "heron.build.git.revision" : "ddbb98bbf173fb082c6fd575caaa35205abe34df", "heron.build.git.status" : "Clean", "heron.build.host" : "ci-server-01", "heron.build.time" : "Sat Mar 31 09:27:19 UTC 2018", "heron.build.timestamp" : "1522488439000", "heron.build.user" : "release-agent", "heron.build.version" : "0.17.8" }
Success! You can now manage Heron topologies on your Minikube Kubernetes installation. To submit an example topology to the cluster:
$ heron submit kubernetes \ --service-url=http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy \ ~/.heron/examples/heron-api-examples.jar \ org.apache.heron.examples.api.AckingTopology acking
You can also track the progress of the Kubernetes pods that make up the topology. When you run kubectl get pods
you should see pods with names like acking-0
and acking-1
.
Another option is to set the service URL for Heron using the heron config
command:
$ heron config kubernetes set service_url \ http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy
That would enable you to manage topologies without setting the --service-url
flag.
The Heron UI is an in-browser dashboard that you can use to monitor your Heron topologies. It should already be running in Minikube.
You can access Heron UI in your browser by navigating to http://localhost:8001/api/v1/namespaces/default/services/heron-ui:8889/proxy/topologies.
You can use Google Container Engine (GKE) to run Kubernetes clusters on Google Cloud Platform.
To run Heron on GKE, you'll need to create a Kubernetes cluster with at least three nodes. This command would create a three-node cluster in your default Google Cloud Platform zone and project:
$ gcloud container clusters create heron-gke-cluster \ --machine-type=n1-standard-4 \ --num-nodes=3
You can specify a non-default zone and/or project using the --zone
and --project
flags, respectively.
Once the cluster is up and running, enable your local kubectl
to interact with the cluster by fetching your GKE cluster's credentials:
$ gcloud container clusters get-credentials heron-gke-cluster Fetching cluster endpoint and auth data. kubeconfig entry generated for heron-gke-cluster.
Finally, you need to create a Kubernetes secret that specifies the Cloud Platform connection credentials for your service account. First, download your Cloud Platform credentials as a JSON file, say key.json
. This command will download your credentials:
$ gcloud iam service-accounts create key.json \ --iam-account=YOUR-ACCOUNT
Heron on Google Container Engine supports two static file storage options for topology artifacts:
If you're running Heron on GKE, you can use either Google Cloud Storage or Apache BookKeeper for topology artifact storage.
If you'd like to use BookKeeper instead of Google Cloud Storage, skip to the BookKeeper section below.
To use Google Cloud Storage for artifact storage, you‘ll need to create a Google Cloud Storage bucket. Here’s an example bucket creation command using gsutil
':
$ gsutil mb gs://my-heron-bucket
Cloud Storage bucket names must be globally unique, so make sure to choose a bucket name carefully. Once you‘ve created a bucket, you need to create a Kubernetes ConfigMap that specifies the bucket name. Here’s an example:
$ kubectl create configmap heron-apiserver-config \ --from-literal=gcs.bucket=BUCKET-NAME
You can list your current service accounts using the
gcloud iam service-accounts list
command.
Then you can create the secret like this:
$ kubectl create secret generic heron-gcs-key \ --from-file=key.json=key.json
Once you've created a bucket, a ConfigMap
, and a secret, you can move on to starting up the various components of your Heron installation.
There are a variety of Heron components that you'll need to start up separately and in order. Make sure that the necessary pods are up and in the RUNNING
state before moving on to the next step. You can track the progress of the pods using this command:
$ kubectl get pods -w
Heron uses ZooKeeper for a variety of coordination- and configuration-related tasks. To start up ZooKeeper on your GKE cluster:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/gcp/zookeeper.yaml
If you're using Google Cloud Storage for topology artifact storage, skip to the Heron tools section below.
To start up an Apache BookKeeper cluster for Heron:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/gcp/bookkeeper.yaml
The so-called “Heron tools” include the Heron UI and the Heron Tracker. To start up the Heron tools:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/gcp/tools.yaml
The Heron API server is the endpoint that the Heron CLI client uses to interact with the other components of Heron. Heron on Google Container Engine has two separate versions of the Heron API server that you can run depending on which artifact storage system you're using (Google Cloud Storage or Apache BookKeeper).
If you're using Google Cloud Storage:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/gcp/gcs-apiserver.yaml
If you're using Apache BookKeeper:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/gcp/bookkeeper-apiserver.yaml
Once all of the components have been successfully started up, you need to open up a proxy port to your GKE Kubernetes cluster using the kubectl proxy
command:
$ kubectl proxy -p 8001
Note: All of the following Kubernetes specific urls are valid with the Kubernetes 1.10.0 release.
Now, verify that the Heron API server running on GKE is available using curl:
$ curl http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy/api/v1/version
You should get a JSON response like this:
{ "heron.build.git.revision" : "bf9fe93f76b895825d8852e010dffd5342e1f860", "heron.build.git.status" : "Clean", "heron.build.host" : "ci-server-01", "heron.build.time" : "Sun Oct 1 20:42:18 UTC 2017", "heron.build.timestamp" : "1506890538000", "heron.build.user" : "release-agent1", "heron.build.version" : "0.16.2" }
Success! You can now manage Heron topologies on your GKE Kubernetes installation. To submit an example topology to the cluster:
$ heron submit kubernetes \ --service-url=http://localhost:8001/api/v1/proxy/namespaces/default/services/heron-apiserver:9000 \ ~/.heron/examples/heron-api-examples.jar \ org.apache.heron.examples.api.AckingTopology acking
You can also track the progress of the Kubernetes pods that make up the topology. When you run kubectl get pods
you should see pods with names like acking-0
and acking-1
.
Another option is to set the service URL for Heron using the heron config
command:
$ heron config kubernetes set service_url \ http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy
That would enable you to manage topologies without setting the --service-url
flag.
The Heron UI is an in-browser dashboard that you can use to monitor your Heron topologies. It should already be running in your GKE cluster.
You can access Heron UI in your browser by navigating to http://localhost:8001/api/v1/namespaces/default/services/heron-ui:8889/proxy/topologies.
Although Minikube and Google Container Engine provide two easy ways to get started running Heron on Kubernetes, you can also run Heron on any Kubernetes cluster. The instructions in this section are tailored to non-Minikube, non-GKE Kubernetes installations.
To run Heron on a general Kubernetes installation, you'll need to fulfill the requirements listed at the top of this doc. Once those requirements are met, you can begin starting up the various components that comprise a Heron on Kubernetes installation.
There are a variety of Heron components that you'll need to start up separately and in order. Make sure that the necessary pods are up and in the RUNNING
state before moving on to the next step. You can track the progress of the pods using this command:
$ kubectl get pods -w
Heron uses ZooKeeper for a variety of coordination- and configuration-related tasks. To start up ZooKeeper on your Kubernetes cluster:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/general/zookeeper.yaml
When running Heron on Kubernetes, Apache BookKeeper is used for things like topology artifact storage (unless you're running on GKE). You can start up BookKeeper using this command:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/general/bookkeeper.yaml
The so-called “Heron tools” include the Heron UI and the Heron Tracker. To start up the Heron tools:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/general/tools.yaml
The Heron API server is the endpoint that the Heron CLI client uses to interact with the other components of Heron. To start up the Heron API server on your Kubernetes cluster:
$ kubectl create -f https://raw.githubusercontent.com/apache/incubator-heron/master/deploy/kubernetes/general/apiserver.yaml
Once all of the components have been successfully started up, you need to open up a proxy port to your GKE Kubernetes cluster using the kubectl proxy
command:
$ kubectl proxy -p 8001
Note: All of the following Kubernetes specific urls are valid with the Kubernetes 1.10.0 release.
Now, verify that the Heron API server running on GKE is available using curl:
$ curl http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy/api/v1/version
You should get a JSON response like this:
{ "heron.build.git.revision" : "ddbb98bbf173fb082c6fd575caaa35205abe34df", "heron.build.git.status" : "Clean", "heron.build.host" : "ci-server-01", "heron.build.time" : "Sat Mar 31 09:27:19 UTC 2018", "heron.build.timestamp" : "1522488439000", "heron.build.user" : "release-agent", "heron.build.version" : "0.17.8" }
Success! You can now manage Heron topologies on your GKE Kubernetes installation. To submit an example topology to the cluster:
$ heron submit kubernetes \ --service-url=http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy \ ~/.heron/examples/heron-api-examples.jar \ org.apache.heron.examples.api.AckingTopology acking
You can also track the progress of the Kubernetes pods that make up the topology. When you run kubectl get pods
you should see pods with names like acking-0
and acking-1
.
Another option is to set the service URL for Heron using the heron config
command:
$ heron config kubernetes set service_url \ http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy
That would enable you to manage topologies without setting the --service-url
flag.
The Heron UI is an in-browser dashboard that you can use to monitor your Heron topologies. It should already be running in your GKE cluster.
You can access Heron UI in your browser by navigating to http://localhost:8001/api/v1/proxy/namespaces/default/services/heron-ui:8889.
You can configure Heron on Kubernetes using a variety of YAML config files, listed in the sections below.
{{< configtable “kubernetes” “client” >}}
{{< configtable “kubernetes” “heron_internals” >}}
{{< configtable “kubernetes” “packing” >}}
{{< configtable “kubernetes” “scheduler” >}}
{{< configtable “kubernetes” “stateful” >}}
{{< configtable “kubernetes” “statemgr” >}}
{{< configtable “kubernetes” “uploader” >}}