Install AWS CLI (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html#install-tool-bundled)
Install AWS-IAM-AUTHENTICATOR (https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html)
Install eksctl (https://docs.aws.amazon.com/eks/latest/userguide/eksctl.html#installing-eksctl)
Login to your AWS account.
aws configure
Note that environment variables AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
will override the aws configuration in file ~/.aws/credentials
.
Please modify the parameters in the example command below:
eksctl create cluster \ --name pinot-quickstart \ --version 1.14 \ --region us-west-2 \ --nodegroup-name standard-workers \ --node-type t3.small \ --nodes 3 \ --nodes-min 3 \ --nodes-max 4 \ --node-ami auto
You can monitor cluster status by command:
EKS_CLUSTER_NAME=pinot-quickstart aws eks describe-cluster --name ${EKS_CLUSTER_NAME}
Once the cluster is in ACTIVE
status, it's ready to be used.
Simply run below command to get the credential for the cluster you just created or your existing cluster.
EKS_CLUSTER_NAME=pinot-quickstart aws eks update-kubeconfig --name ${EKS_CLUSTER_NAME}
To verify the connection, you can run
kubectl get nodes
pinot-demo
.pinot-demo
will be used as example value for ${GCLOUD_PROJECT}
variable in script example.pinot-demo@example.com
will be used as example value for ${GCLOUD_EMAIL}
.Below script will:
pinot-quickstart
n1-standard-8
for demo.Please fill both environment variables: ${GCLOUD_PROJECT}
and ${GCLOUD_EMAIL}
with your gcloud project and gcloud account email in below script.
GCLOUD_PROJECT=[your gcloud project name] GCLOUD_EMAIL=[Your gcloud account email] ./setup_gke.sh
E.g.
GCLOUD_PROJECT=pinot-demo GCLOUD_EMAIL=pinot-demo@example.com ./setup_gke.sh
Simply run below command to get the credential for the cluster you just created or your existing cluster. Please modify the Env variables ${GCLOUD_PROJECT}
, ${GCLOUD_ZONE}
, ${GCLOUD_CLUSTER}
accordingly in below script.
GCLOUD_PROJECT=pinot-demo GCLOUD_ZONE=us-west1-b GCLOUD_CLUSTER=pinot-quickstart gcloud container clusters get-credentials ${GCLOUD_CLUSTER} --zone ${GCLOUD_ZONE} --project ${GCLOUD_PROJECT}
az login
AKS_RESOURCE_GROUP=pinot-demo AKS_RESOURCE_GROUP_LOCATION=eastus az group create --name ${AKS_RESOURCE_GROUP} --location ${AKS_RESOURCE_GROUP_LOCATION}
AKS_RESOURCE_GROUP=pinot-demo AKS_CLUSTER_NAME=pinot-quickstart az aks create --resource-group ${AKS_RESOURCE_GROUP} --name ${AKS_CLUSTER_NAME} --node-count 3
(Optional) Please register default provider if above command failed for error: MissingSubscriptionRegistration
az provider register --namespace Microsoft.Network
Simply run below command to get the credential for the cluster you just created or your existing cluster.
AKS_RESOURCE_GROUP=pinot-demo AKS_CLUSTER_NAME=pinot-quickstart az aks get-credentials --resource-group ${AKS_RESOURCE_GROUP} --name ${AKS_CLUSTER_NAME}
To verify the connection, you can run
kubectl get nodes
helm dependency update
kubectl create ns pinot-quickstart helm install -n pinot-quickstart pinot .
If cluster is just initialized, ensure helm is initialized by running:
helm init --service-account tiller
Then deploy pinot cluster by:
helm install --namespace "pinot-quickstart" --name "pinot" .
Error: could not find tiller".
kubectl -n kube-system delete deployment tiller-deploy kubectl -n kube-system delete service/tiller-deploy helm init --service-account tiller
Error: release pinot failed: namespaces "pinot-quickstart" is forbidden: User "system:serviceaccount:kube-system:default" cannot get resource "namespaces" in API group "" in the namespace "pinot-quickstart"
kubectl apply -f helm-rbac.yaml
kubectl get all -n pinot-quickstart
helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator helm install -n pinot-quickstart kafka incubator/kafka --set replicas=1
helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator helm install --namespace "pinot-quickstart" --name kafka incubator/kafka --set replicas=1
kubectl -n pinot-quickstart exec kafka-0 -- kafka-topics --zookeeper kafka-zookeeper:2181 --topic flights-realtime --create --partitions 1 --replication-factor 1 kubectl -n pinot-quickstart exec kafka-0 -- kafka-topics --zookeeper kafka-zookeeper:2181 --topic flights-realtime-avro --create --partitions 1 --replication-factor 1
kubectl apply -f pinot-realtime-quickstart.yml
Please use below script to do local port-forwarding and open Pinot query console on your web browser.
./query-pinot-data.sh
This chart includes a ZooKeeper chart as a dependency to the Pinot cluster in its requirement.yaml
by default. The chart can be customized using the following configurable parameters:
Parameter | Description | Default |
---|---|---|
image.repository | Pinot Container image repo | apachepinot/pinot |
image.tag | Pinot Container image tag | 0.3.0-SNAPSHOT |
image.pullPolicy | Pinot Container image pull policy | IfNotPresent |
cluster.name | Pinot Cluster name | pinot-quickstart |
------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
controller.name | Name of Pinot Controller | controller |
controller.port | Pinot controller port | 9000 |
controller.replicaCount | Pinot controller replicas | 1 |
controller.data.dir | Pinot controller data directory, should be same as controller.persistence.mountPath or a sub directory of it | /var/pinot/controller/data |
controller.vip.host | Pinot Vip host | pinot-controller |
controller.vip.port | Pinot Vip port | 9000 |
controller.persistence.enabled | Use a PVC to persist Pinot Controller data | true |
controller.persistence.accessMode | Access mode of data volume | ReadWriteOnce |
controller.persistence.size | Size of data volume | 1G |
controller.persistence.mountPath | Mount path of controller data volume | /var/pinot/controller/data |
controller.persistence.storageClass | Storage class of backing PVC | "" |
controller.jvmOpts | Pinot Controller JVM Options | -Xms256M -Xmx1G |
controller.log4j2ConfFile | Pinot Controller log4j2 configuration file | /opt/pinot/conf/pinot-controller-log4j2.xml |
controller.pluginsDir | Pinot Controller plugins directory | /opt/pinot/plugins |
controller.service.port | Service Port | 9000 |
controller.external.enabled | If True, exposes Pinot Controller externally | false |
controller.external.type | Service Type | LoadBalancer |
controller.external.port | Service Port | 9000 |
controller.resources | Pinot Controller resource requests and limits | {} |
controller.nodeSelector | Node labels for controller pod assignment | {} |
controller.affinity | Defines affinities and anti-affinities for pods as defined in: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity preferences | {} |
controller.tolerations | List of node tolerations for the pods. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ | [] |
controller.podAnnotations | Annotations to be added to controller pod | {} |
controller.updateStrategy.type | StatefulSet update strategy to use. | RollingUpdate |
------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
broker.name | Name of Pinot Broker | broker |
broker.port | Pinot broker port | 8099 |
broker.replicaCount | Pinot broker replicas | 1 |
broker.jvmOpts | Pinot Broker JVM Options | -Xms256M -Xmx1G |
broker.log4j2ConfFile | Pinot Broker log4j2 configuration file | /opt/pinot/conf/pinot-broker-log4j2.xml |
broker.pluginsDir | Pinot Broker plugins directory | /opt/pinot/plugins |
broker.service.port | Service Port | 8099 |
broker.external.enabled | If True, exposes Pinot Broker externally | false |
broker.external.type | External service Type | LoadBalancer |
broker.external.port | External service Port | 8099 |
broker.routingTable.builderClass | Routing Table Builder Class | random |
broker.resources | Pinot Broker resource requests and limits | {} |
broker.nodeSelector | Node labels for broker pod assignment | {} |
broker.affinity | Defines affinities and anti-affinities for pods as defined in: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity preferences | {} |
broker.tolerations | List of node tolerations for the pods. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ | [] |
broker.podAnnotations | Annotations to be added to broker pod | {} |
broker.updateStrategy.type | StatefulSet update strategy to use. | RollingUpdate |
------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
server.name | Name of Pinot Server | server |
server.port.netty | Pinot server netty port | 8098 |
server.port.admin | Pinot server admin port | 8097 |
server.replicaCount | Pinot server replicas | 1 |
server.dataDir | Pinot server data directory, should be same as server.persistence.mountPath or a sub directory of it | /var/pinot/server/data/index |
server.segmentTarDir | Pinot server segment directory, should be same as server.persistence.mountPath or a sub directory of it | /var/pinot/server/data/segments |
server.persistence.enabled | Use a PVC to persist Pinot Server data | true |
server.persistence.accessMode | Access mode of data volume | ReadWriteOnce |
server.persistence.size | Size of data volume | 4G |
server.persistence.mountPath | Mount path of server data volume | /var/pinot/server/data |
server.persistence.storageClass | Storage class of backing PVC | "" |
server.jvmOpts | Pinot Server JVM Options | -Xms512M -Xmx1G |
server.log4j2ConfFile | Pinot Server log4j2 configuration file | /opt/pinot/conf/pinot-server-log4j2.xml |
server.pluginsDir | Pinot Server plugins directory | /opt/pinot/plugins |
server.service.port | Service Port | 8098 |
server.resources | Pinot Server resource requests and limits | {} |
server.nodeSelector | Node labels for server pod assignment | {} |
server.affinity | Defines affinities and anti-affinities for pods as defined in: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity preferences | {} |
server.tolerations | List of node tolerations for the pods. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ | [] |
server.podAnnotations | Annotations to be added to server pod | {} |
server.updateStrategy.type | StatefulSet update strategy to use. | RollingUpdate |
------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
zookeeper.enabled | If True, installs Zookeeper Chart | true |
zookeeper.resources | Zookeeper resource requests and limits | {} |
zookeeper.env | Environmental variables provided to Zookeeper Zookeeper | {ZK_HEAP_SIZE: "256M"} |
zookeeper.storage | Zookeeper Persistent volume size | 2Gi |
zookeeper.image.PullPolicy | Zookeeper Container pull policy | IfNotPresent |
zookeeper.url | URL of Zookeeper Cluster (unneeded if installing Zookeeper Chart) | "" |
zookeeper.port | Port of Zookeeper Cluster | 2181 |
zookeeper.affinity | Defines affinities and anti-affinities for pods as defined in: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity preferences | {} |
------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
Specify parameters using --set key=value[,key=value]
argument to helm install
Alternatively a YAML file that specifies the values for the parameters can be provided like this:
helm install --name pinot -f values.yaml .
If you are using GKE, Create a storageClass:
kubectl apply -f gke-ssd.yaml
or If you want to use pd-standard storageClass:
kubectl apply -f gke-pd.yaml
kubectl apply -f superset.yaml
kubectl exec -it pod/superset-0 -n pinot-quickstart -- bash -c 'export FLASK_APP=superset:app && flask fab create-admin'
kubectl exec -it pod/superset-0 -n pinot-quickstart -- bash -c 'superset db upgrade' kubectl exec -it pod/superset-0 -n pinot-quickstart -- bash -c 'superset init'
kubectl exec -it pod/superset-0 -n pinot-quickstart -- bash -c 'superset import_datasources -p /etc/superset/pinot_example_datasource.yaml' kubectl exec -it pod/superset-0 -n pinot-quickstart -- bash -c 'superset import_dashboards -p /etc/superset/pinot_example_dashboard.json'
You can run below command to navigate superset in your browser with the previous admin credential.
./open-superset-ui.sh
You can open the imported dashboard by click Dashboards
banner then click on AirlineStats
.
You can run below command to deploy a customized Presto with Pinot plugin.
kubectl apply -f presto-coordinator.yaml
Please use below script to do local port-forwarding and open Presto UI on your web browser.
./launch-presto-ui.sh
Once Presto is deployed, you could run below command.
./pinot-presto-cli.sh
presto:default> show catalogs;
Catalog --------- pinot system (2 rows) Query 20191112_050827_00003_xkm4g, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:01 [0 rows, 0B] [0 rows/s, 0B/s]
presto:default> show tables;
Table -------------- airlinestats (1 row) Query 20191112_050907_00004_xkm4g, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:01 [1 rows, 29B] [1 rows/s, 41B/s]
presto:default> DESCRIBE pinot.dontcare.airlinestats;
Column | Type | Extra | Comment ----------------------+---------+-------+--------- flightnum | integer | | origin | varchar | | quarter | integer | | lateaircraftdelay | integer | | divactualelapsedtime | integer | | ...... Query 20191112_051021_00005_xkm4g, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:02 [80 rows, 6.06KB] [35 rows/s, 2.66KB/s]
presto:default> select count(*) as cnt from pinot.dontcare.airlinestats limit 10;
cnt ------ 9745 (1 row) Query 20191112_051114_00006_xkm4g, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0:00 [1 rows, 8B] [2 rows/s, 19B/s]
You can run below command to deploy more presto workers if needed.
kubectl apply -f presto-worker.yaml
Then you could verify the new worker nodes are added by:
presto:default> select * from system.runtime.nodes; node_id | http_uri | node_version | coordinator | state --------------------------------------+--------------------------+------------------------+-------------+-------- 38959968-6262-46a1-a321-ee0db6cbcbd3 | http://10.244.0.182:8080 | 0.230-SNAPSHOT-4e66289 | false | active 83851b8c-fe7f-49fe-ae0c-e3daf6d92bef | http://10.244.2.183:8080 | 0.230-SNAPSHOT-4e66289 | false | active presto-coordinator | http://10.244.1.25:8080 | 0.230-SNAPSHOT-4e66289 | true | active (3 rows) Query 20191206_095812_00027_na99c, FINISHED, 2 nodes Splits: 17 total, 17 done (100.00%) 0:00 [3 rows, 248B] [11 rows/s, 984B/s]
kubectl delete ns pinot-quickstart