blob: 339a6bfcff2680648d25b40d12f0778353419803 [file] [log] [blame] [view]
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Airflow Operator Custom Resource (API)
The Airflow Operator uses these [CustomResourceDefinitions](https://kubernetes.io/docs/concepts/api-n/custom-resources/):
`AirflowBase` includes MySQL, UI, NFS(DagStore).
`AirflowCluster` includes Airflow Scheduler, Workers, Redis.
Multiple `AirflowCluster` could use the same `AirflowBase`. The way custom resources are defined allows multi-single-tenant (multiple single users) usecases, where users use different airflow plugins (opeartors, packages etc) in their set
up. This improves cluster utilization and provide multiple users (in same trust domain) with some isolation.
## AirflowBase API
| **Field** | **json field**| **Type** | **Info** |
| --- | --- | --- | --- |
| Spec | `spec` | [AirflowBaseSpec](#AirflowBaseSpec) | The specfication for Airflow Base cusotm resource |
| Status | `status` | AirflowBaseStatus | The status for the custom resource |
#### AirflowBaseSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| NodeSelector | map[string]string | `nodeSelector` | [Selector](https://kubernetes.io/docs/concepts/configuration/assign-pod-node) for fitting pods to nodes whose labels match the selector |
| Affinity | \*corev1.Affinity | `affinity` | Define scheduling constraints for pods |
| Annotations | map[string]string | `annotations` | Custom annotations to be added to the pods |
| Labels | map[string]string | `labels` | Custom labels to be added to the pods |
| MySQL | \*MySQLSpec | `mysql` | Spec for MySQL component |
| Storage | \*NFSStoreSpec | `storage` | Spec for NFS component |
| UI | \*AirflowUISpec | `ui` | Spec for Airflow UI component |
| SQLProxy | \*SQLProxySpec | `sqlproxy` | Spec for SQLProxy component. Ignored if SQL(MySQLSpec) is specified |
#### MySQLSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image` | Image defines the MySQL Docker image name |
| Version | string | `version` | Version defines the MySQL Docker image version |
| Replicas | int32 | `replicas` | Replicas defines the number of running MySQL instances in a cluster |
| VolumeClaimTemplate | \*corev1.PersistentVolumeClaim | `volumeClaimTemplate` | VolumeClaimTemplate allows a user to specify volume claim for MySQL Server files |
| BackupVolumeClaimTemplate | \*corev1.PersistentVolumeClaim | `backupVolumeClaimTemplate` | BackupVolumeClaimTemplate allows a user to specify a volume to temporarily store the data for a backup prior to it being shipped to object storage |
| Operator | bool | `operator` | Flag when True generates MySQLOperator CustomResource to be handled by MySQL Operator If False, a StatefulSet with 1 replica is created (not for production setups) |
| Backup | \*MySQLBackup | `backup` | Spec defining the Backup Custom Resource to be handled by MySQLOperator Ignored when Operator is False |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods |
| Options | map[string]string | ` ` | command line options for mysql |
#### MySQLBackup
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Schedule | string | `schedule` | Schedule is the cron string used to schedule backup|
| Storage | StorageSpec | `storage` | Storage has the s3 compatible storage spec|
##### StorageSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| StorageProvider | string | `storageprovider` | Provider is the storage type used for backup and restore e.g. s3, oci-s3-compat, aws-s3, gce-s3, etc |
| SecretRef | \*corev1.LocalObjectReference | `secretRef` | SecretRef is a reference to the Kubernetes secret containing the configuration for uploading the backup to authenticated storage |
| Config | map[string]string | `config` | Config is generic string based key-value map that defines non-secret configuration values for uploading the backup to storage w.r.t the configured storage provider |
#### NFSStoreSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image` | Image defines the NFS Docker image.|
| Version | string | `version` | Version defines the NFS Server Docker image version.|
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods.|
| Volume | \*corev1.PersistentVolumeClaim | `volumeClaimTemplate` | Volume allows a user to specify volume claim template to be used for fileserver|
#### SQLProxySpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image` | Image defines the SQLProxy Docker image name|
| Version | string | `version` | Version defines the SQL Proxy docker image version. example: myProject:us-central1:myInstance=tcp:3306|
| Project | string | `project` | Project defines the SQL instance project|
| Region | string | `region` | Region defines the SQL instance region|
| Instance | string | `instance` | Instance defines the SQL instance name|
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods.|
#### AirflowBaseStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| ObservedGeneration | int64 | `observedGeneration` | ObservedGeneration is the last generation of the AirflowBase as observed by the controller |
| MySQL | ComponentStatus | `mysql` | MySQL is the status of the MySQL component |
| UI | ComponentStatus | `ui` | UI is the status of the Airflow UI component |
| Storage | ComponentStatus | `storage` | Storage is the status of the NFS component |
| SQLProxy | ComponentStatus | `sqlproxy` | SQLProxy is the status of the SQLProxy component |
| LastError | string | `lasterror` | LastError |
| Status | string | `status`| Reaedy or Pending |
## AirflowCluster
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Spec | AirflowClusterSpec | `spec` | |
| Status | AirflowClusterStatus | `status` | |
#### AirflowClusterSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| NodeSelector | map[string]string | `nodeSelector` | [Selector](https://kubernetes.io/docs/concepts/configuration/assign-pod-node) for fitting pods to nodes whose labels match the selector |
| Affinity | \*corev1.Affinity | `affinity` | Define scheduling constraints for pods. |
| Annotations | map[string]string | `annotations` | Custom annotations to be added to the pods. |
| Labels | map[string]string | `labels` | Custom labels to be added to the pods. |
| Executor | string | `executor` | Airflow Executor desired: local,celery,kubernetes |
| Redis | \*RedisSpec | `redis` | Spec for Redis component. |
| Scheduler | \*SchedulerSpec | `scheduler` | Spec for Airflow Scheduler component. |
| Worker | \*WorkerSpec | `worker` | Spec for Airflow Workers |
| UI | \*AirflowUISpec | `ui` | Spec for Airflow UI component. |
| Flower | \*FlowerSpec | `flower` | Spec for Flower component. |
| DAGs | \*DagSpec | `dags` | Spec for DAG source and location |
| AirflowBaseRef | \*corev1.LocalObjectReference | `airflowbase` | AirflowBaseRef is a reference to the AirflowBase CR |
#### AirflowClusterStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| ObservedGeneration | int64 | `observedGeneration"` | ObservedGeneration is the last generation of the AirflowCluster as observed by the controller. |
| Redis | ComponentStatus | `redis` | Redis is the status of the Redis component |
| Scheduler | SchedulerStatus | `scheduler` | Scheduler is the status of the Airflow Scheduler component |
| Worker | ComponentStatus | `worker` | Worker is the status of the Workers |
| UI | ComponentStatus | `ui` | UI is the status of the Airflow UI component |
| Flower | ComponentStatus | `flower` | Flower is the status of the Airflow UI component |
| LastError | string | `lasterror` | LastError |
| Status | string | `status` | Status |
#### RedisSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image"` | Image defines the Redis Docker image name |
| Operator | bool | `operator` | Version defines the Redis Docker image version. Flag when True generates RedisReplica CustomResource to be handled by Redis Operator If False, a StatefulSet with 1 replica is created |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods. |
| VolumeClaimTemplate | \*corev1.PersistentVolumeClaim | `volumeClaimTemplate` | VolumeClaimTemplate allows a user to specify volume claim for MySQL Server files |
| AdditionalArgs | string | `additionalargs` | AdditionalArgs for redis-server |
#### RedisSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image"` | Image defines the Redis Docker image name |
| Version | string | `version"` | Version defines the Redis Docker image version. |
| Operator | bool | `operator` | Flag when True generates RedisReplica CustomResource to be handled by Redis Operator If False, a StatefulSet with 1 replica is created |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods. |
| VolumeClaimTemplate | \*corev1.PersistentVolumeClaim | `volumeClaimTemplate` | VolumeClaimTemplate allows a user to specify volume claim for MySQL Server files |
| AdditionalArgs | string | `additionalargs` | AdditionalArgs for redis-server |
#### FlowerSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image"` | Image defines the Flower Docker image. |
| Version | string | `version"` | Version defines the Flower Docker image version. |
| Replicas | int32 | `replicas` | Replicas defines the number of running Flower instances in a cluster |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods. |
#### SchedulerSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image"` | Image defines the Airflow custom server Docker image. |
| Version | string | `version"` | Version defines the Airflow Docker image version |
| DBName | string | `database"` | DBName defines the Airflow Database to be used |
| DBUser | string | `dbuser"` | DBUser defines the Airflow Database user to be used |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods. |
#### WorkerSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image"` | Image defines the Airflow worker Docker image. |
| Version | string | `version"` | Version defines the Airflow worker Docker image version |
| Replicas | int32 | `replicas` | Replicas is the count of number of workers |
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods. |
#### AirflowUISpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Image | string | `image` | Image defines the AirflowUI Docker image.|
| Version | string | `version` | Version defines the AirflowUI Docker image version.|
| Replicas | int32 | `replicas` | Replicas defines the number of running Airflow UI instances in a cluster|
| Resources | corev1.ResourceRequirements | `resources` | Resources is the resource requests and limits for the pods.|
#### GCSSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Bucket | string | `bucket` | Bucket describes the GCS bucket |
| Once | bool | `once` | Once syncs initially and quits (use init container instead of sidecar) |
#### GitSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Repo | string | `repo,"` | Repo describes the http/ssh uri for git repo |
| Branch | string | `branch` | Branch describes the branch name to be synced |
| Rev | string | `rev` | Rev is the git hash to be used for syncing |
| User | string | `user` | User for git access |
| Once | bool | `once` | Once syncs initially and quits (use init container instead of sidecar) |
| CredSecretRef | \*corev1.LocalObjectReference | `cred` | Reference to a Secret that has git credentials in field `password`. It is injected as env `GIT_SYNC_PASSWORD` in [git-sync](https://github.com/kubernetes/git-sync) container.Refer to how the `password` is [used in git-sync](https://github.com/kubernetes/git-sync/blob/40e188fb26ecad2d8174e486fc104939c6b1271d/cmd/git-sync/main.go#L477:6) |
#### DagSpec
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| DagSubdir | string | `subdir` | DagSubdir is the directory under source where the dags are present |
| Git | \*GitSpec | `git` | GitSpec defines details to pull DAGs from a git repo using github.com/kubernetes/git-sync sidecar |
| NfsPV | \*corev1.PersistentVolumeClaim | `nfspv` | NfsPVSpec |
| Storage | \*StorageSpec | `storage` | Storage has s3 compatible storage spec for copying files from |
| GCS | \*GCSSpec | `gcs` | Gcs config which uses storage spec |
#### SchedulerStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Resources | ComponentStatus | `resources` | Status is a string describing Scheduler status |
| DagCount | int32 | `dagcount` | DagCount is a count of number of Dags observed |
| RunCount | int32 | `runcount` | RunCount is a count of number of Dag Runs observed |
## Common
#### ComponentStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| STS | []StsStatus | `sts` | StatefulSet status|
| SVC | []SvcStatus | `svc` | Service status|
| PDB | []PdbStatus | `pdb` | PDB status|
| LastError | string | `lasterror` | LastError|
| Status | string | `status` | Status|
#### StsStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Link | string | `link` | Link to sts|
| Name | string | `name` | Name of sts|
| Status | string | `status` | Status to rsrc|
| Replicas | int32 | `replicas` | Replicas defines the no of MySQL instances desired|
| ReadyReplicas | int32 | `readycount` | ReadyReplicas defines the no of MySQL instances that are ready|
| CurrentReplicas | int32 | `currentcount` | CurrentReplicas defines the no of MySQL instances that are created|
#### SvcStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Link | string | `link` | Link to rsrc|
| Name | string | `name` | service name|
| Status | string | `status` | Status to rsrc|
#### PdbStatus
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| Link | string | `link` | Link to rsrc|
| Name | string | `name` | Name of pdb|
| Status | string | `status` | Status to rsrc|
| CurrentHealthy | int32 | `currenthealthy` | currentHealthy|
| DesiredHealthy | int32 | `desiredhealthy` | desiredHealthy|
#### ResourceRequests
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| CPU | string | `cpu` | Cpu is the amount of CPU requested for a pod.|
| Memory | string | `memory` | Memory is the amount of RAM requested for a Pod.|
| Disk | string | `disk` | Disk is the amount of Disk requested for a pod.|
| DiskStorageClass | string | `diskStorageClass` | DiskStorageClass is the storage class for Disk. Disk must be present or this field is invalid.|
#### ResourceLimits
| **Field** | **Type** | **json field** | **Info** |
| --- | --- | --- | --- |
| CPU | string | `cpu` | Cpu is the CPU limit for a pod.|
| Memory | string | `memory` | Memory is the RAM limit for a pod.|