##Overview
One of the core capabilities of Apache Amaterasu is configuration management for data pipelines. Configurations are stored in environments. By default, environments are defined in folders named Env
that can be stored both at the root of the Amaterasu repo which is applied to all the actions in the repo as well as in the action folder under: src/{action_name}/{env}/
which are available only for the specific action.
Note: When the same configuration value is defined at the root and for an action, the action level definition overrides the the global configuration.
The following repo structure defines three environments (dev
, test
and prod
) both at the root and for the start
action:
repo +-- env/ | +-- dev/ | | +-- job.yaml | | +-- spark.yaml | +-- test/ | | +-- job.yaml | | +-- spark.yaml | +-- prod/ | +-- job.yaml | +-- spark.yaml +-- src/ | +-- start/ | +-- dev/ | | +-- job.yaml | | +-- spark.yaml | +-- test/ | | +-- job.yaml | | +-- spark.yaml | +-- prod/ | +-- job.yaml | +-- spark.yaml +-- maki.yaml
Additional configuration paths can be added both for global and action configurations by specifying the config
element in the maki.yaml
as shown in the following example:
config: myconfig/{env}/ job-name: amaterasu-test flow: - name: start config: cfg/start/{env}/ runner: group: spark type: python file: start.py
Amaterasu allows the configuration of three main areas:
All frameworks have their own configuration, Apache Amaterasu allows different frameworks to define their configurations per environment and by doing so, allowing to configure how actions will be configured when deployed.
For more information about specific framework configuration options, look at the frameworks section of this documentation.