| Aurora Configuration Tutorial |
| ============================= |
| |
| How to write Aurora configuration files, including feature descriptions |
| and best practices. When writing a configuration file, make use of |
| `aurora job inspect`. It takes the same job key and configuration file |
| arguments as `aurora job create` or `aurora update start`. It first ensures the |
| configuration parses, then outputs it in human-readable form. |
| |
| You should read this after going through the general [Aurora Tutorial](../getting-started/tutorial.md). |
| |
| - [The Basics](#user-content-the-basics) |
| - [Use Bottom-To-Top Object Ordering](#user-content-use-bottom-to-top-object-ordering) |
| - [An Example Configuration File](#user-content-an-example-configuration-file) |
| - [Defining Process Objects](#user-content-defining-process-objects) |
| - [Getting Your Code Into The Sandbox](#user-content-getting-your-code-into-the-sandbox) |
| - [Defining Task Objects](#user-content-defining-task-objects) |
| - [SequentialTask: Running Processes in Parallel or Sequentially](#user-content-sequentialtask-running-processes-in-parallel-or-sequentially) |
| - [SimpleTask](#user-content-simpletask) |
| - [Combining tasks](#user-content-combining-tasks) |
| - [Defining Job Objects](#user-content-defining-job-objects) |
| - [The jobs List](#user-content-the-jobs-list) |
| - [Basic Examples](#basic-examples) |
| |
| |
| The Basics |
| ---------- |
| |
| To run a job on Aurora, you must specify a configuration file that tells |
| Aurora what it needs to know to schedule the job, what Mesos needs to |
| run the tasks the job is made up of, and what Thermos needs to run the |
| processes that make up the tasks. This file must have |
| a`.aurora` suffix. |
| |
| A configuration file defines a collection of objects, along with parameter |
| values for their attributes. An Aurora configuration file contains the |
| following three types of objects: |
| |
| - Job |
| - Task |
| - Process |
| |
| A configuration also specifies a list of `Job` objects assigned |
| to the variable `jobs`. |
| |
| - jobs (list of defined Jobs to run) |
| |
| The `.aurora` file format is just Python. However, `Job`, `Task`, |
| `Process`, and other classes are defined by a type-checked dictionary |
| templating library called *Pystachio*, a powerful tool for |
| configuration specification and reuse. Pystachio objects are tailored |
| via {{}} surrounded templates. |
| |
| When writing your `.aurora` file, you may use any Pystachio datatypes, as |
| well as any objects shown in the [*Aurora Configuration |
| Reference*](configuration.md), without `import` statements - the |
| Aurora config loader injects them automatically. Other than that, an `.aurora` |
| file works like any other Python script. |
| |
| [*Aurora Configuration Reference*](configuration.md) |
| has a full reference of all Aurora/Thermos defined Pystachio objects. |
| |
| ### Use Bottom-To-Top Object Ordering |
| |
| A well-structured configuration starts with structural templates (if |
| any). Structural templates encapsulate in their attributes all the |
| differences between Jobs in the configuration that are not directly |
| manipulated at the `Job` level, but typically at the `Process` or `Task` |
| level. For example, if certain processes are invoked with slightly |
| different settings or input. |
| |
| After structural templates, define, in order, `Process`es, `Task`s, and |
| `Job`s. |
| |
| Structural template names should be *UpperCamelCased* and their |
| instantiations are typically *UPPER\_SNAKE\_CASED*. `Process`, `Task`, |
| and `Job` names are typically *lower\_snake\_cased*. Indentation is typically 2 |
| spaces. |
| |
| An Example Configuration File |
| ----------------------------- |
| |
| The following is a typical configuration file. Don't worry if there are |
| parts you don't understand yet, but you may want to refer back to this |
| as you read about its individual parts. Note that names surrounded by |
| curly braces {{}} are template variables, which the system replaces with |
| bound values for the variables. |
| |
| # --- templates here --- |
| class Profile(Struct): |
| package_version = Default(String, 'live') |
| java_binary = Default(String, '/usr/lib/jvm/java-1.7.0-openjdk/bin/java') |
| extra_jvm_options = Default(String, '') |
| parent_environment = Default(String, 'prod') |
| parent_serverset = Default(String, |
| '/foocorp/service/bird/{{parent_environment}}/bird') |
| |
| # --- processes here --- |
| main = Process( |
| name = 'application', |
| cmdline = '{{profile.java_binary}} -server -Xmx1792m ' |
| '{{profile.extra_jvm_options}} ' |
| '-jar application.jar ' |
| '-upstreamService {{profile.parent_serverset}}' |
| ) |
| |
| # --- tasks --- |
| base_task = SequentialTask( |
| name = 'application', |
| processes = [ |
| Process( |
| name = 'fetch', |
| cmdline = 'curl -O |
| https://packages.foocorp.com/{{profile.package_version}}/application.jar'), |
| ] |
| ) |
| |
| # not always necessary but often useful to have separate task |
| # resource classes |
| staging_task = base_task(resources = |
| Resources(cpu = 1.0, |
| ram = 2048*MB, |
| disk = 1*GB)) |
| production_task = base_task(resources = |
| Resources(cpu = 4.0, |
| ram = 2560*MB, |
| disk = 10*GB)) |
| |
| # --- job template --- |
| job_template = Job( |
| name = 'application', |
| role = 'myteam', |
| contact = 'myteam-team@foocorp.com', |
| instances = 20, |
| service = True, |
| task = production_task |
| ) |
| |
| # -- profile instantiations (if any) --- |
| PRODUCTION = Profile() |
| STAGING = Profile( |
| extra_jvm_options = '-Xloggc:gc.log', |
| parent_environment = 'staging' |
| ) |
| |
| # -- job instantiations -- |
| jobs = [ |
| job_template(cluster = 'cluster1', environment = 'prod') |
| .bind(profile = PRODUCTION), |
| |
| job_template(cluster = 'cluster2', environment = 'prod') |
| .bind(profile = PRODUCTION), |
| |
| job_template(cluster = 'cluster1', |
| environment = 'staging', |
| service = False, |
| task = staging_task, |
| instances = 2) |
| .bind(profile = STAGING), |
| ] |
| |
| ## Defining Process Objects |
| |
| Processes are handled by the Thermos system. A process is a single |
| executable step run as a part of an Aurora task, which consists of a |
| bash-executable statement. |
| |
| The key (and required) `Process` attributes are: |
| |
| - `name`: Any string which is a valid Unix filename (no slashes, |
| NULLs, or leading periods). The `name` value must be unique relative |
| to other Processes in a `Task`. |
| - `cmdline`: A command line run in a bash subshell, so you can use |
| bash scripts. Nothing is supplied for command-line arguments, |
| so `$*` is unspecified. |
| |
| Many tiny processes make managing configurations more difficult. For |
| example, the following is a bad way to define processes. |
| |
| copy = Process( |
| name = 'copy', |
| cmdline = 'curl -O https://packages.foocorp.com/app.zip' |
| ) |
| unpack = Process( |
| name = 'unpack', |
| cmdline = 'unzip app.zip' |
| ) |
| remove = Process( |
| name = 'remove', |
| cmdline = 'rm -f app.zip' |
| ) |
| run = Process( |
| name = 'app', |
| cmdline = 'java -jar app.jar' |
| ) |
| run_task = Task( |
| processes = [copy, unpack, remove, run], |
| constraints = order(copy, unpack, remove, run) |
| ) |
| |
| Since `cmdline` runs in a bash subshell, you can chain commands |
| with `&&` or `||`. |
| |
| When defining a `Task` that is just a list of Processes run in a |
| particular order, use `SequentialTask`, as described in the [*Defining* |
| `Task` *Objects*](#Task) section. The following simplifies and combines the |
| above multiple `Process` definitions into just two. |
| |
| stage = Process( |
| name = 'stage', |
| cmdline = 'curl -O https://packages.foocorp.com/app.zip && ' |
| 'unzip app.zip && rm -f app.zip') |
| |
| run = Process(name = 'app', cmdline = 'java -jar app.jar') |
| |
| run_task = SequentialTask(processes = [stage, run]) |
| |
| `Process` also has optional attributes to customize its behaviour. Details can be found in the [Aurora Configuration Reference](configuration.md#process-objects). |
| |
| |
| ## Getting Your Code Into The Sandbox |
| |
| When using Aurora, you need to get your executable code into its "sandbox", specifically |
| the Task sandbox where the code executes for the Processes that make up that Task. |
| |
| Each Task has a sandbox created when the Task starts and garbage |
| collected when it finishes. All of a Task's processes run in its |
| sandbox, so processes can share state by using a shared current |
| working directory. |
| |
| Typically, you save this code somewhere. You then need to define a Process |
| in your `.aurora` configuration file that fetches the code from that somewhere |
| to where the agent can see it. For a public cloud, that can be anywhere public on |
| the Internet, such as S3. For a private cloud internal storage, you need to put in |
| on an accessible HDFS cluster or similar storage. |
| |
| The template for this Process is: |
| |
| <name> = Process( |
| name = '<name>' |
| cmdline = '<command to copy and extract code archive into current working directory>' |
| ) |
| |
| Note: Be sure the extracted code archive has an executable. |
| |
| ## Getting Environment Variables Into The Sandbox |
| |
| Every time a process is forked the Thermos executor checks for the existence of the |
| `.thermos_profile` file, if the `.thermos_profile` file exists it will be sourced. |
| You can utilize this process to pass environment variables to the sandbox. |
| |
| An example for this Process is: |
| |
| setup_env = Process( |
| name = 'setup', |
| cmdline = ( |
| 'cat <<EOF > .thermos_profile\n' |
| 'export RESULT=hello\n' |
| 'EOF\n' |
| ) |
| ) |
| |
| read_env = Process( |
| name = 'read' |
| cmdline = 'echo $RESULT' |
| ) |
| |
| ## Defining Task Objects |
| |
| Tasks are handled by Mesos. A task is a collection of processes that |
| runs in a shared sandbox. It's the fundamental unit Aurora uses to |
| schedule the datacenter; essentially what Aurora does is find places |
| in the cluster to run tasks. |
| |
| The key (and required) parts of a Task are: |
| |
| - `name`: A string giving the Task's name. By default, if a Task is |
| not given a name, it inherits the first name in its Process list. |
| |
| - `processes`: An unordered list of Process objects bound to the Task. |
| The value of the optional `constraints` attribute affects the |
| contents as a whole. Currently, the only constraint, `order`, determines if |
| the processes run in parallel or sequentially. |
| |
| - `resources`: A `Resource` object defining the Task's resource |
| footprint. A `Resource` object has three attributes: |
| - `cpu`: A Float, the fractional number of cores the Task |
| requires. |
| - `ram`: An Integer, RAM bytes the Task requires. |
| - `disk`: An integer, disk bytes the Task requires. |
| |
| A basic Task definition looks like: |
| |
| Task( |
| name="hello_world", |
| processes=[Process(name = "hello_world", cmdline = "echo hello world")], |
| resources=Resources(cpu = 1.0, |
| ram = 1*GB, |
| disk = 1*GB)) |
| |
| A Task has optional attributes to customize its behaviour. Details can be found in the [Aurora Configuration Reference](configuration.md#task-object) |
| |
| |
| ### SequentialTask: Running Processes in Parallel or Sequentially |
| |
| By default, a Task with several Processes runs them in parallel. There |
| are two ways to run Processes sequentially: |
| |
| - Include an `order` constraint in the Task definition's `constraints` |
| attribute whose arguments specify the processes' run order: |
| |
| Task( ... processes=[process1, process2, process3], |
| constraints = order(process1, process2, process3), ...) |
| |
| - Use `SequentialTask` instead of `Task`; it automatically runs |
| processes in the order specified in the `processes` attribute. No |
| `constraint` parameter is needed: |
| |
| SequentialTask( ... processes=[process1, process2, process3] ...) |
| |
| ### SimpleTask |
| |
| For quickly creating simple tasks, use the `SimpleTask` helper. It |
| creates a basic task from a provided name and command line using a |
| default set of resources. For example, in a .`aurora` configuration |
| file: |
| |
| SimpleTask(name="hello_world", command="echo hello world") |
| |
| is equivalent to |
| |
| Task(name="hello_world", |
| processes=[Process(name = "hello_world", cmdline = "echo hello world")], |
| resources=Resources(cpu = 1.0, |
| ram = 1*GB, |
| disk = 1*GB)) |
| |
| The simplest idiomatic Job configuration thus becomes: |
| |
| import os |
| hello_world_job = Job( |
| task=SimpleTask(name="hello_world", command="echo hello world"), |
| role=os.getenv('USER'), |
| cluster="cluster1") |
| |
| When written to `hello_world.aurora`, you invoke it with a simple |
| `aurora job create cluster1/$USER/test/hello_world hello_world.aurora`. |
| |
| ### Combining tasks |
| |
| `Tasks.concat`(synonym,`concat_tasks`) and |
| `Tasks.combine`(synonym,`combine_tasks`) merge multiple Task definitions |
| into a single Task. It may be easier to define complex Jobs |
| as smaller constituent Tasks. But since a Job only includes a single |
| Task, the subtasks must be combined before using them in a Job. |
| Smaller Tasks can also be reused between Jobs, instead of having to |
| repeat their definition for multiple Jobs. |
| |
| With both methods, the merged Task takes the first Task's name. The |
| difference between the two is the result Task's process ordering. |
| |
| - `Tasks.combine` runs its subtasks' processes in no particular order. |
| The new Task's resource consumption is the sum of all its subtasks' |
| consumption. |
| |
| - `Tasks.concat` runs its subtasks in the order supplied, with each |
| subtask's processes run serially between tasks. It is analogous to |
| the `order` constraint helper, except at the Task level instead of |
| the Process level. The new Task's resource consumption is the |
| maximum value specified by any subtask for each Resource attribute |
| (cpu, ram and disk). |
| |
| For example, given the following: |
| |
| setup_task = Task( |
| ... |
| processes=[download_interpreter, update_zookeeper], |
| # It is important to note that {{Tasks.concat}} has |
| # no effect on the ordering of the processes within a task; |
| # hence the necessity of the {{order}} statement below |
| # (otherwise, the order in which {{download_interpreter}} |
| # and {{update_zookeeper}} run will be non-deterministic) |
| constraints=order(download_interpreter, update_zookeeper), |
| ... |
| ) |
| |
| run_task = SequentialTask( |
| ... |
| processes=[download_application, start_application], |
| ... |
| ) |
| |
| combined_task = Tasks.concat(setup_task, run_task) |
| |
| The `Tasks.concat` command merges the two Tasks into a single Task and |
| ensures all processes in `setup_task` run before the processes |
| in `run_task`. Conceptually, the task is reduced to: |
| |
| task = Task( |
| ... |
| processes=[download_interpreter, update_zookeeper, |
| download_application, start_application], |
| constraints=order(download_interpreter, update_zookeeper, |
| download_application, start_application), |
| ... |
| ) |
| |
| In the case of `Tasks.combine`, the two schedules run in parallel: |
| |
| task = Task( |
| ... |
| processes=[download_interpreter, update_zookeeper, |
| download_application, start_application], |
| constraints=order(download_interpreter, update_zookeeper) + |
| order(download_application, start_application), |
| ... |
| ) |
| |
| In the latter case, each of the two sequences may operate in parallel. |
| Of course, this may not be the intended behavior (for example, if |
| the `start_application` Process implicitly relies |
| upon `download_interpreter`). Make sure you understand the difference |
| between using one or the other. |
| |
| ## Defining Job Objects |
| |
| A job is a group of identical tasks that Aurora can run in a Mesos cluster. |
| |
| A `Job` object is defined by the values of several attributes, some |
| required and some optional. The required attributes are: |
| |
| - `task`: Task object to bind to this job. Note that a Job can |
| only take a single Task. |
| |
| - `role`: Job's role account; in other words, the user account to run |
| the job as on a Mesos cluster machine. A common value is |
| `os.getenv('USER')`; using a Python command to get the user who |
| submits the job request. The other common value is the service |
| account that runs the job, e.g. `www-data`. |
| |
| - `environment`: Job's environment, typical values |
| are `devel`, `test`, or `prod`. |
| |
| - `cluster`: Aurora cluster to schedule the job in, defined in |
| `/etc/aurora/clusters.json` or `~/.clusters.json`. You can specify |
| jobs where the only difference is the `cluster`, then at run time |
| only run the Job whose job key includes your desired cluster's name. |
| |
| You usually see a `name` parameter. By default, `name` inherits its |
| value from the Job's associated Task object, but you can override this |
| default. For these four parameters, a Job definition might look like: |
| |
| foo_job = Job( name = 'foo', cluster = 'cluster1', |
| role = os.getenv('USER'), environment = 'prod', |
| task = foo_task) |
| |
| In addition to the required attributes, there are several optional |
| attributes. Details can be found in the [Aurora Configuration Reference](configuration.md#job-objects). |
| |
| |
| ## The jobs List |
| |
| At the end of your `.aurora` file, you need to specify a list of the |
| file's defined Jobs. For example, the following exports the jobs `job1`, |
| `job2`, and `job3`. |
| |
| jobs = [job1, job2, job3] |
| |
| This allows the aurora client to invoke commands on those jobs, such as |
| starting, updating, or killing them. |
| |
| |
| |
| Basic Examples |
| ============== |
| |
| These are provided to give a basic understanding of simple Aurora jobs. |
| |
| ### hello_world.aurora |
| |
| Put the following in a file named `hello_world.aurora`, substituting your own values |
| for values such as `cluster`s. |
| |
| import os |
| hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello world') |
| |
| hello_world_task = Task( |
| resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB), |
| processes = [hello_world_process]) |
| |
| hello_world_job = Job( |
| cluster = 'cluster1', |
| role = os.getenv('USER'), |
| task = hello_world_task) |
| |
| jobs = [hello_world_job] |
| |
| Then issue the following commands to create and kill the job, using your own values for the job key. |
| |
| aurora job create cluster1/$USER/test/hello_world hello_world.aurora |
| |
| aurora job kill cluster1/$USER/test/hello_world |
| |
| ### Environment Tailoring |
| |
| Put the following in a file named `hello_world_productionized.aurora`, substituting your own values |
| for values such as `cluster`s. |
| |
| include('hello_world.aurora') |
| |
| production_resources = Resources(cpu = 1.0, ram = 512 * MB, disk = 2 * GB) |
| staging_resources = Resources(cpu = 0.1, ram = 32 * MB, disk = 512 * MB) |
| hello_world_template = hello_world( |
| name = "hello_world-{{cluster}}" |
| task = hello_world(resources=production_resources)) |
| |
| jobs = [ |
| # production jobs |
| hello_world_template(cluster = 'cluster1', instances = 25), |
| hello_world_template(cluster = 'cluster2', instances = 15), |
| |
| # staging jobs |
| hello_world_template( |
| cluster = 'local', |
| instances = 1, |
| task = hello_world(resources=staging_resources)), |
| ] |
| |
| Then issue the following commands to create and kill the job, using your own values for the job key |
| |
| aurora job create cluster1/$USER/test/hello_world-cluster1 hello_world_productionized.aurora |
| |
| aurora job kill cluster1/$USER/test/hello_world-cluster1 |