| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| # Pipelines |
| |
| In the `pipelines` section you can configure scheduled/manually run pipelines of tasks to be |
| executed sequentially: |
| |
| ```yaml |
| pipelines: |
| - pipeline: my_data_pipeline |
| timeout_minutes: 30 |
| start_date: 2020-03-01 |
| schedule: 0 10 * * * |
| tasks: |
| - task: my_sql_task |
| type: sql |
| query: "SELECT * FROM |
| {{my_database_name}}.{{my_table_name}} |
| WHERE event_date_prt >= |
| '{{yesterday_ds}}'" |
| AND cms_platform = 'xsite' |
| output_table: my_db.my_out_table |
| output_path: s3://my_bky/{{env}}/mydir |
| - task: my_python_task |
| type: python |
| image: myorg/myrepo:pythonapp |
| cmd: python my_python_app.py |
| env_vars: |
| env: {{env}} |
| fizz: buzz |
| ``` |
| |
| `pipelines` is a section in the root lof your liminal.yml file and is a list of `pipeline`s defined |
| by the following attributes: |
| |
| ## pipeline attributes |
| |
| `pipeline`: name of your pipeline (must be unique per liminal server). |
| |
| `timeout_minutes`: maximum allowed pipeline run time in minutes, if run exceeds this time, pipeline |
| and all running tasks will fail. |
| |
| `start_date`: start date for the pipeline. |
| |
| `schedule`: to be configured if the pipeline should run on a schedule. Format is |
| [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression). |
| |
| `tasks`: list of `task`s, defined by the following attributes: |
| |
| ## task attributes |
| |
| For fully detailed information on tasks see: [tasks](tasks). |
| |
| `task`: name of your task (must be made of alphanumeric, dash and/or underscore characters only). |
| |
| `type`: type of the task. Examples of available task types are: `python`. |
| and more.. |
| |
| Different task types require their own additional configuration. For example, `python` task requires |
| `image` to be configured. |