Apache DolphinScheduler Python API, aka PyDolphinscheduler.

Clone this repo:
  1. a7ce874 ci: compatible ruff latest version (#144) by Jay Chung · 8 weeks ago main
  2. 2f1f2c4 ci: update setup-python unsupported version (#143) by Jay Chung · 9 weeks ago
  3. 5470f3e Bump ruff to fix ci (#142) by Jay Chung · 4 months ago
  4. b568f05 fix: missing task_execute_type for task (#140) by Jay Chung · 5 months ago
  5. b15c716 impv: added multiple lines shell example (#133) by Harshit Nagpal · 5 months ago

PyDolphinScheduler

PyPi Version PyPi Python Versions PyPi License PyPi Status Downloads Coverage Status Code style: black Imports: isort CI Twitter Follow Slack Status

PyDolphinScheduler is python API for Apache DolphinScheduler, which allow you definition your workflow by python code, aka workflow-as-codes.

Quick Start

Version Compatibility

At Nov 7, 2022 we seperated PyDolphinScheduler from DolphinScheduler, and the version of PyDolphinScheduler 4.0.0 can match multiple versions of DolphinScheduler, for more details, please refer to version

Installation

# Install
python -m pip install apache-dolphinscheduler

# Verify installation is successful, it will show the version of apache-dolphinscheduler, here we use 0.1.0 as example
pydolphinscheduler version
# 0.1.0

NOTE: package apache-dolphinscheduler not work on above Python version 3.10(including itself) in Window operating system due to dependence py4j not work on those environments.

Here we show you how to install and run a simple example of PyDolphinScheduler

Start DolphinScheduler

There are many ways to start DolphinScheduler, here we use docker to start and run it as a standalone server.

# Change the version of dolphinscheduler to the version you want to use, here we use 3.1.1 as example
DOLPHINSCHEDULER_VERSION=3.1.1
docker run --name dolphinscheduler-standalone-server -p 12345:12345 -p 25333:25333 -d apache/dolphinscheduler-standalone-server:"${DOLPHINSCHEDULER_VERSION}"

After the container is started, you can access the DolphinScheduler UI via http://localhost:12345/dolphinscheduler. For more way to start DolphinScheduler and the more detail about DolphinScheduler, please refer to DolphinScheduler

Run a simple example

We have many examples in examples directory, we here pick up a typical one to show how to run it.

# Get the latest code of example from github 
curl https://raw.githubusercontent.com/apache/dolphinscheduler-sdk-python/main/src/pydolphinscheduler/examples/tutorial.py -o ./tutorial.py

# Change tenant to real exists tenant in the host your DolphinScheduler running, by any editor you like 

# Run the example
python ./tutorial.py

NOTICE: Since Apache DolphinScheduler's tenant is requests while running command, you have to change tenant value in file tutorial.py. The default value is tenant_exists, change it to username exists your host.

After that, a new workflow will be created by PyDolphinScheduler, and you can see it in DolphinScheduler web UI‘s Project Management page. It will trigger the workflow automatically, so you can see the workflow running in DolphinScheduler web UI’s Workflow Instance page too. For more detail about any function about DolphinScheduler Project Management, please refer to DolphinScheduler Workflow

Documentation

For full documentation visit document. This documentation is generated from this repository so please raise issues or pull requests for any additions, corrections, or clarifications.

Contributing

If you would like to contribute, check out the open issues on GitHub. You can also see the guide to contributing.

Release

Follow the release guide to release a new version of PyDolphinScheduler.