Apache Airflow Site Archive - archive of all produced documentation for Apache Airflow

Clone this repo:
  1. 7090bb5 Sync S3 to GitHub by GitHub Actions · 16 hours ago main
  2. 2746057 Sync S3 to GitHub by GitHub Actions · 2 days ago
  3. f1adcd5 Add cache invalidation after syncing Github -> S3 (#24) by Jarek Potiuk · 10 days ago
  4. 837689f Update website banner to show airflow survey 2025 for historic versions (#23) by Jarek Potiuk · 10 days ago
  5. a8f86c7 Sync S3 to GitHub by GitHub Actions · 3 weeks ago

Airflow sync archive

The repository stores the archive of generated documentation from Apache Airflow.

The scripts and workflows here allow to keep the repository in sync with the S3 buckets - both live and sync - wehre the documentation is stored. Sync in both direction is possible.

In the future we will automate synchronization of the repoitory after any change to the buckets, currently manual synchronization S3 -> Bucket for the live ucket documentation is done using the S3 to GitHub workflow that subsequently uses s3-to-github.py, and syncing the repository to the staging bucket is done using the GitHub to S3 workflow that uses github-to-s3.py script. The scripts can also be used to perform manual syncs of changes when we modify the documentation in the repository and want to sync it to either of the S3 buckets.

You can see the arguments for the scripts in the s3-to-github.py and github-to-s3.py by passing --help options:

  • uv run scripts/s3_to_github.py --help:
usage: s3_to_github.py [-h] --bucket-path BUCKET_PATH --local-path LOCAL_PATH [--document-packages DOCUMENT_PACKAGES] [--processes PROCESSES]

Sync S3 to GitHub

options:
  -h, --help            show this help message and exit
  --bucket-path BUCKET_PATH
                        S3 bucket name with path
  --local-path LOCAL_PATH
                        local path to sync
  --document-packages DOCUMENT_PACKAGES
                        Document packages to sync
  --processes PROCESSES
                        Number of processes
  • uv run scripts/github_to_s3.py --help:
usage: github_to_s3.py [-h] --bucket-path BUCKET_PATH --local-path LOCAL_PATH [--document-packages DOCUMENT_PACKAGES] [--commit-ref COMMIT_REF] [--sync-type {full-sync,single-commit}] [--processes PROCESSES]

Sync GitHub to S3

options:
  -h, --help            show this help message and exit
  --bucket-path BUCKET_PATH
                        S3 bucket name with path
  --local-path LOCAL_PATH
                        local path to sync
  --document-packages DOCUMENT_PACKAGES
                        Document package ids to sync (long or short) separated with spaces ('all' means all packages)
  --commit-ref COMMIT_REF
                        Commit ref to sync (sha/HEAD/branch)
  --sync-type {full-sync,single-commit}
                        Sync type
  --processes PROCESSES
                        Number of processes