| The Python components of Aurora are built using [Pants](https://pantsbuild.github.io). |
| |
| |
| Python Build Conventions |
| ======================== |
| The Python code is laid out according to the following conventions: |
| |
| 1. 1 `BUILD` per 3rd level directory. For a list of current top-level packages run: |
| |
| % find src/main/python -maxdepth 3 -mindepth 3 -type d |\ |
| while read dname; do echo $dname |\ |
| sed 's@src/main/python/\(.*\)/\(.*\)/\(.*\).*@\1.\2.\3@'; done |
| |
| 2. Each `BUILD` file exports 1 |
| [`python_library`](https://pantsbuild.github.io/build_dictionary.html#bdict_python_library) |
| that provides a |
| [`setup_py`](https://pantsbuild.github.io/build_dictionary.html#setup_py) |
| containing each |
| [`python_binary`](https://pantsbuild.github.io/build_dictionary.html#python_binary) |
| in the `BUILD` file, named the same as the directory it's in so that it can be referenced |
| without a ':' character. The `sources` field in the `python_library` will almost always be |
| `rglobs('*.py')`. |
| |
| 3. Other BUILD files may only depend on this single public `python_library` |
| target. Any other target is considered a private implementation detail and |
| should be prefixed with an `_`. |
| |
| 4. `python_binary` targets are always named the same as the exported console script. |
| |
| 5. `python_binary` targets must have identical `dependencies` to the `python_library` exported |
| by the package and must use `entry_point`. |
| |
| The means a PEX file generated by pants will contain exactly the same files that will be |
| available on the `PYTHONPATH` in the case of `pip install` of the corresponding library |
| target. This will help our migration off of Pants in the future. |
| |
| Annotated example - apache.thermos.runner |
| ----------------------------------------- |
| |
| % find src/main/python/apache/thermos/runner |
| src/main/python/apache/thermos/runner |
| src/main/python/apache/thermos/runner/__init__.py |
| src/main/python/apache/thermos/runner/thermos_runner.py |
| src/main/python/apache/thermos/runner/BUILD |
| % cat src/main/python/apache/thermos/runner/BUILD |
| # License boilerplate omitted |
| import os |
| |
| |
| # Private target so that a setup_py can exist without a circular dependency. Only targets within |
| # this file should depend on this. |
| python_library( |
| name = '_runner', |
| # The target covers every python file under this directory and subdirectories. |
| sources = rglobs('*.py'), |
| dependencies = [ |
| '3rdparty/python:twitter.common.app', |
| '3rdparty/python:twitter.common.log', |
| # Source dependencies are always referenced without a ':'. |
| 'src/main/python/apache/thermos/common', |
| 'src/main/python/apache/thermos/config', |
| 'src/main/python/apache/thermos/core', |
| ], |
| ) |
| |
| # Binary target for thermos_runner.pex. Nothing should depend on this - it's only used as an |
| # argument to ./pants binary. |
| python_binary( |
| name = 'thermos_runner', |
| # Use entry_point, not source so the files used here are the same ones tests see. |
| entry_point = 'apache.thermos.bin.thermos_runner', |
| dependencies = [ |
| # Notice that we depend only on the single private target from this BUILD file here. |
| ':_runner', |
| ], |
| ) |
| |
| # The public library that everyone importing the runner symbols uses. |
| # The test targets and any other dependent source code should depend on this. |
| python_library( |
| name = 'runner', |
| dependencies = [ |
| # Again, notice that we depend only on the single private target from this BUILD file here. |
| ':_runner', |
| ], |
| # We always provide a setup_py. This will cause any dependee libraries to automatically |
| # reference this library in their requirements.txt rather than copy the source files into their |
| # sdist. |
| provides = setup_py( |
| # Conventionally named and versioned. |
| name = 'apache.thermos.runner', |
| version = open(os.path.join(get_buildroot(), '.auroraversion')).read().strip().upper(), |
| ).with_binaries({ |
| # Every binary in this file should also be repeated here. |
| # Always use the dict-form of .with_binaries so that commands with dashes in their names are |
| # supported. |
| # The console script name is always the same as the PEX with .pex stripped. |
| 'thermos_runner': ':thermos_runner', |
| }), |
| ) |
| |
| |
| |
| Thermos Test resources |
| ====================== |
| |
| The Aurora source repository and distributions contain several |
| [binary files](../../src/test/resources/org/apache/thermos/root/checkpoints) to |
| qualify the backwards-compatibility of thermos with checkpoint data. Since |
| thermos persists state to disk, to be read by the thermos observer), it is important that we have |
| tests that prevent regressions affecting the ability to parse previously-written data. |
| |
| The files included represent persisted checkpoints that exercise different |
| features of thermos. The existing files should not be modified unless |
| we are accepting backwards incompatibility, such as with a major release. |
| |
| It is not practical to write source code to generate these files on the fly, |
| as source would be vulnerable to drift (e.g. due to refactoring) in ways |
| that would undermine the goal of ensuring backwards compatibility. |
| |
| The most common reason to add a new checkpoint file would be to provide |
| coverage for new thermos features that alter the data format. This is |
| accomplished by writing and running a |
| [job configuration](../reference/configuration.md) that exercises the feature, and |
| copying the checkpoint file from the sandbox directory, by default this is |
| `/var/run/thermos/checkpoints/<aurora task id>`. |