tree: 1c804d6f11cdb79f6f10f9c4ae2ed10e3facf094 [path history] [tgz]
  1. data/
  2. notebooks/
  3. src/
  4. tests/
  5. pyproject.toml
  6. README.md
  7. requirements.txt
examples/kedro/hamilton-code/README.md

Hamilton code

Hamilton dataflow

File structure

The Hamilton refactor is composed of a few files:

  • data_processing.py and data_science.py contains regular Python functions to define the Hamilton dataflow. This is equivalent to Kedro's pipeline.py and nodes.py files.
  • run.py contains the “driver code” to load and execute the dataflow. There's no direct equivalent in the Kedro tutorial since it prefers using the CLI for execution.
  • noteboks/interactive.ipynb contains the “driver code”, similar to run.py, but uses Hamilton Jupyter Magics to define the dataflow interactily in a notebook.
  • tests/test_dataflow.py includes tests equivalent to tests/pipelines/data_science/test_pipeline.py in the Kedro code.

Instructions

  1. Create a virtual environment and activate it

    python -m venv venv && . venv/bin/active
    
  2. Install requirements for the Hamilton code

    pip install -r requirements.txt
    
  3. Install the current hamilton-code project

    pip install -e .
    
  4. Run the dataflow

    python run.py
    
  5. Run the tests

    pytest tests/
    

Going further