Hamilton UI - Machine learning example

Learn how to use the HamiltonTracker and the Hamilton UI to track a simple machine learning pipeline.

It also illustrates the following notions:

Splitting a pipeline into separate modules (e.g., data loading, feature enginering, model fitting)
Use DataLoader and DataSaver objects to load & save data and collect extra metadata in the UI
Use @subdag to fit different ML models with the same model training code in the same DAG run.

Getting started

First, you need to have the Hamilton UI running. You can either pip install the Hamilton UI (recommended) or run it as a Docker container.

Install the Python dependencies:

pip install "sf-hamilton[ui,sdk]"

then launch the Hamilton UI server:

hamilton ui
# python -m hamilton.cli.__main__ ui # on windows

git clone https://github.com/dagworks-inc/hamilton
cd hamilton/ui/deployment
./run.sh

Then go to http://localhost:8242 to create (1) a username and (2) a project. See this video for a walkthrough.

Now that you have the Hamilton UI running, open another terminal tab to:

cd hamilton/examples/hamilton_ui
pip install -r requirements.txt

Run the run.py script. Providing the username and project ID to be able to log to the Hamilton UI.

python run.py --username <username> --project_id <project_id>

Once you've run that, run this:

python run.py --username <username> --project_id <project_id> --load-from-parquet

Explore results in the Hamilton UI. Find your project under http://localhost:8242/dashboard/projects.

Place an error in the code and see how it shows up in the Hamilton UI. e.g. raise ValueError("I'm an error").
In models.py change "data_set": source("data_set_v1"), to "data_set": source("data_set_v2"),, along with what is requested in run.py (i.e. change/add saving data_set_v2) and see how the lineage changes in the Hamilton UI.
Add a new feature and propagate it through the pipeline. E.g. add a new feature to features.py and then to a dataset. Execute it and then compare the data observed in the Hamilton UI against a prior run.