This template shows a ML pipeline.
It shows a few things:
@subdag to fit different models in the same DAG run and reuse the same fitting code.To get started, you need to have the Hamilton UI running.
See https://hamilton.dagworks.io/en/latest/concepts/ui/ for details, here are the cliff notes:
git clone https://github.com/dagworks-inc/hamilton cd hamilton/ui/deployment ./run.sh
Then go to http://localhost:8242 and create (1) an email, and (2) a project. See this video for a walkthrough.
Ensure you have the right python dependencies installed.
cd hamilton/examples/hamilton_ui pip install -r requirements.txt
run.py script. Providing the email, and project ID to be able to log to the Hamilton UI.python run.py --email <email> --project_id <project_id>
Once you've run that, run this:
python run.py --email <email> --project_id <project_id> --load-from-parquet True
Then you can go see the difference in the Hamilton UI. Find your project under http://localhost:8242/dashboard/projects.
raise ValueError("I'm an error").models.py change "data_set": source("data_set_v1"), to "data_set": source("data_set_v2"),, along with what is requested in run.py (i.e. change/add saving data_set_v2) and see how the lineage changes in the Hamilton UI.features.py and then to a dataset. Execute it and then compare the data observed in the Hamilton UI against a prior run.