Ibis is a portable dataframe library to write procedural data transformations in Python and be able to execute them directly on various SQL backends (DuckDB, Snowflake, Postgres, Flink, see full list). Hamilton provides a declarative way to define testable, modular, self-documenting dataflows, that encode lineage and metadata.
In this example, we‘ll show how to get started with creating feature transformations and training a machine learning model. You’ll learn about the basics of Ibis and IbisML and how they integrate with Hamilton.
Follow these steps to get the example working:
create and activate virtual environment
python -m venv venv & . venv/bin/activate
install requirements
pip install -r requirements.txt
execute the Hamilton feature engineering dataflow at the table or column level
python run.py --level [table, column]
table_dataflow.py and column_dataflow.py include the same Ibis feature engineering dataflow, but with different level of granularitytables.png and columns.png were generated by Hamilton directly from the code.ibis_feature_set.png was generated by Ibis. It describes the atomic data transformations executed by the expression.