docs/hamilton-ui/ui.rst - hamilton - Git at Google

 --------------------
 UI Overview
 --------------------

 Hamilton comes with a fully open-source UI that can be run both for local deployment and on a remote server.
 The UI consists of the following features:

 1. Telemetry for hamilton executions -- both on the history of executions and the data itself.
 2. A feature/artifact catalog for browsing/connecting executions of nodes -> results.
 3. A dataflow (i.e. DAG) visualizer for exploring and looking at your code, and determining lineage.
 4. A project explorer for viewing curating projects and viewing versions of your Hamilton dataflows.

 In short, the Hamilton UI aims to combine a large swath of MLOps/data observability systems in one simple application.

 .. image:: ../_static/hamilton_ui.jpeg
     :alt: Hamilton UI

 ---

 The Hamilton UI has two modes:
 1. Run locally using sqlite3
 2. Run on docker images with postgres (meant for deployment)

 ----------
 Local Mode
 ----------

 To run the hamilton UI in local mode, you can do the following:

 .. code-block:: bash

     pip install "sf-hamilton[ui,sdk]"
     hamilton ui
     # python -m hamilton.cli.__main__ ui # on windows

 This will launch a browser window in localhost:8241. You can then navigate to the UI and start using it!
 While this can potentially handle a small production workflow, you may want to run on postgres with a separate frontend/backend/db
 for full scalability and a multi-read/write db.

 --------------------
 Docker/Deployed Mode
 --------------------

 The Hamilton UI can be contained within a set of Docker images. You launch with `docker-compose <https://docs.docker.com/compose/>`_, and it will start up the UI, the backend server,
 and a Postgres database. If you'd like a quick overview of some of the features, you can watch the following:

 .. raw:: html

     <iframe width="560" height="315" src="https://www.youtube.com/embed/0VIVSeN7Ij8?si=i3vTsfTNorzh5y2C" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>


 Note: if you run into the "Invalid HTTP_HOST" error, then please set the environment variable `HAMILTON_ALLOWED_HOSTS="*"` (or comma separated list of domains of choice) for the backend docker container. You can inject
 this via `-e` or in the `docker-compose[-prod].yml` file itself.

 Install
 -------

 If you'd like a video walkthrough on getting set up, you can watch the following:

 .. raw:: html

     <iframe width="560" height="315" src="https://www.youtube.com/embed/DPfxlTwaNsM?si=gks5oOAWsNPSJbe_" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>


 As prerequisites, you will need to have Docker installed -- you can follow instructions  `here <https://docs.docker.com/engine/install/>`_.

 1. Clone the Hamilton repository locally

 .. code-block:: bash

     git clone https://github.com/dagworks-inc/hamilton

 2. Navigate to the ``hamilton/ui`` directory

 .. code-block:: bash

     cd hamilton/ui


 3. Execute the installation script with the following command

 .. code-block:: bash

     ./run.sh


 This will:

 - Pull all Docker images from the Docker Hub
 - Start a local Postgres database
 - Start the backend server
 - Start the frontend server

 This takes a bit of time! So be patient. The server will be running on port 8242.

 4. Then navigate to ``http://localhost:8242`` in your browser, and enter your email (this will be the username used within the app).

 Building the Docker Images locally
 -----------------------------------
 If building the Docker containers from scratch, increase your Docker memory to 10gb or more -- you can do this in the Docker Desktop settings.

 To build the images locally, you can run the following command:

 .. code-block:: bash

     # from the hamilton/ui directory
     ./dev.sh --build

 This will build the containers from scratch. If you just want to mount the local code, you can run just

 .. code-block:: bash

     ./dev.sh

 Self-Hosting
 -------------

 If you know docker, you should be good to go. The one environment variable to know is `HAMILTON_ALLOWED_HOSTS`, which you can set to `*` to allow all hosts, or
 a comma separated list of hosts you want to allow.

 Please reach out to us if you want to deploy on your own infrastructure and need help - `join slack <https://join.slack.com/t/hamilton-opensource/shared_invite/zt-2niepkra8-DGKGf_tTYhXuJWBTXtIs4g>`_.
 More extensive self-hosting documentation is in the works, e.g. Snowflake, Databricks, AWS, GCP, Azure, etc.; we'd love a helm
 chart contribution!


 Running on Snowflake
 ---------------------
 You can run the Hamilton UI on Snowflake Container Services. For a detailed guide, see the blog post
 `Observability of Python code and application logic with Hamilton UI on Snowflake Container Services <https://medium.com/@pkantyka/observability-of-python-code-and-application-logic-with-hamilton-ui-on-snowflake-container-services-a26693b46635>`_ by
 `Greg Kantyka <https://medium.com/@pkantyka>`_ and the `Hamilton Snowflake Example <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/snowflake/hamilton_ui>`_.

 -----------
 Get started
 -----------

 Now that you have your server running, you can run a simple dataflow and watch it in the UI!
 You can follow instructions in the UI when you create a new project, or follow the instructions here.

 First, install the SDK:

 .. code-block:: bash

     pip install "sf-hamilton[sdk]"

 Then, navigate to the project page (dashboard/projects), in the running UI, and click the green ``+ New DAG`` button.

 .. image:: ../_static/new_project.png

 Remember the project ID -- you'll use it for the next steps.

 Existing Hamilton Code
 ----------------------
 Add the following adapter to your code if you have existing Hamilton code:

 .. code-block:: python

     from hamilton_sdk import adapters

     tracker = adapters.HamiltonTracker(
        project_id=PROJECT_ID_FROM_ABOVE,
        username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
        dag_name="my_version_of_the_dag",
        tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"}
     )

     dr = (
       driver.Builder()
         .with_config(your_config)
         .with_modules(*your_modules)
         .with_adapters(tracker)
         .build()
     )

 Then run your DAG, and follow the links in the logs! Note that the link is correct if you're using
 the local mode -- if you're on postgres it links to 8241 (but you'll want to follow it to 8241).

 I need some Hamilton code to run
 --------------------------------
 If you don't have Hamilton code to run this with, you can run Hamilton UI example under `examples/hamilton_ui <https://github.com/dagworks-inc/hamilton/tree/main/examples/hamilton_ui>`_:

 .. code-block:: bash

     # we assume you're in the Hamilton repository root
     cd examples/hamilton_ui
     # make sure you have the right python packages installed
     pip install -r requirements.txt
     # run the pipeline providing the email and project_id you created in the UI
     python run.py --email <email> --project_id <project_id>

 You should see links in the `logs to the UI <http://localhost:8242/dashboard/projects>`_, where you can see the DAG run + the data summaries captured.

 ----------
 Features
 ----------

 Once you get to the UI, you can navigate to the projects page (left hand nav-bar). Assuming you have created a project
 and logged to it, you can then navigate to view it and then more details about it. E.g. versions, code, lineage, catalog, execution runs.
 See below for a few screenshots of the UI.


 Dataflow versioning
 --------------------

 Select a dataflow versions to compare and visualize.

 .. image:: ../_static/version_tracking.png
     :alt: DAG Version Tracking

 Assets/features catalog
 -----------------------

 View functions, nodes, and assets across a history of runs.

 .. image:: ../_static/catalog.png
     :alt: Catalog

 Browser
 --------

 View dataflow structure and code.


 .. image:: ../_static/code_browser.png
     :alt: Browser

 .. image:: ../_static/dag_view.png
     :alt: Browser

 Run tracking + telemetry
 -------------------------

 View a history of runs, telemetry on runs/comparison, and data for specific runs:

 .. image:: ../_static/run_tracking.png
     :alt: Run Tracking

 .. image:: ../_static/run_telemetry.png
     :alt: Run Telemetry

 .. image:: ../_static/run_data.png
     :alt: Run Data


 ------------------
 SDK Configuration
 ------------------
 This section documents HamiltonTracker configuration options.

 Changing where data is sent
 ----------------------------
 You can change where telemetry is logged by passing in `hamilton_api_url` and/or `hamilton_ui_url` to the
 HamiltonTracker constructor. By default, these are set to `localhost:8241/8242`.

 .. code-block:: python

     from hamilton_sdk import adapters

     tracker = adapters.HamiltonTracker(
        project_id=PROJECT_ID_FROM_ABOVE,
        username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
        dag_name="my_version_of_the_dag",
        tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"},
        hamilton_api_url="http://YOUR_DOMAIN_HERE:8241",
        hamilton_ui_url="http://YOUR_DOMAIN_HERE:8242" # if using docker the UI is on 8242.
     )

     dr = (
       driver.Builder()
         .with_config(your_config)
         .with_modules(*your_modules)
         .with_adapters(tracker)
         .build()
     )


 Changing behavior of what is captured
 -------------------------------------
 By default, a lot is captured and sent to the Hamilton UI.

 Here are a few options that can change that - these can be found
 in `hamilton_sdk.tracking.constants`. You can either change the defaults by
 directly changing the constants, by specifying them in a config file, or via environment variables.

 Here we first explain the options:

 .. table:: Simple Invocation
    :align: left

    +-----------------------------+-----------------------------+----------------------------------------------------------+
    | Option                      | Default                     | Explanation                                              |
    +=============================+=============================+==========================================================+
    | CAPTURE_DATA_STATISTICS     | True                        | Whether to capture any data insights/statistics          |
    +-----------------------------+-----------------------------+----------------------------------------------------------+
    | MAX_LIST_LENGTH_CAPTURE     | 50                          | Max length for list capture                              |
    +-----------------------------+-----------------------------+----------------------------------------------------------+
    | MAX_DICT_LENGTH_CAPTURE     | 100                         | Max length for dict capture                              |
    +-----------------------------+-----------------------------+----------------------------------------------------------+
    | DEFAULT_CONFIG_URI          | ~/.hamilton.conf            | Default config file URI.                                 |
    +-----------------------------+-----------------------------+----------------------------------------------------------+


 To change the defaults via a config file, you can do the following:

 .. code-block:: ini

     [SDK_CONSTANTS]
     MAX_LIST_LENGTH_CAPTURE=100
     MAX_DICT_LENGTH_CAPTURE=200

     # save this to ~/.hamilton.conf


 To change the defaults via environment variables, you can do the following, prefixing them with `HAMILTON_`:

 .. code-block:: bash

     export HAMILTON_MAX_LIST_LENGTH_CAPTURE=100
     export HAMILTON_MAX_DICT_LENGTH_CAPTURE=200
     python run_my_hamilton_code.py

 To change the defaults directly, you can do the following:

 .. code-block:: python

     from hamilton_sdk.tracking import constants

     constants.MAX_LIST_LENGTH_CAPTURE = 100
     constants.MAX_DICT_LENGTH_CAPTURE = 200

     tracker = adapters.HamiltonTracker(
        project_id=PROJECT_ID_FROM_ABOVE,
        username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
        dag_name="my_version_of_the_dag",
        tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"}
     )

     dr = (
       driver.Builder()
         .with_config(your_config)
         .with_modules(*your_modules)
         .with_adapters(tracker)
         .build()
     )
     dr.execute(...)

 In terms of precedence, the order is:

 1. Module default.
 2. Config file values.
 3. Environment variables.
 4. Directly set values.
	--------------------
	UI Overview
	--------------------

	Hamilton comes with a fully open-source UI that can be run both for local deployment and on a remote server.
	The UI consists of the following features:

	1. Telemetry for hamilton executions -- both on the history of executions and the data itself.
	2. A feature/artifact catalog for browsing/connecting executions of nodes -> results.
	3. A dataflow (i.e. DAG) visualizer for exploring and looking at your code, and determining lineage.
	4. A project explorer for viewing curating projects and viewing versions of your Hamilton dataflows.

	In short, the Hamilton UI aims to combine a large swath of MLOps/data observability systems in one simple application.

	.. image:: ../_static/hamilton_ui.jpeg
	:alt: Hamilton UI

	---

	The Hamilton UI has two modes:
	1. Run locally using sqlite3
	2. Run on docker images with postgres (meant for deployment)

	----------
	Local Mode
	----------

	To run the hamilton UI in local mode, you can do the following:

	.. code-block:: bash

	pip install "sf-hamilton[ui,sdk]"
	hamilton ui
	# python -m hamilton.cli.__main__ ui # on windows

	This will launch a browser window in localhost:8241. You can then navigate to the UI and start using it!
	While this can potentially handle a small production workflow, you may want to run on postgres with a separate frontend/backend/db
	for full scalability and a multi-read/write db.

	--------------------
	Docker/Deployed Mode
	--------------------

	The Hamilton UI can be contained within a set of Docker images. You launch with `docker-compose <https://docs.docker.com/compose/>`_, and it will start up the UI, the backend server,
	and a Postgres database. If you'd like a quick overview of some of the features, you can watch the following:

	.. raw:: html

	<iframe width="560" height="315" src="https://www.youtube.com/embed/0VIVSeN7Ij8?si=i3vTsfTNorzh5y2C" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>


	Note: if you run into the "Invalid HTTP_HOST" error, then please set the environment variable `HAMILTON_ALLOWED_HOSTS="*"` (or comma separated list of domains of choice) for the backend docker container. You can inject
	this via `-e` or in the `docker-compose[-prod].yml` file itself.

	Install
	-------

	If you'd like a video walkthrough on getting set up, you can watch the following:

	.. raw:: html

	<iframe width="560" height="315" src="https://www.youtube.com/embed/DPfxlTwaNsM?si=gks5oOAWsNPSJbe_" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>


	As prerequisites, you will need to have Docker installed -- you can follow instructions `here <https://docs.docker.com/engine/install/>`_.

	1. Clone the Hamilton repository locally

	.. code-block:: bash

	git clone https://github.com/dagworks-inc/hamilton

	2. Navigate to the ``hamilton/ui`` directory

	.. code-block:: bash

	cd hamilton/ui


	3. Execute the installation script with the following command

	.. code-block:: bash

	./run.sh


	This will:

	- Pull all Docker images from the Docker Hub
	- Start a local Postgres database
	- Start the backend server
	- Start the frontend server

	This takes a bit of time! So be patient. The server will be running on port 8242.

	4. Then navigate to ``http://localhost:8242`` in your browser, and enter your email (this will be the username used within the app).

	Building the Docker Images locally
	-----------------------------------
	If building the Docker containers from scratch, increase your Docker memory to 10gb or more -- you can do this in the Docker Desktop settings.

	To build the images locally, you can run the following command:

	.. code-block:: bash

	# from the hamilton/ui directory
	./dev.sh --build

	This will build the containers from scratch. If you just want to mount the local code, you can run just

	.. code-block:: bash

	./dev.sh

	Self-Hosting
	-------------

	If you know docker, you should be good to go. The one environment variable to know is `HAMILTON_ALLOWED_HOSTS`, which you can set to `*` to allow all hosts, or
	a comma separated list of hosts you want to allow.

	Please reach out to us if you want to deploy on your own infrastructure and need help - `join slack <https://join.slack.com/t/hamilton-opensource/shared_invite/zt-2niepkra8-DGKGf_tTYhXuJWBTXtIs4g>`_.
	More extensive self-hosting documentation is in the works, e.g. Snowflake, Databricks, AWS, GCP, Azure, etc.; we'd love a helm
	chart contribution!


	Running on Snowflake
	---------------------
	You can run the Hamilton UI on Snowflake Container Services. For a detailed guide, see the blog post
	`Observability of Python code and application logic with Hamilton UI on Snowflake Container Services <https://medium.com/@pkantyka/observability-of-python-code-and-application-logic-with-hamilton-ui-on-snowflake-container-services-a26693b46635>`_ by
	`Greg Kantyka <https://medium.com/@pkantyka>`_ and the `Hamilton Snowflake Example <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/snowflake/hamilton_ui>`_.

	-----------
	Get started
	-----------

	Now that you have your server running, you can run a simple dataflow and watch it in the UI!
	You can follow instructions in the UI when you create a new project, or follow the instructions here.

	First, install the SDK:

	.. code-block:: bash

	pip install "sf-hamilton[sdk]"

	Then, navigate to the project page (dashboard/projects), in the running UI, and click the green ``+ New DAG`` button.

	.. image:: ../_static/new_project.png

	Remember the project ID -- you'll use it for the next steps.

	Existing Hamilton Code
	----------------------
	Add the following adapter to your code if you have existing Hamilton code:

	.. code-block:: python

	from hamilton_sdk import adapters

	tracker = adapters.HamiltonTracker(
	project_id=PROJECT_ID_FROM_ABOVE,
	username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
	dag_name="my_version_of_the_dag",
	tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"}
	)

	dr = (
	driver.Builder()
	.with_config(your_config)
	.with_modules(*your_modules)
	.with_adapters(tracker)
	.build()
	)

	Then run your DAG, and follow the links in the logs! Note that the link is correct if you're using
	the local mode -- if you're on postgres it links to 8241 (but you'll want to follow it to 8241).

	I need some Hamilton code to run
	--------------------------------
	If you don't have Hamilton code to run this with, you can run Hamilton UI example under `examples/hamilton_ui <https://github.com/dagworks-inc/hamilton/tree/main/examples/hamilton_ui>`_:

	.. code-block:: bash

	# we assume you're in the Hamilton repository root
	cd examples/hamilton_ui
	# make sure you have the right python packages installed
	pip install -r requirements.txt
	# run the pipeline providing the email and project_id you created in the UI
	python run.py --email <email> --project_id <project_id>

	You should see links in the `logs to the UI <http://localhost:8242/dashboard/projects>`_, where you can see the DAG run + the data summaries captured.

	----------
	Features
	----------

	Once you get to the UI, you can navigate to the projects page (left hand nav-bar). Assuming you have created a project
	and logged to it, you can then navigate to view it and then more details about it. E.g. versions, code, lineage, catalog, execution runs.
	See below for a few screenshots of the UI.


	Dataflow versioning
	--------------------

	Select a dataflow versions to compare and visualize.

	.. image:: ../_static/version_tracking.png
	:alt: DAG Version Tracking

	Assets/features catalog
	-----------------------

	View functions, nodes, and assets across a history of runs.

	.. image:: ../_static/catalog.png
	:alt: Catalog

	Browser
	--------

	View dataflow structure and code.


	.. image:: ../_static/code_browser.png
	:alt: Browser

	.. image:: ../_static/dag_view.png
	:alt: Browser

	Run tracking + telemetry
	-------------------------

	View a history of runs, telemetry on runs/comparison, and data for specific runs:

	.. image:: ../_static/run_tracking.png
	:alt: Run Tracking

	.. image:: ../_static/run_telemetry.png
	:alt: Run Telemetry

	.. image:: ../_static/run_data.png
	:alt: Run Data


	------------------
	SDK Configuration
	------------------
	This section documents HamiltonTracker configuration options.

	Changing where data is sent
	----------------------------
	You can change where telemetry is logged by passing in `hamilton_api_url` and/or `hamilton_ui_url` to the
	HamiltonTracker constructor. By default, these are set to `localhost:8241/8242`.

	.. code-block:: python

	from hamilton_sdk import adapters

	tracker = adapters.HamiltonTracker(
	project_id=PROJECT_ID_FROM_ABOVE,
	username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
	dag_name="my_version_of_the_dag",
	tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"},
	hamilton_api_url="http://YOUR_DOMAIN_HERE:8241",
	hamilton_ui_url="http://YOUR_DOMAIN_HERE:8242" # if using docker the UI is on 8242.
	)

	dr = (
	driver.Builder()
	.with_config(your_config)
	.with_modules(*your_modules)
	.with_adapters(tracker)
	.build()
	)


	Changing behavior of what is captured
	-------------------------------------
	By default, a lot is captured and sent to the Hamilton UI.

	Here are a few options that can change that - these can be found
	in `hamilton_sdk.tracking.constants`. You can either change the defaults by
	directly changing the constants, by specifying them in a config file, or via environment variables.

	Here we first explain the options:

	.. table:: Simple Invocation
	:align: left

	+-----------------------------+-----------------------------+----------------------------------------------------------+
	\| Option \| Default \| Explanation \|
	+=============================+=============================+==========================================================+
	\| CAPTURE_DATA_STATISTICS \| True \| Whether to capture any data insights/statistics \|
	+-----------------------------+-----------------------------+----------------------------------------------------------+
	\| MAX_LIST_LENGTH_CAPTURE \| 50 \| Max length for list capture \|
	+-----------------------------+-----------------------------+----------------------------------------------------------+
	\| MAX_DICT_LENGTH_CAPTURE \| 100 \| Max length for dict capture \|
	+-----------------------------+-----------------------------+----------------------------------------------------------+
	\| DEFAULT_CONFIG_URI \| ~/.hamilton.conf \| Default config file URI. \|
	+-----------------------------+-----------------------------+----------------------------------------------------------+


	To change the defaults via a config file, you can do the following:

	.. code-block:: ini

	[SDK_CONSTANTS]
	MAX_LIST_LENGTH_CAPTURE=100
	MAX_DICT_LENGTH_CAPTURE=200

	# save this to ~/.hamilton.conf


	To change the defaults via environment variables, you can do the following, prefixing them with `HAMILTON_`:

	.. code-block:: bash

	export HAMILTON_MAX_LIST_LENGTH_CAPTURE=100
	export HAMILTON_MAX_DICT_LENGTH_CAPTURE=200
	python run_my_hamilton_code.py

	To change the defaults directly, you can do the following:

	.. code-block:: python

	from hamilton_sdk.tracking import constants

	constants.MAX_LIST_LENGTH_CAPTURE = 100
	constants.MAX_DICT_LENGTH_CAPTURE = 200

	tracker = adapters.HamiltonTracker(
	project_id=PROJECT_ID_FROM_ABOVE,
	username="USERNAME/EMAIL_YOU_PUT_IN_THE_UI",
	dag_name="my_version_of_the_dag",
	tags={"environment": "DEV", "team": "MY_TEAM", "version": "X"}
	)

	dr = (
	driver.Builder()
	.with_config(your_config)
	.with_modules(*your_modules)
	.with_adapters(tracker)
	.build()
	)
	dr.execute(...)

	In terms of precedence, the order is:

	1. Module default.
	2. Config file values.
	3. Environment variables.
	4. Directly set values.