Retrieval Augmented Generation app with Hamilton

This application allows you to search arXiv for PDFs or import arbitrary PDF files and search over them using LLMs. For each file, the text is divided in chunks that are embedded with OpenAI and stored in Weaviate. When you query the system, the most relevant chunks are retrieved and a summary answer is generated using OpenAI.

The ingestion and retrieval steps are implemented as dataflows with Hamilton and are exposed via FastAPI endpoints. The frontend is built with Streamlit and exposes the different functionalities via a simple web UI. Everything is packaged as containers with docker compose.

This example draws from previous simpler examples (Knowledge Retrieval, Modular LLM Stack, PDF Summarizer).

Find below a list of references for the technical concepts found in this example

Setup

Setup script

Give executable permissions to the script: chmod +x build_app.sh
To build and launch the app the first time, specify where to download the app and your OpenAI API key: ./build_app.sh DOWNLOAD_DIRECTORY YOUR_OPENAI_API_KEY

To rebuild and launch docker compose up -d --build
To shutdown the app docker compose down
To view app logs docker compose logs -f

Manual setup

Clone this repository git clone https://github.com/dagworks-inc/hamilton.git
Move to the directory cd hamilton/examples/LLM_Workflows/retrieval_augmented_generation
Create a copy of .env.template with cp .env.template .env
Replace the placeholder in .env with your OpenAI API key such that OPENAI_API_KEY=YOUR_API_KEY
Create and build docker images docker compose up -d --build
Go to http://localhost:8080/ to view the Streamlit app. If it's running, it means the FastAPI server started and was able to connect to Weaviate. You can manually verify FastAPI at http://localhost:8082/docs and Weaviate at http://localhost:8080/docs

If you make changes, you need to rebuild the docker images, so do docker compose up -d --build.
To stop the containers do docker compose down.
To look at the logs, your docker application should allow you to view them, or you can do docker compose logs -f to tail the logs (ctrl+c to stop tailing the logs).

Retrieval Augmented Generation app with Hamilton

Setup

Setup script

Manual setup

Technical References: