This example demonstrates how to use a Named Entity Recognition (NER) model to extract entities from text along with embeddings to facilitate querying with more precision. Specifically we'll use the entities here to filter to the documents that contain the entities of interest.
In general the concept we're showing here, is that if you extract extra metadata, like the entities text mentions, this can be used when trying to find the most relevant text to pass to an LLM in a retrieval augmented generation (RAG) context.
The pipeline we create can be seen in the image below.
To run this in a notebook:
pip install -r requirements.txt.jupyter by running pip install jupyter.jupyter notebook in the current directory and open notebook.ipynb.Alternatively open this notebook in Google Colab by clicking the button below:
To run this example via the commandline :
Install the requirements by running pip install -r requirements.txt
Run the script python run.py. Some example commands:
To see the full list of commands run python run.py --help.