Follow up README.md to introduce the basic UI function & details, welcome to update and improve at any time, thanks
Construct a knowledge graph, chunk vector, and graph vid vector from the text.
graph TD; A[Raw Text] --> B[Text Segmentation] B --> C[Vectorization] C --> D[Store in Vector Database] A --> F[Text Segmentation] F --> G[LLM extracts graph based on schema \nand segmented text] G --> H[Store graph in Graph Database, \nautomatically vectorize vertices \nand store in Vector Database] I[Retrieve vertices from Graph Database] --> J[Vectorize vertices and store in Vector Database \nNote: Incremental update]
Get RAG Info
Get Vector Index Info: Retrieve vector index information
Get Graph Index Info: Retrieve graph index information
Clear RAG Data
Import into Vector: Convert the text in Doc(s) into vectors (requires chunking the text first and then converting the chunks into vectors)
Extract Graph Data (1): Extract graph data from Doc(s) based on the Schema, using the Graph Extract Prompt Header and chunked content as the prompt
Load into GraphDB (2): Store the extracted graph data into the database (automatically calls Update Vid Embedding to store vectors in the vector database)
Update Vid Embedding: Convert graph vid into vectors
The Import into Vector button in the previous module converts text (chunks) into vectors, and the Update Vid Embedding button converts graph vid into vectors. These vectors are stored separately to supplement the context for queries (answer generation) in this module. In other words, the previous module prepares the data for RAG (vectorization), while this module executes RAG.
This module consists of two parts:
The first part handles single queries, while the second part handles multiple queries at once. Below is an explanation of the first part.
graph TD; A[Question] --> B[Vectorize the question and search \nfor the most similar chunk in the Vector Database (chunk)] A --> F[Extract keywords using LLM] F --> G[Match vertices precisely in Graph Database \nusing keywords; perform fuzzy matching in \nVector Database (graph vid)] G --> H[Generate Gremlin query using matched vertices and query with LLM] H --> I[Execute Gremlin query; if successful, finish; if failed, fallback to BFS] B --> J[Sort results] I --> J J --> K[Generate answer]
Use the extracted keywords to:
First, perform an exact match in the graph database.
If no match is found, perform a fuzzy match in the vector database (graph vid vector) to retrieve relevant vertices.
text2gql: Call the text2gql-related interface, using the matched vertices as entities to convert the question into a Gremlin query and execute it in the graph database.
BFS: If text2gql fails (LLM-generated queries might be invalid), fall back to executing a graph query using a predefined Gremlin query template (essentially a BFS traversal).
Convert the query into a vector.
Search for the most similar content in the chunk vector dataset in the vector database.
After executing the retrieval, sort the search (retrieval) results to construct the final prompt.
Generate answers based on different prompt configurations and display them in different output fields:
Converts natural language queries into Gremlin queries.
This module consists of two parts:
The first part is straightforward, so the focus is on the second part.
graph TD; A[Gremlin Pairs File] --> C[Vectorize query] C --> D[Store in Vector Database] F[Natural Language Query] --> G[Search for the most similar query \nin the Vector Database \n(If no Gremlin pairs exist in the Vector Database, \ndefault files will be automatically vectorized) \nand retrieve the corresponding Gremlin] G --> H[Add the matched pair to the prompt \nand use LLM to generate the Gremlin \ncorresponding to the Natural Language Query]
Input the query (natural language) into the Natural Language Query field.
Input the graph schema into the Schema field.
Click the Text2Gremlin button, and the following execution logic applies:
Convert the query into a vector.
Construct the prompt:
- Generate the Gremlin query using the constructed prompt.
Input Gremlin queries to execute corresponding operations.