Langfuse is an open-source LLM engineering platform that provides trace tracking, performance evaluation, prompt management, and metrics monitoring for large language model applications. Langfuse on Doris uses Apache Doris as the analytics backend and is suitable for processing large-scale LLM application observability data.
This document describes how to deploy the Langfuse solution based on Apache Doris and integrate application traces through the Langfuse SDK, LangChain SDK, and LlamaIndex SDK.
When you want to uniformly record call traces, analyze model performance, manage prompts, and monitor cost and quality in LLM applications, you can use Langfuse on Doris. It uses Langfuse for application-side observability and Apache Doris to host OLAP analytics data.
| Capability | User Scenario |
|---|---|
| Trace tracking | Records the complete call traces and execution flows of LLM applications |
| Performance evaluation | Provides multi-dimensional model performance evaluation and quality analysis |
| Prompt management | Centrally manages and version-controls prompt templates |
| Metrics monitoring | Monitors application performance, cost, and quality metrics in real time |
Langfuse on Doris adopts a microservices architecture. Langfuse Web and Worker handle application interactions, API integration, and asynchronous tasks. PostgreSQL, Redis, and MinIO handle transactional data, cache and queue, and object storage respectively. Doris serves as the OLAP analytics backend that stores and queries observability data.
| Component | Port | Description |
|---|---|---|
| Langfuse Web | 3000 | Web interface and API service that provides user interaction and data ingestion |
| Langfuse Worker | 3030 | Asynchronous task processing, responsible for data processing and analytics tasks |
| PostgreSQL | 5432 | Transactional data storage that holds user configurations and metadata |
| Redis | 6379 | Cache layer and message queue that improves system response performance |
| MinIO | 9000 | Object storage service that stores raw events and multi-modal attachments |
| Doris FE | 9030, 8030 | Doris Frontend, responsible for receiving user requests, query parsing and planning, metadata management, and node management |
| Doris BE | 8040, 8050 | Doris Backend, responsible for data storage and query plan execution. Data is split into shards and stored in BEs in multiple replicas |
:::note
When deploying Apache Doris, you can choose either the integrated storage-compute architecture or the separated storage-compute architecture based on your hardware environment and business requirements.
In the Langfuse deployment, Docker Doris is not recommended for production environments. The Doris FE and Doris BE in the example Docker Compose are only intended to help users quickly experience the capabilities of Langfuse on Doris.
:::
flowchart TB User["UI, API, SDKs"] subgraph vpc["VPC"] Web["Web Server<br/>(langfuse/langfuse)"] Worker["Async Worker<br/>(langfuse/worker)"] Postgres["Postgres - OLTP<br/>(Transactional Data)"] Cache["Redis/Valkey<br/>(Cache, Queue)"] Doris["Doris - OLAP<br/>(Observability Data)"] S3["S3 / Blob Storage<br/>(Raw events, multi-modal attachments)"] end LLM["LLM API/Gateway<br/>(optional)"] User --> Web Web --> S3 Web --> Postgres Web --> Cache Web --> Doris Web -.->|"optional for playground"| LLM Cache --> Worker Worker --> Doris Worker --> Postgres Worker --> S3 Worker -.->|"optional for evals"| LLM
Before deployment, confirm software versions, hardware resources, and network connectivity. Doris is recommended to be deployed independently for better performance and stability.
| Component | Version Requirement | Description |
|---|---|---|
| Docker | 20.0+ | Container runtime environment |
| Docker Compose | 2.0+ | Container orchestration tool |
| Apache Doris | 2.1.10+ | Analytics database, must be deployed independently |
| Resource Type | Minimum Requirement | Recommended Configuration | Description |
|---|---|---|---|
| Memory | 8 GB | 16 GB+ | Supports concurrent execution of multiple services |
| Disk | 50 GB | 100 GB+ | Stores container data and logs |
| Network | 1 Gbps | 10 Gbps | Ensures data transfer performance |
Doris Cluster Preparation
Network Connectivity
:::tip Deployment Recommendation
It is recommended to deploy the Langfuse service components, including Web, Worker, Redis, and PostgreSQL, with Docker. Doris is recommended to be deployed independently for better performance and stability. For the detailed Doris deployment guide, refer to the official documentation.
:::
The Langfuse service connects to components such as Doris, PostgreSQL, and Redis through environment variables. Replace the example values according to your actual environment, especially the secrets, passwords, and service addresses.
| Parameter | Example Value | Description |
|---|---|---|
LANGFUSE_ANALYTICS_BACKEND | doris | Specifies Doris as the analytics backend |
DORIS_FE_HTTP_URL | http://localhost:8030 | Doris FE HTTP service address |
DORIS_FE_QUERY_PORT | 9030 | Doris FE query port |
DORIS_DB | langfuse | Doris database name |
DORIS_USER | root | Doris username |
DORIS_PASSWORD | 123456 | Doris password |
DORIS_MAX_OPEN_CONNECTIONS | 100 | Maximum number of database connections |
DORIS_REQUEST_TIMEOUT_MS | 300000 | Request timeout, in milliseconds |
| Parameter | Example Value | Description |
|---|---|---|
DATABASE_URL | postgresql://postgres:postgres@langfuse-postgres:5432/postgres | PostgreSQL database connection address |
NEXTAUTH_SECRET | your-debug-secret-key-here-must-be-long-enough | NextAuth authentication secret, used for session encryption |
SALT | your-super-secret-salt-with-at-least-32-characters-for-encryption | Data encryption salt, at least 32 characters |
ENCRYPTION_KEY | 0000000000000000000000000000000000000000000000000000000000000000 | Data encryption key, 64 characters |
NEXTAUTH_URL | http://localhost:3000 | Langfuse Web service address |
TZ | UTC | System time zone |
| Parameter | Example Value | Description |
|---|---|---|
REDIS_HOST | langfuse-redis | Redis service host address |
REDIS_PORT | 6379 | Redis service port |
REDIS_AUTH | myredissecret | Redis authentication password |
REDIS_TLS_ENABLED | false | Whether to enable TLS encryption |
REDIS_TLS_CA | - | Path to the TLS CA certificate |
REDIS_TLS_CERT | - | Path to the TLS client certificate |
REDIS_TLS_KEY | - | Path to the TLS private key |
| Parameter | Example Value | Description |
|---|---|---|
LANGFUSE_ENABLE_BACKGROUND_MIGRATIONS | false | Disables background migrations, must be turned off when using Doris |
LANGFUSE_AUTO_DORIS_MIGRATION_DISABLED | false | Enables Doris automatic migration |
This section provides a Docker Compose example that you can start directly. You can modify the configuration based on your actual deployment requirements.
wget https://apache-doris-releases.oss-cn-beijing.aliyuncs.com/extension/docker-langfuse-doris.tar.gz
After downloading and extracting, the directory structure of the Compose file and configuration files is as follows:
docker-langfuse-doris ├── docker-compose.yml └── doris-config └── fe_custom.conf
$ docker compose up -d [+] Running 9/9 ✔ Network docker-langfuse-doris_doris_internal Created 0.1s ✔ Network docker-langfuse-doris_default Created 0.1s ✔ Container doris_fe Healthy 13.8s ✔ Container langfuse-postgres Healthy 13.8s ✔ Container langfuse-redis Healthy 13.8s ✔ Container langfuse-minio Healthy 13.8s ✔ Container doris_be Healthy 54.3s ✔ Container langfuse-worker Started 54.8s ✔ Container langfuse-web Started
Check the service status. When all services are in the Healthy state, Compose has started successfully.
$ docker compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS doris_be apache/doris:be-2.1.11 "bash entry_point.sh" doris_be 2 minutes ago Up 2 minutes (healthy) 0.0.0.0:8040->8040/tcp, :::8040->8040/tcp, 0.0.0.0:8060->8060/tcp, :::8060->8060/tcp, 0.0.0.0:9050->9050/tcp, :::9050->9050/tcp, 0.0.0.0:9060->9060/tcp, :::9060->9060/tcp doris_fe apache/doris:fe-2.1.11 "bash init_fe.sh" doris_fe 2 minutes ago Up 2 minutes (healthy) 0.0.0.0:8030->8030/tcp, :::8030->8030/tcp, 0.0.0.0:9010->9010/tcp, :::9010->9010/tcp, 0.0.0.0:9030->9030/tcp, :::9030->9030/tcp langfuse-minio minio/minio "sh -c 'mkdir -p /da…" minio 2 minutes ago Up 2 minutes (healthy) 0.0.0.0:19090->9000/tcp, :::19090->9000/tcp, 127.0.0.1:19091->9001/tcp langfuse-postgres postgres:latest "docker-entrypoint.s…" postgres 2 minutes ago Up 2 minutes (healthy) 127.0.0.1:5432->5432/tcp langfuse-redis redis:7 "docker-entrypoint.s…" redis 2 minutes ago Up 2 minutes (healthy) 127.0.0.1:16379->6379/tcp langfuse-web selectdb/langfuse-web:latest "dumb-init -- ./web/…" langfuse-web 2 minutes ago Up About a minute (healthy) 0.0.0.0:13000->3000/tcp, :::13000->3000/tcp langfuse-worker selectdb/langfuse-worker:latest "dumb-init -- ./work…" langfuse-worker 2 minutes ago Up About a minute (healthy) 0.0.0.0:3030->3030/tcp, :::3030->3030/tcp
After the deployment is complete, access the Langfuse Web interface and finish project initialization.
Access address:
http://localhost:3000
Initialization steps:
After service initialization is complete, you can use the Langfuse SDK, LangChain SDK, or LlamaIndex SDK to integrate applications. The following examples use the DeepSeek API and write trace data to the local Langfuse service.
import os # Instead of: import openai from langfuse.openai import OpenAI # from langfuse import observe # Langfuse config os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-******-******" os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-******-******" os.environ["LANGFUSE_HOST"] = "http://localhost:3000" # use OpenAI client to access DeepSeek API client = OpenAI( base_url="https://api.deepseek.com" ) # ask a question question = "What are the characteristics of the Doris observability solution? Answer concisely and clearly." print(f"question: {question}") completion = client.chat.completions.create( model="deepseek-chat", messages=[ {"role": "user", "content": question} ] ) response = completion.choices[0].message.content print(f"response: {response}")
import os from langfuse.langchain import CallbackHandler from langchain_openai import ChatOpenAI # Langfuse config os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-******-******" os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-******-******" os.environ["LANGFUSE_HOST"] = "http://localhost:3000" # Create your LangChain components (using DeepSeek API) llm = ChatOpenAI( model="deepseek-chat", openai_api_base="https://api.deepseek.com" ) # ask a question question = "What are the characteristics of the Doris observability solution? Answer concisely and clearly." print(f"question: {question} \n") # Run your chain with Langfuse tracing try: # Initialize the Langfuse handler langfuse_handler = CallbackHandler() response = llm.invoke(question, config={"callbacks": [langfuse_handler]}) print(f"response: {response.content}") except Exception as e: print(f"Error during chain execution: {e}")
import os from langfuse import get_client from openinference.instrumentation.llama_index import LlamaIndexInstrumentor from llama_index.llms.deepseek import DeepSeek # Langfuse config os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-******-******" os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-******-******" os.environ["LANGFUSE_HOST"] = "http://localhost:3000" langfuse = get_client() # Initialize LlamaIndex instrumentation LlamaIndexInstrumentor().instrument() # Set up the DeepSeek class with the required model and API key llm = DeepSeek(model="deepseek-chat") # ask a question question = "What are the characteristics of the Doris observability solution? Answer concisely and clearly." print(f"question: {question} \n") with langfuse.start_as_current_span(name="llama-index-trace"): response = llm.complete(question) print(f"response: {response}")