title: ai-proxy keywords:
The ai-proxy plugin simplifies access to LLM providers and models by defining a standard request format that allows key fields in plugin configuration to be embedded into the request.
Proxying requests to OpenAI is supported now. Other LLM services will be supported soon.
| Name | Type | Required | Description |
|---|---|---|---|
messages | Array | Yes | An array of message objects |
messages.role | String | Yes | Role of the message (system, user, assistant) |
messages.content | String | Yes | Content of the message |
| Field | Required | Type | Description |
|---|---|---|---|
| auth | Yes | Object | Authentication configuration |
| auth.header | No | Object | Authentication headers. Key must match pattern ^[a-zA-Z0-9._-]+$. |
| auth.query | No | Object | Authentication query parameters. Key must match pattern ^[a-zA-Z0-9._-]+$. |
| model.provider | Yes | String | Name of the AI service provider (openai). |
| model.name | Yes | String | Model name to execute. |
| model.options | No | Object | Key/value settings for the model |
| override.endpoint | No | String | Override the endpoint of the AI provider |
| timeout | No | Integer | Timeout in milliseconds for requests to LLM. Range: 1 - 60000. Default: 30000 |
| keepalive | No | Boolean | Enable keepalive for requests to LLM. Default: true |
| keepalive_timeout | No | Integer | Keepalive timeout in milliseconds for requests to LLM. Minimum: 1000. Default: 60000 |
| keepalive_pool | No | Integer | Keepalive pool size for requests to LLM. Minimum: 1. Default: 30 |
| ssl_verify | No | Boolean | SSL verification for requests to LLM. Default: true |
Create a route with the ai-proxy plugin like so:
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \ -H "X-API-KEY: ${ADMIN_API_KEY}" \ -d '{ "uri": "/anything", "plugins": { "ai-proxy": { "auth": { "header": { "Authorization": "Bearer <some-token>" } }, "model": { "provider": "openai", "name": "gpt-4", "options": { "max_tokens": 512, "temperature": 1.0 } } } }, "upstream": { "type": "roundrobin", "nodes": { "somerandom.com:443": 1 }, "scheme": "https", "pass_host": "node" } }'
Upstream node can be any arbitrary value because it won't be contacted.
Now send a request:
curl http://127.0.0.1:9080/anything -i -XPOST -H 'Content-Type: application/json' -d '{ "messages": [ { "role": "system", "content": "You are a mathematician" }, { "role": "user", "a": 1, "content": "What is 1+1?" } ] }'
You will receive a response like this:
{ "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "The sum of \\(1 + 1\\) is \\(2\\).", "role": "assistant" } } ], "created": 1723777034, "id": "chatcmpl-9whRKFodKl5sGhOgHIjWltdeB8sr7", "model": "gpt-4o-2024-05-13", "object": "chat.completion", "system_fingerprint": "fp_abc28019ad", "usage": { "completion_tokens": 15, "prompt_tokens": 23, "total_tokens": 38 } }
Create a route with the ai-proxy plugin with provider set to openai-compatible and the endpoint of the model set to override.endpoint like so:
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \ -H "X-API-KEY: ${ADMIN_API_KEY}" \ -d '{ "uri": "/anything", "plugins": { "ai-proxy": { "auth": { "header": { "Authorization": "Bearer <some-token>" } }, "model": { "provider": "openai-compatible", "name": "qwen-plus" }, "override": { "endpoint": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions" } } }, "upstream": { "type": "roundrobin", "nodes": { "somerandom.com:443": 1 }, "scheme": "https", "pass_host": "node" } }'