ZenLLM Docs
One OpenAI-compatible gateway for every model. Point your client at https://api.zenllm.org/v1, use your key, and keep the rest of your app intact.
- Compatible with
- OpenAI, Anthropic, Responses
- Models
- 50+ across 12 providers
- Billing
- Provider list price, no markup
Getting Started
ZenLLM stays close to OpenAI-compatible request shapes, so the common migration path is simple: swap the base URL, use your ZenLLM key, and choose the model you need.
Quick start
- Create an account at zenllm.org/register
- Choose a plan that fits your usage
- Copy your API key from the dashboard
- Set
https://api.zenllm.org/v1as your base URL - Start making requests. That's it.
Authentication
All requests require an API key. Pass it via the appropriate header depending on the API format you're using.
- OpenAI format
- Authorization: Bearer sk-zenith-...
- Anthropic format
- x-api-key: sk-zenith-...
Base URLs
Use the appropriate base URL for your client's API format.
- OpenAI
- https://api.zenllm.org/v1
- Anthropic
- https://api.zenllm.org/v1
- Responses
- https://api.zenllm.org/v1
Core endpoints
/v1/chat/completionsOpenAI/v1/responsesOpenAI/v1/messagesAnthropic/v1/embeddingsOpenAI/v1/rerankOpenAI/v1/modelsOpenAI/v1/models/:idOpenAIChat Completions
OpenAI-compatible chat completions. Works with Cursor, Cline, Aider, OpenCode, and any OpenAI SDK.
Non-streaming
curl https://api.zenllm.org/v1/chat/completions \
-H "Authorization: Bearer sk-zenith-..." \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-5.5","messages":[{"role":"user","content":"hello"}]}'Streaming
curl -N https://api.zenllm.org/v1/chat/completions \
-H "Authorization: Bearer sk-zenith-..." \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-5.5","messages":[{"role":"user","content":"hello"}],"stream":true}'Streaming emits SSE data: lines with chat.completion.chunk objects and ends with data: [DONE].
Responses API
Newer OpenAI Responses API format. Used by Codex CLI, OpenAI Agents SDK, and OpenAI SDK v5+.
curl https://api.zenllm.org/v1/responses \
-H "Authorization: Bearer sk-zenith-..." \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5.5",
"input": "hello",
"instructions": "You are a helpful assistant."
}'Uses named SSE events (event: response.created, response.output_text.delta, etc.). No [DONE] sentinel.
Messages (Anthropic)
Anthropic Messages API format. Used by Claude Code and the Anthropic SDK.
curl https://api.zenllm.org/v1/messages \
-H "x-api-key: sk-zenith-..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5.5",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "hello"}]
}'Uses Anthropic SSE events (event: message_start, content_block_delta, etc.). Requires anthropic-version: 2023-06-01 header.
Models
List available models. Every client calls this for model discovery.
curl https://api.zenllm.org/v1/models \
-H "Authorization: Bearer sk-zenith-..."Browse all available models on the Models page.
Embeddings
Generate vector embeddings for text, for semantic search and RAG.
curl https://api.zenllm.org/v1/embeddings \
-H "Authorization: Bearer sk-zenith-..." \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-large",
"input": "Your text to embed"
}'Available models
qwen3-embedding-8bqwen/qwen3-embedding-8bReranking
Rerank documents by relevance to a query.
curl https://api.zenllm.org/v1/rerank \
-H "Authorization: Bearer sk-zenith-..." \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-reranker-8b",
"query": "What is the capital of France?",
"documents": ["Paris is the capital of France.", "London is the capital of the UK."]
}'Available models
qwen3-reranker-8bqwen/qwen3-reranker-8bSupport
Stuck on an integration, hit an unexpected error, or have a billing question? The fastest way to reach the team and community is Discord. We're active and usually reply within a few hours.
Join discord.gg/zenllm