API Reference

ZenLLM Docs

One OpenAI-compatible gateway for every model. Point your client at https://api.zenllm.org/v1, use your key, and keep the rest of your app intact.

Compatible with: OpenAI, Anthropic, Responses
Models: 50+ across 12 providers
Billing: Provider list price, no markup

Get an API key Get help on Discord

Getting Started

ZenLLM stays close to OpenAI-compatible request shapes, so the common migration path is simple: swap the base URL, use your ZenLLM key, and choose the model you need.

Quick start

Create an account at zenllm.org/register
Choose a plan that fits your usage
Copy your API key from the dashboard
Set https://api.zenllm.org/v1 as your base URL
Start making requests. That's it.

Authentication

All requests require an API key. Pass it via the appropriate header depending on the API format you're using.

OpenAI format: Authorization: Bearer sk-zenith-...
Anthropic format: x-api-key: sk-zenith-...

Base URLs

Use the appropriate base URL for your client's API format.

OpenAI: https://api.zenllm.org/v1
Anthropic: https://api.zenllm.org/v1
Responses: https://api.zenllm.org/v1

Core endpoints

POST/v1/chat/completionsOpenAI

POST/v1/responsesOpenAI

POST/v1/messagesAnthropic

POST/v1/embeddingsOpenAI

POST/v1/rerankOpenAI

GET/v1/modelsOpenAI

GET/v1/models/:idOpenAI

Chat Completions

OpenAI-compatible chat completions. Works with Cursor, Cline, Aider, OpenCode, and any OpenAI SDK.

Non-streaming

curl https://api.zenllm.org/v1/chat/completions \
  -H "Authorization: Bearer sk-zenith-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-5.5","messages":[{"role":"user","content":"hello"}]}'

Streaming

bash

curl -N https://api.zenllm.org/v1/chat/completions \
  -H "Authorization: Bearer sk-zenith-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-5.5","messages":[{"role":"user","content":"hello"}],"stream":true}'

Streaming emits SSE data: lines with chat.completion.chunk objects and ends with data: [DONE].

Responses API

Newer OpenAI Responses API format. Used by Codex CLI, OpenAI Agents SDK, and OpenAI SDK v5+.

bash

curl https://api.zenllm.org/v1/responses \
  -H "Authorization: Bearer sk-zenith-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.5",
    "input": "hello",
    "instructions": "You are a helpful assistant."
  }'

Uses named SSE events (event: response.created, response.output_text.delta, etc.). No [DONE] sentinel.

Messages (Anthropic)

Anthropic Messages API format. Used by Claude Code and the Anthropic SDK.

bash

curl https://api.zenllm.org/v1/messages \
  -H "x-api-key: sk-zenith-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.5",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "hello"}]
  }'

Uses Anthropic SSE events (event: message_start, content_block_delta, etc.). Requires anthropic-version: 2023-06-01 header.

Models

List available models. Every client calls this for model discovery.

bash

curl https://api.zenllm.org/v1/models \
  -H "Authorization: Bearer sk-zenith-..."

Browse all available models on the Models page.

Embeddings

Generate vector embeddings for text, for semantic search and RAG.

Pricing$0.10 / M tokens

Context40k tokens

bash

curl https://api.zenllm.org/v1/embeddings \
  -H "Authorization: Bearer sk-zenith-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-large",
    "input": "Your text to embed"
  }'

Available models

qwen3-embedding-8bqwen/qwen3-embedding-8b

Reranking

Rerank documents by relevance to a query.

Pricing$0.20 / M tokens

Context40k tokens

bash

curl https://api.zenllm.org/v1/rerank \
  -H "Authorization: Bearer sk-zenith-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-reranker-8b",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "London is the capital of the UK."]
  }'

Available models

qwen3-reranker-8bqwen/qwen3-reranker-8b

Support

Stuck on an integration, hit an unexpected error, or have a billing question? The fastest way to reach the team and community is Discord. We're active and usually reply within a few hours.

Join discord.gg/zenllm