anthropic/claude-3-haikuFast, low-cost chat for high-volume tasks.
Input $0.25/M, output $1.25/M. Cache write 5m $0.30/M, 1h $0.50/M. Cache read $0.03/M.
Browse the production catalog: 65 routed models across chat, realtime, embeddings, image, and video. Point your client at one endpoint and call any of them with exact IDs like openai/gpt-5.5.
anthropic/claude-3-haikuFast, low-cost chat for high-volume tasks.
Input $0.25/M, output $1.25/M. Cache write 5m $0.30/M, 1h $0.50/M. Cache read $0.03/M.
anthropic/claude-3-sonnetGeneral-purpose chat with tool calling.
Input $3/M, output $15/M. Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.
anthropic/claude-3.5-haikuFast, low-cost chat for high-volume tasks.
Input $0.80/M, output $4/M. Cache write 5m $1/M, 1h $1.60/M. Cache read $0.08/M.
anthropic/claude-haiku-4.5Fast, low-cost chat for high-volume tasks.
Input $1/M, output $5/M. Cache write 5m $1.25/M, 1h $2/M. Cache read $0.10/M.
anthropic/claude-opus-4.1Frontier reasoning for complex, multi-step work.
Input $15/M, output $75/M. Cache write 5m $18.75/M, 1h $30/M. Cache read $1.50/M.
anthropic/claude-opus-4.5Frontier reasoning for complex, multi-step work.
Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.
anthropic/claude-opus-4.6Frontier reasoning for complex, multi-step work.
Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.
anthropic/claude-opus-4.7Frontier reasoning for complex, multi-step work.
Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.
anthropic/claude-opus-4.8Frontier reasoning for complex, multi-step work.
Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.
anthropic/claude-sonnet-4General-purpose chat with tool calling.
Input $3/M (>200K $6/M), output $15/M (>200K $22.50/M). Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.
anthropic/claude-sonnet-4.5General-purpose chat with tool calling.
Input $3/M (>200K $6/M), output $15/M (>200K $22.50/M). Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.
anthropic/claude-sonnet-4.6General-purpose chat with tool calling.
Input $3/M, output $15/M. Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.
deepseek/deepseek-v4-flashFast, low-cost chat for high-volume tasks.
DeepSeek V4 Flash. Free of cost.
deepseek/deepseek-v4-proFrontier reasoning for complex, multi-step work.
DeepSeek V4 Pro published in RMB; USD estimate shown here. Pricing may step above provider long-context tiers.
free/claude-haiku-4.5Fast, low-cost chat for high-volume tasks.
Free tier. $0 per token.
free/claude-opus-4.8Frontier reasoning for complex, multi-step work.
Free tier. $0 per token.
free/claude-sonnet-4.6General-purpose chat with tool calling.
Free tier. $0 per token.
free/deepseek-v3.2General-purpose chat with tool calling.
Free tier. $0 per token.
free/deepseek-v4-flashFast, low-cost chat for high-volume tasks.
Free tier. $0 per token.
free/deepseek-v4-proFrontier reasoning for complex, multi-step work.
Free tier. $0 per token.
free/gemini-2.5-flashFast, low-cost chat for high-volume tasks.
Free tier. $0 per token.
free/gemini-2.5-proFrontier reasoning for complex, multi-step work.
Free tier. $0 per token.
free/gemini-3.1-proFrontier reasoning for complex, multi-step work.
Free tier. $0 per token.
free/glm-4.7-flashFast, low-cost chat for high-volume tasks.
Free tier. $0 per token.
free/glm-5.1General-purpose chat with tool calling.
Free tier. $0 per token.
free/gpt-5.4General-purpose chat with tool calling.
Free tier. $0 per token.
free/gpt-5.5General-purpose chat with tool calling.
Free tier. $0 per token.
free/gpt-oss-120bGeneral-purpose chat with tool calling.
Free tier. $0 per token.
free/kimi-k2.6General-purpose chat with tool calling.
Free tier. $0 per token.
free/llama-3.3-70bGeneral-purpose chat with tool calling.
Free tier. $0 per token.
free/mimo-v2.5-proFrontier reasoning for complex, multi-step work.
Free tier. $0 per token.
free/minimax-m3Fast, low-cost chat for high-volume tasks.
Free tier. $0 per token.
google/gemini-2.5-flashFast, low-cost chat for high-volume tasks.
Input $0.30/M, output $2.50/M. Cache read $0.03/M. Cache storage $1/M/hour.
google/gemini-2.5-flash-liteFast, low-cost chat for high-volume tasks.
Input $0.10/M, output $0.40/M. Cache read $0.01/M. Cache storage $1/M/hour.
google/gemini-2.5-proFrontier reasoning for complex, multi-step work.
Standard pricing through 200K context. Above 200K context: $2.50 input, $15 output, $0.25 cache read per 1M tokens. Cache storage $4.50/M/hour.
google/gemini-3-flashFast, low-cost chat for high-volume tasks.
Input $0.50/M, output $3/M. Cache read $0.05/M. Cache storage $1/M/hour.
google/gemini-3.1-flash-liteFast, low-cost chat for high-volume tasks.
Input $0.25/M, output $1.50/M. Cache read $0.025/M. Cache storage $1/M/hour.
google/gemini-3.1-proFrontier reasoning for complex, multi-step work.
Standard pricing through 200K context. Above 200K context: $4 input, $18 output, $0.40 cache read per 1M tokens. Cache storage $4.50/M/hour.
google/gemini-3.5-flashFast, low-cost chat for high-volume tasks.
Input $0.50/M, output $3/M. Cache read $0.05/M. Cache storage $1/M/hour.
moonshot/kimi-k2.6General-purpose chat with tool calling.
Context-dependent provider pricing may increase at higher token windows.
nvidia/nemotron-3-ultraGeneral-purpose chat with tool calling.
Nemotron 3 Ultra pricing based on OpenRouter-listed rates.
openai/gpt-5.4General-purpose chat with tool calling.
Standard pricing through 200K context. Above 200K context: $5 input, $0.50 cached input, $22.50 output per 1M tokens.
openai/gpt-5.4-miniFast, low-cost chat for high-volume tasks.
openai/gpt-5.4-nanoFast, low-cost chat for high-volume tasks.
openai/gpt-5.4-proFrontier reasoning for complex, multi-step work.
Standard pricing through 200K context. Above 200K context: $60 input and $270 output per 1M tokens.
openai/gpt-5.5General-purpose chat with tool calling.
Standard pricing through 200K context. Above 200K context: $10 input, $1 cached input, $45 output per 1M tokens.
openai/gpt-image-2Image generation and editing.
Text rate. Image input: $8 input and $2 cached per 1M tokens. Image output: $30 per 1M tokens.
openai/text-embedding-3-largeVector embeddings for search and RAG.
$0.13 per 1M input tokens.
xai/grok-3General-purpose chat with tool calling.
xai/grok-4General-purpose chat with tool calling.
xai/grok-4.20General-purpose chat with tool calling.
xai/grok-4.20-autoGeneral-purpose chat with tool calling.
xai/grok-4.20-expertGeneral-purpose chat with tool calling.
xai/grok-4.20-fastFast, low-cost chat for high-volume tasks.
xai/grok-4.20-multi-agentGeneral-purpose chat with tool calling.
xai/grok-4.3General-purpose chat with tool calling.
xai/grok-buildGeneral-purpose chat with tool calling.
xai/grok-build-0.1General-purpose chat with tool calling.
xai/grok-code-fast-1Fast, low-cost chat for high-volume tasks.
xai/grok-imagine-imageImage generation and editing.
family fallback
xai/grok-imagine-image-liteImage generation and editing.
family fallback
xai/grok-tts-1Text to speech synthesis.
$0.20 / request
xiaomi/mimo-v2.5General-purpose chat with tool calling.
Xiaomi direct overseas pricing for <=256K context. Higher 256K-1M tier is billed above this base rate.
zhipu/glm-5.1General-purpose chat with tool calling.
Context-dependent provider pricing may increase at higher token windows.
zhipu/glm-5.2General-purpose chat with tool calling.
GLM-5.2 (FP8). Free of cost.