Model Catalog

Every model, one API

Browse the production catalog: 65 routed models across chat, realtime, embeddings, image, and video. Point your client at one endpoint and call any of them with exact IDs like openai/gpt-5.5.

65 models

Owners

Model Type

Anthropic

anthropic/claude-3-haiku

Fast, low-cost chat for high-volume tasks.

Input /M

$0.25

Output /M

$1.25

Cached /M

$0.03

Input $0.25/M, output $1.25/M. Cache write 5m $0.30/M, 1h $0.50/M. Cache read $0.03/M.

Anthropic

anthropic/claude-3-sonnet

General-purpose chat with tool calling.

Input /M

$3.00

Output /M

$15.00

Cached /M

$0.30

Input $3/M, output $15/M. Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.

Anthropic

anthropic/claude-3.5-haiku

Fast, low-cost chat for high-volume tasks.

Input /M

$0.80

Output /M

$4.00

Cached /M

$0.08

Input $0.80/M, output $4/M. Cache write 5m $1/M, 1h $1.60/M. Cache read $0.08/M.

Anthropic

anthropic/claude-haiku-4.5

Fast, low-cost chat for high-volume tasks.

Input /M

$1.00

Output /M

$5.00

Cached /M

$0.10

Input $1/M, output $5/M. Cache write 5m $1.25/M, 1h $2/M. Cache read $0.10/M.

Anthropic

anthropic/claude-opus-4.1

Frontier reasoning for complex, multi-step work.

Input /M

$15.00

Output /M

$75.00

Cached /M

$1.50

Input $15/M, output $75/M. Cache write 5m $18.75/M, 1h $30/M. Cache read $1.50/M.

Anthropic

anthropic/claude-opus-4.5

Frontier reasoning for complex, multi-step work.

Input /M

$5.00

Output /M

$25.00

Cached /M

$0.50

Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.

Anthropic

anthropic/claude-opus-4.6

Frontier reasoning for complex, multi-step work.

Input /M

$5.00

Output /M

$25.00

Cached /M

$0.50

Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.

Anthropic

anthropic/claude-opus-4.7

Frontier reasoning for complex, multi-step work.

Input /M

$5.00

Output /M

$25.00

Cached /M

$0.50

Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.

Anthropic

anthropic/claude-opus-4.8

Frontier reasoning for complex, multi-step work.

Input /M

$5.00

Output /M

$25.00

Cached /M

$0.50

Input $5/M, output $25/M. Cache write 5m $6.25/M, 1h $10/M. Cache read $0.50/M.

Anthropic

anthropic/claude-sonnet-4

General-purpose chat with tool calling.

Input /M

$3.00

Output /M

$15.00

Cached /M

$0.30

Input $3/M (>200K $6/M), output $15/M (>200K $22.50/M). Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.

Anthropic

anthropic/claude-sonnet-4.5

General-purpose chat with tool calling.

Input /M

$3.00

Output /M

$15.00

Cached /M

$0.30

Input $3/M (>200K $6/M), output $15/M (>200K $22.50/M). Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.

Anthropic

anthropic/claude-sonnet-4.6

General-purpose chat with tool calling.

Input /M

$3.00

Output /M

$15.00

Cached /M

$0.30

Input $3/M, output $15/M. Cache write 5m $3.75/M, 1h $6/M. Cache read $0.30/M.

DeepSeek

deepseek/deepseek-v4-flash

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

DeepSeek V4 Flash. Free of cost.

DeepSeek

deepseek/deepseek-v4-pro

Frontier reasoning for complex, multi-step work.

Input /M

$0.44

Output /M

$0.89

Cached /M

$0.0037

DeepSeek V4 Pro published in RMB; USD estimate shown here. Pricing may step above provider long-context tiers.

Anthropic

free/claude-haiku-4.5

Free

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Anthropic

free/claude-opus-4.8

Free

Frontier reasoning for complex, multi-step work.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Anthropic

free/claude-sonnet-4.6

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

DeepSeek

free/deepseek-v3.2

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

DeepSeek

free/deepseek-v4-flash

Free

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

DeepSeek

free/deepseek-v4-pro

Free

Frontier reasoning for complex, multi-step work.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Google

free/gemini-2.5-flash

Free

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Google

free/gemini-2.5-pro

Free

Frontier reasoning for complex, multi-step work.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Google

free/gemini-3.1-pro

Free

Frontier reasoning for complex, multi-step work.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Zhipu AI

free/glm-4.7-flash

Free

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Zhipu AI

free/glm-5.1

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

OpenAI

free/gpt-5.4

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

OpenAI

free/gpt-5.5

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

OpenAI

free/gpt-oss-120b

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Moonshot AI

free/kimi-k2.6

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

meta

free/llama-3.3-70b

Free

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Xiaomi

free/mimo-v2.5-pro

Free

Frontier reasoning for complex, multi-step work.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

MiniMax

free/minimax-m3

Free

Fast, low-cost chat for high-volume tasks.

Input /M

Free

Output /M

Free

Cached /M

Free

Free tier. $0 per token.

Google

google/gemini-2.5-flash

Fast, low-cost chat for high-volume tasks.

Input /M

$0.60

Output /M

$5.00

Cached /M

$0.06

Input $0.30/M, output $2.50/M. Cache read $0.03/M. Cache storage $1/M/hour.

Google

google/gemini-2.5-flash-lite

Fast, low-cost chat for high-volume tasks.

Input /M

$0.20

Output /M

$0.80

Cached /M

$0.02

Input $0.10/M, output $0.40/M. Cache read $0.01/M. Cache storage $1/M/hour.

Google

google/gemini-2.5-pro

Frontier reasoning for complex, multi-step work.

Input /M

$2.50

Output /M

$20.00

Cached /M

$0.25

Standard pricing through 200K context. Above 200K context: $2.50 input, $15 output, $0.25 cache read per 1M tokens. Cache storage $4.50/M/hour.

Google

google/gemini-3-flash

Fast, low-cost chat for high-volume tasks.

Input /M

$1.00

Output /M

$6.00

Cached /M

$0.10

Input $0.50/M, output $3/M. Cache read $0.05/M. Cache storage $1/M/hour.

Google

google/gemini-3.1-flash-lite

Fast, low-cost chat for high-volume tasks.

Input /M

$0.50

Output /M

$3.00

Cached /M

$0.05

Input $0.25/M, output $1.50/M. Cache read $0.025/M. Cache storage $1/M/hour.

Google

google/gemini-3.1-pro

Frontier reasoning for complex, multi-step work.

Input /M

$4.00

Output /M

$24.00

Cached /M

$0.40

Standard pricing through 200K context. Above 200K context: $4 input, $18 output, $0.40 cache read per 1M tokens. Cache storage $4.50/M/hour.

Google

google/gemini-3.5-flash

Fast, low-cost chat for high-volume tasks.

Input /M

$1.00

Output /M

$6.00

Cached /M

$0.10

Input $0.50/M, output $3/M. Cache read $0.05/M. Cache storage $1/M/hour.

Moonshot AI

moonshot/kimi-k2.6

General-purpose chat with tool calling.

Input /M

$0.95

Output /M

$4.00

Cached /M

$0.16

Context-dependent provider pricing may increase at higher token windows.

NVIDIA

nvidia/nemotron-3-ultra

General-purpose chat with tool calling.

Input /M

$1.00

Output /M

$5.00

Cached /M

Free

Nemotron 3 Ultra pricing based on OpenRouter-listed rates.

OpenAI

openai/gpt-5.4

General-purpose chat with tool calling.

Input /M

$2.50

Output /M

$15.00

Cached /M

$0.25

Standard pricing through 200K context. Above 200K context: $5 input, $0.50 cached input, $22.50 output per 1M tokens.

OpenAI

openai/gpt-5.4-mini

Fast, low-cost chat for high-volume tasks.

Input /M

$1.50

Output /M

$9.00

Cached /M

$0.15

OpenAI

openai/gpt-5.4-nano

Fast, low-cost chat for high-volume tasks.

Input /M

$0.40

Output /M

$2.50

Cached /M

$0.04

OpenAI

openai/gpt-5.4-pro

Frontier reasoning for complex, multi-step work.

Input /M

$30.00

Output /M

$180.00

Standard pricing through 200K context. Above 200K context: $60 input and $270 output per 1M tokens.

OpenAI

openai/gpt-5.5

General-purpose chat with tool calling.

Input /M

$5.00

Output /M

$30.00

Cached /M

$0.50

Standard pricing through 200K context. Above 200K context: $10 input, $1 cached input, $45 output per 1M tokens.

OpenAI

openai/gpt-image-2

Image generation and editing.

Per image

$30.00

Text rate. Image input: $8 input and $2 cached per 1M tokens. Image output: $30 per 1M tokens.

OpenAI

openai/text-embedding-3-large

Vector embeddings for search and RAG.

Input /M

$0.13

Output /M

Free

$0.13 per 1M input tokens.

xAI

xai/grok-3

General-purpose chat with tool calling.

Input /M

$2.00

Output /M

$10.00

Cached /M

$0.30

xAI

xai/grok-4

General-purpose chat with tool calling.

Input /M

$3.00

Output /M

$15.00

Cached /M

$0.75

xAI

xai/grok-4.20

General-purpose chat with tool calling.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-4.20-auto

General-purpose chat with tool calling.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-4.20-expert

General-purpose chat with tool calling.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-4.20-fast

Fast, low-cost chat for high-volume tasks.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-4.20-multi-agent

General-purpose chat with tool calling.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-4.3

General-purpose chat with tool calling.

Input /M

$1.25

Output /M

$2.50

Cached /M

$0.20

xAI

xai/grok-build

General-purpose chat with tool calling.

Input /M

$1.00

Output /M

$2.00

Cached /M

$0.20

xAI

xai/grok-build-0.1

General-purpose chat with tool calling.

Input /M

$1.00

Output /M

$2.00

Cached /M

$0.20

xAI

xai/grok-code-fast-1

Fast, low-cost chat for high-volume tasks.

Input /M

$1.00

Output /M

$2.00

Cached /M

$0.20

xAI

xai/grok-imagine-image

Image generation and editing.

Per image

$10.00

family fallback

xAI

xai/grok-imagine-image-lite

Image generation and editing.

Per image

$10.00

family fallback

xAI

xai/grok-tts-1

Text to speech synthesis.

Per request

$0.20

$0.20 / request

Xiaomi

xiaomi/mimo-v2.5

General-purpose chat with tool calling.

Input /M

$0.80

Output /M

$4.00

Cached /M

$0.16

Xiaomi direct overseas pricing for <=256K context. Higher 256K-1M tier is billed above this base rate.

Zhipu AI

zhipu/glm-5.1

General-purpose chat with tool calling.

Input /M

$1.40

Output /M

$4.40

Cached /M

$0.26

Context-dependent provider pricing may increase at higher token windows.

Zhipu AI

zhipu/glm-5.2

General-purpose chat with tool calling.

Input /M

Free

Output /M

Free

Cached /M

Free

GLM-5.2 (FP8). Free of cost.