AI model pricing and provider fit.
Compare supported providers, scan model capabilities, and filter the list quickly. Prices are displayed per 1 million tokens for fast planning.
OpenAI
Latest GPT-5.4 and GPT-5 mini with best intelligence for agentic and coding workflows
GPT-5.4
1MBest intelligence at scale for agentic, coding, and professional workflows
Input
$2.50
/ 1M tokens
Output
$15.00
/ 1M tokens
gpt-5.4GPT-5 mini
400KNear-frontier intelligence for cost-sensitive, low latency, high volume workloads
Input
$0.25
/ 1M tokens
Output
$2.00
/ 1M tokens
gpt-5-mini-2025-08-07GPT-4o
128KGeneral purpose, vision, coding (Legacy)
Input
$2.50
/ 1M tokens
Output
$10.00
/ 1M tokens
gpt-4oGPT-4o-mini
128KFast, cost-effective tasks (Legacy)
Input
$0.15
/ 1M tokens
Output
$0.60
/ 1M tokens
gpt-4o-miniAnthropic
Latest Claude 4.6 models with state-of-the-art reasoning and agent capabilities
Claude 4.6 Opus
200KMost intelligent model for agents and coding
Input
$5.00
/ 1M tokens
Output
$25.00
/ 1M tokens
claude-opus-4-6-20251001Claude 4.6 Sonnet
200KOptimal balance of intelligence, cost, and speed
Input
$3.00
/ 1M tokens
Output
$15.00
/ 1M tokens
claude-sonnet-4-6-20251001Claude 4.6 Haiku
200KFastest, most cost-efficient model
Input
$1.00
/ 1M tokens
Output
$5.00
/ 1M tokens
claude-haiku-4-6-20251001Claude 3.5 Sonnet
200KCoding, analysis, writing (Legacy)
Input
$3.00
/ 1M tokens
Output
$15.00
/ 1M tokens
claude-3-5-sonnet-20241022Claude 3 Opus
200KMost complex tasks (Legacy)
Input
$15.00
/ 1M tokens
Output
$75.00
/ 1M tokens
claude-3-opus-20240229Claude 3 Sonnet
200KBalanced performance
Input
$3.00
/ 1M tokens
Output
$15.00
/ 1M tokens
claude-3-sonnet-20240229Claude 3 Haiku
200KFast responses, simple tasks
Input
$0.25
/ 1M tokens
Output
$1.25
/ 1M tokens
claude-3-haiku-20240307Latest Gemini 2.5 models with native multimodal and 1M+ context window
Gemini 2.5 Pro
1MState-of-the-art reasoning and agent capabilities
Input
$1.25
/ 1M tokens
Output
$10.00
/ 1M tokens
gemini-2.5-pro-preview-03-25Gemini 2.5 Flash
1MFast, cost-effective multimodal tasks
Input
$0.15
/ 1M tokens
Output
$0.60
/ 1M tokens
gemini-2.5-flash-preview-03-25Gemini 2.0 Flash
1MFast, multimodal tasks
Input
$0.07
/ 1M tokens
Output
$0.30
/ 1M tokens
gemini-2.0-flash-expGemini 1.5 Pro
2MLong documents, analysis
Input
$1.25
/ 1M tokens
Output
$5.00
/ 1M tokens
gemini-1.5-proGemini 1.5 Flash
1MFast, cost-effective
Input
$0.07
/ 1M tokens
Output
$0.30
/ 1M tokens
gemini-1.5-flashMoonshot AI
Chinese LLM with strong long-context capabilities
Kimi K2.5
256KLong context, Chinese tasks
Input
$0.50
/ 1M tokens
Output
$2.00
/ 1M tokens
kimi-k2.5DeepSeek
Cost-effective models with strong coding capabilities
DeepSeek V3
64KCoding, cost-effective tasks
Input
$0.27
/ 1M tokens
Output
$1.10
/ 1M tokens
deepseek-chatDeepSeek R1
64KReasoning, math, logic
Input
$0.55
/ 1M tokens
Output
$2.19
/ 1M tokens
deepseek-reasonerMistral AI
Efficient open-source models with strong performance
Mistral Large 2
128KComplex tasks, reasoning
Input
$2.00
/ 1M tokens
Output
$6.00
/ 1M tokens
mistral-large-latestMistral Medium
32KBalanced performance
Input
$0.90
/ 1M tokens
Output
$2.70
/ 1M tokens
mistral-mediumMistral Small
32KSimple tasks, speed
Input
$0.20
/ 1M tokens
Output
$0.60
/ 1M tokens
mistral-smallCodestral
32KCode generation
Input
$0.30
/ 1M tokens
Output
$0.90
/ 1M tokens
codestral-latestPerplexity
Search-augmented models with real-time information access
Sonar Pro
200KResearch, complex queries
Input
$3.00
/ 1M tokens
Output
$15.00
/ 1M tokens
sonar-proSonar
128KGeneral search queries
Input
$1.00
/ 1M tokens
Output
$1.00
/ 1M tokens
sonarCohere
Enterprise-grade models with strong RAG capabilities
Command R+
128KEnterprise RAG, complex tasks
Input
$3.00
/ 1M tokens
Output
$15.00
/ 1M tokens
command-r-plusCommand R
128KRAG, conversational
Input
$0.50
/ 1M tokens
Output
$1.50
/ 1M tokens
command-rCommand
4KGeneral tasks
Input
$1.00
/ 1M tokens
Output
$2.00
/ 1M tokens
commandGroq
Ultra-fast inference with competitive pricing
Llama 3.3 70B
128KFast inference, general tasks
Input
$0.59
/ 1M tokens
Output
$0.79
/ 1M tokens
llama-3.3-70b-versatileLlama 3.1 8B
128KUltra-fast simple tasks
Input
$0.05
/ 1M tokens
Output
$0.08
/ 1M tokens
llama-3.1-8b-instantMixtral 8x7B
32KBalanced performance
Input
$0.24
/ 1M tokens
Output
$0.24
/ 1M tokens
mixtral-8x7b-32768Together AI
Access to wide range of open-source models
Llama 3.3 70B
128KGeneral purpose, chat
Input
$0.88
/ 1M tokens
Output
$0.88
/ 1M tokens
meta-llama/Llama-3.3-70B-Instruct-TurboQwen 2.5 72B
128KMultilingual, coding
Input
$1.20
/ 1M tokens
Output
$1.20
/ 1M tokens
Qwen/Qwen2.5-72B-Instruct-TurboDeepSeek V3
64KCoding, reasoning
Input
$1.25
/ 1M tokens
Output
$1.25
/ 1M tokens
deepseek-ai/DeepSeek-V3Fireworks AI
Fast inference API for open-source models
Llama 3.3 70B
128KFast inference
Input
$0.90
/ 1M tokens
Output
$0.90
/ 1M tokens
accounts/fireworks/models/llama-v3p3-70b-instructMixtral 8x22B
64KComplex reasoning
Input
$1.20
/ 1M tokens
Output
$1.20
/ 1M tokens
accounts/fireworks/models/mixtral-8x22b-instructxAI
Grok models with real-time X platform integration
Grok 2
128KReal-time info, analysis
Input
$5.00
/ 1M tokens
Output
$15.00
/ 1M tokens
grok-2-latestGrok 2 Mini
128KFast responses
Input
$0.60
/ 1M tokens
Output
$2.00
/ 1M tokens
grok-2-miniAbout Pricing
Prices are shown in USD per 1 million tokens. Input tokens refer to the text you send to the model, while output tokens are the model's response. Context window indicates the maximum number of tokens the model can process in a single request. Prices may vary and are subject to change by the providers. Always check the official provider documentation for the most current pricing.