Supported by OpenClaw

AI Model Pricing

Compare pricing and capabilities of all AI models supported by OpenClaw. Prices shown per 1 million tokens.

OpenAI

Industry-leading models with excellent reasoning and code capabilities

Recommended

GPT-4o

128K

General purpose, vision, coding

Input

$2.50

/ 1M tokens

Output

$10.00

/ 1M tokens

VisionFunction callingJSON mode
gpt-4o

GPT-4o-mini

128K

Fast, cost-effective tasks

Input

$0.15

/ 1M tokens

Output

$0.60

/ 1M tokens

VisionFunction callingLow latency
gpt-4o-mini

GPT-4 Turbo

128K

Complex reasoning tasks

Input

$10.00

/ 1M tokens

Output

$30.00

/ 1M tokens

VisionAdvanced reasoningKnowledge cutoff 2023
gpt-4-turbo

GPT-3.5 Turbo

16K

Simple tasks, legacy support

Input

$0.50

/ 1M tokens

Output

$1.50

/ 1M tokens

FastCost-effectiveReliable
gpt-3.5-turbo

Anthropic

Claude models excel at analysis, writing, and complex reasoning

Recommended

Claude 3.5 Sonnet

200K

Coding, analysis, writing

Input

$3.00

/ 1M tokens

Output

$15.00

/ 1M tokens

Excellent codingLong contextVision
claude-3-5-sonnet-20241022

Claude 3 Opus

200K

Most complex tasks

Input

$15.00

/ 1M tokens

Output

$75.00

/ 1M tokens

Highest capabilityDeep analysisResearch
claude-3-opus-20240229

Claude 3 Sonnet

200K

Balanced performance

Input

$3.00

/ 1M tokens

Output

$15.00

/ 1M tokens

ReliableGood speedVersatile
claude-3-sonnet-20240229

Claude 3 Haiku

200K

Fast responses, simple tasks

Input

$0.25

/ 1M tokens

Output

$1.25

/ 1M tokens

FastestCost-effectiveLightweight
claude-3-haiku-20240307

Google

Gemini models offer competitive pricing and large context windows

Recommended

Gemini 2.0 Flash

1M

Fast, multimodal tasks

Input

$0.07

/ 1M tokens

Output

$0.30

/ 1M tokens

1M contextVisionAudioVideo
gemini-2.0-flash-exp

Gemini 1.5 Pro

2M

Long documents, analysis

Input

$1.25

/ 1M tokens

Output

$5.00

/ 1M tokens

2M contextComplex reasoningMultimodal
gemini-1.5-pro

Gemini 1.5 Flash

1M

Fast, cost-effective

Input

$0.07

/ 1M tokens

Output

$0.30

/ 1M tokens

1M contextSpeedEfficiency
gemini-1.5-flash

Moonshot AI

Chinese LLM with strong long-context capabilities

Recommended

Kimi K2.5

256K

Long context, Chinese tasks

Input

$0.50

/ 1M tokens

Output

$2.00

/ 1M tokens

256K contextChinese optimizedReasoning
kimi-k2.5

DeepSeek

Cost-effective models with strong coding capabilities

Recommended

DeepSeek V3

64K

Coding, cost-effective tasks

Input

$0.27

/ 1M tokens

Output

$1.10

/ 1M tokens

Great valueCodingChinese/English
deepseek-chat

DeepSeek R1

64K

Reasoning, math, logic

Input

$0.55

/ 1M tokens

Output

$2.19

/ 1M tokens

Chain-of-thoughtReasoningProblem solving
deepseek-reasoner

Mistral AI

Efficient open-source models with strong performance

Recommended

Mistral Large 2

128K

Complex tasks, reasoning

Input

$2.00

/ 1M tokens

Output

$6.00

/ 1M tokens

MultilingualReasoningCoding
mistral-large-latest

Mistral Medium

32K

Balanced performance

Input

$0.90

/ 1M tokens

Output

$2.70

/ 1M tokens

FastCost-effectiveReliable
mistral-medium

Mistral Small

32K

Simple tasks, speed

Input

$0.20

/ 1M tokens

Output

$0.60

/ 1M tokens

FastestEfficientLightweight
mistral-small

Codestral

32K

Code generation

Input

$0.30

/ 1M tokens

Output

$0.90

/ 1M tokens

80+ languagesFill-in-middleCode focus
codestral-latest

Perplexity

Search-augmented models with real-time information access

Recommended

Sonar Pro

200K

Research, complex queries

Input

$3.00

/ 1M tokens

Output

$15.00

/ 1M tokens

Search augmentedReal-time dataCitations
sonar-pro

Sonar

128K

General search queries

Input

$1.00

/ 1M tokens

Output

$1.00

/ 1M tokens

Fast searchCost-effectiveReal-time
sonar

Cohere

Enterprise-grade models with strong RAG capabilities

Recommended

Command R+

128K

Enterprise RAG, complex tasks

Input

$3.00

/ 1M tokens

Output

$15.00

/ 1M tokens

RAG optimizedTool useMultilingual
command-r-plus

Command R

128K

RAG, conversational

Input

$0.50

/ 1M tokens

Output

$1.50

/ 1M tokens

BalancedRAG readyFast
command-r

Command

4K

General tasks

Input

$1.00

/ 1M tokens

Output

$2.00

/ 1M tokens

ReliableSimpleCost-effective
command

Groq

Ultra-fast inference with competitive pricing

Recommended

Llama 3.3 70B

128K

Fast inference, general tasks

Input

$0.59

/ 1M tokens

Output

$0.79

/ 1M tokens

Ultra-fast128K contextOpen source
llama-3.3-70b-versatile

Llama 3.1 8B

128K

Ultra-fast simple tasks

Input

$0.05

/ 1M tokens

Output

$0.08

/ 1M tokens

FastestCheapestEfficient
llama-3.1-8b-instant

Mixtral 8x7B

32K

Balanced performance

Input

$0.24

/ 1M tokens

Output

$0.24

/ 1M tokens

MoE architectureFastReliable
mixtral-8x7b-32768

Together AI

Access to wide range of open-source models

Recommended

Llama 3.3 70B

128K

General purpose, chat

Input

$0.88

/ 1M tokens

Output

$0.88

/ 1M tokens

128K contextOpen sourceReliable
meta-llama/Llama-3.3-70B-Instruct-Turbo

Qwen 2.5 72B

128K

Multilingual, coding

Input

$1.20

/ 1M tokens

Output

$1.20

/ 1M tokens

Strong codingMultilingual128K context
Qwen/Qwen2.5-72B-Instruct-Turbo

DeepSeek V3

64K

Coding, reasoning

Input

$1.25

/ 1M tokens

Output

$1.25

/ 1M tokens

Strong codingCost-effectiveFast
deepseek-ai/DeepSeek-V3

Fireworks AI

Fast inference API for open-source models

Recommended

Llama 3.3 70B

128K

Fast inference

Input

$0.90

/ 1M tokens

Output

$0.90

/ 1M tokens

FastReliable128K context
accounts/fireworks/models/llama-v3p3-70b-instruct

Mixtral 8x22B

64K

Complex reasoning

Input

$1.20

/ 1M tokens

Output

$1.20

/ 1M tokens

MoEStrong reasoning64K context
accounts/fireworks/models/mixtral-8x22b-instruct

xAI

Grok models with real-time X platform integration

Recommended

Grok 2

128K

Real-time info, analysis

Input

$5.00

/ 1M tokens

Output

$15.00

/ 1M tokens

X integrationReal-timeUncensored
grok-2-latest

Grok 2 Mini

128K

Fast responses

Input

$0.60

/ 1M tokens

Output

$2.00

/ 1M tokens

FastCost-effectiveReal-time
grok-2-mini

About Pricing

Prices are shown in USD per 1 million tokens. Input tokens refer to the text you send to the model, while output tokens are the model's response. Context window indicates the maximum number of tokens the model can process in a single request. Prices may vary and are subject to change by the providers. Always check the official provider documentation for the most current pricing.