Elastic Model Catalog

SERVERLESS INFERENCE

40+ production models (including the latest from Anthropic, xAI, Meta, Mistral, and more) ready for instant, scalable inference. Priced strictly per million tokens with zero cold starts.

ModelDeveloperTypeContextInput (1M)Output (1M)
claude-sonnet-4.6
us.anthropic.claude-sonnet-4-6
AnthropicChat1M----
claude-opus-4.7
us.anthropic.claude-opus-4-7
AnthropicChat1M----
claude-haiku-4.5
us.anthropic.claude-haiku-4-5-20251001-v1:0
AnthropicChat200k----
cohere.embed-v4
cohere.embed-v4:0
CohereEmbedding128k----
grok-4.3
xai.grok-4.3
xAIChat1M----
grok-4.20-reasoning
xai.grok-4.20-0309-reasoning
xAIChat2M----
grok-build-0.1
xai.grok-build-0.1
xAIChat256k----
grok-imagine-image-quality
xai.grok-imagine-image-quality
xAIVisionN/A----
grok-imagine-image
xai.grok-imagine-image
xAIVisionN/A----
grok-imagine-video
xai.grok-imagine-video
xAIChatN/A----
grok-tts
xai.tts
xAIChatN/A----
llama-4-scout-17b
us.meta.llama4-scout-17b-instruct-v1:0
MetaChat10M----
llama-3-3-70b-instruct
us.meta.llama3-3-70b-instruct-v1:0
MetaChat128K----
llama-3.2-11b-instruct
us.meta.llama3-2-11b-instruct-v1:0
MetaChat128K----
minimax-m2.5
minimax.minimax-m2.5
MiniMaxChat196K----
mistral-large-3
mistral.mistral-large-3-675b-instruct
Mistral AIChat256k----
devstral-2-123b
mistral.devstral-2-123b
Mistral AIChat256k----
ministral-14b
mistral.ministral-3-14b-instruct
Mistral AIChat128k----
deepseek-v3.2
deepseek.v3.2
DeepSeekChat164k----
gpt-oss-120b
openai.gpt-oss-120b-1:0
OpenAI OSSChat128k----
gpt-oss-20b
openai.gpt-oss-20b-1:0
OpenAI OSSChat128k----
gpt-oss-safeguard-120b
openai.gpt-oss-safeguard-120b
OpenAI OSSChat128k----
gpt-oss-safeguard-20b
openai.gpt-oss-safeguard-20b
OpenAI OSSChat128k----
gemma-3-27b
google.gemma-3-27b-it
GoogleChat128k----
gemma-3-12b
google.gemma-3-12b-it
GoogleChat128k----
gemma-3-4b
google.gemma-3-4b-it
GoogleChat128k----
nova-2-lite
us.amazon.nova-2-lite-v1:0
AmazonChat1M----
qwen3-coder-next
qwen.qwen3-coder-next
QwenCode256k----
qwen3-vl-235b
qwen.qwen3-vl-235b-a22b
QwenChat256k----
qwen3-next-80b
qwen.qwen3-next-80b-a3b
QwenChat256k----
qwen3-32b
qwen.qwen3-32b-v1:0
QwenChat32k----
nemotron-3-super-120b
nvidia.nemotron-super-3-120b
NVIDIAChat256k----
nemotron-nano-12b-v2
nvidia.nemotron-nano-12b-v2
NVIDIAChat128k----
nemotron-nano-3-30b
nvidia.nemotron-nano-3-30b
NVIDIAChat256k----
stable-image-remove-background
us.stability.stable-image-remove-background-v1:0
Stability AIChatN/A----
stable-image-control-sketch
us.stability.stable-image-control-sketch-v1:0
Stability AIChatN/A----
stable-image-control-structure
us.stability.stable-image-control-structure-v1:0
Stability AIChatN/A----
stable-image-style-guide
us.stability.stable-image-style-guide-v1:0
Stability AIChatN/A----
palmyra-x5
us.writer.palmyra-x5-v1:0
Emerging LabsChat128k----
palmyra-x4
us.writer.palmyra-x4-v1:0
Emerging LabsChat128k----
palmyra-vision-7b
writer.palmyra-vision-7b
WriterChat4k----
pegasus-1.2
us.twelvelabs.pegasus-1-2-v1:0
TwelveLabsChatN/A----
Don't see your model?
We add highly-requested models to our elastic tier weekly. Alternatively, you can host any model immediately via our Dedicated Inference tier.