Elastic Model Catalog
SERVERLESS INFERENCE
40+ production models (including the latest from Anthropic, xAI, Meta, Mistral, and more) ready for instant, scalable inference. Priced strictly per million tokens with zero cold starts.
| Model | Developer | Type | Context | Input (1M) | Output (1M) |
|---|---|---|---|---|---|
claude-sonnet-4.6 us.anthropic.claude-sonnet-4-6 | Anthropic | Chat | 1M | -- | -- |
claude-opus-4.7 us.anthropic.claude-opus-4-7 | Anthropic | Chat | 1M | -- | -- |
claude-haiku-4.5 us.anthropic.claude-haiku-4-5-20251001-v1:0 | Anthropic | Chat | 200k | -- | -- |
cohere.embed-v4 cohere.embed-v4:0 | Cohere | Embedding | 128k | -- | -- |
grok-4.3 xai.grok-4.3 | xAI | Chat | 1M | -- | -- |
grok-4.20-reasoning xai.grok-4.20-0309-reasoning | xAI | Chat | 2M | -- | -- |
grok-build-0.1 xai.grok-build-0.1 | xAI | Chat | 256k | -- | -- |
grok-imagine-image-quality xai.grok-imagine-image-quality | xAI | Vision | N/A | -- | -- |
grok-imagine-image xai.grok-imagine-image | xAI | Vision | N/A | -- | -- |
grok-imagine-video xai.grok-imagine-video | xAI | Chat | N/A | -- | -- |
grok-tts xai.tts | xAI | Chat | N/A | -- | -- |
llama-4-scout-17b us.meta.llama4-scout-17b-instruct-v1:0 | Meta | Chat | 10M | -- | -- |
llama-3-3-70b-instruct us.meta.llama3-3-70b-instruct-v1:0 | Meta | Chat | 128K | -- | -- |
llama-3.2-11b-instruct us.meta.llama3-2-11b-instruct-v1:0 | Meta | Chat | 128K | -- | -- |
minimax-m2.5 minimax.minimax-m2.5 | MiniMax | Chat | 196K | -- | -- |
mistral-large-3 mistral.mistral-large-3-675b-instruct | Mistral AI | Chat | 256k | -- | -- |
devstral-2-123b mistral.devstral-2-123b | Mistral AI | Chat | 256k | -- | -- |
ministral-14b mistral.ministral-3-14b-instruct | Mistral AI | Chat | 128k | -- | -- |
deepseek-v3.2 deepseek.v3.2 | DeepSeek | Chat | 164k | -- | -- |
gpt-oss-120b openai.gpt-oss-120b-1:0 | OpenAI OSS | Chat | 128k | -- | -- |
gpt-oss-20b openai.gpt-oss-20b-1:0 | OpenAI OSS | Chat | 128k | -- | -- |
gpt-oss-safeguard-120b openai.gpt-oss-safeguard-120b | OpenAI OSS | Chat | 128k | -- | -- |
gpt-oss-safeguard-20b openai.gpt-oss-safeguard-20b | OpenAI OSS | Chat | 128k | -- | -- |
gemma-3-27b google.gemma-3-27b-it | Chat | 128k | -- | -- | |
gemma-3-12b google.gemma-3-12b-it | Chat | 128k | -- | -- | |
gemma-3-4b google.gemma-3-4b-it | Chat | 128k | -- | -- | |
nova-2-lite us.amazon.nova-2-lite-v1:0 | Amazon | Chat | 1M | -- | -- |
qwen3-coder-next qwen.qwen3-coder-next | Qwen | Code | 256k | -- | -- |
qwen3-vl-235b qwen.qwen3-vl-235b-a22b | Qwen | Chat | 256k | -- | -- |
qwen3-next-80b qwen.qwen3-next-80b-a3b | Qwen | Chat | 256k | -- | -- |
qwen3-32b qwen.qwen3-32b-v1:0 | Qwen | Chat | 32k | -- | -- |
nemotron-3-super-120b nvidia.nemotron-super-3-120b | NVIDIA | Chat | 256k | -- | -- |
nemotron-nano-12b-v2 nvidia.nemotron-nano-12b-v2 | NVIDIA | Chat | 128k | -- | -- |
nemotron-nano-3-30b nvidia.nemotron-nano-3-30b | NVIDIA | Chat | 256k | -- | -- |
stable-image-remove-background us.stability.stable-image-remove-background-v1:0 | Stability AI | Chat | N/A | -- | -- |
stable-image-control-sketch us.stability.stable-image-control-sketch-v1:0 | Stability AI | Chat | N/A | -- | -- |
stable-image-control-structure us.stability.stable-image-control-structure-v1:0 | Stability AI | Chat | N/A | -- | -- |
stable-image-style-guide us.stability.stable-image-style-guide-v1:0 | Stability AI | Chat | N/A | -- | -- |
palmyra-x5 us.writer.palmyra-x5-v1:0 | Emerging Labs | Chat | 128k | -- | -- |
palmyra-x4 us.writer.palmyra-x4-v1:0 | Emerging Labs | Chat | 128k | -- | -- |
palmyra-vision-7b writer.palmyra-vision-7b | Writer | Chat | 4k | -- | -- |
pegasus-1.2 us.twelvelabs.pegasus-1-2-v1:0 | TwelveLabs | Chat | N/A | -- | -- |
Don't see your model?
We add highly-requested models to our elastic tier weekly. Alternatively, you can host any model immediately via our Dedicated Inference tier.