Models
Available AI models and their capabilities
Access 9 text models and 3 media models through a single API.
Text Models
| Model | ID | Provider | Vision | Specialty |
|---|---|---|---|---|
| Claude | claude | Anthropic | Yes | Thoughtful, analytical reasoning |
| GPT | gpt | OpenAI | Yes | Creative problem-solving |
| Grok | grok | xAI | Yes | Bold, unconventional perspectives |
| Gemini | gemini | Yes | Comprehensive knowledge | |
| DeepSeek | deepseek | DeepSeek | No | Technical depth, coding |
| Kimi | kimi | Moonshot | No | Multilingual capabilities |
| Mistral | mistral | Mistral AI | No | Efficient, practical insights |
| Llama | llama | Meta | No | Open-source AI |
| MiniMax | minimax | MiniMax | No | Strong multilingual, roleplay |
Media Models (MiniMax)
| Model | ID | Type | Description |
|---|---|---|---|
| Speech 2.6 | speech-2.6-hd | TTS | Text-to-speech, 300+ voices, 40 languages |
| Music 2.5 | music-2.5 | Music | Music generation from text/lyrics |
| Hailuo 2.3 | MiniMax-Hailuo-2.3 | Video | AI video generation with camera control |
Model Aliases
Each model supports multiple aliases for convenience.
Claude (Anthropic)
GPT (OpenAI) — includes reasoning models
Grok (xAI)
Gemini (Google) — includes reasoning mode
DeepSeek — includes reasoning (R1)
Kimi (Moonshot)
Mistral — includes code (Codestral)
Llama (Meta)
Qwen (via OpenRouter → DeepSeek provider)
MiniMax
Vision Support
Four models natively support image analysis. Non-vision models automatically receive an AI-generated description (produced once per request by GPT-5.2, then shared across all non-vision models).
| Model | Native Vision | What Happens |
|---|---|---|
| GPT | Yes | Sees the image directly |
| Claude | Yes | Sees the image directly |
| Gemini | Yes | Sees the image directly |
| Grok | Yes | Sees the image directly |
| DeepSeek | No | Receives AI-generated description |
| Kimi | No | Receives AI-generated description |
| Mistral | No | Receives AI-generated description |
| Llama | No | Receives AI-generated description |
Vision works in both the Gateway API (via OpenAI multimodal format) and the Chat API (via image_url parameter). The system handles capabilities automatically — you don't need to check which models support vision.
List Models (API)
GET /api/v1/models
Returns all available models in OpenAI-compatible format.