Chat API
Battle, Fight, and legacy chat endpoints
Use these endpoints when you want multiple AI models to answer the same question — side-by-side, or competing and voting on each other's answers. They can also search the web on their own when your question needs current information.
Endpoints
| Endpoint | What it does | Streaming |
|---|---|---|
POST /api/v1/battle | Two models answer the same question, side-by-side. No voting. | Streaming or single response |
POST /api/v1/fight | Up to 8 models answer, see each other's answers, and vote on the best one. | Streaming only |
POST /api/v1/chat/completions | A single model answers. Works with the OpenAI SDK. See Gateway API. | Streaming or single response |
POST /api/chat (older) | The original endpoint. Does everything above, but you have to pick the mode with a mode field in the request body. Still fully supported — see the dedicated Chat API (Legacy) page. | Streaming or single response |
Which one should I use?
- Battle (2 models) →
/api/v1/battle - Fight (3+ models with voting) →
/api/v1/fight - Single model →
/api/v1/chat/completions(see the Gateway API for OpenAI-SDK-compatible usage)
The older /api/chat endpoint still works if you're already using it. New projects should use one of the three above.
Conversation History
All modes support multi-turn conversations via the optional messages array. Pass previous turns so the AI has context from earlier in the conversation.
The messages array contains all previous turns before the current message. Each entry has:
| Field | Type | Description |
|---|---|---|
role | string | "user", "assistant", or "system" |
content | string | The message content |
How History Works Per Mode
- Chat mode: The model sees the full conversation history, enabling natural follow-ups.
- Battle mode: Both models see the same conversation history.
- Fight mode (
brawl, default): Every agent sees the conversation history, plus every peer answer from earlier in the round, plus every message from prior rounds. - Fight mode (
blitz): Every agent sees the conversation history and only the winning answer from each prior round — never in-flight peers. - Multi-instance mode: Runs inside fight mode, so it follows whichever
fightStyleyou pass.
Web Search (Exa-Powered)
In battle, fight, and multi-instance modes, AI models can autonomously search the web when they need current information. This happens transparently — no extra parameters needed.
How It Works
- A model receives your question and decides it needs fresh data
- The model triggers a web search with a targeted query
- Results are fetched via Exa and fed back to the model
- The model's final response includes a
citationsarray with sources
No Configuration Required
Web search is automatic. Models decide when to search based on the question. Not every response triggers a search.
Citations
When a model performs a web search, its response includes a citations array:
Streaming Search Events
When streaming, web searches produce additional SSE events:
| Mode | Web Search |
|---|---|
| Chat (single model) | No |
| Battle (2 models) | Yes |
| Fight (multi-agent) | Yes |
| Multi-instance (4x same model) | Yes |
Vision & Image Analysis
All modes support image inputs via the image_url parameter. Vision models (GPT, Claude, Gemini, Grok) see the image directly. Non-vision models receive an AI-generated description transparently. See Models > Vision Support for the full support matrix.
In battle/fight mode, vision and non-vision models can compete on the same image — the system handles it automatically.
Chatting with a single model
If you just want one AI model to answer (no battle, no fight), use the Gateway API at POST /api/v1/chat/completions. It works with the OpenAI SDK — point the SDK at Concurred and your existing code runs unchanged. It also supports streaming, images, tool calls, and prompt caching.
See the Gateway docs for the full reference.
Battle — POST /api/v1/battle
Compare responses from 2 AI models side by side.
Request
You don't need a mode field — this endpoint is always battle mode. Everything else in the body works the same as the older /api/chat endpoint.
Response
No Winner in Battle Mode
Battle mode returns raw responses only — no voting, no winner. Use fight mode if you need voting and a winner.
Fight — POST /api/v1/fight
Multiple AIs compete, debate, and vote on each other's responses.
Streaming Required
Fight requires streaming ("stream": true or omit the parameter). Sending "stream": false returns a 400 STREAMING_REQUIRED error.
Request
You don't need a mode field — this endpoint is always fight mode.
Selecting Models
Pass a comma-separated string ("claude,gpt,grok") or an array (["claude","gpt","grok"]). Omit models to use all 8 AI models.
Fight Styles
Pick how agents interact within a round with the optional fightStyle parameter.
fightStyle | Behaviour | Feels like |
|---|---|---|
"brawl" (default) | Agents answer sequentially within a round — each agent sees every peer answer that came before it. Multi-round rolls every prior round's full transcript forward. | A live debate. Reactive, peer-aware, slower. |
"blitz" | Agents answer concurrently within a round from a frozen context — they never see in-flight peer answers. Multi-round carries only the previous round's winner forward. | Parallel competition. Everyone strikes at once, faster, cleaner. |
Omitting fightStyle is the same as "brawl", so existing integrations keep their current behaviour. Unknown values fall back to "brawl".
session_start echoes the chosen style so UIs can render either flow:
Response (SSE Stream)
Fight mode returns Server-Sent Events. Each line follows data: {json}\n\n:
SSE Event Reference
| Event | Key Fields | Description |
|---|---|---|
session_start | agents[], rounds | Session begins |
round_start | round, totalRounds | New round begins |
agent_start | agent, model | Agent about to respond |
agent_complete | agent, message, citations? | Agent response ready |
search_start | agent, query | Agent triggered web search |
search_complete | agent, citations[] | Search results ready |
voting_start | round | Voting phase begins |
vote_update | voter, votedFor | Single vote cast |
voting_complete | round, results[] | All votes tallied |
leaderboard_update | leaderboard[] | Cumulative scores |
round_complete | round, winner | Round winner |
session_complete | winner, finalLeaderboard[] | Final results |
error | error, agent? | Error occurred |
Multi-Instance Mode
Use agentMode to run 4 instances of the same model competing. Similar to xAI's Grok Heavy.
| agentMode | Description |
|---|---|
multiple-grok | 4 Grok instances |
multiple-claude | 4 Claude instances |
multiple-gpt | 4 GPT instances |
multiple-gemini | 4 Gemini instances |
multiple-deepseek | 4 DeepSeek instances |
multiple-kimi | 4 Kimi instances |
multiple-mistral | 4 Mistral instances |
multiple-llama | 4 Llama instances |
Same SSE stream format as fight mode. Agent IDs will be grok-1, grok-2, grok-3, grok-4.
All Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
message | string | Yes | Your question (max 2000 characters) |
mode | string | Yes | "chat", "battle", or "fight" |
model | string | For chat | Single model ID (see Models) |
models | array or string | For battle/fight | Model IDs. Array ["claude","gpt"] or comma-separated "claude,gpt" |
image_url | string | No | Public image URL for vision analysis |
messages | array | No | Conversation history {role, content} objects |
rounds | number | No | Debate rounds 1-10 (default: 1) |
agentMode | string | No | Multi-instance mode (e.g., "multiple-grok") |
fightStyle | string | No | "brawl" (default, sequential peer-aware) or "blitz" (parallel, frozen context). Fight mode only. |
stream | boolean | No | Enable SSE streaming (default: true). Must be true for fight mode. |
Mode Comparison
| Feature | Chat | Battle | Fight |
|---|---|---|---|
| Models | 1 | 2 | 2-8 |
| Web Search | No | Yes | Yes |
| Voting & Winner | No | No | Yes |
stream: false | Yes | Yes | No |
SDK Examples
Already using /api/chat?
The original POST /api/chat endpoint is still fully supported — it picks the mode from a mode field in the request body instead of from the URL. See the dedicated Chat API (Legacy) page for the mode mapping and a short migration checklist.