Gateway API
Streaming
SSE event grammar for Gateway responses, including tool-call deltas
Set "stream": true to receive responses as Server-Sent Events. The frame grammar matches OpenAI exactly so the Vercel AI SDK and OpenAI SDKs parse it out of the box.
Frame grammar
- Every frame is
data: <json>\n\n. - End of stream is
data: [DONE]\n\n. - Each JSON object is a partial
chat.completion.chunk.
Example text stream:
Tool-call deltas
When the model emits tool calls, deltas arrive with a tool_calls array on each chunk. Example stream for a turn that calls two tools then stops:
Required invariants
- The first delta for a new tool call carries
id,type:"function", andfunction.name. Subsequent deltas for the same call fillfunction.argumentsincrementally and may omitid/name. - Each tool call has a stable
indexacross all of its deltas. Clients key offindex, notid(the AI SDK does this too). - The final chunk before
[DONE]hasdelta: {}andfinish_reason: "tool_calls". - Text content and tool-call deltas can interleave in the same stream — Claude sometimes emits
"Let me check..."text before atool_useblock. Handle both as they arrive. - No provider-native event names (
tool_use,content_block_delta,functionCall) leak to clients. All shapes are normalized to OpenAI deltas.
Streaming argument shape per provider
| Provider | How function.arguments streams |
|---|---|
| Anthropic Claude | Character-by-character (input_json_delta). |
| OpenAI | Character-by-character. |
| Google Gemini | One complete JSON object per tool call (Gemini doesn't chunk args). |
| xAI Grok | Character-by-character. |
| DeepSeek / Mistral / Kimi / Llama / MiniMax | Character-by-character via OpenRouter / native endpoints. |
Finish reasons
finish_reason | Meaning |
|---|---|
stop | Natural end of turn. |
tool_calls | Assistant wants one or more tools invoked. |
length | Hit the max_tokens budget. |
content_filter | Guardrails or upstream safety classifier blocked the response. |
Errors mid-stream
If the upstream provider errors after the stream has started, an error frame is emitted followed by [DONE]:
See Errors for the full code list.
Fallback behavior
Failover applies only before any delta has been emitted. See Provider compatibility → Fallback with tool use for the full policy.
See also
- Tool Use — defining tools, tool_choice, and the round-trip contract.
- Caching with tools — why tool responses bypass the response cache.