Streaming
Server-Sent Events im OpenAI-Chunk-Format — ohne eigenen Parser.
CleverRouter streamt Chat-Completions als Server-Sent Events im OpenAI-Chunk-Format. Jede SDK, die OpenAI versteht, kann den Stream konsumieren.
Stream anfordern
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.CLEVERROUTER_API_KEY!,
baseURL: 'https://cleverouter.eu/v1',
});
const stream = await client.chat.completions.create({
model: 'mistral/mistral-small-3.2',
messages: [{ role: 'user', content: 'Schreib einen Haiku über Berlin.' }],
stream: true,
stream_options: { include_usage: true },
});
let usage;
for await (const chunk of stream) {
if (chunk.usage) usage = chunk.usage;
process.stdout.write(chunk.choices[0]?.delta.content ?? '');
}
console.log('\nusage', usage);Raw-Wire-Format
Wenn du nicht über die SDK gehst (Edge, Bun, Hono):
data: {"id":"chatcmpl-x","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hallo"},"index":0}]}
data: {"id":"chatcmpl-x","object":"chat.completion.chunk","choices":[{"delta":{"content":", "},"index":0}]}
data: [DONE]Response-Headers:
content-type: text/event-stream; charset=utf-8
cache-control: no-cache, no-transform
x-accel-buffering: noReverse-Proxies
CleverRouter setzt X-Accel-Buffering: no, damit nginx/Cloudflare Chunks nicht puffern. Hinter eigenen Proxies bitte denselben Header durchreichen.
Tool-Calls werden ebenfalls als Deltas gestreamt — siehe Tool Use.