Streaming

Display each token as it arrives — no buffering, no flash of finished content. kitn-chat supports two approaches: mutate messages in place (the standard pattern with kai-chat) or feed an AsyncIterable<string> directly to kai-response-stream.

Stream into kai-chat

kai-chat streams by appending an empty assistant message, then replacing its content as tokens arrive. Set loading = true during the request so the input is disabled and the loading state shows; set it back to false when the stream ends.

import '@kitn.ai/ui/elements';
import '@kitn.ai/ui/theme.css';

const chat = document.querySelector('kai-chat');

chat.addEventListener('kai-submit', async (e) => {
  const text = e.detail.value.trim();
  if (!text) return;

  // Append the user message
  const history = [
    ...chat.messages,
    { id: crypto.randomUUID(), role: 'user', content: text },
  ];
  chat.messages = history;
  chat.loading = true;

  // Seed an empty assistant placeholder to stream into
  const assistantId = crypto.randomUUID();
  chat.messages = [
    ...history,
    { id: assistantId, role: 'assistant', content: '' },
  ];

  // Point at your own backend in production — never expose an API key in the browser.
  const res = await fetch('/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      messages: history.map((m) => ({ role: m.role, content: m.content })),
    }),
  });

  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';
  let answer = '';

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    buffer += decoder.decode(value, { stream: true });

    const lines = buffer.split('\n');
    buffer = lines.pop(); // carry the incomplete line forward

    for (const line of lines) {
      const s = line.trim();
      if (!s.startsWith('data:')) continue;
      const payload = s.slice(5).trim();
      if (payload === '[DONE]') continue;
      try {
        const delta = JSON.parse(payload).choices?.[0]?.delta?.content;
        if (!delta) continue;
        answer += delta;
        // Replace the assistant placeholder with the growing answer
        chat.messages = chat.messages.map((m) =>
          m.id === assistantId ? { ...m, content: answer } : m
        );
      } catch {
        // Ignore non-JSON keep-alive lines
      }
    }
  }

  chat.loading = false;
});

OpenRouter example

If you are proxying an OpenRouter request on your server, the body sent upstream looks like this:

{
  "model": "anthropic/claude-sonnet-4",
  "stream": true,
  "messages": [{ "role": "user", "content": "Hello" }]
}

The SSE response follows the OpenAI streaming format (data: {...} lines, terminated with data: [DONE]) — the reader loop above handles it as-is.

Stream with kai-response-stream

kai-response-stream is a lower-level web component for streaming plain text or markdown outside a full chat thread. Pass either a complete string (renders with the reveal animation) or an AsyncIterable<string> (streams tokens live).

Prop	Type	Default	Notes
`text`	`string \| AsyncIterable<string>`	`''`	Assign it in JavaScript — an async iterable can’t be an HTML attribute.
`mode`	`'typewriter' \| 'fade'`	`'typewriter'`	Reveal animation.
`speed`	`number`	`20`	Characters/segments per tick.
`as`	`string`	—	Element tag to render into.

Event: kai-complete — fires when streaming finishes.

Feeding an AsyncIterable

import '@kitn.ai/ui/elements';

const el = document.querySelector('kai-response-stream');

async function* tokenStream(res) {
  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    buffer += decoder.decode(value, { stream: true });

    const lines = buffer.split('\n');
    buffer = lines.pop();

    for (const line of lines) {
      const s = line.trim();
      if (!s.startsWith('data:')) continue;
      const payload = s.slice(5).trim();
      if (payload === '[DONE]') continue;
      try {
        const delta = JSON.parse(payload).choices?.[0]?.delta?.content;
        if (delta) yield delta;
      } catch { /* skip keep-alives */ }
    }
  }
}

const res = await fetch('/api/chat', { method: 'POST', body: '...' });
el.text = tokenStream(res); // pass the AsyncIterable directly

Passing a finished string

Assigning a complete string runs the reveal animation without a live connection — useful for replaying or previewing a cached response.

el.text = 'The assistant response rendered with the typewriter effect.';

Choosing an approach

Scenario	Use
Full chat UI with history	`kai-chat` + mutate `messages`
Isolated response widget	`kai-response-stream`
Replay a saved response	`kai-response-stream` with a string
Render finished markdown	`kai-markdown` with a `content` string

kai-markdown accepts only a static content string and renders it immediately — use it when the full text is available, not for live streaming.