Voice assistant
A hands-free assistant: the mic captures a question, the transcript becomes a chat message, and the streamed reply is read back aloud. This demo simulates the speech-to-text step with a canned transcript — the spoken reply is real, powered by your browser’s Web Speech API. Tap the mic in the toolbar to try it.
How it works
Section titled “How it works”<kai-chat> does the talking on both ends. Its built-in voice attribute adds a mic button to the input toolbar and fires a kai-voice event when tapped. You own what happens next: capture audio, transcribe it, push the text into the thread, then speak the finished reply with the Web Speech API.
<kai-chat id="chat" chat-title="Voice assistant" voice></kai-chat>
<script type="module"> import '@kitn.ai/ui/elements';
const chat = document.getElementById('chat'); chat.messages = [];
// Read the finished reply aloud — guarded so unsupported browsers stay silent. function speak(text) { if (!('speechSynthesis' in window) || !text) return; const utter = new SpeechSynthesisUtterance(text); utter.lang = 'en-US'; speechSynthesis.cancel(); // stop any previous utterance speechSynthesis.speak(utter); }
// Run one turn: append the question, stream the reply, then speak it. async function ask(question) { const assistantId = crypto.randomUUID(); chat.messages = [ ...chat.messages, { id: crypto.randomUUID(), role: 'user', content: question }, { id: assistantId, role: 'assistant', content: '' }, ]; chat.loading = true;
let answer = ''; for await (const token of streamAnswer(question)) { answer += token; chat.messages = chat.messages.map((m) => m.id === assistantId ? { ...m, content: answer } : m, ); }
chat.loading = false; speak(answer); // hear the reply once it finishes streaming }
// Mic tapped → record, transcribe, then run the turn with the transcript. chat.addEventListener('kai-voice', async () => { const transcript = await transcribeFromMic(); // your STT pipeline if (transcript) ask(transcript); });
// Typed messages follow the same path. chat.addEventListener('kai-submit', (e) => { const text = e.detail.value.trim(); if (text) ask(text); });</script>Mic capture needs the MediaRecorder API and a transcription backend, which can’t run in this sandbox — the demo above swaps transcribeFromMic() for a canned transcript and a scripted stream so the flow is visible end to end. In production, drop <kai-voice-input> into transcribeFromMic() to record audio and return text from your speech-to-text provider.
Next steps
Section titled “Next steps”- Speech to Text recipe — wire real mic capture with
<kai-voice-input>and a transcription backend, the piece this demo stubs out. - Text to Speech recipe — choosing a voice, stopping playback, and swapping the browser-native path for higher-quality cloud TTS.
kai-chatreference — thevoiceattribute, thekai-voiceevent, and the full streaming loop.- Drop-in chat — the baseline streaming loop this example layers voice on top of.