Chat UI Component ‐ Speech to text

Chat UI Component - Speech to text Specification

Owned by

CodeX Team

Ivan Petrov

Designer Name

Requires approval from

Peer Developer Name | Date:
Design Manager Name | Date:

Signed off by

Product Owner Name | Date:
Platform Architect Name | Date:

Revision History

Version	Users	Date	Notes
1	Ivan Petrov	14.10.2025	Initial specification

1. Overview

Objectives

Add speech-to-text (STT) functionality to the Chat UI Component, allowing users to dictate messages using their voice. The feature supports two STT modes:

Backend Transcription Mode – Audio is streamed via WebSocket/SignalR to a backend service that integrates with a 3rd party transcription service: Google Speech-to-Text / Vertex AI / etc.. *This backend service is provided as nuget package. Repository here
Frontend (Web Speech API) Mode – Browser-native transcription handled entirely in the frontend (no server dependency).

PoC: https://github.com/IgniteUI/igniteui-webcomponents/pull/1893

Complementary Backend project: https://github.com/IgniteUI/igniteui-speech-to-text-server

Acceptance criteria

Must-have before we can consider the feature a sprint candidate

Users can record and transcribe voice messages directly in the chat input in real-time.
Developers can configure which STT provider is used.
Transcription output appears in the message input field in real-time.
The system automatically stops on silence timeout.
Works across Chrome, Edge, Safari (Web Speech fallback).

2. User Stories

Elaborate more on the multi-facetted use cases

Developer stories:

Story 1: As a developer, I want to enable STT via a component options so I don’t need to write custom integration code.
Story 2: As a developer, I want to choose between backend or frontend transcription providers.

End-user stories:

Story 1: As an end-user, I want to dictate a message in the chat box using my microphone.
Story 2: As an end-user, I want visual feedback (mic pulse and silence countdown) during recording.
Story 3: As an end-user, I want the transcription to stop automatically when I stop speaking and auto-submit the message.
Story 3: As an end-user, I want to have the ability to manually stop the transcription. This should not trigger auto-submitting the message so that it's available for further editing.

3. Functionality

Describe behavior, design, look and feel of the implemented feature. Always include visual mock-up

3.1. End-User Experience

** A microphone icon is displayed next to the message input field. ** Clicking the icon starts recording. The microphone icon is replaced by a stop icon. ** Visual feedback begins when voice is detected - pulsing stop icon. ** Live transcription text appears in the message input field. ** When silence is detected, a timeout animation is presented (countdown circle). If during countdown, voice is again detected, the countdown resets. ** When silence timeout ends or the user clicks stop, recording stops and transcription is finalized. ** When transcription finishes due to silence timeout, the message is auto-submitted.

3.2. Developer Experience Frontend setup: Add speech to text options in the chat component options

speakPlaceholder: 'Speak...',
...
speechToText: {
    enable: true,
    lang: 'en-US',
    serviceProvider: 'webspeech', // 'webspeech' | 'backend'
    serviceUri: 'https://localhost:5000/sttHub',
  },

3.3. Globalization/Localization

Language setting controls transcription locale.

3.4. Keyboard Navigation

Keys	Description

3.5. API

Options

Name	Description	Type	Default value	Valid values
SILENCE_TIMEOUT_MS	Timeout before automatic stop in ms	Number	4000	Any integer ≥ 0
SILENCE_GRACE_PERIOD	Time before silence countdown animation starts	Number	1000	Any integer < SILENCE_TIMEOUT_MS

Methods

Name	Description	Return type	Parameters
start()	Begin recording and transcription	Promise	language?: string
stop()	Stop recording and finalize transcription	void	–

Events

Name	Description	Cancelable	Parameters
onPulseSignal	Fired when STT detects voice (actually fired when a transcription of that voice is received, for simplification)	No	—
onStartCountdown	Fired when silence countdown animation should start	No	`{ ms: number \| null}`
onTranscript	Fired when transcription text updates	No	`{ text: string }`
onStopInProgress	Fired when user clicks stop, but service awaits final transcription result	No	—
onFinishedTranscribing	Fired when transcription completes	No	`{ finish: 'auto' \| 'manual' }`

4. Test Scenarios

Automation

Scenario 1:
scenario 2:

5. Accessibility

ARIA Support

RTL Support

6. Assumptions and Limitations

Assumptions	Limitation Notes

7. References

Specify all referenced external sources

Chat UI Component ‐ Speech to text

Chat UI Component - Speech to text Specification

Contents

Owned by

Requires approval from

Signed off by

Revision History

1. Overview

Objectives

Acceptance criteria

2. User Stories

3. Functionality

Options

Methods

Events

4. Test Scenarios

5. Accessibility

6. Assumptions and Limitations

7. References

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally