Skip to content

bodhi-realtime-agent / GeminiBatchSTTProvider

Class: GeminiBatchSTTProvider

Defined in: transport/gemini-batch-stt-provider.ts:30

STTProvider that uses a separate Gemini model via generateContent() for batch transcription of buffered user audio.

Extracted from GeminiLiveTransport. Audio is buffered via feedAudio(), then transcribed when commit() is called (triggered by model turn start).

Implements

Constructors

Constructor

new GeminiBatchSTTProvider(config): GeminiBatchSTTProvider

Defined in: transport/gemini-batch-stt-provider.ts:41

Parameters

config

GeminiBatchSTTConfig

Returns

GeminiBatchSTTProvider

Properties

onPartialTranscript()?

optional onPartialTranscript: (text) => void

Defined in: transport/gemini-batch-stt-provider.ts:39

Partial/interim transcription (streaming providers only). Replaces any previous partial for the same turn.

Parameters

text

string

Returns

void

Implementation of

STTProvider.onPartialTranscript


onTranscript()?

optional onTranscript: (text, turnId) => void

Defined in: transport/gemini-batch-stt-provider.ts:38

Final transcription of user speech.

Parameters

text

string

The transcribed text.

turnId

The turn this transcript belongs to (from commit()). Undefined when a streaming provider's VAD auto-commits before the framework calls commit().

number | undefined

Returns

void

Implementation of

STTProvider.onTranscript

Methods

commit()

commit(turnId): void

Defined in: transport/gemini-batch-stt-provider.ts:77

Signal that the user's turn has ended (model started responding). For batch providers, this triggers transcription. For streaming providers, this may trigger a manual commit.

Parameters

turnId

number

Monotonically increasing turn counter for ordering.

Returns

void

Implementation of

STTProvider.commit


configure()

configure(audio): void

Defined in: transport/gemini-batch-stt-provider.ts:46

Configure the audio format that feedAudio() will deliver. Called once before start(). The provider MUST resample or reject if it cannot handle the given format.

Parameters

audio

STTAudioConfig

Returns

void

Implementation of

STTProvider.configure


feedAudio()

feedAudio(base64Pcm): void

Defined in: transport/gemini-batch-stt-provider.ts:65

Feed audio data. Format matches the STTAudioConfig from configure().

Parameters

base64Pcm

string

Base64-encoded PCM audio chunk.

Returns

void

Implementation of

STTProvider.feedAudio


handleInterrupted()

handleInterrupted(): void

Defined in: transport/gemini-batch-stt-provider.ts:123

Signal that the current turn was interrupted by the user. Providers MUST preserve buffered audio for the next commit().

Returns

void

Implementation of

STTProvider.handleInterrupted


handleTurnComplete()

handleTurnComplete(): void

Defined in: transport/gemini-batch-stt-provider.ts:127

Signal a natural turn completion (model finished, no interruption). Batch providers SHOULD clear buffers. Streaming providers may no-op.

Returns

void

Implementation of

STTProvider.handleTurnComplete


start()

start(): Promise<void>

Defined in: transport/gemini-batch-stt-provider.ts:56

Start the STT session (e.g. open WebSocket).

Returns

Promise<void>

Implementation of

STTProvider.start


stop()

stop(): Promise<void>

Defined in: transport/gemini-batch-stt-provider.ts:60

Stop the STT session (e.g. close WebSocket).

Returns

Promise<void>

Implementation of

STTProvider.stop

Built with VitePress