Skip to content

bodhi-realtime-agent / VoiceSessionConfig

Interface: VoiceSessionConfig

Defined in: core/voice-session.ts:31

Configuration for creating a VoiceSession.

Properties

agents

agents: MainAgent[]

Defined in: core/voice-session.ts:39

All agents available in this session.


apiKey

apiKey: string

Defined in: core/voice-session.ts:37

Google API key for the Gemini Live API (used when no transport is provided).


behaviors?

optional behaviors: BehaviorCategory[]

Defined in: core/voice-session.ts:67

Behavior categories for dynamic runtime tuning (speech speed, verbosity, etc.).


compressionConfig?

optional compressionConfig: object

Defined in: core/voice-session.ts:57

Context window compression thresholds.

targetTokens

targetTokens: number

triggerTokens

triggerTokens: number


geminiModel?

optional geminiModel: string

Defined in: core/voice-session.ts:51

LLM model name (e.g. "gemini-live-2.5-flash-preview").


hooks?

optional hooks: FrameworkHooks

Defined in: core/voice-session.ts:45

Lifecycle hooks for observability.


host?

optional host: string

Defined in: core/voice-session.ts:49

Host for the client WebSocket server (default: '0.0.0.0' for all interfaces).


initialAgent

initialAgent: string

Defined in: core/voice-session.ts:41

Name of the agent to activate on start.


inputAudioTranscription?

optional inputAudioTranscription: boolean

Defined in: core/voice-session.ts:61

Enable server-side transcription of user audio input (default: true). Has no effect when sttProvider is set (built-in is disabled automatically). Use false to disable all input transcription for privacy or cost control.


memory?

optional memory: object

Defined in: core/voice-session.ts:69

Enable memory distillation. Extracts durable user facts from conversation and persists them.

store

store: MemoryStore

Where to persist extracted facts.

turnFrequency?

optional turnFrequency: number

Extract every N turns (default: 5).


model

model: LanguageModelV1

Defined in: core/voice-session.ts:53

Vercel AI SDK model for subagent text generation.


port

port: number

Defined in: core/voice-session.ts:47

Port for the client WebSocket server.


sessionId

sessionId: string

Defined in: core/voice-session.ts:33

Unique session identifier.


speechConfig?

optional speechConfig: object

Defined in: core/voice-session.ts:55

Voice configuration for Gemini's speech output.

voiceName?

optional voiceName: string


sttProvider?

optional sttProvider: STTProvider

Defined in: core/voice-session.ts:65

External STT provider for user input transcription. When set, transport built-in transcription is automatically disabled. When omitted, the transport's built-in transcription is used.


subagentConfigs?

optional subagentConfigs: Record<string, SubagentConfig>

Defined in: core/voice-session.ts:43

Background subagent configs keyed by tool name.


transport?

optional transport: LLMTransport

Defined in: core/voice-session.ts:76

Pre-constructed LLM transport. If provided, apiKey/geminiModel/speechConfig/compressionConfig are ignored.


userId

userId: string

Defined in: core/voice-session.ts:35

User identifier (used for memory storage and history).

Built with VitePress