bodhi-realtime-agent / VoiceSessionConfig
Interface: VoiceSessionConfig
Defined in: core/voice-session.ts:31
Configuration for creating a VoiceSession.
Properties
agents
agents:
MainAgent[]
Defined in: core/voice-session.ts:39
All agents available in this session.
apiKey
apiKey:
string
Defined in: core/voice-session.ts:37
Google API key for the Gemini Live API (used when no transport is provided).
behaviors?
optionalbehaviors:BehaviorCategory[]
Defined in: core/voice-session.ts:67
Behavior categories for dynamic runtime tuning (speech speed, verbosity, etc.).
compressionConfig?
optionalcompressionConfig:object
Defined in: core/voice-session.ts:57
Context window compression thresholds.
targetTokens
targetTokens:
number
triggerTokens
triggerTokens:
number
geminiModel?
optionalgeminiModel:string
Defined in: core/voice-session.ts:51
LLM model name (e.g. "gemini-live-2.5-flash-preview").
hooks?
optionalhooks:FrameworkHooks
Defined in: core/voice-session.ts:45
Lifecycle hooks for observability.
host?
optionalhost:string
Defined in: core/voice-session.ts:49
Host for the client WebSocket server (default: '0.0.0.0' for all interfaces).
initialAgent
initialAgent:
string
Defined in: core/voice-session.ts:41
Name of the agent to activate on start.
inputAudioTranscription?
optionalinputAudioTranscription:boolean
Defined in: core/voice-session.ts:61
Enable server-side transcription of user audio input (default: true). Has no effect when sttProvider is set (built-in is disabled automatically). Use false to disable all input transcription for privacy or cost control.
memory?
optionalmemory:object
Defined in: core/voice-session.ts:69
Enable memory distillation. Extracts durable user facts from conversation and persists them.
store
store:
MemoryStore
Where to persist extracted facts.
turnFrequency?
optionalturnFrequency:number
Extract every N turns (default: 5).
model
model:
LanguageModelV1
Defined in: core/voice-session.ts:53
Vercel AI SDK model for subagent text generation.
port
port:
number
Defined in: core/voice-session.ts:47
Port for the client WebSocket server.
sessionId
sessionId:
string
Defined in: core/voice-session.ts:33
Unique session identifier.
speechConfig?
optionalspeechConfig:object
Defined in: core/voice-session.ts:55
Voice configuration for Gemini's speech output.
voiceName?
optionalvoiceName:string
sttProvider?
optionalsttProvider:STTProvider
Defined in: core/voice-session.ts:65
External STT provider for user input transcription. When set, transport built-in transcription is automatically disabled. When omitted, the transport's built-in transcription is used.
subagentConfigs?
optionalsubagentConfigs:Record<string,SubagentConfig>
Defined in: core/voice-session.ts:43
Background subagent configs keyed by tool name.
transport?
optionaltransport:LLMTransport
Defined in: core/voice-session.ts:76
Pre-constructed LLM transport. If provided, apiKey/geminiModel/speechConfig/compressionConfig are ignored.
userId
userId:
string
Defined in: core/voice-session.ts:35
User identifier (used for memory storage and history).