bodhi-realtime-agent / VoiceSessionConfig
Interface: VoiceSessionConfig
Defined in: core/voice-session.ts:39
Configuration for creating a VoiceSession.
Properties
agents
agents:
MainAgent[]
Defined in: core/voice-session.ts:47
All agents available in this session.
apiKey
apiKey:
string
Defined in: core/voice-session.ts:45
Google API key for the Gemini Live API (used when no transport is provided).
artifactRegistry?
optionalartifactRegistry:object
Defined in: core/voice-session.ts:101
Optional per-session artifact registry for cross-tool binary sharing (images, documents).
dispose()
dispose():
void
Returns
void
store()
store(
base64,mimeType,description,source?,fileName?):string
Parameters
base64
string
mimeType
string
description
string
source?
string
fileName?
string
Returns
string
artifactStore?
optionalartifactStore:ArtifactStore
Defined in: core/voice-session.ts:93
When provided, agents/tools can persist artifacts (images, docs, etc.) via session.workspace.saveArtifact().
behaviors?
optionalbehaviors:BehaviorCategory[]
Defined in: core/voice-session.ts:82
Behavior categories for dynamic runtime tuning (speech speed, verbosity, etc.).
clientSender?
optionalclientSender:SessionClientSender
Defined in: core/voice-session.ts:58
Sender for all output to the client. The server owns the socket and feeds input via feedAudioFromClient / feedJsonFromClient and notifyClientConnected / notifyClientDisconnected.
compressionConfig?
optionalcompressionConfig:object
Defined in: core/voice-session.ts:72
Context window compression thresholds.
targetTokens
targetTokens:
number
triggerTokens
triggerTokens:
number
conversationHistoryStore?
optionalconversationHistoryStore:ConversationHistoryStore
Defined in: core/voice-session.ts:91
When provided, conversation items are persisted at turn boundaries and on session close.
geminiModel?
optionalgeminiModel:string
Defined in: core/voice-session.ts:66
LLM model name (e.g. "gemini-live-2.5-flash-preview").
hooks?
optionalhooks:FrameworkHooks
Defined in: core/voice-session.ts:53
Lifecycle hooks for observability.
host?
optionalhost:string
Defined in: core/voice-session.ts:62
Host for the local client WebSocket server (legacy/local mode).
initialAgent
initialAgent:
string
Defined in: core/voice-session.ts:49
Name of the agent to activate on start.
inputAudioTranscription?
optionalinputAudioTranscription:boolean
Defined in: core/voice-session.ts:76
Enable server-side transcription of user audio input (default: true). Has no effect when sttProvider is set (built-in is disabled automatically). Use false to disable all input transcription for privacy or cost control.
listenTimeoutMs?
optionallistenTimeoutMs:number
Defined in: core/voice-session.ts:64
Listen timeout for local client WebSocket server startup (legacy/local mode).
memory?
optionalmemory:object
Defined in: core/voice-session.ts:84
Enable memory distillation. Extracts durable user facts from conversation and persists them.
store
store:
MemoryStore
Where to persist extracted facts.
turnFrequency?
optionalturnFrequency:number
Extract every N turns (default: 5).
model
model:
LanguageModelV1
Defined in: core/voice-session.ts:68
Vercel AI SDK model for subagent text generation.
port?
optionalport:number
Defined in: core/voice-session.ts:60
Port for the local client WebSocket server (legacy/local mode).
sessionId
sessionId:
string
Defined in: core/voice-session.ts:41
Unique session identifier.
speechConfig?
optionalspeechConfig:object
Defined in: core/voice-session.ts:70
Voice configuration for Gemini's speech output.
voiceName?
optionalvoiceName:string
sttProvider?
optionalsttProvider:STTProvider
Defined in: core/voice-session.ts:80
External STT provider for user input transcription. When set, transport built-in transcription is automatically disabled. When omitted, the transport's built-in transcription is used.
subagentConfigs?
optionalsubagentConfigs:Record<string,SubagentConfig>
Defined in: core/voice-session.ts:51
Background subagent configs keyed by tool name.
transport?
optionaltransport:LLMTransport
Defined in: core/voice-session.ts:99
Pre-constructed LLM transport. If provided, apiKey/geminiModel/speechConfig/compressionConfig are ignored.
ttsProvider?
optionalttsProvider:TTSProvider
Defined in: core/voice-session.ts:97
External TTS provider for speech synthesis. When set, LLM is configured for text-mode responses. When omitted, LLM-native audio generation is used (default).
userId
userId:
string
Defined in: core/voice-session.ts:43
User identifier (used for memory storage and history).