Google GenAI SDK (Gemini Live)

View as Markdown

This guide shows how to integrate the Telcoflow SDK with the Google GenAI SDK for bidirectional real-time audio streaming with Gemini’s native audio model.

Overview

The integration bridges two real-time streams:

  • Caller audio -> Gemini: Incoming phone audio is forwarded to a Gemini Live session
  • Gemini audio -> Caller: Gemini’s voice responses are sent back to the caller via send_audio()

Interruption handling is built in: when Gemini detects the user is speaking over the model, the outgoing audio buffer is cleared instantly.

Full Example

1import asyncio
2import os
3from google import genai
4from google.genai import types
5from telcoflow_sdk import TelcoflowClient, TelcoflowClientConfig, ActiveCall
6import telcoflow_sdk.events as events
7
8gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
9MODEL = "gemini-2.5-flash-native-audio-preview-12-2025"
10
11async def start_gemini_session(call: ActiveCall):
12 await call.answer()
13
14 async with gemini_client.aio.live.connect(model=MODEL) as session:
15 async def stream_to_gemini():
16 async for chunk in call.audio_stream():
17 await session.send_realtime_input(
18 audio=types.Blob(
19 data=chunk, mime_type="audio/pcm;rate=24000"
20 )
21 )
22
23 async def receive_from_gemini():
24 async for response in session.receive():
25 if content := response.server_content:
26 if content.interrupted:
27 await call.clear_send_audio_buffer()
28 elif content.model_turn:
29 for part in content.model_turn.parts:
30 if part.inline_data:
31 await call.send_audio(part.inline_data.data)
32
33 await asyncio.gather(stream_to_gemini(), receive_from_gemini())
34
35async def main():
36 config = TelcoflowClientConfig.sandbox(
37 api_key=os.getenv("WSS_API_KEY"),
38 connector_uuid=os.getenv("WSS_CONNECTOR_UUID"),
39 sample_rate=24000,
40 )
41
42 async with TelcoflowClient(config) as client:
43 @client.on(events.INCOMING_CALL)
44 async def on_call(call: ActiveCall):
45 await start_gemini_session(call)
46
47 await client.run_forever()
48
49if __name__ == "__main__":
50 asyncio.run(main())

How It Works

Stream to Gemini

The stream_to_gemini() coroutine reads audio chunks from call.audio_stream() and forwards them to the Gemini Live session using send_realtime_input(). The audio is wrapped in a types.Blob with the PCM MIME type.

Receive from Gemini

The receive_from_gemini() coroutine listens for Gemini responses:

  • Interruption: When content.interrupted is True, the caller has started speaking over the model. clear_send_audio_buffer() is called to immediately stop any queued audio.
  • Model audio: When content.model_turn contains inline_data, the raw audio bytes are sent to the caller via send_audio().

Concurrency

Both coroutines run concurrently via asyncio.gather(). This allows the system to simultaneously listen to the caller and send AI responses without blocking.

Environment Variables

VariableDescription
GEMINI_API_KEYGoogle API key with Gemini API access
WSS_API_KEYTelcoflow API key
WSS_CONNECTOR_UUIDTelcoflow connector UUID

Audio Format

Both Telcoflow and Gemini native audio use PCM 16-bit linear, 24kHz, mono (audio/pcm;rate=24000). No transcoding is needed.

Next Steps