Audio Streaming
Receiving Audio
Use the async iterator audio_stream() to receive incoming audio chunks from the caller:
Each audio_chunk is a bytes object containing raw PCM audio data.
Sending Audio
Use send_audio() to queue audio data for playback to the caller:
Audio is placed in an internal ConcurrentByteBuffer. The SDK sends data to the server only when the server requests it (pull-based flow control). This ensures smooth playback without jitter.
Audio Format
All audio sent to and received from Telcoflow must use the following format:
This means each audio sample is a 16-bit signed integer in little-endian byte order, producing 48,000 bytes per second of audio (24,000 samples x 2 bytes per sample).
This format is consistent across:
- Telcoflow media connections (both send and receive)
- Google Gemini native audio
- Deepgram speech-to-text input
No headers, containers, or codecs are involved. The audio data is raw PCM bytes. If you are generating audio from a TTS engine or other source, make sure to strip any file headers (e.g., WAV headers) before sending.
Buffer Management
Why Use a Buffer?
- Smooth Playback - Prevents audio jitter by maintaining a steady supply of data for the server
- Flow Control - Automatically handles the rate at which audio should be sent
- Interruption Handling - If your AI model gets interrupted (e.g., via a Gemini Live interruption event), you can instantly clear the buffer to stop any pending audio from being played
Configuring Buffer Size
The public 0.24.0 docs emphasize the factory helpers for client setup:
Handling Interruptions
When your AI model is interrupted (e.g., the caller starts talking while the AI is responding), clear the audio buffer to immediately stop playback:
This prevents stale audio from playing after the model has already moved on to a new response. Both the Google GenAI SDK and Google ADK surface interruption events that you can use to trigger this.
Next Steps
- Call Commands - Commands that control the call
- Event Handling - Listen for call events
- Google GenAI Integration - Full bidirectional audio example with Gemini
- Google ADK Integration - Multi-agent audio pipeline with ADK
