Audio Streaming

View as Markdown

Receiving Audio

Use the async iterator audio_stream() to receive incoming audio chunks from the caller:

1async for audio_chunk in call.audio_stream():
2 await process(audio_chunk)

Each audio_chunk is a bytes object containing raw PCM audio data.

Sending Audio

Use send_audio() to queue audio data for playback to the caller:

1await call.send_audio(pcm_audio_bytes)

Audio is placed in an internal ConcurrentByteBuffer. The SDK sends data to the server only when the server requests it (pull-based flow control). This ensures smooth playback without jitter.

Audio Format

All audio sent to and received from Telcoflow must use the following format:

PropertyValue
EncodingPCM (raw, uncompressed)
Bit Depth16-bit (signed integer)
Byte OrderLittle-endian
ChannelsMono (single channel)
Sample Rate24,000 Hz (24 kHz)
MIME Typeaudio/pcm;rate=24000

This means each audio sample is a 16-bit signed integer in little-endian byte order, producing 48,000 bytes per second of audio (24,000 samples x 2 bytes per sample).

This format is consistent across:

  • Telcoflow media connections (both send and receive)
  • Google Gemini native audio
  • Deepgram speech-to-text input

No headers, containers, or codecs are involved. The audio data is raw PCM bytes. If you are generating audio from a TTS engine or other source, make sure to strip any file headers (e.g., WAV headers) before sending.

Buffer Management

1# Check buffer size
2size = call.get_send_audio_buffer_size()
3
4# Clear all queued audio (for interruption handling)
5await call.clear_send_audio_buffer()

Why Use a Buffer?

  • Smooth Playback - Prevents audio jitter by maintaining a steady supply of data for the server
  • Flow Control - Automatically handles the rate at which audio should be sent
  • Interruption Handling - If your AI model gets interrupted (e.g., via a Gemini Live interruption event), you can instantly clear the buffer to stop any pending audio from being played

Configuring Buffer Size

The public 0.24.0 docs emphasize the factory helpers for client setup:

1config = TelcoflowClientConfig.sandbox(
2 api_key="YOUR_API_KEY",
3 connector_uuid="YOUR_APP_UUID",
4 sample_rate=24000,
5)

Handling Interruptions

When your AI model is interrupted (e.g., the caller starts talking while the AI is responding), clear the audio buffer to immediately stop playback:

1if event.interrupted:
2 await call.clear_send_audio_buffer()
3 # Now safe to start sending new audio

This prevents stale audio from playing after the model has already moved on to a new response. Both the Google GenAI SDK and Google ADK surface interruption events that you can use to trigger this.

Next Steps