Audio-Only

Audio-only recordings capture just the audio from your calls—no video. This is ideal for podcasts, voice calls, transcription, meeting minutes, and scenarios where video isn't needed or you want to reduce storage costs.

You can request audio-only recordings in three different formats:

  • Composite: Delivers one file with all audio from all participants mixed in
  • Individual: Delivers one audio file for each participant in the call
  • Raw: Delivers one zip containing audio-only packets and metadata that you use to process

Quickstart

// Option 1: Composite audio (single mixed MP3)
await call.update({
  settings_override: {
    recording: { mode: "available", audio_only: true },
  },
});
// Triggers: call.recording_started → call.recording_stopped → call.recording_ready
await call.startRecording("composite");
await call.stopRecording("composite");

// Option 2: Individual audio (separate file per participant)
await call.update({
  settings_override: {
    individual_recording: {
      mode: "available",
      output_types: ["audio_only", "screenshare_audio_only"],
    },
  },
});
// Triggers: call.recording_started → call.recording_stopped → call.recording_ready (one per file)
await call.startRecording("individual");
await call.stopRecording("individual");

// Option 3: Raw audio (unprocessed, requires CLI post-processing)
await call.update({
  settings_override: {
    raw_recording: { mode: "available", audio_only: true },
  },
});
// Triggers: call.recording_started → call.recording_stopped → call.recording_ready
await call.startRecording("raw");
await call.stopRecording("raw");

// List all recordings for this call
const response = await call.listRecordings();

// Download using the URL from response.recordings or the call.recording_ready webhook payload
// response.recordings[0].url

Each type of recording can be used to do audio-only, here's an overview of the pros/cons:

MethodOutputBest for
CompositeSingle mixed audio file (all participants combined)Ready-to-share podcasts, meeting recordings
IndividualSeparate audio file per participantPost-production editing, speaker isolation
RawUnprocessed RTP packetsMaximum flexibility, lowest cost, custom encoding

Each recording type requires specific settings to enable audio-only mode. See the sections below for configuration details.

Composite audio

This section covers audio-only configuration. For complete documentation, see the Composite recording guide.

Composite audio (also known as mixed audio) produces a single MP3 file with all participants mixed together. This is the simplest option—you get one ready-to-use file with no post-processing required.

Use cases: Meeting recordings, webinar audio, podcast distribution, simple transcription

How to enable: Set audio_only: true in your composite recording settings.

// Enable audio-only composite recording
call.update({
  settings_override: {
    recording: {
      mode: "available",
      audio_only: true,
    },
  },
});

// Start recording
// Triggers webhook event: call.recording_started
call.startRecording("composite");

// When stopped: call.recording_stopped
// When ready: call.recording_ready (contains recording URL)

Output: Single MP3 file

Individual audio tracks

This section covers audio-only configuration. For complete documentation, see the Individual recording guide.

Individual audio recording captures each participant's audio separately. This gives you isolated tracks per speaker, which is essential for professional post-production where you need to edit, mix, or process each participant independently.

Use cases: Podcast editing with per-speaker control, speaker diarization, noise reduction per participant, creating highlight clips

How to enable: Set output_types to include audio_only (and optionally screenshare_audio_only) in your individual recording settings.

// Enable individual audio-only recording
call.update({
  settings_override: {
    individual_recording: {
      mode: "available",
      output_types: ["audio_only", "screenshare_audio_only"],
    },
  },
});

// Start recording
// Triggers webhook event: call.recording_started
call.startRecording("individual");

// When stopped: call.recording_stopped
// When ready: call.recording_ready (one event per file, contains recording URL)

Output: One MKV audio file per participant

Raw audio

This section covers audio extraction from raw recordings. For complete documentation, see the Raw recording guide.

Raw recording captures unprocessed RTP audio packets, giving you maximum flexibility for custom post-processing. This is the most cost-effective option but requires using the Stream CLI to extract playable audio files.

Use cases: Cost-effective archival, custom encoding pipelines, compliance requirements, generating MP3 or other formats

How to enable: Enable raw recording with audio_only: true, then use the Stream CLI to extract and mix audio tracks.

// Enable raw recording with audio only
call.update({
  settings_override: {
    raw_recording: {
      mode: "available",
      audio_only: true,
    },
  },
});

// Start recording
// Triggers webhook event: call.recording_started
call.startRecording("raw");

// When stopped: call.recording_stopped
// When ready: call.recording_ready (contains archive URL)

After the call, use the Stream CLI to extract audio:

# Extract individual audio tracks per participant (outputs MKV files)
stream-cli video raw-recording extract-audio --input-file recording.tar.gz --output ./audio

# Mix all participants into a single MP3 file
stream-cli video raw-recording mix-audio --input-file recording.tar.gz --output ./mixed

You can convert extracted MKV files to other formats using FFmpeg:

ffmpeg -i individual_audio.mkv -codec:a libmp3lame -qscale:a 2 output.mp3

Output:

  • extract-audio → MKV audio files per participant
  • mix-audio → Single MP3 file with all participants mixed

Choosing the right option

Use caseRecommendedWhy
Meeting recording for playbackComposite audioSingle file, ready to share
Podcast with multiple hostsIndividual audioEdit each speaker separately
Transcription serviceComposite or IndividualComposite for simplicity, Individual for speaker attribution
Archival with future flexibilityRaw audioLowest cost, can generate any format later
Custom MP3 encodingRaw audioFull control over encoding settings