Voice AI | Now in Early Access

Voice AI
Accelerated.

Skill, prompt, agent.
Production-grade voice agents on global edge. Pick any STT, LLM and TTS. Connect your data over MCP. Ship to phone, web, mobile and video.

$npx skills add GetStream/agent-skills -s stream

Join the waitlist Try the live demo

~240 ms

P50 voice round-trip

60+

Edge locations

99.999%

Uptime SLA

9:41

Incoming call | Reservation

Bella Notte

+1 (415) 555‑0142

Caller

Hi, I'd like to book a table for four on Friday at 7.

Assistant

Got it: Friday, 7:00 PM, party of four. Anything else I can add?

listening...

STTinworld-sttdeepgram-nova-3assemblyai-universalinworld-stt

LLMgemini-3.5-flashgpt-5claude-4.7-sonnetgemini-3.5-flash

TTSinworld-ttselevenlabs-v3cartesia-sonic-2inworld-tts

Round-trip 238 ms

Edgeus-west-1us-east-1eu-west-1us-west-1

MCPreservations-db

Build voice agents for

Restaurant bookingsHotel conciergesTelehealth intakeCustomer supportSales outreachBanking & insuranceField service dispatchRecruiting screensInsurance claimsRestaurant bookingsHotel conciergesTelehealth intakeCustomer supportSales outreachBanking & insuranceField service dispatchRecruiting screensInsurance claims

01Install the skill

Drop the Voice AI skill into your project.

$npx skills add GetStream/agent-skills -s stream

Claude CodeCursorCodex

02Prompt

Tell your agent what to do, in plain English.

"I'd like a voice agent attached to a new phone number in the US to handle my restaurant bookings. Set up an agent that can handle calls and manage reservations."

Agent scaffolded| agent.py generated | phone number provisioned

R1Result | Live demo

See the agent take a call.

A simulated reservation call that mirrors a deployed Stream agent: watch it greet the caller, capture the booking, and confirm by SMS in real time.

Ready

Demoagent-bella-01

Incoming call

+1 (415) 555-0142

00:00 | us-west-1 edge

Real-time metrics

healthy

Round-trip

STT first byte

LLM first tok.

TTS first byte

PipelineSTT -> LLM -> TTS | co-located

Reservation captured

Name-

Party-

Date-

Notes-

Status-

R2Result | Generated agent

One file. One function. Production-ready.

Scaffold your agent with the Stream skills: hosted STT, LLM and TTS models, or bring your own, and deploy to Stream's infra.

agent.py~12 lines

agent.pypython

async def create_agent(**kwargs) -> Agent:
    return Agent(
        edge=stream.Edge(),  # low-latency edge: React, iOS, Android, RN, Flutter
        agent_user=User(name="Assistant", id="agent"),
        instructions="You're a helpful voice assistant. Be concise.",
        realtime=stream.Realtime(
            model="models/gemini-3.5-flash",
            stt="models/inworld-stt",
            tts="models/inworld-tts",
            number="+1-800-my-number",
        ),
    )

R3Why it works well

Built on the same infrastructure as the rest of Stream.

Same edge, same SDKs, same dashboard you already trust for chat and video.

Global edge

Sessions route to the closest data center automatically. 60+ locations, sub-300 ms anywhere.

Any STT / LLM / TTS

Inworld, Deepgram, AssemblyAI, OpenAI, Gemini, Claude, ElevenLabs, Cartesia. Swap providers in a line.

MCP & RAG built-in

Plug your CRM, calendar, knowledge base or Postgres straight into the agent over MCP.

Voice anywhere

Phone numbers, browser, iOS, Android, video rooms: one agent, every channel, one transcript.

Smart co-location

STT, LLM and TTS run in the same region when the provider supports it. Fewer hops, lower P50.

HIPAA, SOC 2, GDPR

Telehealth-ready compliance posture. Per-region data residency on by default.

R4Observability

API & CLI driven. Beautiful monitoring out of the box.

Every call records latency, transcripts, model choice and cost. Replay any session, jump to the exact turn, and ship a fix without a debugger.

agent-bella-01 | us-west-1

CallsLatencyQualityCost

last 24hExport

Calls today

1,284+12.4%

Avg duration

1m 47s-4.1%

P50 round-trip

238 ms-18 ms

Resolution rate

92.3%+1.1%

End-to-end latency

238 ms P50

STTLLMTTSEdge

Recent calls

c_8aF2+1 415 555 014202:14218 ms

c_8aE9+1 212 555 018801:08244 ms

c_8aDc+44 20 7946 072103:47312 ms

c_8aCa+1 510 555 010300:42201 ms

c_8aB6+1 415 555 014204:31229 ms

c_8aA1+1 646 555 019202:02246 ms

REST + gRPCEvery metric available as API

Slack / PagerDutyAlert on P95 latency drift

ReplaysStep through any turn in a call

Verifying it works

Ship voice agents you can trust in production.

Scenario tests, guardrails and replays, the same way you ship the rest of your backend.

QATesting scripts

Reservations | 12 scenarios

11 / 12 passing

Books a table for 2 on a weekday

1.8s

Handles 'do you have anything earlier?'

2.1s

Asks for dietary requirements

1.4s

Confirms via SMS in caller's number

0.9s

Politely declines off-menu requests

1.2s

Escalates when overbooked

1.6s

Handles caller switching to Spanishflaky

2.4s

Run on every PR | GitHub Actions11 / 12 | 91.7%

GRGuardrails

Live policy enforcement

2 triggered today

Never quote prices

Stops the agent committing to a number; defers to the manager.

PII redaction in logs

Phone numbers and emails masked in transcripts.

Off-topic deflection1x today

Politely steers caller back to reservations.

Profanity filter1x today

Single-strike warning, then hand-off.

Max 2-minute hold

Auto-callback if escalation queue is busy.

Get started

One CLI command from idea to phone call.

Free for the first 10,000 minutes a month. No credit card. Bring your own model keys, or use Stream's.

$stream agents deploy

Start Coding Free Read the docs

Voice AIAccelerated.