Build low-latency Vision AI applications using our new open-source Vision AI SDK. ⭐️ on GitHub

Real-Time Vision AI Agents

Multi-modal AI agents that see, hear, & remember.
Open-source. Edge-agnostic. Low-latency.

Vision Agents Playground

Join our

Partner Ecosystem

Building models, tools, or platforms that work with real-time voice or video AI?

We’re actively adding first-party integrations, co-building, and co-marketing with partners.

  • Model providers (STT, TTS, LLM, STS etc.)
  • Competing video edge networks
  • Avatar, visual effect companies
  • Hosting, both AI and regular

See Vision Agents in Action

Selling Assistant

Create a product page for selling a used item that includes a product image, title, description, and a suggested price.

Security Camera

Facial recognition, package detection, automated package theft response, and posting to X.

Video Content Moderation

Detect and censor offensive gestures, and give three verbal warnings before kicking the user out.

I Want a Plugin To...

Handle calls and respond naturally by voice

Realtime : End-to-end voice agent with multimodal support, unified under one plugin and model.

Connect to my own tools, APIs, or knowledge base

Language Models : Function calling, RAG, and full control over STT/TTS choices

Transcribe what users say in real time

Speech-to-Text : Streaming transcription, some with built-in turn detection

Give my agent a distinct, natural voice

Text-to-Speech : Cloud and local options, from expressive to ultra-low latency

See and understand what’s on camera

Vision & Video : Object detection, video analysis, and style transfer

Put a face on my agent

Avatars : Real-time lip-synced visual characters

Make conversations feel natural, not robotic

Turn Detection : Smart interruption handling and silence detection

Run open-source models on my own infrastructure

Infrastructure : Self-hosted inference, model routing, and vector search

Community & Open Source

Join the Community

Follow Stream on X, star the Vision Agents GitHub repo, and join the discussion on Discord to try demos, share feedback, and contribute.