Stream Blog
Vision Agents v0.5.0 Release: Local Hardware I/O, Anam Avatars, and Faster Deepgram TTS
Stream’s AI Moderation Roadmap: What We’re Building Next
How to Build an App Like TikTok Shop (+ Turn Livestreams into Revenue)
The 8 Best Platforms To Build Voice AI Agents
The End of the Orb: Building AI Agents That Feel Present
TLDR: Agents these days are blind and not very engaging, so we decided to team up with Anam and Inworld to build an agent using Vision Agents that feels personal and aware of the world around you. Give it a try here. Most voice agents today are blind. They hear words, convert them to text,
Gemini Live API & Lyria 3: Generate Music From Text, Phone & Video Calls
The instrumental background music in the video below is AI-generated using Lyria 3 by Google DeepMind. Lyria 3 allows anyone to generate AI music from text and image prompts. The music demos in this article take it further by adding another input prompt modality, your voice. Let’s proceed to generate your first music with Lyria
Chat Application Architecture, Explained
TLDR; Wide-column stores like Cassandra handle messages while Redis holds read state, because each subsystem’s access patterns differ significantly. Presence alone generates a write on every connect, disconnect, and heartbeat, making it orders of magnitude more write-heavy than messages. End-to-end encryption prevents the server from searching, moderating, or generating push notification previews, which transport-plus-at-rest encryption
How to Clone Any Voice in Minutes Using Voxtral TTS
What You Will Build This tutorial demonstrates how to build an AI speech app with in-app voice cloning support. You can clone your favorite voice by supplying a reference audio of about 3 seconds. Here is a demo. Voice cloning example demonstrating reference and output voices Voice cloning example demonstrating reference and agent’s output voices
Community Sift Moderation Alternatives – Top 6 Competitors Compared
Community Sift has been one of the most purpose-built content moderation platforms for gaming and online communities. If you’re evaluating whether it’s still the right fit, or your trust and safety team is looking at what else is out there, this guide gives you an honest comparison of the strongest moderation alternatives available today. We’ll
Popup Frees Creators from the Algorithm with Stream’s Livestreaming Infrastructure
Popup was founded in early 2025 with a simple but powerful premise: give creators a branded virtual space to connect with and monetize their audiences directly—no algorithm standing between them and their communities. The idea emerged from a clear shift happening across the creator economy. For years, creators have depended on brand sponsorships and social
How To Design AI Voices in Minutes Using Qwen3-TTS
Before You Start To begin, ensure that you meet these requirements and have the following credentials. Python 3.13 or a later version. An Apple Silicon Mac (recommended) or any modern laptop. Stream API credentials (for realtime audio and video communication). A HuggingFace Account and access token (HF_TOKEN). A Deepgram API key (for speech-to-text). A Google
Shipping WebRTC Video From a $10 Microcontroller: Challenges Building the Stream Video ESP32 SDK
We recently open-sourced the Stream Video ESP32 SDK — an SDK that lets an ESP32-S3 or ESP32-P4 join a Stream Video call, capture camera and microphone input, encode H.264 + Opus in real-time, and publish it over WebRTC. Someone on a browser or mobile device can then see and hear the ESP32 live. If you’re

