Stream Blog
Vision Agents v0.5.0 Release: Local Hardware I/O, Anam Avatars, and Faster Deepgram TTS
It's been a busy period since our last release, and now it's time to share Vision Agents v0.5.0: a step toward making production-grade multimodal AI agents easy to build and deploy. While previous versions laid the groundwork for real-time voice, video, and Vision Agents, v0.5.0 focuses on stability at scale and even more expressive
Stream's AI Moderation Roadmap: What We're Building Next
Moderation has quietly become one of the hardest problems in modern apps. As chat, feeds, and real-time video interactions expand globally, the challenge isn't just catching bad content; it's doing it in real time, across languages, with context, and at scale. At Stream, we've been investing deeply in solving that problem. This roadmap is a
How to Build an App Like TikTok Shop (+ Turn Livestreams into Revenue)
Livestream shopping is changing how people discover and buy products online by combining real-time video with instant purchasing. Platforms like TikTok have popularised this model, enabling creators and brands to showcase products live while viewers shop without leaving the stream. In this tutorial, you'll learn how to build a TikTok-style livestream shopping application using Next.js.
The 8 Best Platforms To Build Voice AI Agents
Voice assistants like Siri and Alexa are great for non-trivial everyday personal assistive tasks. However, they are limited in providing accurate answers to complex questions, real-time information, handling turns, and user interruptions. Get started! Activate your free Stream account today and start prototyping your own voice AI agent! Try asking Siri about the best things
Stream Skills: Build a Marketplace App To Buy, Sell, and Shop Online
Let's build an online marketplace platform that enables safe, secure buying and selling, combining Stream's AI agent skills for chat, activity feed, moderation, and video into a unified product. What You Can Build Agent Skills improve developers' productivity and help them integrate features more quickly and build complex applications from scratch. Stream now has skills
How Do I Choose Between Different Chat API Providers for a Chatbot?
Picking a chat infrastructure provider sounds like a feature comparison, but it's mostly a question of fit. Pricing, SDK depth, and AI capabilities matter, but they are downstream of the higher-order questions about what you're building and what trade-offs you can tolerate. Should I Build Chat In-House or Use a Provider? Building chat from scratch
Using AI Agent Skills: Build an iOS Chat Messaging App With a Single Prompt
As developers, we typically spend time reading docs and tutorials, and watching YouTube videos to integrate APIs and SDKs to add specific functionality to apps and services. These integrations can now be completed much more quickly using AI Agent Skills. Agent Skills are sets of instructions, scripts, and reference documents that equip AI models to
The End of the Orb: Building AI Agents That Feel Present
TLDR: Agents these days are blind and not very engaging, so we decided to team up with Anam and Inworld to build an agent using Vision Agents that feels personal and aware of the world around you. Give it a try here. Most voice agents today are blind. They hear words, convert them to text,
Gemini Live API & Lyria 3: Generate Music From Text, Phone & Video Calls
The instrumental background music in the video below is AI-generated using Lyria 3 by Google DeepMind. Lyria 3 allows anyone to generate AI music from text and image prompts. The music demos in this article take it further by adding another input prompt modality, your voice. Let's proceed to generate your first music with Lyria
Chat Application Architecture, Explained
TLDR; Wide-column stores like Cassandra handle messages while Redis holds read state, because each subsystem's access patterns differ significantly. Presence alone generates a write on every connect, disconnect, and heartbeat, making it orders of magnitude more write-heavy than messages. End-to-end encryption prevents the server from searching, moderating, or generating push notification previews, which transport-plus-at-rest encryption
How to Clone Any Voice in Minutes Using Voxtral TTS
What You Will Build This tutorial demonstrates how to build an AI speech app with in-app voice cloning support. You can clone your favorite voice by supplying a reference audio of about 3 seconds. Here is a demo. Voice cloning example demonstrating reference and output voices Voice cloning example demonstrating reference and agent's output voices
Community Sift Moderation Alternatives - Top 6 Competitors Compared
Community Sift has been one of the most purpose-built content moderation platforms for gaming and online communities. If you're evaluating whether it's still the right fit, or your trust and safety team is looking at what else is out there, this guide gives you an honest comparison of the strongest moderation alternatives available today. We'll

