Stream Blog
Open Vision Agents by Stream: Open Source SDK for Building Low-Latency Vision AI Apps
The 8 Best Platforms To Build Voice AI Agents
The 6 Best LLM Tools To Run Models Locally
Using Stream to Build a Livestream Chat App in Next.js
Build an Instagram-Style For-You Feed in React Native
Personalized content feeds keep users engaged by surfacing content they’re most likely to enjoy. In this tutorial, you’ll build an Instagram-style “For You” feed in React Native Expo that recommends images and videos to users based on their interests and content popularity. Get a free Stream account and use your API credentials to get started.
Create Speech-to-Text Experiences with ElevenLabs Scribe v2 Realtime & Vision Agents
ElevenLabs released Scribe v2 Realtime, an ultra-low latency speech-to-text model with ~150ms end-to-end transcription, supporting 90+ languages and claiming the lowest Word Error Rate in benchmarks for major languages and accents. It’s built specifically for agentic apps, live meetings, note-taking, and conversational AI, where every millisecond and every word matters. In this demo, Scribe v2
How Text-to-Speech Works: Neural Models, Latency, and Deployment
Not long ago, text-to-speech (TTS) was a laughing stock. Robotic, obviously synthetic output that made customer service jokes write themselves and relegated TTS to accessibility contexts where users had no alternative. Now, you may have listened to text-to-speech today without even realizing. AI-generated podcasts, automated customer service calls, voice assistants that actually sound like assistants.
Marketplace Content Moderation: How to Build Trust and Prevent Abuse at Scale
Marketplaces only work when people trust each other. Buyers trust that listings accurately represent what they’re purchasing. Sellers trust they won’t be scammed, harassed, or pushed off the platform by bad actors. And both trust that the marketplace itself is actively protecting them, not reacting after damage is already done. As marketplaces scale, maintaining that
Edge-Optimized Speech Workflows: Combining Deepgram Nova-3 STT with Fish Speech V1.5 TTS
AI won’t stay online. It won’t stay on your laptop. It won’t stay centralized. It will move to every device and to the edge of every network, into your earbuds, your car, your factory floor, and your doorbell. This opens up a remarkable number of use cases. A fitness coach who listens continuously, counts your
Building A2UI-Powered Interfaces with Stream Chat
A2UI (Agent-to-UI) is a protocol designed by Google to standardize how AI agents communicate with user interfaces. Instead of tightly coupling agents to specific frontends, A2UI defines a clear contract for intent, state, and actions – making it easier to build interactive, agent-driven experiences that are portable, composable, and UI-agnostic. As AI systems move from
Scaling Activity Feeds to 100M Users: Stream’s Latest Benchmarks
Stream has reached a major milestone in activity feed infrastructure, successfully benchmarking over 37 million operations with a 10% write and 90% read workload distribution across a dataset of 100M users, 500M activities, and 200M follow relationships. Each scenario was tested at 500, 1,000, and 1,500 requests per second to measure performance under increasing load.
Scaling WebRTC Video to 100,000 Participants: Stream’s Latest Video Benchmarks
Stream has reached a major milestone in real-time video infrastructure: Successfully scaling a single WebRTC-based livestream to 100,000 concurrent participants while maintaining ultra-low latency, stable frame rates, and zero packet loss. Today, Stream powers real-time chat, activity feeds, moderation, audio, and video for applications serving over one billion end users worldwide, backed by a 99.999%
