Stream Blog
Open Vision Agents by Stream: Open Source SDK for Building Low-Latency Vision AI Apps
The 8 Best Platforms To Build Voice AI Agents
The 6 Best LLM Tools To Run Models Locally
Using Stream to Build a Livestream Chat App in Next.js
Grok TTS + Vision: Build a Healthcare Appointment Agent
This step-by-step guide will help you build an AI front-desk receptionist that interacts with patients through conversations, assesses their conditions, and advises whether to visit a doctor or seek online medical advice. When an agent can see the patient’s condition in real time, it can make a smarter recommendation, saving patients an unnecessary trip to
The Architecture and Best Practices for Mobile App Stability
A frozen message composer. A feed that won’t load. A draft that vanishes. None of these register as crashes, but all of them lose users. Add real-time features, like chat, activity feeds, or live streaming, and your crash rate can look pristine in Crashlytics while your app silently drops messages and bleeds memory. This guide
How to Build a Social Media App: A Technical Guide
Building a social media app means a single user action must propagate to potentially millions of other users in real time, while staying fast, safe, and cheap. Every feature touches every other feature. And the hard problems shift as you scale. At 100K users, it’s the database. At 1M users, it’s the fan-out strategies. At
How to Build an App Like TikTok Shop (+ Turn Livestreams into Revenue)
Livestream shopping is changing how people discover and buy products online by combining real-time video with instant purchasing. Platforms like TikTok have popularised this model, enabling creators and brands to showcase products live while viewers shop without leaving the stream. In this tutorial, you’ll learn how to build a TikTok-style livestream shopping application using Next.js.
Developer’s Guide to Ultralytics YOLO: From Theory to Real-Time Pose Detection
In most of the world, if you’re YOLO’ing, you’re jumping out of a plane, asking out your future spouse, or eating gas station sushi. In vision AI, You’re Only Looking Once. Ultralytics’ YOLO is a real-time object detection framework with a simple premise: instead of scanning an image multiple times to find and classify objects,
Build a Local AI Agent with Qwen 3.5 Small on macOS
Qwen 3.5 Small is a new family of lightweight, high-performance models from Alibaba (0.8B, 2B, 4B, and 9B parameters) that are now available on Ollama. These models support multimodal input, native tool calling, and strong reasoning, all while running efficiently on laptops, Macs, and even mobile/IoT devices. In this demo, the agent runs completely locally
Using Opus 4.6: Vibe Code a Custom Python Plugin for Vision Agents
Vision Agents has out-of-the-box support for the LLM services and providers developers need to build voice, vision, and video AI applications. The framework also makes it easy to integrate custom AI services — either by following a step-by-step guide or by vibe coding them using SoTA models. Let’s use Claude Opus 4.6 to create a
Developer’s Guide to Building Vision AI Pipelines Using Grok
Grok tends to fly under the radar. While ChatGPT, Claude, and Gemini have found their footing in enterprise workflows and agentic toolchains, Grok remains mostly associated with X, which has overshadowed some genuinely strong capabilities. Chief among them is vision: Grok can understand and generate images, produce entire videos from a single prompt, and with
