Build low-latency Vision AI applications using our new open-source Vision AI SDK. ⭐️ on GitHub ->

Engineering: AI (2)

Open Vision Agents by Stream: Open Source SDK for Building Low-Latency Vision AI Apps

Vision Agents is a new, open-source framework from Stream that helps developers quickly build low-latency vision AI applications. The project is completely open-source and ships with over ten out-of-the-box integrations, including day one support for leading real-time voice and video models like OpenAI Realtime and Gemini Live. Text-to-speech, speech-to-text, and speech-to-speech models are also natively
Read more ->
4 min read

Top 5 Real-Time Speech-to-Speech APIs and Libraries To Build Voice Agents

There are two ways to build conversational voice agents for enterprise and production use cases. Developers can use a real-time API or speech-to-speech (STS), that takes audio input from a user and sends it to a large language model (LLM) to return a voice output. Or they can use a turn-based architecture, which consists of
Read more ->
13 min read

The 8 Best Platforms To Build Voice AI Agents

Voice assistants like Siri and Alexa are great for non-trivial everyday personal assistive tasks. However, they are limited in providing accurate answers to complex questions, real-time information, handling turns, and user interruptions. Get started! Activate your free Stream account today and start prototyping your own voice AI agent! Try asking Siri about the best things
Read more ->
13 min read

Compare the Top 5 Agentic CLI Coding Tools

Agentic AI coding tools vary in how they help you write, debug, and ship code. Some, like Lovable and Bolt, allow developers to build web and mobile apps using prompts quickly. Others, like Cursor and Windsurf, provide developers with a fully AI-featured IDE to solve engineering problems. Generally, AI coding platforms can be categorized into
Read more ->
12 min read

LLM Context Protocols: Agent2Agent vs. MCP

Two buzzwords circulating in the developer ecosystem today are MCP and A2A. Model Context Protocol (MCP) has been around since November 2024. Google released Agent2Agent (A2A) in April 2025 with an extensive list of technology partners. Developers can use the MCP and A2A open standards to provide context to models for building AI applications. MCP
Read more ->
13 min read

Exploring Google’s Agent Development Kit (ADK)

This article has an accompanying GitHub Repository containing runnable samples: adk-samples In recent years, the development of autonomous agents—software entities capable of reasoning, planning, and taking actions on behalf of users—has moved from research labs into real-world applications. These AI agents are rapidly becoming central to building intelligent systems, whether through task automation, information retrieval,
Read more ->
39 min read

How To Run OpenAI Agents SDK Locally With 100+ LLMs, and Custom Tracing

The OpenAI Agents SDK for Python provides developers with the building blocks to implement two agentic solutions for AI applications. You can create text-generation agents, allowing users to get responses from text prompts. Additionally, you can build voice agents using the SDK. To create your first agent with the OpenAI Agents SDK, get started here.
Read more ->
14 min read

Scaling AI Chat: 10 Best Practices for Performance, Cost, and Resource Optimization

Your AI chatbot is up and running. It’s helping customers, getting them the information they need in the tone and manner that is right for your brand. CX costs are down, and your support team are moving up the value chain. Everyone is happy. And then it happens: spam. Automated bots flood your system with
Read more ->
12 min read

The Top 7 MCP-Supported AI Frameworks

Toolkits for AI agents expose developers to various APIs to equip AI solutions with tools to carry out tasks and ensure accurate results for user satisfaction. However, integrating these tools into AI apps and managing them can be messy. This article introduces you to an industry standard of providing context to LLMs and agents using
Read more ->
20 min read

OpenAI Agents SDK — Getting Started

In the ever-evolving landscape of artificial intelligence, AI agents have emerged as a groundbreaking approach to building more sophisticated and autonomous systems. These agents represent a significant leap forward in AI development, offering capabilities beyond traditional static models. What makes AI assistants particularly powerful is their ability to orchestrate complex tasks and continuously self-assess their
Read more ->
8 min read

The 3 Best Python Frameworks To Build UIs for AI Apps

Python offers various packages and frameworks for building interactive, production-ready AI app interfaces, including chat UIs. This article details the top platforms. AI Chat UIs: Overview AI Chat UIs: Overview Chat UIs for AI applications provide the front-end interface to interact with AI assistants, local, and cloud-based large language models (LLMs), and agentic workflows. Like
Read more ->
16 min read

How to Connect Any AI Model to Your Chatbot

We are speedrunning AI development. This week alone, Claude 3.7 and GPT-4.5 were released. Before that, Deepseek R1, Deep Research, and Grok 3 were released. This speed makes it almost impossible for developers to keep up. No sooner have you implemented an OS Deepseek model into your chatbot than the newest OpenAI/Mistral/Llama/Anthropic model comes out,
Read more ->
9 min read

Choosing the Best AI Model for Your Chatbot

OpenAI just launched GPT-4.5. Here’s a snippet from their release blog post: Early testing shows that interacting with GPT‑4.5 feels more natural. Its broader knowledge base, improved ability to follow user intent, and greater “EQ” make it useful for tasks like improving writing, programming, and solving practical problems. We also expect it to hallucinate less.
Read more ->
7 min read

Cursor for Large Projects

With all this "vibe" coding, many devs think that Cursor and Claude are just for prototypes. While Cursor is great at writing new code, it’s also very effective at structuring code, standardizing, refactoring, and maintaining large projects. It’s super exciting since you can build software 5-30x faster. This guide shares my workflow for Cursor and
Read more ->
5 min read

The Best Pre-Built Toolkits for AI Agents

Python and TypeScript-based AI agent frameworks, CrewAI, LangChain, Agno, and Vercel AI SDK allow developers to build AI applications with multiple agents to act as Computer-Using Agents, or Deep Research Agents to automate browser tasks like clicking, scrolling, ordering products on the web and performing complex and multi-step tasks. These multi-AI agents may be put
Read more ->
10 min read

Moderation API Introduction

Content moderation is crucial for maintaining a safe and positive user experience. Stream's Moderation API offers a powerful solution for integrating robust moderation capabilities into your applications. Stream's Moderation Dashboard enables developers to prevent users from posting harmful content and build custom moderation workflows tailored to their specific needs. This article will explore the key
Read more ->
4 min read

Building an AI Chatbot for Customer Success

Earlier this year, payment service Klarna launched an AI customer service assistant. Within the first month, this AI had 2.3 million conversations, the equivalent of 700 agents, and led to a 25% drop in repeated inquiries. Klarna estimated it would “drive $40 million USD in profit improvement” in 2024. Customer success is an ideal opportunity
Read more ->
21 min read

Understanding RAG vs Function Calling for LLMs

Unless you’ve been living under a rock, you probably know Large Language Models (LLMs) are all the rage right now. LLMs like OpenAI's ChatGPT and Google’s Gemini have redefined productivity and have more or less changed the world as we know it. However, their capabilities are not without limits. Static models trained on a fixed
Read more ->
7 min read

Exploring Reasoning LLMs and Their Real-World Applications

LLMs have excelled in writing, coding, and problem-solving tasks and prompts based on the data sets they were trained with. However, these models fall short when used to solve complex puzzles because they respond with the information they were trained with and lack the ability to self-correct. Recent LLMs, like OpenAI's o1 and o3 models,
Read more ->
12 min read

The 6 Best LLM Tools To Run Models Locally

Running large language models (LLMs) like DeepSeek Chat, ChatGPT, and Claude usually involves sending data to servers managed by DeepSeek, OpenAI, and other AI model providers. While these services are secure, some businesses prefer to keep their data offline for greater privacy. Get started! Activate your free Stream account today and start prototyping with the
Read more ->
12 min read

How to Achieve a 9ms Inference Time for Transformer Models

Interested in Moderation for your product? Check out Stream's AI Moderation Platform! It is crucial for the technology platforms to moderate any harmful content as early as possible. Most modern moderation tools take a few hundred milliseconds to a few seconds to detect harmful content. Often the action against detected harm is taken after the
Read more ->
5 min read

Transformations in Machine Learning

On 8th September 2020, an article in the Guardian was written by a robot called GPT-3. They asked the robot to write an article about why humans should not be scared of robots and Artificial Intelligence. The human editors wrote the introduction for the article and instructed GPT-3 to generate the next possible sentences iteratively.
Read more ->
17 min read

Google Feed Personalization and Recommender Systems

Lately, I’ve been using Google’s feed on Android and it contains several interesting best practices for content discovery. Google’s feed strikes an effective balance between machine learning and follow relationships. With the recent advancements in AI, it can be hard to know when to apply AI and when to use a more manual method. This
Read more ->
4 min read

Building an End-to-End Deep Learning GitHub Discovery Feed

There's hardly a developer who doesn’t use GitHub. With all those stars, pulls, pushes and merges, GitHub has a plethora of data available describing the developer universe. As a Data Scientist at Stream, my job is to develop recommender systems for our clients so that they can provide a better user experience for their customers. With that said, I wanted to see if I could build a recommendation
Read more ->
11 min read

Moving Beyond EdgeRank for Personalized Newsfeeds

This blog post is broken into two parts and harkens back to learnings from a prior post. The sum of all these parts is altogether my best effort to provide you with a framework of how to take the creation of personalized news feeds to the next level. Part 1: Theory behind a very basic
Read more ->
6 min read

Building Your Own Instagram Discovery Engine: A Step-By-Step Tutorial

Isn’t it great how Instagram’s “Explore” section displays content that matches your interests? When you open the application, the content and recommendations shown are almost always relevant to your specific likes, interests, connections, etc. While it may be fun to think we’re the center of the Instagram universe, the reality is that personalized, relevant content
Read more ->
7 min read

Follow Recommendations in Social Networks

Social media is a series of networks connecting individuals, companies, organizations, and groups to one another. These networks can transcend local, national, and international borders connecting people to networks far and wide. With all those connections, how can a user find the ones that they want to connect with? That’s where follow suggestions come in.
Read more ->
4 min read

Best Practices for Recommendation Engines

In this blogpost I will describe how to implement a feature-rich activity feed that will make relevant and accurate personalization algorithms easier to implement. As we have already explored in previous blog posts, app personalization is linking activity feeds and user engagement data. In most cases, a well thought out feed structure provides valuable information
Read more ->
3 min read

Factorization Machines for Recommendation Systems

As a Data Scientist that works on Feed Personalization, I find it it important to stay up to date with the current state of Machine Learning and its applications. Most of the time, using some of the better-known recommendation algorithms yields good initial results; however, sometimes a change in the model is essential to provide customers
Read more ->
6 min read

Example Ranking Methods for Your Feeds

In this short tutorial we will show you how to use Custom Ranking for your activity streams and news feeds. By default all feeds on Stream are ranked chronologically. Custom ranking allows you to take full control over how your feeds are sorted. Some common use cases include: Showing popular activities higher in the feed
Read more ->
4 min read

Personalization & Machine Learning for News Feeds and Social Networks

Winds is an open source RSS reader is powered by React, Redux, Sails and Stream. This tutorial explains how we’ve built personalization for Winds, as an example of how using Stream makes it easy to build personalized feeds. About Personalization Personalization is a very broad concept. In this case, personalization equates to leveraging engagement data
Read more ->
5 min read

An Introduction to Contextual Bandits

In this post I discuss the Multi Armed Bandit problem and its applications to feed personalization. First, I will use a simple synthetic example to visualize arm selection in with bandit algorithms, I also evaluate the performance of some of the best known algorithms on a dataset for musical genre recommendations. What is a Multi-Armed Bandit? Imagine
Read more ->
6 min read

Fast Recommendations for Activity Streams Using Vowpal Wabbit

The problem of content discovery and recommendation is very common in many machine learning applications: social networks, news aggregators and search engines are constantly updating and tweaking their algorithms to give individual users a unique experience. Personalization engines suggest relevant content with the objective of maximizing a specific metric. For example: a news website might want to increase
Read more ->
5 min read