Engineering: AI (2)

Build a Flash Answer AI Assistant Like Le Chat

Our aim today is straightforward. We learn how to integrate an AI assistant with Stream's Chat APIs and make it blazing fast, like Le Chat by Mistral AI, using the LLM inference platforms. After the February 2025 updates and improvements to Le Chat by Mistral, many developers and power users have compared its response speed
Read more ->
11 min read

How to Add DeepSeek LLM to Your Chat App Using AWS Bedrock

Deepseek is the latest LLM to hit the digital shelves. It boasts high-quality reasoning at a fraction of the cost of current state-of-the-art models, OpenAI o1 and o3-mini, and Gemini 2.0 Flash Thinking. DeepSeek R1 is open-source, which means two things. First, developers can examine the model's architecture, training process, and weights directly, enabling a
Read more ->
9 min read

Gesture Recognition Using Tensorflow.js

Hand gesture recognition has become increasingly important in computer vision and human-computer interaction. With the rise of video conferencing and virtual interactions, there's a growing need for intuitive ways to control our digital environments. In this tutorial, we'll explore building a hand gesture detection system using TensorFlow.js that can recognize various hand poses in real-time
Read more ->
10 min read

The Best Pre-Built Toolkits for AI Agents

Python and TypeScript-based AI agent frameworks, CrewAI, LangChain, Agno, and Vercel AI SDK allow developers to build AI applications with multiple agents to act as Computer-Using Agents, or Deep Research Agents to automate browser tasks like clicking, scrolling, ordering products on the web and performing complex and multi-step tasks. These multi-AI agents may be put
Read more ->
10 min read

Build an AI Image Moderation System with AWS Rekognition

Adding image uploads to your live streams or chats increases engagement and interactivity. It allows your users to share visual content with the community, express themselves through photos and memes, and create more dynamic conversations in real time. But it also introduces the challenge of bad actors uploading inappropriate content and exposing your community to
Read more ->
11 min read

DeepSeek R1 - The Best Local LLM Tools To Run Offline

Many people (especially developers) want to use the new DeepSeek R1 thinking model but are concerned about sending their data to DeepSeek. Read this article to learn how to use and run the DeepSeek R1 reasoning model locally and without the Internet or using a trusted hosting service. You run the model offline, so your
Read more ->
6 min read

Multilingual Content Moderation with LLMs

Content warning: This article contains some NSFW Hungarian and Korean words and phrases. Did you know for the sitcom Mork and Mindy, the production team needed censors who knew four languages just to keep up with Robin Williams' sneaky swearing attempts? That was in the seventies, but today’s content moderators have the same problem, writ
Read more ->
5 min read

How to Add RAG-Based AI to Team Chat With Stream

AI agent chats are mostly a 1:1 experience. But that misses a clear opportunity--having an AI member of your team. If every team member was participating in a chat with AI, you could collaborate as a group, create shared knowledge bases, or solve problems together more efficiently. So, let's build that. We're going to extend
Read more ->
6 min read

Moderation API Introduction

Content moderation is crucial for maintaining a safe and positive user experience. Stream's Moderation API offers a powerful solution for integrating robust moderation capabilities into your applications. Stream's Moderation Dashboard enables developers to prevent users from posting harmful content and build custom moderation workflows tailored to their specific needs. This article will explore the key
Read more ->
4 min read

Build an Agentic RAG System With OpenAI, LanceDB, and Phidata

Integrating AI into enterprise applications often challenges getting accurate and efficient results from LLMs. The main reason is that LLMs are trained on large datasets rather than specifically on your enterprise's data. These challenges may usually include hallucinations, outdated information presentation, and more. This article explores the integration of AI agents, or, Agentic Retrieval Augmented
Read more ->
8 min read

Build an AI Assistant Using Python

In this post, we will see how to build a Python server allowing frontend chat SDKs to start and stop an AI agent for a channel in Stream Chat. Building polished AI assistants can be challenging. Features like streaming responses, table components, and code generation require complex implementation across SDKs and the backend. To ease
Read more ->
15 min read

Use Pinecone, OpenAI, and Stream To Chat With Any Book

Have you ever wanted to chat with the characters in your favorite book? Talk to Heathcliff about his origins, Harry Potter about his first impressions of Hogwarts, Jane Eyre about Lowood, or Lizzie Bennet about Mr. Darcy's proposal. Or, maybe, like us, you can’t wait to interrogate WebRTC For The Curious to learn more about
Read more ->
20 min read

Building an AI Chatbot for Customer Success

Earlier this year, payment service Klarna launched an AI customer service assistant. Within the first month, this AI had 2.3 million conversations, the equivalent of 700 agents, and led to a 25% drop in repeated inquiries. Klarna estimated it would “drive $40 million USD in profit improvement” in 2024. Customer success is an ideal opportunity
Read more ->
21 min read

Build an AI Assistant Using NodeJS

In this post, we will see how we can build a NodeJS server that will allow frontend chat SDKs to start and stop an assistant for a given channel in Stream Chat. Building polished AI assistants can be challenging. Features like streaming responses, table components, and code generation require complex implementation across SDKs and the
Read more ->
10 min read

Build an AI Assistant with React Native

This tutorial will demonstrate how easy it is to build an AI assistant with Stream React Native Chat SDK. While this tutorial features Anthropic and OpenAI APIs as the LLM provider, you can integrate any LLM service with Stream and still benefit from the same features, such as generation indicators, markdown support, tables, etc. No
Read more ->
12 min read

Build an AI Assistant with Flutter

In this tutorial, we will demonstrate how easy it is to build an AI assistant for iOS using the Stream Flutter Chat SDK on both the client and server sides. For this example, we will use the Anthropic and OpenAI APIs as the LLM service, but you can use any LLM service with Stream Chat.
Read more ->
10 min read

Build an AI Assistant for iOS Using Swift

In this tutorial, we will demonstrate how easy it is to create an AI assistant for iOS using Stream Chat. In this example, we will use the Anthropic and OpenAI APIs as our example LLM; however, developers are free to use whichever LLM provider they like and still benefit from Stream’s rich UI support for
Read more ->
9 min read

Build an AI Assistant for Android Using Compose

This tutorial guides you through building an AI assistant seamlessly integrated with the Stream Chat SDK for Jetpack Compose. You'll learn how to handle interactions on both the client and server sides by setting up and running your own simple backend. The AI assistant leverages Stream's edge network for optimal performance and uses APIs from
Read more ->
9 min read

Build an AI Assistant with React

In this tutorial, we will demonstrate how easy it is to build an assistant integrated into Stream’s React Chat SDK and learn how to incorporate the interaction on both the client and server sides. We will use the Anthropic and OpenAI APIs as the out-of-the-box examples, but using this method, developers can integrate any LLM
Read more ->
9 min read

Harness the Power of Stream, Cronofy, and OpenAI for Team Collaboration

Geographically dispersed teams often have a hard time scheduling meetings that work for all participants. Human Resources departments also face this challenge when working with existing employees and job candidates alike. Employees have the benefit of a defined and somewhat uniform computing environment, job applicants are a whole different challenge. Each candidate uses whatever computer
Read more ->
7 min read

Build AI-Powered Chatbot Apps for Android Using Firebase

AI-powered chatbots are widely used across industries like education, food delivery, and now, even software development. Since the release of large language models (LLMs) from Google and OpenAI, implementing AI-powered chatbots in projects has become much more accessible. Google’s Generative AI offers substantial benefits by enabling content creation, personalization, decision support, and simulation, which improve
Read more ->
8 min read

Best 5 Frameworks To Build Multi-Agent AI Applications

This article aims to help you build AI agents powered by memory, knowledgebase, tools, and reasoning and chat with them using the command line and beautiful agent UIs. What is an Agent? Large language models (LLMs) can automate complex and sequential workflows and tasks. For example, you can use LLMs to build assistants that can
Read more ->
17 min read

Understanding RAG vs Function Calling for LLMs

Unless you’ve been living under a rock, you probably know Large Language Models (LLMs) are all the rage right now. LLMs like OpenAI's ChatGPT and Google’s Gemini have redefined productivity and have more or less changed the world as we know it. However, their capabilities are not without limits. Static models trained on a fixed
Read more ->
7 min read

Using a Speech Language Model That Can Listen While Speaking

Traditional speech language models like Siri or Alexa use turn-taking as the primary interaction style. Although these systems can detect single human voices, they cannot be interrupted in real time. Let's discover an advanced AI speech dialogue system that integrates listening and speaking capabilities to engage in conversations in real time, allowing seamless to-and-fro communication
Read more ->
8 min read

The 6 Best LLM Tools To Run Models Locally

Running large language models (LLMs) like DeepSeek Chat, ChatGPT, and Claude usually involves sending data to servers managed by DeepSeek, OpenAI, and other AI model providers. While these services are secure, some businesses prefer to keep their data offline for greater privacy. This article covers the top six tools developers can use to run and
Read more ->
12 min read

How to Achieve a 9ms Inference Time for Transformer Models

Interested in Moderation for your product? Check out Stream's Auto-Moderation Platform! It is crucial for the technology platforms to moderate any harmful content as early as possible. Most modern moderation tools take a few hundred milliseconds to a few seconds to detect harmful content. Often the action against detected harm is taken after the harm
Read more ->
5 min read

Transformations in Machine Learning

On 8th September 2020, an article in the Guardian was written by a robot called GPT-3. They asked the robot to write an article about why humans should not be scared of robots and Artificial Intelligence. The human editors wrote the introduction for the article and instructed GPT-3 to generate the next possible sentences iteratively.
Read more ->
17 min read

Activity Feed Personalization 101: Top Feed Features to Improve User Engagement

Personalization comes in many flavors, and the data science team at Stream can help you build your own feeds personalization engine based on your specific needs. In conjunction with our analytics client we recommend tracking every event for every user, such as clicking on a link) we use both engagement and feed data to power
Read more ->
4 min read

Google Feed Personalization and Recommender Systems

Lately, I’ve been using Google’s feed on Android and it contains several interesting best practices for content discovery. Google’s feed strikes an effective balance between machine learning and follow relationships. With the recent advancements in AI, it can be hard to know when to apply AI and when to use a more manual method. This
Read more ->
4 min read

Building an End-to-End Deep Learning GitHub Discovery Feed

There's hardly a developer who doesn’t use GitHub. With all those stars, pulls, pushes and merges, GitHub has a plethora of data available describing the developer universe. As a Data Scientist at Stream, my job is to develop recommender systems for our clients so that they can provide a better user experience for their customers. With that said, I wanted to see if I could build a recommendation
Read more ->
11 min read

Moving Beyond EdgeRank for Personalized Newsfeeds

This blog post is broken into two parts and harkens back to learnings from a prior post. The sum of all these parts is altogether my best effort to provide you with a framework of how to take the creation of personalized news feeds to the next level. Part 1: Theory behind a very basic
Read more ->
6 min read

Building Your Own Instagram Discovery Engine: A Step-By-Step Tutorial

Isn’t it great how Instagram’s “Explore” section displays content that matches your interests? When you open the application, the content and recommendations shown are almost always relevant to your specific likes, interests, connections, etc. While it may be fun to think we’re the center of the Instagram universe, the reality is that personalized, relevant content
Read more ->
7 min read

Follow Recommendations in Social Networks

Social media is a series of networks connecting individuals, companies, organizations, and groups to one another. These networks can transcend local, national, and international borders connecting people to networks far and wide. With all those connections, how can a user find the ones that they want to connect with? That’s where follow suggestions come in.
Read more ->
4 min read

Best Practices for Recommendation Engines

In this blogpost I will describe how to implement a feature-rich activity feed that will make relevant and accurate personalization algorithms easier to implement. As we have already explored in previous blog posts, app personalization is linking activity feeds and user engagement data. In most cases, a well thought out feed structure provides valuable information
Read more ->
3 min read

Factorization Machines for Recommendation Systems

As a Data Scientist that works on Feed Personalization, I find it it important to stay up to date with the current state of Machine Learning and its applications. Most of the time, using some of the better-known recommendation algorithms yields good initial results; however, sometimes a change in the model is essential to provide customers
Read more ->
6 min read

Example Ranking Methods for Your Feeds

In this short tutorial we will show you how to use Custom Ranking for your activity streams and news feeds. By default all feeds on Stream are ranked chronologically. Custom ranking allows you to take full control over how your feeds are sorted. Some common use cases include: Showing popular activities higher in the feed
Read more ->
4 min read

Personalization & Machine Learning for News Feeds and Social Networks

Winds is an open source RSS reader is powered by React, Redux, Sails and Stream. This tutorial explains how we’ve built personalization for Winds, as an example of how using Stream makes it easy to build personalized feeds. About Personalization Personalization is a very broad concept. In this case, personalization equates to leveraging engagement data
Read more ->
5 min read

An Introduction to Contextual Bandits

In this post I discuss the Multi Armed Bandit problem and its applications to feed personalization. First, I will use a simple synthetic example to visualize arm selection in with bandit algorithms, I also evaluate the performance of some of the best known algorithms on a dataset for musical genre recommendations. What is a Multi-Armed Bandit? Imagine
Read more ->
6 min read

Fast Recommendations for Activity Streams Using Vowpal Wabbit

The problem of content discovery and recommendation is very common in many machine learning applications: social networks, news aggregators and search engines are constantly updating and tweaking their algorithms to give individual users a unique experience. Personalization engines suggest relevant content with the objective of maximizing a specific metric. For example: a news website might want to increase
Read more ->
5 min read