Super Bowl weekend is one of the biggest moments in sports. On January 31st, in San Francisco, a group of developers decided to build for it.
The Gemini 3 SuperHack, hosted with Google DeepMind, challenged participants to reimagine sports and live entertainment. This ran the gamut, from in-game performance analytics and strategy insights to halftime shows and live media.
The tools were Gemini 3 across AI Studio, Vertex AI, and Antigravity.
Stream's Vision Agents was one of the event's featured partners, and we loved seeing more than a few teams put it to work.
What three of them made (in a single day) is worth sharing.
Why Vision Agents?
Handling video with AI is genuinely hard. Think real-time streams, fast-moving subjects, low-latency requirements… most frameworks weren't built for this.
Vision Agents was.
It's designed to make video a first-class input for AI applications, and across all three projects, developers cited the same things:
- Fast setup
- Great examples
- The ability to focus on the product instead of the infrastructure
The Builds
Each project tackled a different corner of the sports experience, but all three reached for Vision Agents to handle real-time video.
Let’s take a look.
Super Analytics — Bharat Satya
Gemini 3 SuperHack Finalist & Stream Vision Agents Prize Winner
Bharat built Super Analytics, a real-time sports analytics platform that functions like a computational sideline assistant.
Using Google Gemini, Stream WebRTC, and Vision Agents, the platform takes live game footage and turns it into actionable intelligence with positional analytics, strategic insights, and visual decision matrices for coaches, teams, and analysts during live matches.
The core idea is that coaches need strategy answers at halftime, not after the game; teams need live positional analytics; and analysts need visual decision matrices to drive winning plays.
"I was looking for a way to handle video, specifically live video, with AI, and Vision Agents was the only option that fit my needs. It was pretty easy to set up the SDK with the examples provided. It made development very simple, and I was able to come up with an initial version quite fast.
I would definitely share this with other developers who want to use a video agent in their product development. Handling video with LLM and AI is tough, and Vision Agents is the best way to do that.”
Personalized AI Commentator — Divya Saini
Stream Vision Agents Prize Runner-Up
Divya built a real-time AI sports commentator that narrates live streams differently for every viewer. It's a fun idea on the surface: the system adapts to your favorite team, your knowledge level, and your preferred style (and yes, it can roast the other team). But, the problem it's solving is bigger than it might seem.
Sports is a global language, but commentary rarely is. With AI commentary, viewers can get real-time narration in their own language, at their own level, and even ask questions about what's happening on screen.
Plus, the fact that it was built in a hackathon window speaks to how much Vision Agents handles behind the scenes.
“I attended the workshop to use Vision Agents, and I saw that there were already strong example projects on GitHub, which would make it really easy to get started quickly. I wouldn't have to build everything from scratch. I could focus on experimenting and actually shipping something interesting.
Vision Agents handled a lot of the heavy lifting around real-time video understanding, which let me focus on the experience and logic of the product. It made something that would normally feel complex and infrastructure-heavy actually buildable within a hackathon timeframe."
Divya is already thinking ahead; future versions will automatically trigger commentary based on live stream events and support multiple voices for different commentary styles.
"The team is incredibly supportive when it comes to integrations, and the platform itself feels stable and production-ready. I genuinely had fun building on top of Stream Vision Agents. It lowers the barrier to turning ambitious ideas into working demos fast."
Scrym Vision — Gretchen Boria & Jorge Jimenez
Gretchen and Jorge built a real-time "Offensive Coordinator" dashboard, a tactical command center that lives and breathes with the game.
The system watches a live football video feed, detects formations in real time, and renders them on a digital tactical board that updates dynamically. Think of it as a digital twin of the field.
The dashboard is divided into three zones: a live commentary feed and win probability gauge on the left, a strategic play call card with success rates and matchup advantages in the center, and the live tactical board on the right, where player markers shift and jitter to mimic pre-snap adjustments.
What makes it technically interesting is the "Fast Eye, Slow Brain" architecture. Vision Agents acts as the orchestration layer, connecting specialized processors (in this case, Roboflow for object detection and YOLO Ultralytics for tracking) to handle the heavy lifting on every frame. Rather than building computer vision from scratch, Vision Agents lets developers plug in best-in-class models and coordinate them.
The "Slow Brain" side of the equation, a large language model, is then invoked only when meaningful events occur, such as a formation change.
"We were immediately impressed by the object detection and visual reasoning capabilities that Vision Agents could afford via API, as it presented a way to analyze in-game activity in real time. The application for play-by-play sports analysis seemed novel, and we found it easy to draw value from the football use case."
This prevents hallucinations and keeps things efficient.
"We found a way to use it for the purpose of dynamically updating a canvas to position players during scrimmage, according to the Vision Agents output whenever a formation was detected. Because object recognition and real-time commentary are useful, we naturally used them to visualize field coordinates."
What These Projects Have in Common
It's worth stepping back for a moment.
Three independent teams, none coordinating with each other, all built real-time sports analytics tools.
All three cited the same friction point: real-time video AI is hard to work with, and Vision Agents removed that friction.
All three shipped working demos in a single day.
That's the point of a good developer framework. You shouldn't have to solve the infrastructure to get to the interesting part.
Build with Vision Agents
If you're a developer who works with video and wants to build something novel, Vision Agents is worth a look.
🔗 Check out the links below to get started:
