Architecture & Benchmark
Architecture Overview
The chat API works for apps with hundreds of millions of users. A few key things about the architecture
- Edge network: We run an edge network of servers around the world
- Offline storage & optimistic UI updates: The SDKs handle offline storage, and update the UI optimistically, which makes everything fast.
- Excellent performance: Scale to 5m users online in a channel, with <40ms latency
- Highly redundant infra
On the SDKs we provide both a low level client and UI components. This gives you the flexibility to build any type of chat or messaging UI.
The tech behind the Stream chat API
Stream uses Go as the backend language for chat.
Websockets
To get to 5m concurrent online users on a single channel we have a few optimizations in place. First we use cuckoo filters to have a quick way to check if a user might be online on a given server. There are also optimizations in place with skip lists and smart locking to prevent slowdowns at high traffic.
Next we use “writeev” syscall to improve WS memory and network throughput. In addition to that, we extensively us sync.Pool to prevent pressure from Go garbage collection.
API performance
The API often responds in <50ms. This is achieved using a combination of denormalization, caching and redis client side caching. In parts where it makes sense we use Redis + LUA scripting. The design of the caching libraries typically prevents most thundering herd problems
Members & Limits
There are no limits on the number of concurrent users, the number of members, the number of messages or the number of channels.
Infra & testing
There is a large Go integration test suite, a set of smoketests in production and a JS QA suite. Sometimes things can slip through, but in general this approach is very effective at preventing issues.
The infra runs on AWS and is highly available.
Benchmark
The benchmark result up to 5m users can be found here https://getstream.io/blog/scaling-chat-5-million-concurrent-connections/