New Stream Chat Edge Infrastructure Reduces Latency by up to 5x

Stream is proud to announce that our new Edge API Infrastructure is now available to all Stream Chat customers. This represents a major performance improvement for platforms built with Stream that serve users on multiple continents. Unlike other messaging services hosted on traditional cloud infrastructure, Stream Chat's performance no longer relies on users’ proximity to a regional data center, reducing latency by up to five times and eliminating timeout errors caused by poor WiFi or mobile connectivity. As the first chat API and SDK provider to offer this technology, Stream is uniquely equipped to power in-app messaging for the world’s largest enterprise organizations.

For existing customers and anyone evaluating with a Stream Chat free trial, switching to this new edge API infrastructure does not require any complex change. All you have to do is change the base URL of your API endpoint or SDK to the new URL: https://chat.stream-io-api.com. API calls using the old URL and the new edge URL will still connect to the same infrastructure and still access the same channels and messages. We invite you to make the switch and take advantage of improved performance today.

As part of our commitment to provide product and engineering teams with the highest-performing chat development tools in the world, we’ve decided to make this infrastructure upgrade available for all Stream Chat plans at no extra charge.

Let’s take a closer look at some of the architectural challenges this new approach solves, how our team approached the transition process, and why we believe edge computing is the future of cloud infrastructure.

Challenges With Traditional Cloud Infrastructure

The cloud and SaaS revolutions changed how we think about developing and delivering apps to users, paving the way for faster iteration and massive improvements in performance and overall functionality. Users expect seamless, instantaneous responses from their apps, but in certain situations, geographic distance and the limitations of physical infrastructure can cause perceptible latency. The problem boils down to a couple of factors:

  • The internet is slow, and the further your data needs to travel, the higher the latency.
  • TLS, CORS, and TCP over poor connections compound and create a multiplication problem.

Connecting Users Across Multiple Continents

With a traditional approach to cloud infrastructure, you choose the regional data center for your API servers that is closest to most of your users. But if even a few of those users are located on a different continent, the need to communicate with this single location can add hundreds of milliseconds to load times for those users. Some Stream Chat customers, for example, have end users spread across the US, the EU, and Australia, making it difficult to select a single infrastructure region suitable for all users.

Balancing Security and Performance

TLS encryption is critical to ensure security and privacy for end users and to achieve regulatory compliance within a number of different industries. For these reasons, all Stream Chat traffic is encrypted with TLS, and while this is great for security, it has a significant impact on latency. Here’s an overview of some of the inherent latency challenges:

  • The first request needs to perform a TLS handshake; this requires four full round trips. If base latency is 100ms, this approach means a total of 400ms to get things started.
  • CORS – Before your browser makes an API request, a preflight request is made to ensure the call is authorized. This adds another full round trip to the latency of the first API call. (This is a web-only issue.)

Mitigating Connectivity Issues

Even today, millions of end users don’t have access to consistently reliable high-speed internet, whether via wired broadband or mobile networks. Connectivity can vary based on socioeconomic factors, urban/rural divides, and even minute-to-minute location changes depending on mobile network coverage.

The best apps are developed to accommodate intermittent connectivity issues with minimal effect on the user experience. With in-app chat, this often means support for offline messaging, allowing users to send messages and reactions and even create channels while they’re offline. When the user comes back online, the library automatically recovers lost events and retries sending messages.

But even with offline messaging and other provisions like optimistic UI updates, the internet’s Transmission Control Protocol (TCP) can create a latency multiplication problem when a user connects to the chat API using a poor WiFi connection or a spotty mobile connection. When a packet is dropped, TCP will require re-transmission. Depending on the error rate, this can quickly increase latency to the point that the API client will timeout, creating an unacceptable user experience.

The Edge Solution Explained

With the above challenges in mind, it’s clear that an alternative to the traditional region-based approach is necessary to ensure consistent high performance and low latency for users around the globe. Our solution was to create a network of edge servers and use a combination of DNS and BGP to route user traffic to the nearest edge.

In this setup, the TLS handshake, preflight requests, and TCP retransmissions all happen between the client and an edge to ensure low latency, instead of traversing a longer distance across more networks.

A common hack to achieve similar results is to encrypt traffic between edges and clients and transmit plain text traffic between edges. While this approach brings significant latency results, it also opens up substantial security problems. Our solution relies on fully encrypted edge-edge channels, and we use persistent multiplexed HTTP/2 connections to achieve this without incurring latency penalties. Our mobile and web SDK libraries support HTTP/2 connections out of the box and use these connections without requiring any configuration change.

GDPR Compliance with Stream Edge Infrastructure

Many organizations that use Stream Chat must demonstrate that their software products are GDPR compliant. These organizations may have questions about how data storage works given the international distribution of the new edge infrastructure. In order to meet the requirements of GDPR and similar regulatory frameworks, data is still only stored in one region. We’re committed to ensuring that the Stream Chat API and SDKs always come with the tools you need to integrate fully compliant messaging functionality.

Performance Improvements by the Numbers

Initial testing of Stream’s new edge infrastructure demonstrates remarkable performance improvements compared to traditional region-based infrastructure, with users located the furthest from a given regional data center benefitting the most. The examples below indicate latency reductions between 2x and 5x. Actual round-trip ping times will vary based on users’ individual circumstances, but in general apps with users on multiple continents should see impressive results.

Connect user latency from Amsterdam to US East

bestmedianp95p99
Chat Edge165ms170ms327ms359ms
Regional Proxy356ms362ms444.5ms1155ms

Connect user latency from Amsterdam to Mumbai

bestmedianp95p99
Chat Edge186ms193ms203ms219ms
Regional Proxy502ms515ms526ms2197ms

Connect user latency from Amsterdam to Singapore

bestmedianp95p99
Chat Edge228ms232ms241ms255ms
Regional Proxy660ms678ms693ms3112ms

Easily Migrate to Stream’s New Edge Infrastructure

We’re excited for all new and existing Stream Chat customers to benefit from the increased performance and reliability afforded by this infrastructure overhaul, and we invite you to migrate to the new edge servers at your earliest convenience.

You can easily switch to our new edge infrastructure today by changing the base URL of your API endpoint or SDK. The new url is https://chat.stream-io-api.com. Clients using the old URL and the new edge URL will still connect to the same infrastructure and still access the same channels and messages.

Coming Up Next

This is just our first step in building a globally available edged API service. User experience and latency are inversely related, and we plan to reduce latency even further with the following initiatives:

  • Increase the number of edge servers
  • Implement more intelligent routing by moving more logic to the DNS layer
  • Handle connect and channel API endpoints entirely at the edge