Rate Limits

LAST EDIT Jul 23 2024

Stream powers chat and activity feeds for a billion end users. The tech is based on Go, RocksDB & Raft, and it scales well for some of the largest sites and apps. That being said, Stream does have rate limits to:

  • Prevent integration issues or abuse from impacting your application, as making many API requests also triggers client-side events and excessive calls could easily degrade your app's performance.

  • Prevent integration issues or abuse from using more capacity than is provisioned for your plan. This protects Stream infrastructure.

  • Identify integration issues early, like opening more than one WebSocket for a single user, long before cost or scale impacts become a significant challenge. This is equally helpful if your application experiences abuse, like bots generating users

Stream rate limits are set higher than alternative chat providers in almost all cases and still far below the maximum capacity of our system.

Every Application has rate limits applied based on a combination of API endpoint and platform: these limits are set on a 1-minute time window. For example, reading a channel has a different limit than sending a message. Likewise, different platforms such as iOS, Android or your server-side infrastructure have independent counters for each API endpoint's rate limit.

For example, The default connect endpoint limit is 10,000 per minute. If you've built an integration on iOS and Android, and 6,000 users connect in under a minute from iOS and another 6,000 connect from Android, you will NOT encounter a rate limit. Instead, you must see 10,001 users connect from just one platform, say iOS, in under a minute, to encounter a rate limit. Independent platforms have independent counters.

Types of Rate Limits


There are two kinds of rate limits for Chat:

  1. User Rate Limits: Apply to each user and platform combination and help to prevent a single user from consuming your Application rate limits.

  2. App rate limits: App rate limits are calculated per endpoint and platform combination for your application.

User Rate Limits


To avoid individual users consuming your entire quota, every single user is limited to at most 60 requests per minute (per API endpoint and platform). When the limit is exceeded, requests from that user and platform will be rejected.

App Rate Limits


Stream supports four different platforms via our official SDKs:

  • Server: SDKs that execute on the server including Node, Python, Ruby, Go, C#, PHP, and Java.

  • Android: SDKs that execute on an Android device including Kotlin, Java, Flutter, and React Native for Android clients.

  • iOS: SDKs that execute on an iOS device including Swift, Flutter, and React Native for iOS clients.

  • Web: SDKs that execute in a browser including React, Angular, or vanilla JavaScript clients.

Rate limits quotas are not shared across different platforms. This way if by accident a server-side script hits a rate limit, you will not have any impact on your mobile and web applications. When the limit is hit, all calls from the same app, platform, and endpoint will result in an error with a 429 HTTP status code.

App rate limits are administered both per minute and per second. The per-second limit is equal to the per-minute limit divided by 30 to allow for bursts.

What To Do When You've Hit a Rate Limit


You should always review responses from Stream to watch for error conditions. If you receive 429 status, this means your API request was rate-limited and you will need to retry. We recommend implementing an exponential back-off retry mechanism.

Here are a few things to keep in mind to avoid rate limits:

1. Slow down your scripts: This is the most common cause of rate limits. You're running a cronjob or script that runs many API calls in succession. Adding a small timeout in between API calls typically solves the issue.

2. Use batch endpoints: Batch update endpoints exist for many operations. So instead of doing 100 calls to update 1 user each, call the batch endpoint for updating many users.

3. (Re)render logic in your apps: Sometimes things like infinite pagination or other client-side logic goes wrong and ends up sending an endless number of API calls. Watch out for these mistakes.

4. Query only when needed: Sometimes apps will call QueryChannels to see if a channel exists before creating it. This isn't needed you can simply create the channel since the endpoint behaves in an upsert fashion. For more information about querying channels, see this page.

5. Open only one websocket connection per user. Opening multiple WebSockets per user opens your application to a myriad of problems, including performance, billing, and unexpected behavior. See Instantiating the Client for more information.

6. For Livestream and Live Events, which can have significant messaging volume, even more best practices can be found here.

7. If rate limits are still a problem, Stream can set higher limits for certain pricing plans:

  • For Standard plans, Stream may also raise rate limits in certain instances, an integration review is required to ensure your integration is making optimal use of the default rate limits before any increase will be applied.

  • For Enterprise plans, Stream will review your architecture, and set higher rate limits for your production application.

Rate Limit Headers





the total limit allowed for the resource requested (i.e. 5000)


the remaining limit (i.e. 4999)


when the current limit will reset (Unix timestamp)

Inspecting rate limits


Stream offers the ability to inspect an App's current rate limit quotas and usage in your App's dashboard. Alternatively you can also retrieve the API Limits for your application using the API directly.

The Get Rate Limits endpoint includes the 1-minute limit, the remaining quota and the timestamp of the window reset.

This is useful for error handling. You can inspect this endpoint every time your application receives a 429 status to get the timestamp for the window reset, and try the request again at that time.