SFU Cascading

If you are building a scalable real-time communication app cascading can help you handle large user bases across interconnected servers, ensuring smoother performance and improved user experience.

What is SFU Cascading?

SFU cascading refers to the process of connecting multiple Selective Forwarding Units (SFUs) in a hierarchical manner to handle video streams in a real-time communication system, such as WebRTC-based video calling or live streaming applications.

An SFU is a key component of many real-time communication systems. It receives audio and video streams from participants in a session and selectively forwards these streams to other participants based on their subscriptions. This way, each participant only receives the media streams of the other participants they are interested in, rather than receiving every stream from all participants.

When there are a large number of participants in a session or when the session needs to be distributed across multiple regions, a single SFU may not be sufficient to handle the load effectively. In such cases, SFU cascading comes into play. SFU cascading involves setting up multiple SFUs in a cascaded network, where each SFU handles a subset of participants or a geographical region.

Here's a simplified example of how it works:

  1. The first level SFUs (referred to as "root SFUs") are responsible for managing the entire session and handling a large number of participants.
  2. Each root SFU is connected to several second-level SFUs (referred to as "branch SFUs") that are responsible for handling a smaller subset of participants or a specific geographical region.
  3. Participants connect to the nearest or most appropriate branch SFU based on their location or other criteria.
  4. The branch SFUs forward the received media streams to the root SFUs, which, in turn, forward the streams to the appropriate branch SFUs and participants as needed.

This hierarchical setup helps distribute the load across multiple SFUs, reducing the processing and bandwidth requirements on each individual SFU and improving the overall scalability and performance of your app.

What are the pros and cons of cascading SFUs?


  1. Scalability: Enhances the system's ability to handle a large number of participants and ensures smooth communication even with a large number of users.
  2. Geographic Distribution: Optimizes latency by allowing participants to connect to nearby servers, improving overall communication quality.
  3. Redundancy: Provides backup options and fault tolerance, reducing the risk of system failures.
  4. Load Balancing: Efficiently distributes media streams across multiple servers, preventing overload on any individual server.
  5. Lower Bandwidth Usage: Reduces overall bandwidth consumption by forwarding only relevant media streams to each participant.


  1. Complexity: Implementing and managing multiple interconnected servers can be more complex than using a single server.
  2. Higher Latency: Stream forwarding through multiple servers can introduce additional latency compared to direct peer-to-peer connections.
  3. Central Point of Failure: While redundancy is provided, a crucial server failure can still impact the entire system.
  4. Increased Server Resources: Setting up and maintaining multiple servers requires more infrastructure and resources.
  5. Potential Stream Quality Loss: Transcoding or processing in the cascading setup may slightly affect media stream quality.

What are some examples of apps that would cascade SFUs?

Some typical apps that would benefit from using cascading include:

  1. Video Conferencing Platforms: Apps that facilitate online meetings, webinars, and virtual conferences with multiple participants.
  2. WebRTC-based Communication Apps: Real-time communication applications that leverage the WebRTC protocol for audio and video streaming.
  3. Live Streaming Services: Platforms that enable live broadcasting of events, gaming, and interactive shows to a large audience.
  4. Virtual Events and Webinars: Apps that host virtual events, workshops, and webinars with numerous attendees.
  5. E-learning and Online Classroom Platforms: Platforms that facilitate live online classes and interactive educational sessions.
  6. Social Networking Apps with Video Chat: Social media or messaging apps that offer video chat functionality for users to connect face-to-face.
  7. Telemedicine and Virtual Healthcare Apps: Platforms that enable remote medical consultations and video-based healthcare services.
  8. Interactive Gaming Apps: Multiplayer gaming apps that require real-time communication between players during gameplay.
  9. Remote Team Collaboration Tools: Software that allows distributed teams to collaborate through video meetings and screen sharing.
  10. Online Broadcasting and Content Delivery Platforms: Platforms that distribute live content to a large audience, such as live sports events or music concerts.

These applications often experience dynamic and fluctuating user engagement, making SFU cascading an ideal solution to efficiently handle varying loads while ensuring a smooth real-time communication experience for all participants.

What technologies and infrastructure components should I use in conjunction with SFU cascading?

Here are some essential technologies you might want to consider:

  1. WebRTC: As cascading often goes hand-in-hand with WebRTC-based applications, WebRTC is a crucial technology for enabling real-time audio and video communication between participants. It provides the necessary APIs and protocols for establishing peer connections and exchanging media streams.
  2. Signaling Server: To facilitate WebRTC peer connections, you'll need a signaling server that helps participants discover each other, negotiate session details, and exchange signaling messages. Common signaling protocols include WebSocket or HTTP-based signaling.
  3. Media Server: SFU cascading requires media servers that act as the SFUs in the hierarchy. These media servers handle media streams from participants, perform selective forwarding, and manage the cascading network.
  4. NAT Traversal (STUN/TURN): For WebRTC to work across different network topologies, you may need STUN and TURN servers to assist with NAT traversal. STUN servers help discover public IP addresses, while TURN servers act as relays when direct peer-to-peer connections are not possible.
  5. Load Balancer: When deploying multiple media servers for SFU cascading, a load balancer becomes essential to evenly distribute incoming client connections across the SFUs and ensure efficient resource utilization.
  6. Distributed Database or Caching: If your app needs to handle a large number of participants and sessions, you might need a distributed database or caching mechanism to manage user state, session information, and metadata across the cascading network.
  7. Infrastructure Orchestration: With multiple media servers, you may opt for containerization technologies like Docker or orchestration platforms like Kubernetes to manage and scale the SFUs effectively.
  8. Security Components: Implement encryption mechanisms (e.g., SRTP, DTLS) to secure media streams, and consider implementing user authentication and authorization for secure access to the app.
  9. Monitoring and Logging: Set up monitoring tools and logging infrastructure to keep track of server performance, user experience, and troubleshoot any issues that arise.
  10. Content Delivery Network (CDN): If you have a large number of viewers for live streaming scenarios, consider leveraging a CDN to distribute media content efficiently and reduce server load.

Remember that the specific technologies you need will depend on your app's requirements, scalability goals, and the architecture you choose to implement.

Frequently Asked Questions

How does it differ from other real-time communication architectures like MCU or P2P?

Unlike MCU, which mixes streams into a single combined stream, SFU cascading selectively forwards media streams to relevant participants, reducing bandwidth usage. Unlike P2P, where participants communicate directly, SFU cascading relies on a central server to forward streams. You can read more about these differences in this article.

Why would I need to use SFU cascading instead of a single SFU or MCU for my application?

SFU cascading is beneficial for large user bases and distributed participants. It improves scalability by distributing the load across multiple SFUs, allowing horizontal scaling. Unlike a single SFU, cascading avoids becoming a bottleneck in large applications. Compared to an MCU, it reduces server processing and bandwidth requirements. You can read more about these differences in this article.

How does SFU cascading improve scalability and handle a large number of participants?

Each SFU handles a subset of participants or a specific region, ensuring efficient resource utilization. The hierarchical setup allows the system to scale and accommodate a large number of users effectively.

How do I set up and manage multiple interconnected SFUs effectively?

Effective setup and management involve choosing an appropriate network topology, implementing efficient communication protocols and signaling, employing load balancing mechanisms, ensuring redundancy, and monitoring server performance.

What network topology should I consider for cascading SFUs, and how do I ensure optimal load distribution?

Common topologies include star, mesh, and hybrid configurations. To ensure optimal load distribution, consider participant distribution, geographical proximity, server capacity, and network conditions. Employ load balancers to evenly distribute incoming client connections across the SFUs.

How can I handle failover and ensure redundancy in a cascading setup?

Failover and redundancy can be achieved by implementing mechanisms to detect SFU failures, using distributed databases or caching for session state, maintaining backup SFUs, and proactively monitoring server health.

Are there any specific security considerations when using cascading SFUs to handle media streams?

Security considerations include implementing secure communication with encryption, employing authentication and authorization mechanisms, securing signaling protocols, and keeping server software updated to address vulnerabilities.

How can I monitor the performance and health of cascading SFUs?

Set up monitoring tools to track SFU performance, server load, network metrics, and participant experience. Monitor KPIs such as latency, packet loss, and server utilization. Use logging and analytics for troubleshooting.

Are there any specific standards or protocols to follow when implementing SFU cascading?

While SFU cascading is an architectural approach, implementation depends on technologies and protocols like WebRTC, WebSocket, HTTP, STUN, and TURN. Ensuring compliance with WebRTC standards and compatibility is essential for interoperability.

Next Steps

Start by opening an account and trying out our products. We’re here to help you understand the best solution to your use case. Contact us any time to learn more about Stream.

Chat Messaging

Build any kind of chat messaging experience without scalability or reliability issues.

Learn more about $ Chat Messaging

Activity Feeds

Build any kind of feed without the headache of scalability or reliability of your feeds.

Learn more about $ Activity Feeds