Build low-latency Vision AI applications using our new open-source Vision AI SDK. ⭐️ on GitHub

When To Choose Long Polling vs Websockets for Real-Time Feeds

New
18 min read

Choosing between long polling and WebSockets is less about “real-time” and more about tradeoffs.

Raymond F
Raymond F
Published February 19, 2026
When To Choose Long Polling vs Websockets for Real-Time Feeds cover image

Real-time feeds are practically table stakes for modern applications. Users expect instant messaging, activity streams, and collaboration throughout a product. Product managers see competitors shipping these features and add them to the roadmap. Developers reach for WebSockets or start polling an endpoint.

But real-time infrastructure that works in development often fails in production. Connections drop silently. Servers run out of file descriptors at 10,000 users. Load balancers terminate idle connections. Mobile clients drain batteries. A feature that felt simple to prototype becomes an operational burden that pages your team at 3 am.

The difference between a real-time feed that scales and one that doesn't comes down to understanding the tradeoffs before you build. This guide gives developers an intuitive understanding of the two main approaches, long polling and WebSockets, with the depth to make the right choice for your use case.

What Is a Real-Time Feed?

A real-time feed delivers a continuously updating stream of events from your backend to clients with low perceived delay. The goal is to keep the UI in sync without requiring users to manually refresh.

You constantly interact with real-time feeds, even if you don't think of them that way (for instance, Slack messages appearing as teammates type.) When someone sends a message in a channel, it appears in your sidebar and message list within milliseconds, and the typing indicator ("Sarah is typing...") updates in real-time.

GitHub's activity feed is another example that updates as your team ships code. Open your organization's feed and watch as pull requests are opened, commits are pushed, and issues are commented on. Each event includes the actor, timestamp, and link to the relevant resource.

GitHub activity feed

Real-time feeds don’t always have to be text-based. Figma's multiplayer cursors track where your teammates are working. Open a shared design file and see colored cursors with names attached, moving around the canvas as collaborators select objects, draw shapes, and type text. Changes appear instantly as others make them.

Figma's multiplayer cursors track where teammates are working

These examples share common infrastructure concerns that go beyond just "push bytes fast":

  • Event identity and ordering. Each message, commit, location ping, or cursor position needs an ID. Slack messages have timestamps and channel context. GitHub events have IDs and timestamps. Without ordering, clients can't detect gaps or duplicates.
  • Resume semantics. What happens after a disconnect? Slack replays missed messages from your last-seen cursor. GitHub refreshes your feed from the server. Figma syncs the document state and replays recent operations. The reconnection strategy shapes both UX and server architecture.
  • Fanout. One event often reaches many recipients. A Slack message in a 500-person channel fans out to 500 clients. A GitHub push notification reaches everyone watching the repo. This multiplies your server load and shapes your pub/sub architecture.
  • Backpressure. What if a client can't keep up? A slow mobile connection might lag behind a fast-moving Slack channel. A Figma canvas with 20 active collaborators generates a firehose of cursor positions. Your system needs strategies for dropping, coalescing, or queuing events when delivery outpaces consumption.

Delivering these feeds to clients comes down to two main approaches: long polling and WebSockets.

Both can power real-time experiences, but they make different trade-offs in terms of latency, infrastructure complexity, and operational overhead. The right choice depends on your update frequency, the bidirectionality of your communication needs, and what your infrastructure can support.

What Is Long Polling?

Long polling is an HTTP technique that simulates server push using standard request/response cycles. The client sends a request, and instead of responding immediately, the server holds the connection open until it has something to deliver. Once the server responds (either with data or after a timeout), the client immediately sends another request. This keeps one request pending at all times, ready for the server to respond the moment an event occurs.

The key insight is that HTTP doesn't require the server to respond immediately. A request can stay open for seconds or even minutes. Long polling exploits this by treating the pending request as a "mailbox" that the server can drop messages into.

Here's the basic flow:

  1. Client sends GET /poll?cursor=123
  2. Server checks if any events exist after cursor 123
  3. If yes, respond immediately with the events
  4. If no: hold the connection open and wait
  5. When an event arrives (or a timeout hits): respond
  6. Client receives response, immediately sends GET /poll?cursor=456
  7. Repeat

This server implements this pattern:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
function handleLongPolling(req, res, query) { const lastSeenId = parseInt(query.lastId) || 0; const timeout = parseInt(query.timeout) || 30000; // Check if there are already new messages const newMessages = feedState.messages.filter(m => m.id > lastSeenId); if (newMessages.length > 0) { // Immediate response - data already available sendJsonResponse(res, 200, { type: 'update', messages: newMessages, lastId: feedState.lastId }); return; } // No new data - hold the connection open const client = { res, lastSeenId, timestamp: Date.now() }; client.timeoutId = setTimeout(() => { longPollingClients.delete(client); sendJsonResponse(res, 200, { type: 'timeout', messages: [], lastId: feedState.lastId }); }, timeout); longPollingClients.add(client); }

The function first checks whether events already exist beyond the client's cursor. If so, it responds immediately. There's no reason to wait when data is already available.

If no new events exist, the server stores the response object (res) in a set of waiting clients. The response isn't sent yet, so the HTTP connection stays open. The client is literally waiting on the other end for bytes to come back.

The setTimeout ensures the server eventually responds even if no events arrive. Without this, the connection would hang indefinitely, and intermediary proxies or load balancers would eventually kill it with a 504 Gateway Timeout. Thirty seconds is a common choice because it's short enough to avoid most infrastructure timeouts while long enough to reduce unnecessary round-trips.

When a new event does arrive, the server notifies all waiting clients:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function notifyLongPollingClients(message) { longPollingClients.forEach(client => { if (message.id > client.lastSeenId) { clearTimeout(client.timeoutId); longPollingClients.delete(client); const latency = Date.now() - client.timestamp; sendJsonResponse(client.res, 200, { type: 'update', messages: [message], lastId: feedState.lastId, latency: `${latency}ms` }); } }); }

This iterates over all waiting clients, cancels their timeouts, removes them from the waiting set, and sends the response. The client receives the event and immediately issues a new long poll request, restarting the cycle.

On the client side, the loop looks like this:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
async poll() { const startTime = Date.now(); const response = await fetch(`/poll?lastId=${this.lastId}&timeout=25000`); const data = await response.json(); if (data.type === 'update' && data.messages) { data.messages.forEach(msg => { this.lastId = Math.max(this.lastId, msg.id); this.addMessageToFeed(msg); }); } this.lastId = data.lastId || this.lastId; // Immediately start next poll this.poll(); }

This shows like this:

The recursive call to this.poll() at the end is what keeps the cycle going. Each response triggers a new request, so there's always one pending. The client-side timeout (25 seconds) is slightly shorter than the server's (30 seconds) to account for network latency and ensure the client times out before the server does.

The cursor (lastId) is critical. It tells the server where the client left off, enabling replay of missed events after disconnects. Without a cursor, the client would either miss events or receive duplicates.

What are WebSockets?

WebSockets provide a persistent, bidirectional communication channel between client and server. Unlike long polling's request/response cycle, a WebSocket connection stays open indefinitely, and either side can send messages at any time without waiting for the other.

The protocol starts as HTTP. The client sends a regular HTTP request with special headers asking to "upgrade" the connection:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

If the server supports WebSockets, it responds with a 101 status code, and the connection switches from HTTP to the WebSocket protocol. From that point on, both sides communicate using WebSocket frames rather than HTTP requests.

The key differences from HTTP:

  • No request/response pairing. Either side can send a message whenever it wants. The server doesn't need to wait for a client request to push data.
  • Persistent connection. The TCP connection stays open for the lifetime of the session, which could be minutes or hours.
  • Minimal framing overhead. After the handshake, messages are wrapped in small frames (as little as 2-6 bytes of overhead) rather than full HTTP headers.
  • Built-in message types. The protocol defines text frames, binary frames, ping/pong for keepalives, and close frames for clean shutdown.

Here's how the demo server sets up WebSocket handling:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function setupWebSocketServer(server) { const wss = new WebSocketServer({ server }); wss.on('connection', (ws, req) => { wsClients.add(ws); // Send current state on connect ws.send(JSON.stringify({ type: 'init', messages: feedState.messages.slice(-10), lastId: feedState.lastId })); ws.on('message', (data) => { const parsed = JSON.parse(data); if (parsed.type === 'ping') { ws.send(JSON.stringify({ type: 'pong', timestamp: Date.now() })); } }); ws.on('close', () => wsClients.delete(ws)); }); }

When a client connects, the server adds the socket to a set of active clients and immediately sends the last 10 messages. This "init" payload provides the client with the current state without requiring a separate request. The connection then stays open, listening for incoming messages and ready to send outgoing ones.

The ws.on('message') handler demonstrates bidirectional communication. The client can send messages to the server at any time (here, a ping for keepalive), and the server can respond. This same channel could handle chat messages, typing indicators, or any other client-to-server communication.

Broadcasting to all connected clients is straightforward:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
function broadcastToWebSockets(message) { const payload = JSON.stringify({ type: 'update', message, lastId: feedState.lastId, timestamp: Date.now() }); wsClients.forEach(ws => { if (ws.readyState === 1) { // 1 = OPEN ws.send(payload); } }); }

The server iterates over all connected clients and sends the payload to each. The readyState check ensures we only send to connections that are actually open (clients might be in the middle of a disconnect).

Notice there's no waiting, no holding connections, no timeouts to manage. When an event happens, the server pushes it immediately. The client receives it through its persistent connection without making a new request.

On the client side, the WebSocket API is event-driven:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
connect() { this.ws = new WebSocket(`ws://${window.location.host}`); this.ws.onopen = () => { console.log('Connected'); }; this.ws.onmessage = (event) => { const data = JSON.parse(event.data); if (data.type === 'init') { // Initial state sync data.messages.forEach(msg => this.addMessageToFeed(msg)); } else if (data.type === 'update') { // Real-time update this.addMessageToFeed(data.message); } }; this.ws.onclose = () => { // Reconnect with exponential backoff const delay = Math.min(1000 * Math.pow(2, this.reconnectAttempts), 30000); this.reconnectAttempts++; setTimeout(() => this.connect(), delay); }; }

The client opens a connection once and then reacts to incoming messages via the onmessage handler. There's no polling loop, no repeated requests. Messages just arrive in real-time when the server sends them.

The onclose handler is important. WebSocket connections can drop for many reasons: network issues, server restarts, load balancer timeouts, and mobile devices going to sleep. The client needs to detect disconnection and reconnect automatically. Exponential backoff (1s, 2s, 4s, 8s...) prevents hammering the server if it's having problems.

How Do Long Polling and Websockets Affect Latency and User Experience?

Both approaches can feel "real-time," but they have different latency profiles that matter more in some scenarios than others.

With long polling, when a request is already pending, and an event arrives, latency is just one network round-trip. The server responds immediately, the client receives the event, and it's done. On a typical connection, that's 50-150ms depending on geography and network conditions.

The problem is what happens next. After the client receives a response, it must issue a new request before the server can deliver another event. If a second event arrives during this gap, it waits. The worst case looks like this:

Diagram of long polling and websockets

That's three network transits for Event 2, potentially 150-450ms of latency for an event that happened right after Event 1. The IETF RFC on long polling specifically calls this out as a fundamental limitation.

With WebSockets, both events arrive with single-transit latency because the connection is already open. Event 2 doesn't wait for anything.

The difference matters:

  • In a GitHub activity feed that updates every few minutes, the occasional 300ms delay is invisible. Users aren't staring at the feed waiting for the next commit notification.
  • In fast-moving Slack channels, bursts of messages can expose gaps in long polling. If five messages arrive in quick succession, the first comes through immediately, but messages 2-5 queue behind reconnection cycles. Users might see messages appear in small batches rather than one at a time.
  • For Figma's multiplayer cursors, the difference is dramatic. Cursor positions are updated many times per second. Long polling would create visible jumpiness as positions batch up behind reconnection cycles. WebSockets deliver each position update as soon as it's sent, producing the smooth cursor movement users expect.

The threshold for "instant" in UI interactions is roughly 100ms. Below that, actions feel immediate. Above 200-300ms, users perceive a delay. Long polling's average latency often falls within the instant range, but its tail latency can push into perceptible territory during bursts.

Mobile Considerations

Mobile networks add another dimension. When radios are in low-power states, resuming active transfer can add ~tens to ~100ms (or more, depending on network/state) to normal transit latency.

Long polling's repeated request cycle triggers this wake-up cost more frequently. Each poll timeout and reconnection may incur a radio wake-up penalty.

WebSockets keep the connection open, which keeps the radio in a higher-power state but avoids repeated wake-up costs. For apps with frequent updates, this can actually be more battery-efficient than long polling. For apps with rare updates, the persistent connection may drain the battery faster than periodic polling would.

The tradeoff depends on your update frequency. High-frequency feeds (chat, collaboration) favor WebSockets. Low-frequency feeds (daily digests, occasional notifications) might favor long polling or even simple periodic polling.

What are the Scalability and Infrastructure Tradeoffs?

Both approaches keep server-side state for each connected client. The differences lie in the cost of that state and how it interacts with the rest of your infrastructure.

A common misconception is that long polling is "stateless" because it uses HTTP. In practice, both approaches consume roughly the same resources per client. Long polling holds a TCP connection, an HTTP request object, and a timer for each waiting client:

javascript
1
2
3
4
5
6
7
const client = { res, // The response object (HTTP connection) lastSeenId, // Cursor position timestamp: Date.now(), timeoutId: null // Timer reference }; longPollingClients.add(client);

WebSockets hold a TCP connection and a socket object:

javascript
1
wsClients.add(ws); // Just the socket

The per-connection memory is comparable. The difference is what happens between events.

With long polling on a thread-per-request server (traditional PHP, older Java servlet containers), each waiting request ties up a thread. If you have 10,000 connected clients, you need 10,000 threads just to hold their connections. This is why long polling historically had a reputation for poor scalability.

With async/event-loop servers (Node.js, Go, Rust, Python with asyncio, Java with virtual threads), the connection is just a file descriptor and a small state object. The event loop handles thousands of connections with a single thread. Both long polling and WebSockets scale well in this model.

Building your own app? Get early access to our Livestream or Video Calling API and launch in days!

The practical limit is usually the number of file descriptors. Each TCP connection consumes one. At high connection counts, you can hit memory pressure from kernel TCP buffers. Linux TCP receive buffers are autotuned; initial/default budgets are often on the order of tens to hundreds of KiB per connection (and can grow), so connection scale can become memory-bound.

Header Overhead at Scale

Each long-polling response includes all HTTP headers. A minimal response might look like:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 42
Cache-Control: no-cache
Connection: keep-alive

{"type":"update","messages":[],"lastId":5}

Header/framing overhead scales roughly with delivery count:

  • Long polling: ~150–200 bytes of HTTP headers per delivery (plus a new request to re-open the poll).
  • WebSockets: ~2 bytes (server→client small messages) or ~6 bytes (browser→server small messages).

Example: at 1,000 deliveries/sec, that’s ~0.15–0.20 MB/s vs ~0.002–0.006 MB/s.

For a notification that fires once per minute, this doesn't matter. For Figma-style cursor positions updating 30 times per second, it's the difference between a feasible system and one that saturates your network.

Load Balancers and Proxies

Long polling works with standard HTTP infrastructure. Any L7 load balancer that handles HTTP will correctly route long-poll requests. You need to ensure timeouts are aligned (more on that below), but there's no special protocol for handling them.

WebSockets require explicit support. The HTTP Upgrade mechanism uses hop-by-hop headers that proxies don't forward by default. NGINX, for example, requires explicit configuration:

location /ws {
    proxy_pass http://backend;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

Without this, the upgrade handshake fails, and clients fall back to HTTP (or just error out).

WebSockets also complicate load balancer behavior in other ways:

  • Connection draining during deploys. When you restart a server, HTTP requests complete quickly. Long poll requests either return with a timeout response or get terminated (the client reconnects to another server). WebSocket connections are long-lived, so you need to either wait for them to close naturally, forcibly terminate them (clients reconnect), or implement graceful handoff (complex).
  • Sticky sessions aren't the answer. For HTTP, you might use sticky sessions to route a user's requests to the same server. With WebSockets, the connection itself is sticky by nature. But if that server dies, the client must reconnect, and there's no guarantee they'll have session state on the new server. Build your app to handle reconnection to any server.
  • Health checks need care. A server might be healthy for HTTP traffic but have exhausted its WebSocket connection limit. Your health checks should account for this if WebSocket capacity is a constraint.

Timeout Alignment

Long-lived connections interact with every timeout in your infrastructure. Misalignment causes spurious disconnections that are hard to debug.

For long polling, the server timeout must be shorter than any intermediary timeout. Common defaults:

Thirty seconds provides a margin against all of these. If you set 90 seconds and your ALB kills the connection at 60, clients see unexplained disconnects. The client timeout should be shorter than the server timeout. This ensures the client times out and reconnects before the server gives up, avoiding a race where both sides think the other is gone.

WebSockets need application-level heartbeats to stay alive. The protocol includes ping/pong frames for this:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
// Server sends ping every 30 seconds setInterval(() => { wsClients.forEach(ws => { if (ws.isAlive === false) return ws.terminate(); ws.isAlive = false; ws.ping(); }); }, 30000); // Client responds to ping automatically (browser handles this) // Server marks connection alive on pong ws.on('pong', () => { ws.isAlive = true; });

The heartbeat interval must be shorter than your infrastructure's idle timeout. If AWS ALB drops idle connections at 60 seconds, ping every 30-45 seconds.

Cloudflare explicitly warns that edge restarts can terminate WebSocket connections at any time. Heartbeats don't prevent this; they only detect it faster. Your client must handle unexpected disconnection regardless.

Observability Differences

HTTP requests generate logs automatically. Most web servers log every request with timestamp, path, status code, and latency. Long polling fits this model: each poll is a request, each response is logged.

WebSocket traffic is invisible to standard HTTP logging. You see the initial upgrade request, then silence until disconnection. To understand what's happening, you need custom instrumentation:

  • Connection count (current, peak)
  • Connection churn (connects/disconnects per second)
  • Messages sent/received per second
  • Message latency (time from send to acknowledgment, if you implement acks)
  • Error rates by type (protocol errors, application errors)

Build this from day one. Debugging production WebSocket issues without metrics is painful.

What Reliability and Failure Modes Do Each Have?

Both approaches can fail. The differences lie in how they fail and how you recover.

Long Polling Failures

Timeout misalignment (408/504 errors). If your poll timeout exceeds an intermediary's idle timeout, the proxy kills the connection before the server responds. The fix: set your poll timeout to a value below the lowest timeout in your infrastructure path (the defaults we covered earlier).

Duplicate delivery. When a connection drops mid-response, the client doesn't know if events were sent. It reconnects and might receive them again. The fix: deduplicate by event ID on the client:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
const processedIds = new Set(); function handleEvent(event) { if (processedIds.has(event.id)) return; // Already seen processedIds.add(event.id); // Prevent unbounded growth if (processedIds.size > 1000) { const oldest = processedIds.values().next().value; processedIds.delete(oldest); } addMessageToFeed(event); }

Thundering herd. If your server restarts, all clients reconnect simultaneously. The fix: jittered backoff. Add random delay to reconnection attempts:

javascript
1
2
3
4
5
6
function reconnectWithJitter() { const baseDelay = Math.min(1000 * Math.pow(2, this.retryCount), 30000); const jitter = Math.random() * 1000; // 0-1 second random this.retryCount++; setTimeout(() => this.poll(), baseDelay + jitter); }

The jitter spreads 10,000 simultaneous reconnects over 1-2 seconds instead of one instant.

Graceful degradation. Under load, long polling slows down rather than failing. Events queue, responses take longer, but clients don't see errors. This can mask problems, so you must monitor response times to catch it.

WebSocket Failures

Silent disconnects. Connections can die without either side knowing (NAT timeout, mobile sleep, network switch). The TCP socket looks open, but nothing gets through. The fix: heartbeats with timeouts. Server pings every 30 seconds; if no pong arrives within 10 seconds, terminate and let the client reconnect.

State loss on reconnection. When a WebSocket drops, in-flight messages are lost. Unlike long polling's cursor-based replay, there's no built-in resume. The fix: implement resume yourself. Track the last event ID the client processed, send it on reconnect, and have the server replay missed events. This is the same cursor pattern long polling uses, just at the application layer.

Backpressure and memory exhaustion. The browser's WebSocket API has no flow control. If events arrive faster than the client processes them, the messages queue in memory until the tab crashes. For high-frequency updates like cursor positions, use "latest wins" coalescing on the server:

javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Track latest position per user, not every position const latestPositions = new Map(); function handleCursorMove(userId, position) { latestPositions.set(userId, position); } // Broadcast on a fixed interval, not on every move setInterval(() => { const payload = JSON.stringify({ type: 'cursors', positions: Object.fromEntries(latestPositions) }); wsClients.forEach(ws => ws.send(payload)); }, 50); // 20 updates/sec max, regardless of input rate

This caps update frequency and ensures slow clients see current positions rather than falling behind.

Deploy disruption. Server restarts terminate all WebSocket connections instantly. Unlike HTTP requests that complete quickly, WebSocket connections might be hours old. The fix is making disconnection cheap:

  • Clients reconnect automatically with backoff
  • Clients resume from a known event ID
  • Servers are stateless; any server can handle any client
  • Application state lives in a database, not in handler memory

Here’s a comparison of failures between the options:

Failure typeLong pollingWebSockets
Connection dropsAutomatic replay via cursorMust implement resume
Server restartRequests fail, clients retryConnections terminate; need reconnect logic
Intermediary timeout504 errors if misconfiguredSilent disconnect without heartbeat
Client can't keep upLatency increasesMemory exhaustion risk
Network blipSingle request failsFull reconnect needed

Long polling fails visibly (HTTP errors) and recovers automatically. WebSocket failures are often silent and require explicit detection and recovery. The tradeoff: WebSockets perform better once you've built that recovery logic.

When Is Each the Right Choice?

Choose long polling when:

  • Your feed updates at low to moderate frequency (notifications, occasional activity)
  • You need maximum compatibility with existing HTTP infrastructure and restrictive networks
  • Your organization prefers stateless HTTP operations for routing, auth, and logging
  • You're implementing a fallback transport for when WebSockets can't connect
  • Your product tolerates slightly higher tail latency and occasional catch-up bursts

Choose WebSockets when:

  • You need bidirectional interaction (typing indicators, presence, collaborative editing)
  • You expect high message frequency, where HTTP header overhead becomes wasteful
  • You need consistent low latency and smooth UX under load
  • You can support the operational requirements: connection-aware load balancing, heartbeats, reconnect/resume, and custom monitoring

But most production systems don't pick just one. Real-time frameworks like SignalR negotiate the best available transport and fall back as needed (WebSockets when possible, long polling otherwise).

If your use case is primarily server-to-client streaming (classic "feed"), Server-Sent Events (SSE) can be a strong middle ground: HTTP-based, simpler than WebSockets, and designed for streaming events via the EventSource API.

For new projects with demanding requirements, WebTransport is emerging as a potential successor, offering features such as built-in backpressure and multiplexing that WebSockets lack.

FAQs

  1. Can I start with long polling and switch to WebSockets later?

Yes, and many teams do. Long polling is easier to deploy (using standard HTTP infrastructure) and debug (requests appear in logs). Start there, measure your latency and overhead, and migrate to WebSockets if you hit limits. The cursor-based resume pattern works for both, so your event model stays the same. Libraries like Socket.IO and SignalR handle this automatically, negotiating WebSockets when available and falling back to long polling.

  1. What about Server-Sent Events (SSE)?

SSE is a good middle ground for server-to-client streaming. It's HTTP-based (works with existing infrastructure), supports automatic reconnection with last-event-ID, and is simpler than WebSockets.

The limitation is that it's unidirectional. Clients can't send messages back over the same connection. For feeds that are purely push (notifications, activity streams, live scores), SSE is worth considering. For bidirectional needs (chat, collaboration), you'll need WebSockets or a separate channel for client-to-server messages.

  1. How do I handle authentication with WebSockets?

WebSockets don't support custom headers after the initial handshake, so you can't use standard Authorization headers for ongoing requests.

Common approaches:

  1. Authenticate during the HTTP upgrade handshake using cookies or a token in the query string
  2. Send an authentication message as the first frame after connection
  3. Use short-lived connection tokens that your HTTP API issues and your WebSocket server validates.

Whichever you choose, plan for token expiration and re-authentication without dropping the connection.

  1. Do I need a separate WebSocket server?

Not necessarily. Most modern frameworks (Node.js, Go, Rust, Python with ASGI) handle WebSockets alongside HTTP on the same server and port. The complexity comes at scale: WebSocket connections are long-lived, so a server restart affects all connected clients.

Some teams run dedicated WebSocket servers behind a separate load balancer to isolate deploys and scale connection capacity independently from their HTTP API tier. Start simple, separate when you have a reason to.

Picking the Right Transport for Your Real-Time Deed

Long polling and WebSockets both solve the problem of pushing events to clients without polling at a fixed interval, but they make different trade-offs.

Long polling is HTTP all the way down. It works with your existing infrastructure, shows up in your logs, and fails visibly. The cost is higher latency during bursts and more bandwidth overhead per message.

WebSockets provide lower latency and less overhead, but require infrastructure awareness (proxy config, idle timeouts, health checks) and explicit reliability logic (heartbeats, reconnection, resume). Failure modes remain silent until you build detection.

For most feeds, such as notifications, activity streams, and dashboards that update every few seconds, long polling is simpler to operate and performs well enough. For high-frequency bidirectional features, such as chat with typing indicators, collaborative editing, or multiplayer anything, WebSockets are worth the operational investment.

But the patterns matter more than the transport. Cursor-based resume, jittered reconnection, idempotent event handling: these work for both approaches and will save you when things go wrong in production.

Feeds 4.5M Concurrent Connection Benchmark
View the architecture behind Stream Feeds that powers performance at scale.
View Feeds Benchmark