WebSocket

Modern applications rely on continuous, real-time data exchange to keep everyone in sync and engaged. Achieving this requires more than just HTTP, which is where WebSocket comes into play.

What Is a WebSocket?

WebSocket is a communications protocol that provides a full-duplex channel over a single TCP connection, enabling interactive, real-time exchanges between client and server.

Its persistent connection lets servers push updates instantly, avoiding HTTP's repeated client requests and added latency.

If you've ever used Discord or Slack, you've seen it in action when:

New messages and reactions appear almost immediately
A typing indicator or read receipt pops up
Coworkers' and friends' dots turn green or grey to indicate online status

How Do WebSockets Work?

Here is a step-by-step guide on how this protocol works:

1. Initial Handshake

The connection begins with an HTTP handshake. The client sends the server an HTTP/1.1 GET request on port 80 or HTTPS on port 443, which includes the following headers to initiate the protocol switch:

Upgrade: websocket

Connection: Upgrade

The handshake request might look like:

GET /chat HTTP/1.1

Host: idealchatapp.example.com:80

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== (example key value)

Sec-WebSocket-Version: 13

The request can include other common HTTP headers, such as Cookie, Referer, or User-Agent.

2. Server Response

The server responds with HTTP/1.1 101 Switching Protocols status and the Upgrade: websocket and Sec-WebSocket-Accept: headers.

Here is an example response:

HTTP/1.1 101 Switching Protocols

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

Sec-WebSocket-Accept is a hash computed from the client's key to prove the handshake is legitimate.

If validation fails, the server rejects the upgrade with a 4xx or 5xx status code.

3. Connection Upgrade

After the 101 response, the connection switches from HTTP to the WebSocket protocol. The client and server can now communicate over a sustained connection. The URL scheme changes from http:// to ws:// or, for encrypted connections, from https:// to wss://.

4. Data Transfer and Message Format

Once open, either side can exchange messages independently at any time.

Messages are sent in frames, with small headers that include an opcode and payload length. Client-to-server frames require an additional 32-bit masking key header, which the server uses to unmask the payload.

Large or streaming messages can be fragmented across multiple frames to minimize overhead.

Application frame opcodes indicate text (0x1) or binary (0x2).

Control frame opcodes include close (0x8), ping (0x9), and pong (0xA). The client or server can check the connection health by sending a Ping frame, which the peer must respond to with a Pong.

5. Closing the Connection

Either the client or server can terminate the connection by sending a close frame, triggering a closing handshake.

When one party sends a closing frame, the other responds with its own. Afterwards, the underlying TCP connection is torn down, and no further messages can be sent without establishing a new connection. A close code and optional reason can be included to explain why the session ended.

Chat App WebSocket Example

For a messaging app, this workflow would look like:

The user opens their mobile app, causing the chat client to initiate the protocol switching handshake.
If successful, the chat server accepts the upgrade with a response.
The upgrade completes. The user's status indicator turns green to show they're online, and now they can enjoy other live features like low-latency messaging and typing indicators.
When they send in-app messages, the chat client uses 0x1 frames for text content and 0x2 for media like GIFs and PDFs.
They close their app, and the client begins the handshake to end the connection.

Extensions and Subprotocols

For more complex implementations, the client and server can negotiate optional extensions and application-level subprotocols during the handshake.

Extensions modify how your app transmits frames without changing the meaning of the data. They're enhancements that provide a broad set of features, such as multiplexing and compression.

For instance, the permessage-deflate extension compresses data when sent and decompresses it when received.

Extensions are negotiated with the Sec-WebSocket-Extensions header. If the server supports it, the server's handshake response will include the same extension name.

Subprotocols specify the semantics that impose a particular format or set of rules for the messages exchanged. They're application-specific and ride on top of the WebSocket. Some examples include:

STOMP over WebSockets

Purpose: Text-based messaging

SOAP over WebSockets

Purpose: Structured XML-based SOAP messaging

WAMP

Purpose: RPC and Pub/Sub

MQTT over WebSockets

Purpose: Combines lightweight pub/sub MQTT to serve IoT devices

During the handshake, a client can request multiple subprotocols with the Sec-WebSocket-Protocol header. For example:

Sec-WebSocket-Protocol: soap, wamp, mqtt

The server must then select one of the requested protocols it supports and return it in the handshake response. For example:

Sec-WebSocket-Protocol: wamp

If no subprotocol is agreed upon, the WebSocket defaults to raw messages with no enforced format.

If you choose a subprotocol, you must code the logic your server and client need for implementation.

Use Cases

Below are some of the top use cases:

Data Feeds and Live Dashboards

This protocol allows all connected clients to receive new data points as soon as they're published, without each client querying the server. This makes it a great fit for continuously updating activity feeds, financial trading platforms, IoT sensor dashboards, and live sports scoreboards.

Online Gaming

Online games use this protocol to sync state and provide a near-real-time experience for players. Game studios often implement it for matchmaking, in-game chat, and notifications.

Developers don't commonly use it for games that require ultra low-latency because it's built on top of TCP, which introduces a degree of delay for reliable packet delivery. However, you can use it for slower-paced gameplay, like a chess or card game.

As discussed before, real-time chat applications use it for everything from message delivery to status updates.

Messaging platforms and APIs also use it for several other features, like unread counts, disappearing messages, polls, and cross-platform chat history synchronization.

While its TCP base also prevents it from being responsive enough for video and audio call media streams, some conferencing platforms may use this protocol for text chats, reactions, signaling, and moderation actions.

Live Collaboration Tools

This protocol is what enables live features in collaboration applications, where multiple users need to interact with shared content simultaneously.

Developers use it for many of the same features found in chat apps, like synchronizing data across devices and pushing real-time comments and notifications. They also use it for:

File sharing, including download and upload progress bars
Board updates
Whiteboard
Cursor tracking
Low-latency document editing

WebSocket vs HTTP

While both protocols enable client-server communication, they have major differences in data transmission, including:

Connection Lifecycle

In HTTP, each request is made over a new or reused TCP connection that closes after the server's response. WebSocket maintains a persistent connection that only closes with a closing handshake or by error.

Messaging Pattern

HTTP is unidirectional. If you want to retrieve data updates, the client must send a request and wait for the server's response.

WebSocket is bidirectional, which allows the client and server to send messages independently at any time. The server can push updates as they roll in.

State

WebSocket is stateful, meaning it retains the context of a connection when it's established. A single connection can transmit many messages over time without needing to be reestablished.

In contrast, HTTP is stateless. Each request is independent, and the protocol does not retain information between them. However, you can use session management techniques to preserve state, like cookies or tokens.

Scalability

Each open WebSocket connection consumes memory, and clients are tied to the same server that holds their connection. To handle concurrent connections at scale, you may need to use tools like load balancers and messaging patterns like pub/sub.

On the other hand, HTTP connections are short-lived, making it easier to scale horizontally. Any server instance can handle an independent request, with load balancers freely distributing traffic.

Use Cases

WebSocket is well-suited for real-time applications, enabling features like live document editing or dashboard updates.

HTTP is more fit for traditional web applications, REST APIs, and scenarios where updates are infrequent. These include retrieving database records, static pages, submitting forms, and authentication.

Best Practices for Implementation

To ensure reliability, security, and performance, you should:

Use Secure WebSockets

Always encrypt your traffic in production. WebSocket over TLS/SSL (WSS), often known as Secure WebSockets, uses TLS for security, the same encryption as HTTPS.

Using WSS ensures data exchanges are protected from man-in-the-middle attacks. Browsers also generally require WSS for non-localhost sites.

Handle Disconnects and Errors

Configure your implementation to detect dropped connections and initiate retries. By using exponential backoff when retrying connections, you can gradually increase the reconnection interval while avoiding server overload.

You can also implement logic to detect stale connections using ping/pong and reconnect if no messages are received after a set duration.

Implement Authentication and Authorization

This protocol bypasses the usual response-request cycle after the handshake, so you must authenticate clients before allowing them to establish a connection. You should also enforce authorization at the application level to manage permissions for messages, chat channels, dashboards, documents, and more.

Limit Resource Usage

Set limits to prevent abuse or excessive resource usage. Some ways you can do this are:

Limiting the number of concurrent connections a single user or IP can open
Implementing timeouts on the server to close idle connections after a period of inactivity
Enforcing message size limits and rate limiting to prevent clients from sending too many messages

Enable Message Compression

Compression cuts data size significantly, especially for text-based payloads. You can achieve this using the per-message deflate extension, which will reduce bandwidth usage. Modern WebSocket libraries typically support this extension.

Monitor the Connection

Continuously monitoring helps you maintain system health. You should:

Look for issues like memory leaks, abnormal disconnects, or usage spikes
Track metrics, including the number of active connections, message throughput, latency, and error rates
Log significant events to streamline troubleshooting

Frequently Asked Questions

Is WebSocket Better Than REST API?

They solve different problems, so one isn’t strictly “better” than the other.

WebSocket is more suited to scenarios that require real-time data exchange. RESTful APIs are better for request-response interactions, database lookups, and microservice communications.

You can implement both protocols in the same app, using WebSockets for live updates and REST APIs for fetching existing data.

What Are the Downsides of WebSockets?

While they enable powerful real-time features, they also have some drawbacks.

Maintaining a connection consumes server resources for each client, which can be challenging to scale to very large numbers of users. Their stateful architecture is more complex than the stateless nature of HTTP. They also cannot be easily cached or routed through CDNs, like HTTP responses.

Is WebSocket Frontend or Backend?

It isn’t tied to frontend or backend exclusively, as it’s a communication protocol that requires a client running in the frontend and a server running on the backend to establish the connection.

The client initiates the connection and sends messages, and the server processes them and can send messages back.

Which Is Better, WebSocket or WebRTC?

Even though both enable real-time communication, WebSockets and WebRTC are designed for different purposes:

WebSocket runs over TCP and uses a client-server model. TCP's reliable delivery means no packets are dropped or sent out of order, making WebSocket a better choice for use cases like live text chat, stock tickers, and syncing state in online games.
WebRTC primarily runs over UDP and is peer-to-peer. UDP is less reliable but has lower latency, which makes WebRTC the better option for features that can tolerate occasional lost or unordered packets, such as web conferencing and media streaming.

Do Firewalls Block WebSockets?

This protocol operates on standard ports (80 for ws:// and 443 for wss://), meaning it can often pass through firewalls that allow web traffic. WSS appears as normal HTTPS traffic to firewalls because of its TLS encryption.

What Is a WebSocket?

How Do WebSockets Work?

1. Initial Handshake

2. Server Response

3. Connection Upgrade

4. Data Transfer and Message Format

5. Closing the Connection

Chat App WebSocket Example

Extensions and Subprotocols

Use Cases

Data Feeds and Live Dashboards

Online Gaming

Social Messaging and Calling

Live Collaboration Tools

WebSocket vs HTTP

Connection Lifecycle

Messaging Pattern

State

Scalability

Use Cases

Best Practices for Implementation

Use Secure WebSockets

Handle Disconnects and Errors

Implement Authentication and Authorization

Limit Resource Usage

Enable Message Compression

Monitor the Connection

Frequently Asked Questions

Is WebSocket Better Than REST API?

What Are the Downsides of WebSockets?

Is WebSocket Frontend or Backend?

Which Is Better, WebSocket or WebRTC?

Do Firewalls Block WebSockets?