How to Sync Chat State After a User Goes Offline?

TL;DR

Chat state falls into three buckets (durable content, mutable derived state, and ephemeral state), and each requires a different sync strategy.
Reconnect in the right order: pull the server delta first, apply it atomically, then replay queued local writes with idempotent IDs.
Never use push notifications as a sync transport. Treat them as a wake-up that triggers a delta sync.
Last-write-wins fits only narrow cases like draft fields; read receipts, reactions, and edits each need order-sensitive merge rules.

Syncing chat state after a user goes offline means pulling a delta from the server first, applying it atomically, then replaying queued local writes - with each type of state handled according to its own durability and merge rules.

The reason it's rarely one clean fix is that chat state isn't uniform. Messages, edits, read receipts, presence, and typing indicators each behave differently on reconnect. A sync strategy that works for one will quietly break another.

Let's break it down.

Why Does My Chat Sync Break After a User Is Offline for a While?

Two kinds of mistakes cause most of it.

The first is handling every kind of state the same way. Presence and typing are only needed in the moment, so replaying them on reconnect is worse than useless, showing someone active who left twenty minutes ago. Read receipts depend on order, and merging them by whichever write landed last lets a stale update from the returning device drag the marker backward, marking something the user already read as unread.

The second is getting the reconnect sequence wrong: pushing local writes before pulling remote ones, sending without an idempotent ID, holding queued writes only in memory so they're gone when the app is killed, or treating push delivery as the source of truth.

Both reduce to syncing every piece of state using a single strategy. Chat state is made up of different kinds of records, and the kind determines how long each should last and how to merge them after a gap.

Those records fall into three buckets:

Durable content, like messages, edits, deletions, and reactions, needs to replay reliably and survive a fresh install. A reconnecting client should be able to ask for everything that changed since its last checkpoint and rebuild an accurate timeline.
Mutable derived state, like read receipts, delivery status, and thread positions, changes after the fact. It needs order-sensitive merging so that a receipt never ends up behind the edit that produced the text it points to.
Ephemeral state, like presence, typing, and live cursors, is only meaningful right now. Replaying a typing indicator from twenty minutes ago is worse than dropping it.

Once you split state this way, most sync bugs are easy to name: the wrong durability for a piece of state, or the wrong merge rule for it.

What's the Right Way To Reconnect a Client After It's Been Offline?

Ordering matters more than anything else. A client coming back online should learn what changed while it was offline before sending anything itself, and every write it sends has to be safe to repeat.

A reliable reconnect runs in this order:

Ask the server for everything that changed since your last committed sync token. A fresh or long-offline device gets back a full snapshot, otherwise an incremental batch of changes.
Apply those changes to local storage in a single atomic batch: new messages, edits, tombstones, receipt moves, and thread updates.
Replay your own queued writes. Tag each one with a client-generated transaction ID so the server recognizes a retry and prevents duplicate creation.
Only then reopen the live socket and resume applying incremental deltas.

Pulling the delta first prevents you from clobbering a remote edit with a stale local send or moving a read marker backward because you replayed an old receipt before seeing the newer one.

Those queued writes have to live in a durable local store, not just memory, so they survive the app being killed. If a write can't be replayed safely, show it as pending in the UI rather than pretending it has already been sent.

Sequence diagram showing the reconnect flow: mobile client queues a pending op, device goes offline, then on reconnect fetches a delta, applies events atomically, replays pending ops, and reopens the live socket.

Building your own app? Get access to our Livestream or Video Calling API and launch in days!

Should I Use Last-Write-Wins for Chat Sync?

Last-write-wins means that when two devices update the same record, you keep the write with the later timestamp and discard the other. It's cheap to implement and stores no version history. The discarded write is simply gone, and nothing records that it ever existed. That's acceptable when the current value is all you care about and losing the older one costs nothing. It's a poor fit for anything where the order of changes, or the intent behind them, matters.

So, mostly no. It fits a few narrow cases. A draft field is the clearest one: the newest text is the version you want, and the one it replaces is safe to drop.

It's often the wrong tool for chat.

Reactions and read receipts carry meaning and order. A simple timestamp comparison drops that meaning. A reaction has to stay attached to the exact message version it applied to, not just the most recent write to that record. Even a read marker, simple as it looks, wants a furthest-forward rule rather than a latest-timestamp one, so a stale write arriving late can't pull it back.

How Do I Handle Message Edits and Deletes When a Device Was Offline?

When a message is edited or deleted, store the change as a new record instead of rewriting the message in place. A device that was offline can then replay those records in order and reach the same state as every other client, no matter how long it was away. The server decides what that state is, whether the current text or a deletion, and clients apply it rather than working it out themselves.

Each operation gets its own kind of record:

An edit is stored as a revision: a new record linked to the original message that contains the updated text. The reconnecting client sees the original and the revision that replaces it, and can show edit history if you want it.
A delete is stored as a tombstone: a small record that marks the message as removed, with a reason if available. The client applies it, and either shows "message deleted" or hides the message, never having to guess why it left the timeline.

That makes server-authoritative merging the sensible default for message semantics. The server maintains a canonical representation of edits, deletions, reactions, and thread metadata, so moderation, search, notifications, and every client stay in agreement. Clients get simpler too, applying server decisions instead of negotiating them.

The exception is genuinely collaborative content. Conflict-free Replicated Data Types (CRDTs) are designed for multiple devices editing the same object simultaneously and converging once they've all seen the same changes, with no central server arbitrating each write. A shared draft, a collaborative document, or a whiteboard inside your chat product fits that description.

The message timeline doesn't. Messages are immutable once sent, so they're append-only and carry none of the write-write conflicts that CRDTs exist to resolve. Reaching for one here adds metadata and merge complexity for a problem you don't have. Use a CRDT for the collaborative surface and a server-authoritative log for the messages around it.

Can I Just Use Push Notifications to Sync Messages?

No. Push notifications are a wake-up signal, not a transport for state.

FCM and APNs are both best-effort. FCM accepting a message doesn't mean it reached the device; it may sit in storage until the connection resumes, and a TTL controls how long the service keeps trying. APNs can reorder notifications and may hold them for up to thirty days, or drop them sooner, depending on the expiration you set.

A sync layer that trusts either one as its source of truth inherits all of that: dropped messages, out-of-order state, and data loss with no error to point at.

Push should do one narrow thing: wake the app to run the same delta sync it would on any reconnect. The notification can include a hint of what changed, but the actual data comes from the sync endpoint. That keeps correctness independent of a delivery channel you don't control.

The Default That Survives Bad Networks

These answers point to one simple architecture:

Keep durable content in a server-authoritative log with stable IDs and a defined order, and have clients reconcile against it with a delta sync on reconnect.
Make every client write idempotent and queue it in durable local storage, so retries are safe, and nothing is lost if the app is killed.
Expire ephemeral state instead of replaying it, record deletes as tombstones, and reduce pushes to wake-ups that trigger a sync.

None of it is exotic, but it has a lot of moving parts, and the failures show up only on bad networks and across multiple devices, when they're hardest to reproduce. Stream Chat handles offline storage, reconnection, and state sync, so reconnect ordering and delta reconciliation are built in rather than rediscovered one bug at a time.

Whatever you build on, the decision that matters most is the durable, ordered log underneath. It's what lets a client reconnect after a long gap and still end up correct, and it's the hardest part to add once everything else is already in place.