Back to TechSheets
Local-FirstWebAssemblyCRDTSQLiteDistributedSystems

Beyond the Cloud: Orchestrating Deterministic Local-First State with SQLite-WASM and Hybrid CRDTs

Thanga MariappanSenior Architect
7 min read
Mar 26, 2026
Beyond the Cloud: Orchestrating Deterministic Local-First State with SQLite-WASM and Hybrid CRDTs

Beyond the Cloud: Orchestrating Deterministic Local-First State with SQLite-WASM and Hybrid CRDTs

For the past decade, web development has been dominated by the "Cloud-First" paradigm. We built thin clients that acted as mere views for centralized databases, treating every user interaction as a potential round-trip to a server located hundreds of miles away. However, the tide is shifting. A new architectural movement—Local-First Development—is emerging, promising apps that are instantly responsive, work seamlessly offline, and give users true ownership of their data.

But moving the primary data store to the browser isn't as simple as slapping an IndexDB layer over your API. It requires a fundamental rethink of state synchronization and conflict resolution. In this post, we will explore the cutting-edge combination of SQLite-WASM (leveraging the Origin Private File System) and Hybrid Conflict-free Replicated Data Types (CRDTs) to build a robust, deterministic synchronization engine.

The Architectural Shift: Why Local-First?

In a traditional SPA (Single Page Application), the UI is often a "lie." We use optimistic UI updates to mask the fact that the data hasn't actually been persisted yet. If the network fails, the lie collapses, and the user is met with "Retry" buttons or lost progress.

Local-first changes the hierarchy. The local device is the source of truth. The network becomes an asynchronous background task. This approach offers three major advantages:

  1. Latency Elimination: UI interactions happen at the speed of the local CPU and disk storage (under 1ms).
  2. Robust Offline Support: There is no "offline mode" because being offline is a natural state of the system.
  3. Collaborative Potential: By treating the database as a distributed system, multi-user collaboration becomes a synchronization problem rather than a locking problem.

The Foundation: SQLite-WASM and OPFS

Until recently, persisting large amounts of relational data in the browser was clunky. IndexedDB has a notoriously difficult API and performance bottlenecks. Enter SQLite-WASM.

With the introduction of the Origin Private File System (OPFS), browsers now provide a high-performance storage endpoint that allows WebAssembly-based SQLite to achieve near-native read/write speeds. This allows us to run a full relational database engine inside a Web Worker, providing a powerful SQL interface for our local state.

Setting up the Persistent Store

To ensure our database doesn't block the main UI thread, we instantiate SQLite within a dedicated Worker. Here is how we initialize the OPFS-backed storage:

// worker.js
import sqlite3InitModule from '@sqlite.org/sqlite-wasm';

const initializeDB = async () => {
  const sqlite3 = await sqlite3InitModule({
    print: console.log,
    printErr: console.error,
  });

  if ('opfs' in sqlite3) {
    const db = new sqlite3.oo1.OpfsDb('/my_app_db.sqlite3');
    console.log('Connected to SQLite via OPFS:', db.filename);
    return db;
  } else {
    return new sqlite3.oo1.DB('/local.sqlite3', 'ct');
  }
};

The Conflict Problem: Enter Hybrid CRDTs

Storing data locally is easy; syncing it across three different devices without losing data is hard. If User A and User B both edit the same document while offline, how do we merge their changes? Standard SQL databases use Last-Write-Wins (LWW) based on server timestamps, which often leads to data loss.

CRDTs (Conflict-free Replicated Data Types) solve this by providing data structures that are mathematically guaranteed to converge to the same state, regardless of the order in which operations are received. However, pure CRDTs (like Automerge or Yjs) can be memory-intensive when dealing with massive datasets.

The Hybrid Approach

A Hybrid CRDT approach involves using a relational database (SQLite) for high-speed querying and storage, while augmenting rows with metadata (version vectors or logical clocks) to handle synchronization.

Instead of syncing the entire database file, we sync delta-changesets. Each row in our SQLite table is treated as an element in a LWW-Element-Set. To implement this, every table must include three hidden columns:

  1. hlc: A Hybrid Logical Clock timestamp.
  2. origin: A unique ID for the device that made the change.
  3. is_deleted: A boolean flag (tombstone) to handle deletions.

Implementing Hybrid Logical Clocks (HLC)

We cannot rely on system clocks because they can drift or be manipulated by users. HLCs combine physical clock time with a logical counter to ensure a strict partial ordering of events.

class HLC {
  ts: number;
  count: number;
  node: string;

  constructor(node: string, ts = Date.now(), count = 0) {
    this.node = node;
    this.ts = ts;
    this.count = count;
  }

  // Generate a new HLC for a local mutation
  increment() {
    const now = Date.now();
    if (now > this.ts) {
      this.ts = now;
      this.count = 0;
    } else {
      this.count++;
    }
    return this.serialize();
  }

  serialize() {
    return `${this.ts.toString().padStart(15, '0')}:${this.count.toString(16).padStart(4, '0')}:${this.node}`;
  }
}

The Sync Engine: Merging Remote Changes

When a remote change arrives, our engine performs an "upsert" logic based on the HLC. If the incoming HLC is greater than the local HLC for that specific record (and column), the change is applied. If it is smaller, the change is ignored as "stale."

SQL Schema for Syncable Tables

CREATE TABLE tasks (
  id TEXT PRIMARY KEY,
  title TEXT,
  status TEXT,
  updated_at_hlc TEXT, -- The HLC of the last modification
  is_deleted INTEGER DEFAULT 0
);

The Merge Logic

async function applyRemoteChange(db, remoteRow) {
  const localRow = db.exec("SELECT updated_at_hlc FROM tasks WHERE id = ?", {
    bind: [remoteRow.id],
    returnValue: "resultRows"
  })[0];

  // Compare HLC strings lexicographically
  if (!localRow || remoteRow.updated_at_hlc > localRow[0]) {
    db.run(`
      INSERT INTO tasks (id, title, status, updated_at_hlc, is_deleted)
      VALUES (?, ?, ?, ?, ?)
      ON CONFLICT(id) DO UPDATE SET
        title = excluded.title,
        status = excluded.status,
        updated_at_hlc = excluded.updated_at_hlc,
        is_deleted = excluded.is_deleted
    `, [remoteRow.id, remoteRow.title, remoteRow.status, remoteRow.updated_at_hlc, remoteRow.is_deleted]);
  }
}

Real-World Use Case: Collaborative Project Management

Imagine a project management tool where users can rearrange tasks on a Kanban board in a subway tunnel with no signal.

  1. User A moves a task from "To Do" to "Doing." SQLite records the change locally, generates a new HLC, and flags the row for sync.
  2. User B (online) renames that same task.
  3. When User A reconnects, the sync engine compares the HLCs. If User B's rename happened physically later (or has a higher logical count), the task's title is updated, but its status remains "Doing" because those two operations affected different properties, or the status move had a higher HLC.

Because we are using SQLite, we can perform complex analytical queries (e.g., "Give me all tasks assigned to me completed in the last 7 days") locally without ever hitting the network.

Overcoming Challenges

1. Storage Limits

While OPFS allows for significant storage (often gigabytes), browsers can still evict data if the device runs low on disk space. Implementing a robust backup/restore flow from a central S3 bucket or server-side Postgres instance is still necessary for durability.

2. Schema Migrations

In a local-first world, you cannot migrate all your users' databases at once. You must design your application to handle multiple versions of the schema simultaneously or include migration scripts that run on the client-side when the WASM module initializes.

3. Security

Since the user has direct access to the SQLite file, you must treat all incoming sync data as untrusted. Server-side validation of the CRDT deltas is essential to prevent malicious actors from injecting invalid states into the global mesh.

Conclusion

The combination of SQLite-WASM and Hybrid CRDTs represents a massive leap forward for web architecture. By treating the browser as a first-class data node rather than a volatile cache, we can build applications that are more resilient, more performant, and more respectful of the user's time and connectivity.

We are moving toward a "Distributed Web" where the cloud acts as a relay and a backup, rather than a gatekeeper. As tools like cr-sqlite and wa-sqlite mature, the barrier to entry for local-first development will continue to drop, making it the standard for the next generation of professional web applications.

0Likes
Share this techsheet
Share TechSheet

Discussion

0 characters