Plaid Transactions Sync: Production Lessons

Table of Contents

Why I Almost Switched Back to Polling

When I shipped the first version of bbbudget's Plaid integration, I went with the legacy approach: poll on a timer, reconcile against what I had stored, repeat. It worked fine in sandbox. In production with real users, it started falling apart.

Transactions showed up hours stale. Merchant names updated without my local copy noticing. Pending transactions flipped to posted in a way that created near-duplicates I had to clean up manually. And the reconciliation function had grown into something I was scared to touch.

About two months in, I rewrote the whole thing around Plaid's /transactions/sync API.

The rewrite fixed most of those problems. But it also taught me things I wish someone had written down clearly. This is that post.

What Was Wrong With the Old Pattern

The old approach — what Plaid calls the "local copy" pattern — goes like this:

User connects a bank via Plaid Link → you receive a public_token
Exchange it for an access_token and store it
Listen for webhooks: INITIAL_UPDATE, HISTORICAL_UPDATE, DEFAULT_UPDATE
On each webhook, call /transactions/get with a date range
Diff the returned transactions against your local database and reconcile

Steps 1–4 are manageable. Step 5 is where things break.

"Reconcile" means: upsert returned transactions by transaction_id, detect removals (things in your DB that Plaid no longer returns for that date range), and handle modifications (same ID, different amount, merchant name, or pending status).

The removal detection is the worst part. You call /transactions/get with a fixed date window — say 90 days. If Plaid removes a transaction older than that window, you'll never know. If you use too short a window and a modification happens to an older transaction, same problem. You're working from a snapshot, and snapshots lie by omission.

For a single user this is workable. For any real volume, or any user with years of history and multiple accounts, you're shipping the diff logic yourself. And you will get it wrong. I got it wrong multiple times before I gave up and switched to the sync API.

How /transactions/sync Actually Works

The sync API is cursor-based. Think of it like a git log: you tell Plaid where you left off (your cursor), and it tells you exactly what changed since then.

Call /transactions/sync with an access_token and your stored cursor (or no cursor for the first call). Plaid returns:

added — transactions that are new since your cursor
modified — transactions that changed (amount, merchant name, pending status, etc.)
removed — transaction IDs that were deleted
has_more — whether there are more pages of changes to fetch
next_cursor — store this before your next call

Your application logic collapses to something like this:

const { added, modified, removed, next_cursor, has_more } =
  await plaidClient.transactionsSync({ access_token, cursor });

// Apply the diff atomically
await db.upsert('transactions', added);
await db.upsert('transactions', modified);
await db.deleteMany('transactions', removed.map(r => r.transaction_id));
await db.update('plaid_items', { cursor: next_cursor }, { where: { item_id } });

If has_more is true, paginate: call again with the cursor you just received, repeat until has_more is false, then store the final cursor.

You're no longer managing the diff — Plaid manages it. You just apply it.

One thing worth knowing: the cursor is per-Item, not per-account. A single Chase connection is one Item, even if it has checking, savings, and a credit card under it. A second bank (Capital One, Amex, etc.) is a separate Item with its own cursor. Manage them independently.

Computer screen displaying code and terminal output — The cursor-based model eliminates custom reconciliation logic entirely

Two Gotchas the Docs Gloss Over

These two cost me embarrassing amounts of debugging time.

Gotcha 1: You must bootstrap the cursor before webhooks will fire.

SYNC_UPDATES_AVAILABLE webhooks are only sent for an Item after you've called /transactions/sync at least once for that Item. Plaid doesn't send a webhook to get you started — it waits for you to initialize.

My first implementation: user finishes Plaid Link, I store the access_token, then wait for a webhook. Nothing came. Every user who connected a bank and opened the app saw an empty transaction list.

The fix: immediately after exchanging the public_token for an access_token, kick off a first /transactions/sync call to bootstrap the cursor. This tells Plaid to start tracking changes for that Item.

Gotcha 2: The first sync call can be very slow. Don't make it in a request handler.

Plaid documents this, but it's easy to miss: for Items with substantial transaction history, the first call to /transactions/sync (once Plaid has finished loading historical data) can be up to 8x slower than later calls. For a user connecting a bank with years of history across multiple accounts, this can take several seconds.

If you handle this synchronously in a Next.js route handler, you'll hit serverless timeout limits. I hit Vercel's edge function limit during beta — users got error screens at exactly the wrong moment, right after connecting their bank.

The fix: return success immediately after receiving the public_token, then schedule the initial sync as a background job. In Convex, I do this with ctx.scheduler.runAfter(0, internal.plaid.syncItem, { itemId }). The user sees their bank show up as "syncing," then transactions appear as the background job completes — no timeout, no error screen.

Storing Transaction Diffs in Convex

Convex is a good fit for this specific problem. Mutations are transactional by default, which means applying a Plaid sync result — inserts, updates, and deletes — either all commits or none of it does. No partial state.

My data model:

// convex/schema.ts
transactions: defineTable({
  budgetId: v.id('budgets'),
  plaidTransactionId: v.string(),
  accountId: v.string(),
  amount: v.number(),
  date: v.string(),          // ISO date, e.g. "2026-04-15"
  name: v.string(),
  merchantName: v.optional(v.string()),
  pending: v.boolean(),
  categoryTag: v.optional(v.id('tags')),
}).index('by_budget_date', ['budgetId', 'date']),

plaidItems: defineTable({
  budgetId: v.id('budgets'),
  itemId: v.string(),
  accessToken: v.string(),   // encrypted at rest
  cursor: v.optional(v.string()),
  status: v.union(
    v.literal('active'),
    v.literal('syncing'),
    v.literal('error'),
  ),
}),

The key architectural split: a Convex action makes the external HTTP call to Plaid (actions can call external APIs), then calls a mutation to apply the diff atomically (mutations cannot call external APIs). Keeping them separate means the mutation stays fast, consistent, and transactional.

The other payoff is Convex's reactive query system. When the mutation commits, every active useQuery that touches the transactions table re-runs automatically on every subscribed client. Both people sharing a budget see new transactions appear without polling or manual refresh. That's not something I built — it's how Convex works out of the box.

Recovering From Missed Webhooks

Webhooks miss. Not often in my experience with Plaid, but a deployment restart, a network hiccup, or an upstream blip can drop a SYNC_UPDATES_AVAILABLE delivery. If that happens and you have no fallback, your users' transaction data is stuck until the next webhook fires naturally.

My safety net is a Convex scheduled function that calls /transactions/sync for every active Item every 6 hours:

// convex/crons.ts
const crons = cronJobs();
crons.interval(
  'sync-all-items',
  { hours: 6 },
  internal.crons.syncAllItems,
);

This is a fallback, not the primary path. Webhook-driven sync handles nearly all updates within minutes. The cron catches anything that got stuck.

Idempotency matters here. Because the cron and a webhook might both fire within the same window, you'll occasionally process the same Plaid response twice. Since Plaid transaction IDs are stable (the same transaction always has the same transaction_id), I handle this with an upsert pattern — if a transaction already exists with the same plaidTransactionId and identical data, the write is a no-op. No duplicates.

One more detail: before the cron calls Plaid, it checks the Item's status field. If it's already syncing (because a webhook just fired), it skips that Item. This prevents two sync jobs running concurrently for the same Item and creating a cursor race condition.

Couple looking at a tablet together in a kitchen — Both partners see the same up-to-date budget without doing anything

What This Looks Like for Two People Sharing a Budget

All of this engineering exists so that my wife and I can open the same bbbudget and trust we're looking at the same thing.

She makes a purchase. It posts at the bank. Plaid detects it and fires SYNC_UPDATES_AVAILABLE. My webhook handler triggers a Convex action that calls /transactions/sync. The mutation commits the new transaction. Both of our clients re-render automatically.

Neither of us does anything. The budget updates.

The number that matters most in bbbudget — how much is left in the budget this month — stays accurate without either person doing a manual sync or refresh. For a couples budgeting app, this matters more than it might for a single-user tool. If there's lag, one person is making decisions on stale data while the other is spending. That's how end-of-month surprises happen.

The webhook-first approach, backed by a Convex cron as a safety net, is what keeps both of us looking at the same reality. It took more iteration than I expected, but the two gotchas above — bootstrapping the cursor and moving the first sync off the request path — are the critical pieces.

If you're building something similar and want to compare notes, the full product is at bbbudget.com.

Frequently Asked Questions

What is Plaid's /transactions/sync API and how is it different from /transactions/get?

/transactions/sync is cursor-based and returns only the changes (added, modified, removed) since your last call. /transactions/get returns a full snapshot for a date range, requiring you to diff it against your local database yourself. The sync API eliminates custom reconciliation logic and handles edge cases like late-arriving modifications automatically.

Why isn't my SYNC_UPDATES_AVAILABLE webhook firing?

SYNC_UPDATES_AVAILABLE webhooks only fire for an Item after you've called /transactions/sync at least once for that Item. You must bootstrap the cursor with an initial sync call immediately after exchanging the public_token for an access_token — Plaid won't send webhooks until you do.

How do I handle the slow first call to /transactions/sync?

For Items with substantial history, the first call to /transactions/sync can be several times slower than subsequent calls. Don't make it synchronously in a serverless request handler — you'll hit timeout limits. Instead, return success to the frontend immediately and trigger the initial sync as a background job.

How do I avoid duplicate transactions when using the sync API?

Use an upsert pattern keyed on plaidTransactionId. Since Plaid transaction IDs are stable, inserting a transaction that already exists with the same data is a no-op. This makes your sync handler idempotent — you can safely call it from both webhooks and a polling fallback without creating duplicates.

Is Convex a good fit for storing Plaid transaction data?

Yes, for a few reasons: mutations are transactional (you can apply adds, updates, and deletes atomically), reactive queries mean clients update automatically when new transactions land, and the action/mutation split maps cleanly to the Plaid sync workflow — actions call the external API, mutations write to the database.

Ready to try simpler budgeting?

Try bbbudget Free

plaidconvexbuild-in-publicbank-syncnextjstypescriptfintechwebhooks