Production Webhook Queue with BullMQ and Redis: The Pattern That Actually Scales

typescript dev.to

Production Webhook Queue with BullMQ and Redis: The Pattern That Actually Scales

Every SaaS eventually breaks when a webhook handler times out. Here's the architecture that prevents it.

Why Naive Webhook Handlers Fail

The typical webhook setup:

// ❌ This will hurt you in production
export async function POST(req: Request) {
  const event = await req.json();

  // These can take 5-30 seconds:
  await sendWelcomeEmail(event.user);
  await provisionStripeCustomer(event.user);
  await createDefaultWorkspace(event.user);
  await syncToCRM(event.user);

  return Response.json({ ok: true });
}
Enter fullscreen mode Exit fullscreen mode

Problems:

  • Stripe retries after 30s if no response. You'll process the same event 3 times.
  • Any single step failing kills the whole flow with no retry.
  • No visibility into what failed or why.
  • Vercel functions time out at 10s on hobby plans.

The Queue Architecture

Webhook arrives → Validate signature → Enqueue job → Return 200 immediately
                                           ↓
                                    BullMQ worker processes async:
                                    - Retry on failure (exponential backoff)
                                    - Dead-letter queue for permanent failures
                                    - Full job history for debugging
Enter fullscreen mode Exit fullscreen mode

Implementation

1. Queue Setup

// lib/queue.ts
import { Queue, Worker, QueueEvents } from 'bullmq';
import IORedis from 'ioredis';

const connection = new IORedis(process.env.REDIS_URL!, {
  maxRetriesPerRequest: null, // required for BullMQ
});

export const webhookQueue = new Queue('webhooks', {
  connection,
  defaultJobOptions: {
    attempts: 5,
    backoff: {
      type: 'exponential',
      delay: 2000, // 2s, 4s, 8s, 16s, 32s
    },
    removeOnComplete: { count: 1000 }, // keep last 1000 for debugging
    removeOnFail: { count: 5000 },     // keep failures longer
  },
});
Enter fullscreen mode Exit fullscreen mode

2. Webhook Route (Fast Path)

// app/api/webhooks/stripe/route.ts
import { webhookQueue } from '@/lib/queue';
import Stripe from 'stripe';

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

export async function POST(req: Request) {
  const body = await req.text();
  const sig = req.headers.get('stripe-signature')!;

  let event: Stripe.Event;
  try {
    event = stripe.webhooks.constructEvent(
      body,
      sig,
      process.env.STRIPE_WEBHOOK_SECRET!
    );
  } catch {
    return Response.json({ error: 'Invalid signature' }, { status: 400 });
  }

  // Enqueue and return immediately — don't await processing
  await webhookQueue.add(
    event.type,           // job name for filtering
    { event },            // job data
    { jobId: event.id }   // idempotency: same event = same job ID = no duplicate
  );

  return Response.json({ received: true }); // Stripe sees 200 in <100ms
}
Enter fullscreen mode Exit fullscreen mode

3. Worker (Actual Processing)

// workers/webhook-worker.ts
import { Worker } from 'bullmq';
import { connection } from '@/lib/queue';

const worker = new Worker(
  'webhooks',
  async (job) => {
    const { event } = job.data;

    switch (event.type) {
      case 'checkout.session.completed': {
        const session = event.data.object as Stripe.Checkout.Session;
        await handleCheckoutCompleted(session);
        break;
      }
      case 'customer.subscription.deleted': {
        const sub = event.data.object as Stripe.Subscription;
        await handleSubscriptionCancelled(sub);
        break;
      }
      default:
        // Log unknown events for observability
        console.log(`Unhandled event type: ${event.type}`);
    }
  },
  {
    connection,
    concurrency: 5, // process 5 jobs simultaneously
  }
);

worker.on('failed', (job, err) => {
  console.error(`Job ${job?.id} failed:`, err.message);
  // Alert to Slack/PagerDuty after max retries
  if (job?.attemptsMade === job?.opts.attempts) {
    notifyPermanentFailure(job);
  }
});
Enter fullscreen mode Exit fullscreen mode

4. Idempotent Handlers

Even with queuing, design handlers to be safe to run twice:

async function handleCheckoutCompleted(session: Stripe.Checkout.Session) {
  const userId = session.metadata?.userId;
  if (!userId) throw new Error('Missing userId in metadata');

  // Upsert — safe to run multiple times
  await db.subscription.upsert({
    where: { stripeCustomerId: session.customer as string },
    create: {
      userId,
      stripeCustomerId: session.customer as string,
      status: 'active',
      plan: session.metadata?.plan ?? 'starter',
    },
    update: {
      status: 'active',
    },
  });

  // Check before sending — don't double-email
  const alreadySent = await db.emailLog.findUnique({
    where: { key: `welcome-${userId}` },
  });
  if (!alreadySent) {
    await sendWelcomeEmail(userId);
    await db.emailLog.create({ data: { key: `welcome-${userId}` } });
  }
}
Enter fullscreen mode Exit fullscreen mode

Monitoring With Bull Board

// app/api/admin/queues/route.ts
import { createBullBoard } from '@bull-board/api';
import { BullMQAdapter } from '@bull-board/api/bullMQAdapter';
import { ExpressAdapter } from '@bull-board/express';

// Gives you a UI dashboard at /admin/queues
// Shows pending, active, completed, failed jobs
// Retry failed jobs with one click
Enter fullscreen mode Exit fullscreen mode

The Deployment Stack

  • Redis: Upstash (serverless, $0 for low volume) or Railway Redis
  • Worker: Long-running Node.js process (Railway, Fly.io, or a dedicated EC2)
  • Queue dashboard: Bull Board behind auth middleware
  • Alerts: Worker failed event → Slack webhook for permanent failures

This architecture handles thousands of webhooks per minute and survives any downstream outage gracefully.


Skip the Infrastructure Setup

The Workflow Automator MCP includes pre-built queue patterns, Stripe webhook templates, and BullMQ worker configurations — deploy a production-grade webhook system in an afternoon instead of a week.

$15/mo → whoffagents.com


Building on BullMQ? What pattern have you found most useful? Comments below.

Source: dev.to

arrow_back Back to Tutorials