Rate-Limited Webhook Delivery for Video Events with PHP 8.4 and SQLite

Last spring, one of our partner endpoints received roughly 1,400 webhook POSTs in under two minutes, answered with a wall of 429s, and our naive retry loop cheerfully doubled the traffic in response. The trigger was mundane: two of our regional ingest cron jobs (US and GB) overlapped for the first time after a schedule change, and every video event they produced was being delivered inline, synchronously, the moment the row was written. At TrendVidStream we run streaming-platform discovery across 8 regions, each with its own staggered cron schedule pulling trending video metadata, so "two regions finishing at once" is not an edge case — it is Tuesday.

This post walks through how we rebuilt webhook delivery as a persistent, rate-limited outbox on a deliberately boring stack: PHP 8.4, SQLite (the same file that already powers our FTS5 search index), and plain cron. No Redis, no message broker, no managed queue. Our deploy pipeline is FTP-based and targets shared LiteSpeed hosting, so every dependency we add is a dependency we have to ship, configure, and babysit on a box we don't control. The constraint turned out to be a feature: the resulting system is about 400 lines of PHP, fully inspectable with sqlite3 from a shell, and has delivered every event for the last three months without a single consumer complaint.

Why firing webhooks during ingest is a trap

Our first implementation was the obvious one. The ingest script loops over regions, upserts videos into SQLite, and for each new or materially changed row, fires a curl POST to every subscribed endpoint before moving on. It worked in development, where there was one region and one consumer. In production it failed in four distinct ways:

Ingest time became hostage to consumer latency. One partner endpoint that took 4 seconds to respond stretched a 40-second ingest run past 9 minutes. Cron schedules that were carefully staggered started overlapping, which created more simultaneous events, which made everything slower. A downstream consumer should never be able to slow down your ingest path.
Failures were ambiguous. When a request timed out, we had no idea whether the consumer had processed it. With no persisted delivery state, the safe options were "never retry" (drop events) or "always retry" (duplicate events). We were doing the second one, badly.
Bursts had no pacing. A single regional run produces 200–800 events (new videos, stat refreshes, removals). Two overlapping regions meant a thousand-plus POSTs fired in tight sequence at whatever rate PHP and the network allowed.
Retries amplified the damage. Our error handler retried failed requests immediately, three times. When an endpoint returned 429 because we were flooding it, we responded by sending each event three more times. That is a self-inflicted denial of service with extra steps.

The core realization was that delivery is its own subsystem with its own state, its own schedule, and its own failure domain. The moment an event happens, the ingest path's only job is to record that it happened. Getting it to consumers is somebody else's problem — specifically, a worker that runs on its own cron cadence and knows how to pace itself. This is the transactional outbox pattern, and it maps onto SQLite surprisingly well.

The outbox schema

Three tables: immutable event facts, subscriber endpoints with their rate configuration, and a delivery join table that carries the state machine. We run the database in WAL mode, which matters here because the ingest cron writes events at the same time the delivery worker reads and updates them; WAL lets the reader and writer proceed without blocking each other.

PRAGMA journal_mode = WAL;

CREATE TABLE webhook_events (
  id          INTEGER PRIMARY KEY,
  event_type  TEXT NOT NULL,        -- video.created | video.updated | video.removed
  video_id    TEXT NOT NULL,
  region      TEXT NOT NULL,        -- US, GB, DE, FR, IN, BR, AU, CA
  payload     TEXT NOT NULL,        -- JSON, frozen at event time
  created_at  INTEGER NOT NULL DEFAULT (unixepoch())
);

CREATE TABLE webhook_endpoints (
  id                  INTEGER PRIMARY KEY,
  url                 TEXT NOT NULL,
  secret              TEXT NOT NULL,
  rate_capacity       INTEGER NOT NULL DEFAULT 60,  -- max burst size
  rate_refill_per_sec REAL NOT NULL DEFAULT 1.0,    -- sustained rate
  tokens              REAL NOT NULL DEFAULT 60,
  tokens_updated_at   REAL NOT NULL DEFAULT 0,
  active              INTEGER NOT NULL DEFAULT 1
);

CREATE TABLE webhook_deliveries (
  id              INTEGER PRIMARY KEY,
  event_id        INTEGER NOT NULL REFERENCES webhook_events(id),
  endpoint_id     INTEGER NOT NULL REFERENCES webhook_endpoints(id),
  status          TEXT NOT NULL DEFAULT 'pending',  -- pending | inflight | delivered | dead
  attempts        INTEGER NOT NULL DEFAULT 0,
  next_attempt_at INTEGER NOT NULL,
  last_http_code  INTEGER,
  last_error      TEXT,
  UNIQUE(event_id, endpoint_id)
);

CREATE INDEX idx_deliveries_due
  ON webhook_deliveries(status, next_attempt_at)
  WHERE status = 'pending';

Three decisions in this schema earned their keep:

The payload is frozen at event time. Consumers see the video metadata as it was when the event fired, not whatever it mutated into by delivery time. A video.updated event whose payload silently reflects three later updates is a debugging nightmare on the consumer side.
The partial index on (status, next_attempt_at) keeps the worker's "what is due?" scan fast even though the deliveries table accumulates millions of historical rows. SQLite only indexes rows where status = 'pending', so the index stays tiny no matter how much delivered history you keep.
The UNIQUE(event_id, endpoint_id) constraint makes enqueueing idempotent. If an ingest run crashes after enqueueing half its events and cron re-runs it, INSERT OR IGNORE guarantees no consumer ever gets a duplicate delivery row.

Note the unixepoch() default requires SQLite 3.38+. Every shared host we deploy to ships 3.45, so this is safe for us; check yours before copying.

A token bucket that survives between processes

The classic token bucket assumes a long-running process holding the bucket in memory. PHP under cron gives you the opposite: a process that boots, works for under a minute, and dies. So the bucket state lives in the endpoint row itself, and the refill is computed lazily from elapsed wall-clock time whenever someone asks for a token. No background refill loop needed — the math is identical.

The two knobs map directly onto real consumer requirements. rate_capacity is the burst allowance: how many requests an endpoint tolerates back-to-back. rate_refill_per_sec is the sustained rate. A partner who says "keep it under 2 per second, but bursts of 20 are fine" becomes capacity = 20, refill = 2.0. No translation layer, no guessing.

<?php
declare(strict_types=1);

final class TokenBucket
{
    public function __construct(private readonly PDO $db) {}

    /**
     * Try to take $count tokens for an endpoint.
     * Returns 0.0 if granted, otherwise seconds to wait until enough
     * tokens will have refilled.
     */
    public function take(int $endpointId, int $count = 1): float
    {
        $this->db->exec('BEGIN IMMEDIATE');
        try {
            $stmt = $this->db->prepare(
                'SELECT rate_capacity, rate_refill_per_sec, tokens, tokens_updated_at
                 FROM webhook_endpoints WHERE id = :id'
            );
            $stmt->execute([':id' => $endpointId]);
            $row = $stmt->fetch(PDO::FETCH_ASSOC);

            $now     = microtime(true);
            $elapsed = max(0.0, $now - (float) $row['tokens_updated_at']);
            $tokens  = min(
                (float) $row['rate_capacity'],
                (float) $row['tokens'] + $elapsed * (float) $row['rate_refill_per_sec']
            );

            if ($tokens < $count) {
                $wait = ($count - $tokens) / (float) $row['rate_refill_per_sec'];
                $this->db->exec('COMMIT');
                return $wait;
            }

            $upd = $this->db->prepare(
                'UPDATE webhook_endpoints
                 SET tokens = :tokens, tokens_updated_at = :now
                 WHERE id = :id'
            );
            $upd->execute([
                ':tokens' => $tokens - $count,
                ':now'    => $now,
                ':id'     => $endpointId,
            ]);
            $this->db->exec('COMMIT');
            return 0.0;
        } catch (Throwable $e) {
            $this->db->exec('ROLLBACK');
            throw $e;
        }
    }
}

The BEGIN IMMEDIATE is the load-bearing line. It takes SQLite's write lock at transaction start instead of lazily at the first write, which means the read-compute-write sequence on the token count can never interleave with another process doing the same thing. We normally run exactly one delivery worker, but "normally" is doing a lot of work in that sentence — cron overlap during a slow run is exactly the kind of thing this system exists to survive, so the bucket is correct under concurrency rather than correct by assumption.

A pleasant side effect of the lazy-refill design: when nobody is sending to an endpoint, the bucket costs nothing. There is no timer, no decay job, no row churn. The state is just two numbers that get reconciled on the next read.

The delivery worker

The worker runs every minute from cron and self-limits to about 50 seconds, so consecutive invocations never pile up. Each iteration claims a batch of due deliveries atomically, asks the bucket for permission per endpoint, sends what it can, and reschedules what it can't. Claiming uses UPDATE ... RETURNING, which SQLite has supported since 3.35 and which collapses the classic "select then mark" race into a single statement.

<?php
declare(strict_types=1);

const MAX_ATTEMPTS = 10;
const BATCH_SIZE   = 200;
const RUN_BUDGET_S = 50;

$db = new PDO('sqlite:' . __DIR__ . '/../data/app.db');
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->exec('PRAGMA busy_timeout = 5000');

$bucket = new TokenBucket($db);
$start  = microtime(true);

while (microtime(true) - $start < RUN_BUDGET_S) {
    $claimed = $db->query(
        "UPDATE webhook_deliveries
         SET status = 'inflight'
         WHERE id IN (
             SELECT id FROM webhook_deliveries
             WHERE status = 'pending' AND next_attempt_at <= unixepoch()
             ORDER BY next_attempt_at
             LIMIT " . BATCH_SIZE . "
         )
         RETURNING id, event_id, endpoint_id, attempts"
    )->fetchAll(PDO::FETCH_ASSOC);

    if ($claimed === []) {
        break; // queue drained, exit early
    }

    foreach ($claimed as $d) {
        $wait = $bucket->take((int) $d['endpoint_id']);
        if ($wait > 0.0) {
            requeue($db, (int) $d['id'], (int) ceil($wait));
            continue;
        }
        deliver($db, $bucket, $d);
    }
}

function requeue(PDO $db, int $id, int $delaySec): void
{
    $db->prepare(
        "UPDATE webhook_deliveries
         SET status = 'pending', next_attempt_at = unixepoch() + :d
         WHERE id = :id"
    )->execute([':d' => $delaySec, ':id' => $id]);
}

function deliver(PDO $db, TokenBucket $bucket, array $d): void
{
    [$url, $secret, $body] = loadTarget($db, $d); // joins event + endpoint

    $ch = curl_init($url);
    curl_setopt_array($ch, [
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => $body,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_TIMEOUT        => 10,
        CURLOPT_CONNECTTIMEOUT => 5,
        CURLOPT_HTTPHEADER     => [
            'Content-Type: application/json',
            'X-TVS-Signature: sha256=' . hash_hmac('sha256', $body, $secret),
            'X-TVS-Event-Id: ' . $d['event_id'],
        ],
    ]);
    curl_exec($ch);
    $code = (int) curl_getinfo($ch, CURLINFO_RESPONSE_CODE);
    $err  = curl_error($ch);
    curl_close($ch);

    $attempts = (int) $d['attempts'] + 1;

    if ($code >= 200 && $code < 300) {
        finish($db, (int) $d['id'], 'delivered', $code, null, $attempts);
        return;
    }
    if ($attempts >= MAX_ATTEMPTS) {
        finish($db, (int) $d['id'], 'dead', $code, $err ?: null, $attempts);
        return;
    }

    // Exponential backoff with full jitter, capped at one hour.
    $delay = random_int(1, min(3600, 2 ** $attempts));
    $db->prepare(
        "UPDATE webhook_deliveries
         SET status = 'pending', attempts = :a,
             next_attempt_at = unixepoch() + :delay,
             last_http_code = :code, last_error = :err
         WHERE id = :id"
    )->execute([
        ':a' => $attempts, ':delay' => $delay,
        ':code' => $code, ':err' => $err ?: null, ':id' => $d['id'],
    ]);
}

(loadTarget and finish are three-line helpers omitted for space — a join and an UPDATE respectively.)

A few details that matter more than they look:

Full jitter, not fixed backoff. random_int(1, 2 ** attempts) instead of a deterministic 2 ** attempts. When a consumer endpoint comes back from an outage, hundreds of deliveries may have synchronized retry times. Deterministic backoff sends them all at once — the thundering herd you just spent a whole article preventing. Full jitter spreads them across the entire window.
Rate-limited deliveries are not failures. When the bucket says no, the delivery goes back to pending with next_attempt_at set to when tokens will exist — without incrementing attempts. Burning retry budget on your own rate limiter is a subtle way to dead-letter perfectly good events.
The worker exits early when the queue is empty. Most minutes there is nothing to do and the process lives for 20 milliseconds. Combined with the 50-second budget, this gives you the responsiveness of a daemon with the operational simplicity of cron — which is the whole game on hosting where you cannot run a daemon.
One small honesty note: on a 429 we also check for a Retry-After header and respect it over our computed delay if it's longer. The parsing is ten boring lines, so it's omitted above, but if you implement nothing else from this section, implement that. Consumers who bother to send Retry-After mean it.

Signing payloads and keeping consumers idempotent

Every delivery carries an HMAC-SHA256 of the exact body, plus the event id. The signature proves the payload came from us and wasn't modified; the event id lets consumers deduplicate, because our system guarantees at-least-once delivery, not exactly-once. (Nobody's webhook system guarantees exactly-once. Anyone who claims otherwise has just moved the dedup into a layer they aren't telling you about.)

The consumer side is short enough to include in our partner docs verbatim:

<?php
$raw      = file_get_contents('php://input');
$expected = 'sha256=' . hash_hmac('sha256', $raw, getenv('TVS_WEBHOOK_SECRET'));
$given    = $_SERVER['HTTP_X_TVS_SIGNATURE'] ?? '';

if (!hash_equals($expected, $given)) {
    http_response_code(401);
    exit;
}

$eventId = $_SERVER['HTTP_X_TVS_EVENT_ID'] ?? '';
$stmt = $pdo->prepare('INSERT OR IGNORE INTO processed_events (event_id) VALUES (?)');
$stmt->execute([$eventId]);
if ($stmt->rowCount() === 0) {
    http_response_code(202); // duplicate — already handled, ack and move on
    exit;
}

// ... enqueue real processing here, then ack fast ...
http_response_code(202);

Two things we insist on in that snippet. hash_equals instead of === closes the timing side-channel on signature comparison. And acking with 202 before doing real work keeps the consumer's response time — and therefore our worker's throughput — independent of their processing time. The single best thing a webhook consumer can do for everyone involved is to accept fast and process async, which is, not coincidentally, the same lesson we learned on the producing side.

Absorbing eight regions of cron bursts

The queue smooths delivery, but we also attack burstiness at the source. Our regional fetches are deliberately staggered with offset minutes and different intervals, so region completions spread across the hour instead of clustering on the boundary:

# Regional ingest — offsets chosen so no two schedules share a start minute
20 */2 * * *  php /home/site/cron/fetch_videos.php          # primary region loop
28 */4 * * *  php /home/site/cron/fetch_videos.php --tier=2 # secondary regions
35 */7 * * *  php /home/site/cron/fetch_videos.php --tier=3 # long-tail regions

# Delivery worker — every minute, self-limits to ~50s, exits early when idle
* * * * *     php /home/site/cron/deliver_webhooks.php >> logs/webhooks.log 2>&1

The co-prime intervals (2, 4, 7 hours) mean full alignment of all three tiers happens rarely, and when it does, the outbox absorbs it — the worst case is a deeper queue that drains over a few extra minutes, not a flood at any consumer.

The other big lever was event coalescing. Video stat refreshes dominated our event volume: the same video's view count updates in consecutive ingest runs, generating a video.updated event each time. Consumers don't want five stale intermediate counts; they want the latest one. So before enqueueing a video.updated event, the ingest path checks for an existing pending delivery of the same type for the same (video_id, endpoint_id) and, if found, replaces the event's payload in place instead of inserting a new row. Delivered history is never touched — only the not-yet-sent tail collapses. This one change cut delivery volume by roughly 40% with zero information loss from the consumer's perspective.

Finally, housekeeping: a weekly cron deletes delivered rows older than 14 days and runs PRAGMA optimize. Dead-lettered rows are kept indefinitely — they're rare, and every one of them is a conversation we need to have with a partner anyway.

What three months of production looks like

Numbers from our monitoring since the rebuild went live across all regions:

429 responses from consumers: zero. Previously dozens per day during overlap windows.
p95 enqueue-to-delivery latency: 38 seconds. Driven almost entirely by the one-minute cron cadence; the queue itself is usually empty.
Ingest run time: back to ~45 seconds per region, fully independent of consumer health. One partner had a 6-hour outage in May; our ingest never noticed, and their queue drained in 11 minutes once they recovered.
Dead-letter rate: 0.3%, almost all from a single endpoint with an expired TLS certificate. The dead-letter table told us exactly which events to replay after they fixed it — one UPDATE ... SET status = 'pending' and the backlog flowed.
Total system footprint: ~400 lines of PHP, three tables, one cron line. It deploys over FTP like everything else we ship.

Closing thoughts

The pattern here is older than webhooks: separate recording an obligation from fulfilling it, make fulfillment idempotent and paced, and keep all the state somewhere you can query at 2 a.m. with standard tools. SQLite turns out to be a genuinely good queue substrate at this scale — WAL mode handles the reader/writer overlap, UPDATE ... RETURNING gives you atomic claims, and a partial index keeps the hot path fast forever. You do not need a message broker to do webhook delivery correctly; you need a durable outbox, a token bucket that respects each consumer's stated limits, jittered backoff, and the discipline to treat "rate limited" and "failed" as different states. If your delivery system can survive two cron jobs landing on the same minute, it can survive most things.