Outbox Pattern in Go Microservices

go dev.to

Solving the Dual Write Problem Without Losing Data

Distributed systems fail in uncomfortable ways.

Sometimes the database commit succeeds — but the Kafka publish fails.

Sometimes the event is published — but the transaction rolls back.

And sometimes everything looks successful… until downstream systems realize data is missing.

This is the dual write problem.

If your Go microservice:

  • writes to a database
  • publishes events
  • triggers async workflows
  • integrates with Kafka/RabbitMQ/NATS

then you are already dealing with it — whether you realize it or not.

This article explores how the Outbox Pattern solves this problem safely in production Go systems.


1. The Dual Write Problem

Consider a typical order flow:

HTTP Request
    ↓
Save order to DB
    ↓
Publish "OrderCreated" event
Enter fullscreen mode Exit fullscreen mode

Naive implementation:

func CreateOrder(ctx context.Context, order Order) error {
    err := db.Insert(order)
    if err != nil {
        return err
    }

    err = kafka.Publish("order.created", order)
    if err != nil {
        return err
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

Looks harmless.

But what happens if:

  • DB insert succeeds
  • Kafka publish fails

Now:

  • order exists
  • no event emitted
  • downstream services never know

Your system is inconsistent.


2. Why Distributed Transactions Are Rarely the Answer

Some engineers try:

  • two-phase commit
  • distributed transactions
  • XA protocols

In practice:

  • operationally complex
  • poor performance
  • difficult to scale
  • unsupported by many systems

Modern systems usually prefer:

  • eventual consistency
  • reliable event delivery

This is where the Outbox Pattern shines.


3. Core Idea of the Outbox Pattern

Instead of:

DB write
+
Kafka publish
Enter fullscreen mode Exit fullscreen mode

Do:

DB write
+
Insert event into outbox table
Enter fullscreen mode Exit fullscreen mode

inside the SAME transaction.

Then:

  • background worker publishes events later

Now:

  • either both persist
  • or neither persists

Atomicity restored.


4. Outbox Table Design

Typical schema:

CREATE TABLE outbox_events (
    id UUID PRIMARY KEY,
    event_type TEXT NOT NULL,
    payload JSONB NOT NULL,
    created_at TIMESTAMP NOT NULL,
    processed_at TIMESTAMP,
    retries INT DEFAULT 0
);
Enter fullscreen mode Exit fullscreen mode

Key fields:

  • payload
  • processed status
  • retry count
  • timestamps

This table becomes a durable event queue.


5. Writing to the Outbox (Go Example)

Inside transaction:

func CreateOrder(ctx context.Context, db *sql.DB, order Order) error {
    tx, err := db.BeginTx(ctx, nil)
    if err != nil {
        return err
    }

    defer tx.Rollback()

    _, err = tx.ExecContext(ctx,
        `INSERT INTO orders(id, amount) VALUES($1, $2)`,
        order.ID,
        order.Amount,
    )
    if err != nil {
        return err
    }

    payload, _ := json.Marshal(order)

    _, err = tx.ExecContext(ctx,
        `INSERT INTO outbox_events(id, event_type, payload, created_at)
         VALUES($1, $2, $3, NOW())`,
        uuid.New(),
        "order.created",
        payload,
    )
    if err != nil {
        return err
    }

    return tx.Commit()
}
Enter fullscreen mode Exit fullscreen mode

Now:

  • order + event persist atomically

No dual write inconsistency.


6. Background Publisher Worker

Separate worker:

func StartOutboxPublisher(ctx context.Context, db *sql.DB) {
    ticker := time.NewTicker(2 * time.Second)

    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            publishPendingEvents(ctx, db)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

7. Publishing Pending Events

func publishPendingEvents(ctx context.Context, db *sql.DB) {
    rows, err := db.QueryContext(ctx,
        `SELECT id, event_type, payload
         FROM outbox_events
         WHERE processed_at IS NULL
         LIMIT 100`)
    if err != nil {
        return
    }

    defer rows.Close()

    for rows.Next() {
        var (
            id        string
            eventType string
            payload   []byte
        )

        rows.Scan(&id, &eventType, &payload)

        err := kafka.Publish(eventType, payload)
        if err != nil {
            continue
        }

        _, _ = db.ExecContext(ctx,
            `UPDATE outbox_events
             SET processed_at = NOW()
             WHERE id = $1`,
            id,
        )
    }
}
Enter fullscreen mode Exit fullscreen mode

Now failures become recoverable:

  • if Kafka fails → retry later
  • event never lost

8. The Hidden Problem: Duplicate Delivery

Outbox guarantees:

  • at-least-once delivery

Not:

  • exactly once

This means:

  • consumer may receive duplicates

Consumers MUST be idempotent.

This connects directly to:

  • retries
  • idempotency keys
  • distributed consistency

9. Handling Retries Properly

Never retry infinitely without control.

Track retries:

retries INT DEFAULT 0
Enter fullscreen mode Exit fullscreen mode

Update:

UPDATE outbox_events
SET retries = retries + 1
Enter fullscreen mode Exit fullscreen mode

Eventually:

  • dead-letter queue
  • manual inspection
  • alerting

10. Polling vs CDC (Change Data Capture)

Simple approach:

  • polling outbox table

Advanced approach:

  • Debezium / WAL streaming
  • CDC-based event publishing

Tradeoff:

Polling CDC
Simple Complex
Easier ops Higher throughput
Slight latency Near real-time

Most systems should start with polling.


11. Concurrency Pitfall: Multiple Workers

If multiple publisher instances run:

Two workers may publish same event.

Solution:

  • row locking

Example:

SELECT *
FROM outbox_events
WHERE processed_at IS NULL
FOR UPDATE SKIP LOCKED
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

This is critical in Kubernetes deployments.


12. Observability Matters

Track:

  • pending outbox size
  • retry count
  • oldest unprocessed event
  • publish latency
  • dead-letter count

Danger signal:

growing outbox table

This means downstream systems are unhealthy.


13. Real Production Failure Story

Classic outage pattern:

  • Kafka degraded
  • API kept accepting writes
  • events silently failed
  • downstream inventory never updated

Without outbox:

  • permanent inconsistency

After outbox:

  • events queued safely
  • Kafka recovered later
  • system healed automatically

This is resilience.


14. Production Lessons

The outbox pattern teaches an important engineering truth:

Reliability is not preventing failure.

It’s surviving failure without losing correctness.

Distributed systems WILL:

  • retry
  • duplicate
  • reorder
  • partially fail

Your architecture must expect this.


Final Thoughts

The Outbox Pattern is one of the most important patterns in modern backend engineering.

It solves:

  • dual write inconsistency
  • event loss
  • partial failures

But it also forces you to think carefully about:

  • idempotency
  • retries
  • observability
  • operational recovery

Reliable distributed systems are not built by hoping failures won’t happen.

They are built by assuming they absolutely will.

Source: dev.to

arrow_back Back to Tutorials