Graceful Degradation in Go: Fallback When a Downstream Adapter Fails

Book: Hexagonal Architecture in Go
Also by me: The Complete Guide to Go Programming — the companion book in the Thinking in Go series
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

Your pricing service goes down at 2pm. It is a third-party API
that returns tax rates per region. For the next eleven minutes
every product page in your Go app returns a 500, because the
handler calls the pricing adapter, the adapter's HTTP call
times out, and the error bubbles all the way up to the browser.

The tax rate you failed to fetch changed last on the first of
the month. You had a copy of it, thirty seconds old, sitting in
Redis. The page could have rendered with that number and a
small "prices may be delayed" note. Instead the whole catalog
went dark because one downstream call could not complete.

That is the gap graceful degradation fills. Some data is worth
failing the request over. Most is not. In a hexagonal Go
codebase, the place to make that call is the adapter boundary,
and the mechanism is a second adapter behind the same port.

The port stays the same

Start with a port. The domain does not know or care where tax
rates come from.

// port/pricing.go
package port

import "context"

type TaxRate struct {
    Region  string
    Percent float64
}

type PricingPort interface {
    TaxRate(
        ctx context.Context, region string,
    ) (TaxRate, error)
}

The real adapter calls the vendor over HTTP.

// adapter/pricing/http.go
package pricing

import (
    "context"
    "encoding/json"
    "fmt"
    "net/http"

    "yourapp/port"
)

type HTTPClient struct {
    base string
    c    *http.Client
}

The method builds a context-aware request, then handles the
response. Splitting it in two keeps each half easy to read:

func (h *HTTPClient) TaxRate(
    ctx context.Context, region string,
) (port.TaxRate, error) {
    url := fmt.Sprintf(
        "%s/tax?region=%s", h.base, region,
    )
    req, err := http.NewRequestWithContext(
        ctx, http.MethodGet, url, nil,
    )
    if err != nil {
        return port.TaxRate{}, err
    }
    resp, err := h.c.Do(req)
    if err != nil {
        return port.TaxRate{}, err
    }
    defer resp.Body.Close()
    if resp.StatusCode != http.StatusOK {
        return port.TaxRate{}, fmt.Errorf(
            "pricing: status %d", resp.StatusCode,
        )
    }
    var r port.TaxRate
    if err := json.NewDecoder(resp.Body).
        Decode(&r); err != nil {
        return port.TaxRate{}, err
    }
    return r, nil
}

Nothing new so far. The question is what the handler sees when
h.c.Do times out.

A fallback adapter behind the same port

The fallback is another PricingPort. It wraps the primary and
a cache. On a healthy call it writes the result to the cache and
returns it. On a failed call it reads the last good value back.

// adapter/pricing/fallback.go
package pricing

import (
    "context"
    "log/slog"
    "time"

    "yourapp/port"
)

type Cache interface {
    Get(
        ctx context.Context, key string,
    ) (port.TaxRate, time.Time, bool)
    Set(
        ctx context.Context, key string,
        v port.TaxRate, at time.Time,
    )
}

type Fallback struct {
    primary port.PricingPort
    cache   Cache
    log     *slog.Logger
}

func NewFallback(
    p port.PricingPort, c Cache, l *slog.Logger,
) *Fallback {
    return &Fallback{primary: p, cache: c, log: l}
}

The TaxRate method holds the whole policy:

func (f *Fallback) TaxRate(
    ctx context.Context, region string,
) (port.TaxRate, error) {
    r, err := f.primary.TaxRate(ctx, region)
    if err == nil {
        f.cache.Set(ctx, region, r, time.Now())
        return r, nil
    }

    cached, at, ok := f.cache.Get(ctx, region)
    if !ok {
        return port.TaxRate{}, err
    }
    f.log.Warn("serving stale tax rate",
        "region", region,
        "age", time.Since(at).String(),
        "cause", err,
    )
    return cached, nil
}

The handler still depends on port.PricingPort. It has no idea
whether it got a fresh value or a stale one. Wiring the fallback
in is one line at startup:

// main.go
pricing := pricing.NewFallback(httpClient, redisCache, log)

Swap httpClient for pricing everywhere the app injects the
port, and every caller inherits the degrade behavior without a
single change to their code.

Tell the caller it is degraded

Silently returning stale data is its own bug. A checkout total
computed from a stale rate is fine to display, not fine to
charge against without knowing it is stale. Carry that fact out
of the adapter.

Go's idiom here is a sentinel error you can wrap. Return the
value and the signal.

// port/pricing.go
package port

import "errors"

var ErrStale = errors.New("value is stale")

The fallback returns the cached value alongside ErrStale
instead of nil. This version wraps the error with fmt.Errorf,
so add fmt to the fallback.go import block:

func (f *Fallback) TaxRate(
    ctx context.Context, region string,
) (port.TaxRate, error) {
    r, err := f.primary.TaxRate(ctx, region)
    if err == nil {
        f.cache.Set(ctx, region, r, time.Now())
        return r, nil
    }
    cached, at, ok := f.cache.Get(ctx, region)
    if !ok {
        return port.TaxRate{}, err
    }
    f.log.Warn("serving stale tax rate",
        "region", region,
        "age", time.Since(at).String(),
    )
    return cached, fmt.Errorf(
        "%w (age %s)", port.ErrStale,
        time.Since(at),
    )
}

Now the caller decides. This is the part that matters: the
degrade-vs-fail choice does not belong in the adapter. The
adapter offers a degraded value; each use case says yes or no.

func (s *CatalogService) ShowPage(
    ctx context.Context, region string,
) (Page, error) {
    rate, err := s.pricing.TaxRate(ctx, region)
    if errors.Is(err, port.ErrStale) {
        return Page{
            Rate:    rate,
            Delayed: true,
        }, nil
    }
    if err != nil {
        return Page{}, err
    }
    return Page{Rate: rate}, nil
}

The catalog page renders the stale number and sets a banner.
Checkout makes the opposite call with the same data:

func (s *CheckoutService) Charge(
    ctx context.Context, region string, amt int64,
) error {
    rate, err := s.pricing.TaxRate(ctx, region)
    if err != nil {
        // stale OR hard error: refuse to charge
        return fmt.Errorf("checkout blocked: %w", err)
    }
    return s.chargeWith(ctx, rate, amt)
}

Same adapter, same cached value, two different answers. The
browse path degrades. The money path fails closed. errors.Is
against ErrStale is the whole switch.

Bound the staleness

Stale is a spectrum. A thirty-second-old tax rate is fine. A
tax rate from before the vendor changed it three days ago is a
wrong charge waiting to happen. Give the fallback a ceiling.

type Fallback struct {
    primary  port.PricingPort
    cache    Cache
    maxAge   time.Duration
    log      *slog.Logger
}

func NewFallback(
    p port.PricingPort, c Cache, l *slog.Logger,
    maxAge time.Duration,
) *Fallback {
    return &Fallback{
        primary: p, cache: c, log: l, maxAge: maxAge,
    }
}

func (f *Fallback) TaxRate(
    ctx context.Context, region string,
) (port.TaxRate, error) {
    r, err := f.primary.TaxRate(ctx, region)
    if err == nil {
        f.cache.Set(ctx, region, r, time.Now())
        return r, nil
    }
    cached, at, ok := f.cache.Get(ctx, region)
    if !ok || time.Since(at) > f.maxAge {
        return port.TaxRate{}, err
    }
    return cached, fmt.Errorf(
        "%w (age %s)", port.ErrStale, time.Since(at),
    )
}

The constructor now takes the ceiling, so the wiring at startup
passes it in: NewFallback(httpClient, redisCache, log, 5*time.Minute).
Leave maxAge unset and it defaults to zero, which makes every
cached value look too old and defeats the fallback entirely.

Past maxAge, the fallback stops pretending. It returns the
original downstream error and the request fails like it would
have with no cache at all. Serving data that is too old is worse
than serving none, because nobody knows to distrust it.

Do not hammer a dead downstream

One detail the fallback above misses: it calls the primary on
every request even while the downstream is clearly down. That
turns a partial outage into a slow one, because each request
waits out the full timeout before falling back.

Put a circuit breaker in front of the primary call. The
standard library gives you the pieces; a minimal version:

type breaker struct {
    mu       sync.Mutex
    fails    int
    openTill time.Time
}

func (b *breaker) allow() bool {
    b.mu.Lock()
    defer b.mu.Unlock()
    return time.Now().After(b.openTill)
}

func (b *breaker) record(err error) {
    b.mu.Lock()
    defer b.mu.Unlock()
    if err != nil {
        b.fails++
        if b.fails >= 5 {
            b.openTill = time.Now().Add(10 * time.Second)
            b.fails = 0
        }
        return
    }
    b.fails = 0
}

The fallback checks allow() before touching the primary. When
the breaker is open, it skips the doomed call and goes straight
to cache, which trims the timeout wait off every request during
the outage. For production, reach for a tested library like
sony/gobreaker rather than hand-rolling the state machine, but
the shape is the same: a gate in front of the primary, the cache
behind it.

Why the boundary is the right place

You could scatter if err != nil { useCache() } through every
handler. It works until the third handler forgets, or someone
adds a fourth downstream and copies the wrong version. The
policy drifts because it lives in a dozen places.

Behind a port, it lives in one adapter. The degrade logic, the
staleness ceiling, the breaker, the stale-signal error: all in
fallback.go. Callers depend on the port and read one sentinel
error. Testing it needs no network — feed the fallback a stub
primary that returns an error and a stub cache with a known
value, then assert you got ErrStale back. The whole resilience
story is unit-testable because it sits at a seam you own.

Graceful degradation is not a library you install. It is a
decision about which data your product can serve stale and which
it cannot, expressed as code at the one boundary where the
answer is knowable.

If this was useful

Fallback adapters, sentinel errors, and errors.Is switching
are stdlib mechanics — The Complete Guide to Go Programming
covers the error-wrapping and context machinery the fallback
leans on. Keeping the degrade policy at the port boundary, so it
stays in one adapter instead of leaking into every caller, is
the spine of Hexagonal Architecture in Go.