From Learning to Shipping — Docker, Graceful Shutdown & ECS Fargate

This is the final post in the series. In part 6 I covered testing — table-driven tests, httptest, benchmarks. Now I'm taking everything built across the last six posts and shipping it: a multi-stage Docker build, graceful shutdown wired into the Gin server, and deployment to AWS ECS Fargate using Terraform — the same stack I used for rust-ai-gateway.

If you've followed along, this is where the learning project becomes something you can actually point a recruiter at.

Multi-Stage Docker Build

The biggest Go deployment win over the JVM is binary size. A Go service compiles to a single self-contained binary with no runtime dependency — no JVM to install, no classpath to assemble. That means the production image can be built in two stages: a full Go toolchain image to compile, then a minimal scratch or distroless image to run.

# ---- build stage ----
FROMgolang:1.22-alpineASbuilder

WORKDIR /app

COPY go.mod go.sum ./
RUN go mod download

COPY . .

RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -ldflags="-s -w" -o orders-api ./cmd/main.go

# ---- run stage ----
FROM gcr.io/distroless/static-debian12

WORKDIR /app
COPY --from=builder /app/orders-api .

EXPOSE 8080
USER nonroot:nonroot

ENTRYPOINT ["/app/orders-api"]

Three things worth unpacking here. CGO_ENABLED=0 disables C bindings, which is required for a fully static binary that works in distroless. The -ldflags="-s -w" strip the debug symbol table and DWARF info, shaving 20–30% off the binary size for free. And distroless/static is the production choice over scratch — it has no shell (good for security) but includes timezone data and SSL certificates that your service will almost certainly need.

The resulting image typically comes in under 20MB. The equivalent Spring Boot fat jar with a JRE base image usually lands somewhere north of 300MB.

Graceful Shutdown

Out of the box, r.Run(":8080") in Gin blocks forever and stops immediately on SIGTERM — mid-request, no cleanup, no connection draining. In a containerised environment where ECS sends SIGTERM before terminating a task, that's dropped requests on every deployment. The fix is to run the server in a goroutine and listen for the OS signal yourself:

// cmd/main.go
package main

import (
    "context"
    "log"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"

    "orders-api/handler"
    "orders-api/middleware"
    "orders-api/store"

    "github.com/gin-gonic/gin"
)

func main() {
    s := store.NewInMemoryStore()
    h := handler.NewOrderHandler(s)

    r := gin.New()
    r.Use(middleware.Logger())

    api := r.Group("/api/v1")
    {
        orders := api.Group("/orders")
        orders.POST("", h.Create)
        orders.GET("", h.List)
        orders.GET("/:id", h.GetByID)
    }

    srv := &http.Server{
        Addr:    ":8080",
        Handler: r,
    }

    // start server in background goroutine
    go func() {
        log.Println("server listening on :8080")
        if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            log.Fatalf("listen error: %v", err)
        }
    }()

    // block until SIGINT or SIGTERM
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
    <-quit

    log.Println("shutdown signal received, draining connections...")

    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    if err := srv.Shutdown(ctx); err != nil {
        log.Fatalf("forced shutdown: %v", err)
    }

    log.Println("server exited cleanly")
}

srv.Shutdown(ctx) tells the HTTP server to stop accepting new connections and wait for in-flight requests to finish, up to the 30-second deadline. This is the goroutine and channel pattern from part 3 applied directly to real production code — quit is a channel, <-quit blocks until a signal arrives, and then the shutdown sequence runs. No framework magic, just the primitives we've been using all along.

Health Check Endpoint

ECS needs a health check to know when a task is ready for traffic and when it's gone unhealthy. Two lines in the router:

r.GET("/health", func(c *gin.Context) {
    c.JSON(http.StatusOK, gin.H{"status": "ok"})
})

r.GET("/ready", func(c *gin.Context) {
    // in a real service: check DB connectivity here
    c.JSON(http.StatusOK, gin.H{"status": "ready"})
})

/health is the liveness probe — is the process running. /ready is the readiness probe — is the service ready to serve traffic (database connected, caches warm). Keeping them separate lets ECS restart an unhealthy container without pulling it from the load balancer unnecessarily during a slow startup.

Terraform: ECS Fargate

The infrastructure follows the same pattern as rust-ai-gateway — ECR for the image, ECS Fargate for the task, an ALB in front. The key pieces:

# ecr.tf
resource "aws_ecr_repository" "orders_api" {
  name                 = "orders-api"
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = true
  }
}

# ecs.tf
resource "aws_ecs_task_definition" "orders_api" {
  family                   = "orders-api"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn       = aws_iam_role.ecs_task_execution.arn

  container_definitions = jsonencode([{
    name      = "orders-api"
    image     = "${aws_ecr_repository.orders_api.repository_url}:latest"
    essential = true

    portMappings = [{
      containerPort = 8080
      protocol      = "tcp"
    }]

    healthCheck = {
      command     = ["CMD-SHELL", "wget -qO- http://localhost:8080/health || exit 1"]
      interval    = 30
      timeout     = 5
      retries     = 3
      startPeriod = 10
    }

    logConfiguration = {
      logDriver = "awslogs"
      options = {
        "awslogs-group"         = "/ecs/orders-api"
        "awslogs-region"        = var.aws_region
        "awslogs-stream-prefix" = "ecs"
      }
    }
  }])
}

resource "aws_ecs_service" "orders_api" {
  name            = "orders-api"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.orders_api.arn
  desired_count   = 2
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.ecs_tasks.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.orders_api.arn
    container_name   = "orders-api"
    container_port   = 8080
  }

  deployment_minimum_healthy_percent = 100
  deployment_maximum_percent         = 200
}

desired_count = 2 with deployment_minimum_healthy_percent = 100 means ECS will never take both tasks down simultaneously during a deployment — new tasks must become healthy before old ones are stopped, which combined with the graceful shutdown above gives zero-downtime deploys.

CI/CD: GitHub Actions

The deploy pipeline that ties it together — build, push to ECR, force a new ECS deployment:

# .github/workflows/deploy.yml
name: Deploy

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_REGION }}

      - name: Login to ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build, tag and push image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/orders-api:$IMAGE_TAG .
          docker push $ECR_REGISTRY/orders-api:$IMAGE_TAG
          docker tag $ECR_REGISTRY/orders-api:$IMAGE_TAG $ECR_REGISTRY/orders-api:latest
          docker push $ECR_REGISTRY/orders-api:latest

      - name: Force ECS deployment
        run: |
          aws ecs update-service \
            --cluster main \
            --service orders-api \
            --force-new-deployment

Using github.sha as the image tag means every deploy is traceable — you can look at a running ECS task, find the image tag, and map it back to an exact commit. Tagging latest separately lets Terraform and the task definition reference a stable name, while the SHA tag is there for auditability.

What This Series Built

Seven posts ago I was explaining why I was picking up Go. Here's what actually got built:

A type system that composes instead of inheriting, with interfaces that match on shape rather than declaration
A concurrency model that puts goroutines and channels in reach by default
Error handling that makes failure visible at every call site
A Gin REST API with middleware, request validation, and clean handler separation
A full test suite: table-driven unit tests, httptest handler tests, and benchmarks
A production deployment: multi-stage Docker under 20MB, graceful shutdown, ECS Fargate behind an ALB, GitHub Actions CI/CD

The thing I didn't expect: Go's simplicity compounds. Every new piece — a new handler, a new test, a new middleware — follows the same small set of patterns. There's less to remember, and the code that comes out of it is easier to read six months later than Java or Kotlin at comparable complexity. I get why teams that care about maintainability and onboarding speed keep reaching for it.

Next on my list for this codebase: swapping the in-memory store for pgx and PostgreSQL, adding OpenTelemetry tracing, and wiring in a real secrets manager. That might become another series, or it might just end up in the GitHub repo.

If you've followed along from part 1 — thanks for reading. And if you're just landing here and want the whole project, the full source is at github.com/MihirMohapatra/go-orders-api once I push it.

What should the next series be? Another Go deep-dive, or something at the intersection of Go and AI infrastructure?