Applying clig.dev to a Go CLI — With an Automated Compliance Test

clig.dev is a set of guidelines for building command-line interfaces that are consistent, predictable, and human-friendly. Most teams treat these as aspirational — a checklist someone reviews once during a code review.

We automated it. A test walks our entire command tree and verifies every guideline at CI time. If someone adds a new command without a Long description, without examples, without SilenceUsage, or without exit code documentation, the build fails.

Here's how each guideline maps to concrete Go code, and the test that enforces it.

The Compliance Test

func TestCligCompliance(t *testing.T) {
    root := getRootCmd()

    // Walk the full command tree (including nested subcommands).
    var commands []*cobra.Command
    var walk func(cmd *cobra.Command)
    walk = func(cmd *cobra.Command) {
        commands = append(commands, cmd)
        for _, child := range cmd.Commands() {
            walk(child)
        }
    }
    for _, child := range root.Commands() {
        walk(child)
    }

    for _, cmd := range commands {
        name := cmd.CommandPath()
        if cmd.RunE == nil && cmd.Run == nil {
            continue // Skip group headers
        }

        t.Run(name, func(t *testing.T) {
            t.Run("has_long_description", func(t *testing.T) { ... })
            t.Run("long_starts_with_verb", func(t *testing.T) { ... })
            t.Run("documents_exit_codes", func(t *testing.T) { ... })
            t.Run("has_examples", func(t *testing.T) { ... })
            t.Run("silence_usage_set", func(t *testing.T) { ... })
            t.Run("silence_errors_set", func(t *testing.T) { ... })
            t.Run("format_flag_if_data_command", func(t *testing.T) { ... })
        })
    }
}

Seven checks per command. The test discovers commands dynamically — no manual list to maintain. When someone adds stave inspect newcommand, the test automatically verifies it meets all seven criteria.

Run it:

make clig-check
# or: go test ./cmd/ -run "TestCligCompliance|TestCligGlobalFlags" -count=1

Let's look at each guideline and how it's implemented.

1. Help Text: Long Description Starting With a Verb

clig.dev says: "Provide a detailed help text for every command. Start with a verb phrase that describes what the command does."

The Test

t.Run("long_starts_with_verb", func(t *testing.T) {
    long := strings.TrimSpace(cmd.Long)
    if long == "" {
        t.Skip("no Long description")
    }
    first := strings.SplitN(long, " ", 2)[0]
    if first != "" && first[0] >= 'a' && first[0] <= 'z' {
        t.Errorf("%s: Long description should start with a capitalized verb, got %q", name, first)
    }
})

The Implementation

Every command follows a template:

Long: `Gate applies a CI failure policy and returns exit code 3 when the policy fails.

Supported policies:
  - fail_on_any_violation
  - fail_on_new_violation
  - fail_on_overdue_upcoming

Inputs:
  --policy          CI failure policy mode (default: from project config)
  --in              Path to evaluation JSON
  --format, -f      Output format: text or json (default: text)

Outputs:
  stdout            Gate result summary (text or JSON)
  stderr            Error messages (if any)

Exit Codes:
  0   - Policy passed; no violations detected
  2   - Invalid input or configuration error
  3   - Policy failed; violations detected
  130 - Interrupted (SIGINT)`,

The template has four sections: what it does (verb phrase), inputs (flags), outputs (stdout/stderr), and exit codes. This structure is enforced by the test — the "documents exit codes" check verifies the word "exit" appears in the Long text.

2. Exit Codes: Semantic, Not Binary

clig.dev says: "Use exit codes to indicate what happened. Don't just use 0 and 1."

The Implementation

const (
    ExitSuccess     = 0   // No issues
    ExitSecurity    = 1   // Security-audit gating failure
    ExitInputError  = 2   // Invalid input, flags, or schema validation
    ExitViolations  = 3   // Evaluation completed — findings detected
    ExitInternal    = 4   // Unexpected internal error (bug)
    ExitInterrupted = 130 // Interrupted by SIGINT (Ctrl+C)
)

Exit code 3 is the critical design decision. The tool succeeded — it ran correctly and found violations. That's not an error (exit 1 or 2). It's a policy signal. CI pipelines can branch:

stave apply --controls controls --observations observations
case $? in
    0)   echo "Clean" ;;
    2)   echo "Bad input" ; exit 1 ;;
    3)   echo "Violations found" ; exit 1 ;;
    4)   echo "Bug — report it" ; exit 1 ;;
esac

The Test

t.Run("documents_exit_codes", func(t *testing.T) {
    if cmd.RunE == nil {
        t.Skip("group command")
    }
    if !strings.Contains(strings.ToLower(cmd.Long), "exit") {
        t.Errorf("%s: Long description should document exit codes", name)
    }
})

Every leaf command must mention "exit" in its Long description. The test doesn't check the specific codes — just that exit code documentation exists.

3. --format on Data Commands

clig.dev says: "If your command produces structured data, offer --format for machine-readable output."

The Test

t.Run("format_flag_if_data_command", func(t *testing.T) {
    if !isDataCommand(cmd) {
        t.Skip("not a data command")
    }
    f := cmd.Flags().Lookup("format")
    if f == nil {
        f = cmd.InheritedFlags().Lookup("format")
    }
    if f == nil {
        t.Errorf("%s: data-producing command lacks --format flag", name)
    }
})

isDataCommand maintains an explicit list of commands that produce multi-format output:

func isDataCommand(cmd *cobra.Command) bool {
    multiFormatCommands := map[string]bool{
        "stave apply":             true,
        "stave diagnose":          true,
        "stave validate":          true,
        "stave report":            true,
        "stave security-audit":    true,
        "stave ci gate":           true,
        "stave snapshot diff":     true,
        "stave snapshot quality":  true,
        "stave snapshot upcoming": true,
        "stave controls list":     true,
    }
    return multiFormatCommands[cmd.CommandPath()]
}

Adding a new data command to this map and forgetting the --format flag fails the test.

4. SilenceUsage and SilenceErrors

clig.dev says: "Don't dump usage information on every error. It's noise."

The Problem

By default, Cobra prints the full usage text whenever a command returns an error. A missing file produces 50 lines of flag documentation followed by a one-line error message. The user has to scroll up to find the actual problem.

The Implementation

Every leaf command sets:

cmd := &cobra.Command{
    Use:           "gate",
    SilenceUsage:  true,
    SilenceErrors: true,
    RunE: func(cmd *cobra.Command, _ []string) error {
        // ...
    },
}

SilenceUsage: true — Cobra doesn't print usage on error. The error message alone is shown.
SilenceErrors: true — Cobra doesn't print the error. The executor handles error rendering with structured ErrorInfo (code, title, message, action, URL).

The Test

t.Run("silence_usage_set", func(t *testing.T) {
    if cmd.RunE == nil {
        t.Skip("no RunE")
    }
    if !cmd.SilenceUsage {
        t.Errorf("%s: SilenceUsage should be true", name)
    }
})

t.Run("silence_errors_set", func(t *testing.T) {
    if cmd.RunE == nil {
        t.Skip("no RunE")
    }
    if !cmd.SilenceErrors {
        t.Errorf("%s: SilenceErrors should be true", name)
    }
})

5. --version, --quiet, --no-color, --verbose

clig.dev says: "Provide global flags for version, quiet mode, verbose mode, and color control."

The Test

func TestCligGlobalFlags(t *testing.T) {
    root := getRootCmd()

    if root.Version == "" {
        t.Error("root command has empty Version")
    }

    requiredGlobals := []struct {
        name string
        why  string
    }{
        {"quiet", "CLIG: --quiet suppresses non-essential output"},
        {"verbose", "CLIG: -v enables verbose/debug output"},
        {"no-color", "CLIG: --no-color disables ANSI output"},
    }

    for _, rf := range requiredGlobals {
        t.Run(rf.name, func(t *testing.T) {
            f := root.PersistentFlags().Lookup(rf.name)
            if f == nil {
                t.Errorf("root command missing --%s flag (%s)", rf.name, rf.why)
            }
        })
    }
}

All four are persistent flags on the root command — available to every subcommand.

6. NO_COLOR and TTY Detection

clig.dev says: "Respect the NO_COLOR environment variable. Don't output ANSI codes when stdout is not a terminal."

The Implementation

A five-level cascade determines whether to use color:

func (r *Runtime) CanColor(out io.Writer) bool {
    // 1. Explicit --no-color flag
    if r != nil && r.NoColor {
        return false
    }
    // 2. NO_COLOR environment variable (https://no-color.org)
    if _, noColor := os.LookupEnv("NO_COLOR"); noColor {
        return false
    }
    // 3. TERM=dumb
    if strings.EqualFold(os.Getenv("TERM"), "dumb") {
        return false
    }
    // 4. Test override
    if r != nil && r.IsTTY != nil {
        return *r.IsTTY
    }
    // 5. Actual TTY detection (cached per file descriptor)
    return detectTTY(out)
}

Priority order: explicit flag > environment > terminal type > test override > runtime detection. The result is cached per file descriptor via sync.Map — checking the same os.Stdout twice doesn't make two syscalls.

7. stderr for Diagnostics, stdout for Data

clig.dev says: "Write data to stdout. Write messages, logs, and progress to stderr."

The Implementation

Progress spinners write to stderr:

func (r *Runtime) BeginProgress(label string) func() {
    errOut := r.stderr()  // Always stderr, never stdout
    if !r.isTerminal(errOut) {
        fmt.Fprintf(errOut, "Running: %s...\n", label)
        // ...
    }
    // Spinner frames go to errOut
}

Hints and next-step suggestions write to stderr:

func (r *Runtime) PrintNextSteps(steps ...string) {
    if r.Quiet || len(steps) == 0 {
        return
    }
    fmt.Fprintln(r.stderr(), "\nNext steps:")
    for _, s := range steps {
        fmt.Fprintf(r.stderr(), "  %s\n", s)
    }
}

Data (findings, reports, JSON) writes to stdout:

RunE: func(cmd *cobra.Command, _ []string) error {
    result, err := evaluate(...)
    return jsonutil.WriteIndented(cmd.OutOrStdout(), result)
}

This means stave apply --format json | jq .findings works — progress messages and hints go to stderr and don't contaminate the JSON on stdout.

8. "Did You Mean?" Suggestions

clig.dev says: "If the user makes a typo, suggest the closest valid command or flag value."

The Implementation

For unknown commands:

func SuggestCommandError(err error, commandNames []string) error {
    unknown := extractUnknownCommand(err.Error())
    if unknown == "" {
        return err
    }
    suggestion := ClosestToken(unknown, commandNames)
    if suggestion == "" || suggestion == unknown {
        return err
    }
    return fmt.Errorf("unknown command %q\nDid you mean %q?", unknown, suggestion)
}

For invalid --format values:

func ResolveFormatValue(raw string, valid []string) (OutputFormat, error) {
    normalized := NormalizeToken(raw)
    for _, v := range valid {
        if normalized == v {
            return OutputFormat(v), nil
        }
    }
    if suggestion := ClosestToken(normalized, valid); suggestion != "" {
        return "", fmt.Errorf("invalid --format %q\nDid you mean %q?", raw, suggestion)
    }
    return "", fmt.Errorf("invalid --format %q (use %s)", raw, enumList(valid))
}

The ClosestToken function uses Levenshtein edit distance — if the input is within 2 edits of a valid value, it's suggested. stave apply --format jso → Did you mean "json"?

9. Graceful SIGINT Handling

clig.dev says: "Handle Ctrl+C gracefully. Clean up resources. Use exit code 130."

The Implementation

func (a *App) installInterruptHandler() func() {
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)

    go func() {
        select {
        case <-sigCh:
            fmt.Fprintln(os.Stderr, "Interrupted")
            if a.cancel != nil {
                a.cancel()  // Cancel context, operations unwind
            }
        case <-done:
            return
        }
    }()
    // ...
}

Context cancellation, not os.Exit. Deferred cleanup runs. Exit code 130 (128 + SIGINT signal number 2) — the Unix convention that CI tools understand.

10. Examples: Realistic, Copy-Pasteable

clig.dev says: "Show examples that work. Use realistic values, not 'foo' and 'bar'."

The Implementation

Example: `  # Standard evaluation
  stave apply --controls controls --observations observations --max-unsafe 168h

  # JSON output for automation
  stave apply --format json > evaluation.json

  # Quiet mode with exit code only
  stave apply --quiet --max-unsafe 7d

  # Deterministic evaluation for CI
  stave apply --now 2026-01-11T00:00:00Z --format json`,

Real flag values (168h, 7d), real paths (controls, observations), real workflows (pipe to file, CI with --now). A user can copy-paste any example and it works with the test data that ships with the tool.

The Test

t.Run("has_examples", func(t *testing.T) {
    if cmd.RunE == nil {
        t.Skip("group command")
    }
    if strings.TrimSpace(cmd.Example) == "" {
        t.Errorf("%s: missing Example (CLIG: show realistic usage)", name)
    }
})

The Full Checklist

clig.dev Guideline	Implementation	Enforced By
Detailed help text	`Long:` field with verb-phrase + template	`TestCligCompliance/has_long_description`
Help starts with verb	Capitalized verb phrase	`TestCligCompliance/long_starts_with_verb`
Document exit codes	Exit codes section in Long text	`TestCligCompliance/documents_exit_codes`
Realistic examples	`Example:` field with copy-pasteable commands	`TestCligCompliance/has_examples`
Don't dump usage on error	`SilenceUsage: true`	`TestCligCompliance/silence_usage_set`
Custom error rendering	`SilenceErrors: true` + ErrorInfo	`TestCligCompliance/silence_errors_set`
--format on data commands	`-f` flag on multi-format commands	`TestCligCompliance/format_flag_if_data_command`
--version	Root command `Version` field	`TestCligGlobalFlags`
--quiet	Persistent flag, resolves io.Writer	`TestCligGlobalFlags`
--verbose	Persistent flag, controls slog level	`TestCligGlobalFlags`
--no-color	Persistent flag + NO_COLOR env + TTY detection	`TestCligGlobalFlags`
stderr for diagnostics	Progress, hints, errors → stderr	Code review (not automated)
"Did you mean?"	Edit distance suggestions for commands and flags	Code review (not automated)
SIGINT handling	Context cancellation, exit 130	Code review (not automated)

10 guidelines enforced by automated tests. 4 enforced by code review. The test runs in CI on every commit — a new command without proper help text, examples, or exit code documentation breaks the build.

The TestCligCompliance test walks the entire command tree of Stave, a Go CLI with 30+ commands. Adding a new command automatically triggers 7 compliance checks. make clig-check runs in under a second.