How to Build a Lightweight Rule Engine for Automated Compliance Checks

California just announced it'll start ticketing driverless cars that break traffic laws. That got me thinking — not about self-driving cars specifically, but about a problem I've hit on three different projects: how do you make an automated system respect a set of rules that change over time?

Whether you're building a CI/CD pipeline that enforces deployment policies, an API gateway with rate-limiting rules, or a workflow engine that needs to comply with business regulations, you eventually need a rule engine. And if you reach for a massive enterprise framework on day one, you'll regret it.

Here's how I build lightweight rule engines that actually hold up in production.

The Problem: Hardcoded Rules Rot Fast

Every project starts the same way. Someone says "just add an if-statement." So you do.

# This is fine... for now
def check_deployment(deploy_request):
    if deploy_request.target == "production" and not deploy_request.has_approval:
        return Denied("Production deploys require approval")
    if deploy_request.time.hour < 9 or deploy_request.time.hour > 17:
        return Denied("No deploys outside business hours")
    return Approved()

Then the rules multiply. Then someone wants to change them without a code deploy. Then different environments need different rules. Then someone asks for an audit log of which rules fired and why.

Now your neat little function is 200 lines of nested conditionals, and every change is a production risk.

The Core Pattern: Separate Rules From Execution

The fix isn't a framework — it's a pattern. You need three things:

A rule definition format (data, not code)
An evaluation engine (small, testable, deterministic)
A result collector (for audit trails and debugging)

Here's the minimal version I keep coming back to:

from dataclasses import dataclass, field
from typing import Any, Callable
import operator

# Map string operators to actual functions
OPERATORS = {
    "eq": operator.eq,
    "ne": operator.ne,
    "gt": operator.gt,
    "lt": operator.lt,
    "gte": operator.ge,
    "lte": operator.le,
    "in": lambda val, collection: val in collection,
    "not_in": lambda val, collection: val not in collection,
    "contains": lambda collection, val: val in collection,
}

@dataclass
class Rule:
    name: str
    field: str           # dot-notation path into the context
    op: str              # operator key from OPERATORS
    value: Any           # what we're comparing against
    message: str = ""    # human-readable explanation
    severity: str = "error"  # error, warning, info

@dataclass
class RuleResult:
    rule: Rule
    passed: bool
    actual_value: Any = None

def resolve_field(obj: dict, path: str) -> Any:
    """Navigate nested dicts with dot notation: 'deploy.target.env'"""
    current = obj
    for key in path.split("."):
        if isinstance(current, dict):
            current = current.get(key)
        else:
            return None
    return current

def evaluate(rules: list[Rule], context: dict) -> list[RuleResult]:
    results = []
    for rule in rules:
        actual = resolve_field(context, rule.field)
        op_func = OPERATORS.get(rule.op)
        if op_func is None:
            raise ValueError(f"Unknown operator: {rule.op}")
        try:
            passed = op_func(actual, rule.value)
        except TypeError:
            passed = False  # type mismatch = rule not satisfied
        results.append(RuleResult(rule=rule, passed=passed, actual_value=actual))
    return results

Nothing fancy. No DSL parser, no YAML templating language, no dependency injection. Just data in, results out.

Loading Rules From Config

The real power comes when rules live outside your code. I typically use JSON or YAML, loaded at startup or fetched from a config service.

import json

def load_rules(path: str) -> list[Rule]:
    with open(path) as f:
        raw = json.load(f)
    return [Rule(**r) for r in raw["rules"]]

# rules.json
# {
#   "rules": [
#     {
#       "name": "business_hours_only",
#       "field": "request.hour",
#       "op": "gte",
#       "value": 9,
#       "message": "Action not permitted outside business hours",
#       "severity": "error"
#     },
#     {
#       "name": "max_batch_size",
#       "field": "payload.items_count",
#       "op": "lte",
#       "value": 1000,
#       "message": "Batch size exceeds safe limit",
#       "severity": "warning"
#     }
#   ]
# }

Now your ops team can tweak compliance rules without touching application code. You can version the rule files in git, diff them in PRs, and roll them back independently.

Adding Rule Groups and Short-Circuit Logic

In practice, you'll want to group rules. Some groups should short-circuit (stop on first failure), others should collect all violations.

@dataclass
class RuleGroup:
    name: str
    rules: list[Rule]
    mode: str = "all"  # "all" = collect everything, "first_fail" = stop early

def evaluate_group(group: RuleGroup, context: dict) -> list[RuleResult]:
    results = []
    for rule in group.rules:
        actual = resolve_field(context, rule.field)
        op_func = OPERATORS[rule.op]
        try:
            passed = op_func(actual, rule.value)
        except TypeError:
            passed = False
        result = RuleResult(rule=rule, passed=passed, actual_value=actual)
        results.append(result)
        # bail early if this group uses short-circuit mode
        if not passed and group.mode == "first_fail":
            break
    return results

def evaluate_all_groups(groups: list[RuleGroup], context: dict) -> dict:
    return {
        group.name: evaluate_group(group, context)
        for group in groups
    }

This is the 80/20 point. You've got configurable rules, grouped evaluation, short-circuit logic, and a full audit trail of what passed and what didn't. For most projects, this is enough.

When You Actually Need More

I've only outgrown this pattern twice in eight years. The signs you need something heavier:

Rules reference other rules ("if rule A passed, skip rule B") — now you need a dependency graph
Rules need temporal logic ("this value was X five minutes ago") — now you need state
Non-technical users need to author rules — now you need a UI and probably a real DSL

If you hit those cases, look at existing open-source rule engines for your ecosystem before building one. Python has projects like business-rules. JavaScript has json-rules-engine. Go has grule-rule-engine. They handle the graph traversal and conflict resolution that you don't want to write yourself.

But don't start there. Start with the 50-line evaluator above and see how far it takes you.

Practical Tips From Production

A few things I learned the hard way:

Always log the full context alongside results. When someone asks "why was this request denied at 2 AM last Tuesday," you want the exact input that was evaluated, not just the rule name that fired.
Version your rule sets. Every time rules change, tag the version. Store the version alongside any decision the engine made. You'll need this for audits.
Test rules like code. Write unit tests for your rule definitions. Feed them known contexts, assert expected outcomes. This catches typos in field names and logic inversions before production does.
Set up dry-run mode from day one. Before enforcing a new rule, run it in shadow mode — evaluate but don't block. This has saved me from deploying overly aggressive rules more times than I want to admit.

def make_decision(groups: list[RuleGroup], context: dict, dry_run: bool = False):
    all_results = evaluate_all_groups(groups, context)
    failures = [
        r for results in all_results.values()
        for r in results
        if not r.passed and r.rule.severity == "error"
    ]
    decision = "deny" if failures and not dry_run else "allow"
    # always log regardless of mode
    log_decision(context, all_results, decision, dry_run)
    return decision, all_results

Wrapping Up

The pattern here isn't specific to any domain. I've used it for deployment gates, invoice validation, content moderation filters, and API request policies. The shape is always the same: define rules as data, evaluate them against a context, collect the results.

Start with the simplest evaluator that does the job. Keep rules in version-controlled config files. Log everything. Add complexity only when the current system genuinely can't express what you need.

Forty lines of code and a JSON file will get you surprisingly far.