Experience Engine: AI Memory That Shrinks As Your Agent Learns

javascript dev.to

Every AI coding session, my agent made the same mistakes.

DbContext as singleton — state corruption, 15 minutes debugging. Again. ILogger instead of IMLog — lost tenant context. Again. Wrong project reference path — build fails. Again.

I had 500 memory notes. My agent was still a junior with a bigger notebook.

So I built something different.

The Problem With AI Memory

Every AI memory tool — Mem0, Letta, Zep — stores facts. More sessions = more entries = more tokens = more cost. They're giving your agent a bigger notebook.

But here's the thing: a notebook doesn't make you experienced.

A junior developer with 500 notes is still a junior. A mid-level developer with 15 principles understands why things work. The difference isn't how much you remember — it's whether you can generalize.

Junior (500 notes):
  "DbContext singleton caused bug"
  "HttpClient singleton caused leak"  
  "SmtpClient singleton caused corruption"
  → Encounters RedisConnection singleton → NO MATCH → makes the mistake

Mid-level (1 principle):
  "Stateful objects must be scoped, never singleton"
  → Encounters RedisConnection singleton → MATCHES → avoids the mistake
Enter fullscreen mode Exit fullscreen mode

That's what Experience Engine does. It doesn't store more facts. It evolves facts into principles, then deletes the facts.

How It Works

When you code with any AI agent (Claude Code, Gemini CLI, Codex CLI), Experience Engine runs silently in the background:

Before every Edit/Write/Bash:
A hook queries the experience store: "Have I seen this mistake before?" If yes, it injects a warning directly into the agent's context:

⚠️ [Experience - High Confidence (0.85)]: Stateful objects must be 
scoped, never singleton. Last time SingleInstance caused state 
corruption in DbContext.
Enter fullscreen mode Exit fullscreen mode

The agent reads this warning and avoids the mistake. No human intervention needed.

After every session:
An extractor scans the session transcript for mistake patterns:

  • Retry loops (same tool call 3+ times)
  • User corrections ("no, not that", "wrong", "undo")
  • Test fail → fix cycles
  • Git reverts

Each detected mistake gets extracted into a structured Q&A entry and stored in a vector database.

Weekly (automatic):
The evolution engine runs:

  1. Promote: entries confirmed 3+ times move from cache → behavioral rules
  2. Abstract: clusters of 3+ similar entries → one general principle
  3. Demote: entries ignored 3+ times get deprioritized
  4. Archive: entries unused for 90 days get cleaned up

The result: memory shrinks as capability grows.

The 4-Tier Architecture

T0 Principles  (~400 tokens)  — generalized rules, always loaded
    "Stateful objects must be scoped, never singleton"

T1 Behavioral  (~600 tokens)  — specific reflexes, always loaded
    "WHEN DbContext + DI → MUST check lifetime FIRST"

T2 QA Cache    (semantic)     — detailed Q&A, retrieved on match
    Q: "Why not singleton?" → A: "State corruption across requests"

T3 Raw         (staging)      — unprocessed mistakes, TTL 30 days

Lifecycle: T3 → extract → T2 → promote (3x confirmed) → T1 
           → generalize (cluster 3+) → T0
           Memory SHRINKS as capability GROWS
Enter fullscreen mode Exit fullscreen mode

What Makes This Different

Mem0 Letta Zep Experience Engine
Over time Entries grow forever Entries grow forever Entries grow forever Entries shrink into principles
Novel cases Only exact matches Only exact matches Only exact matches Principles generalize
Mistake learning No No No 5 detection patterns
Dependencies Python + SDK PostgreSQL PostgreSQL Zero (Node.js built-in)
Local-first Optional Optional Partial Default
Data ownership Cloud vendor SaaS terms Cloud vendor You own everything

Experience Graph

Experiences aren't isolated entries — they're linked with typed edges:

DbContext singleton ──generalizes──→ "Stateful objects: always scoped"
                    ──relates-to───→ HttpClient singleton  
                    ──supersedes───→ [old] "Use transient for DbContext"
                    ──contradicts──→ [demoted] "Singleton is fine for DbContext"
Enter fullscreen mode Exit fullscreen mode

When one experience matches your current action, the engine follows edges to surface related experiences too. This is how it catches RedisConnection singleton — not because it's seen Redis before, but because it's connected to the principle about stateful objects.

Temporal Reasoning

Knowledge evolves. What was true in January might be wrong in March:

January:  "Use singleton for HttpClient" (confirmed 5x)
March:    "Use IHttpClientFactory instead" (contradicts January)
          → January entry superseded, not deleted
          → March entry ranks higher (recent confirmation)
          → GET /api/timeline?topic=httpclient shows the evolution
Enter fullscreen mode Exit fullscreen mode

The engine tracks confirmedAt[] arrays — not just "what was learned" but "when it was last confirmed." Stale knowledge gets penalized. Recent confirmations get boosted.

REST API

Everything is accessible via HTTP — not just CLI hooks:

# Start the server
node server.js
# Experience Engine API running on http://localhost:8082

# Check health
curl localhost:8082/health

# Query experience before a tool call
curl -X POST localhost:8082/api/intercept \
  -H "Content-Type: application/json" \
  -d '{"toolName":"Write","toolInput":{"file_path":"src/Startup.cs"}}'

# Response:
{
  "suggestions": "⚠️ [High Confidence (0.85)]: Stateful objects must be scoped",
  "hasSuggestions": true
}

# Trigger evolution
curl -X POST localhost:8082/api/evolve
# {"promoted":2,"abstracted":1,"demoted":0,"archived":3,"success":true}

# View stats
curl localhost:8082/api/stats?since=30d

# Knowledge timeline
curl "localhost:8082/api/timeline?topic=dependency+injection"

# Experience graph
curl "localhost:8082/api/graph?id=abc-123"
Enter fullscreen mode Exit fullscreen mode

10 endpoints total. Zero dependencies — uses Node.js built-in http module. CORS enabled for browser extensions.

Python SDK

from muonroi_experience import Client

client = Client("http://localhost:8082")

# Query experience
result = client.intercept("Write", {"file_path": "app.py"})
if result["hasSuggestions"]:
    print(result["suggestions"])

# Extract lessons
client.extract("Agent tried singleton for DbContext, caused corruption...")

# Trigger evolution  
evolution = client.evolve()
print(f"Promoted: {evolution['promoted']}, Abstracted: {evolution['abstracted']}")

# Check stats
stats = client.stats(since="7d")
print(f"Mistakes avoided: {stats['suggestions']}")

# View knowledge timeline
timeline = client.timeline("dependency injection")
for entry in timeline["timeline"]:
    status = "[superseded]" if entry["superseded"] else ""
    print(f"{status}{entry['solution']}")
Enter fullscreen mode Exit fullscreen mode

Zero dependencies — uses Python stdlib urllib. Python 3.8+.

Multi-User Support

Multiple developers on the same machine get isolated stores:

EXP_USER=alice node server.js    # Alice's experiences
EXP_USER=bob node server.js      # Bob's (completely isolated)
Enter fullscreen mode Exit fullscreen mode

Share principles without sharing personal data:

# Alice shares a principle she evolved
curl -X POST localhost:8082/api/principles/share \
  -d '{"principleId": "abc-123"}'
# Returns portable JSON — no personal data included

# Bob imports it
curl -X POST localhost:8082/api/principles/import \
  -d '{"principle":"Stateful objects must be scoped","solution":"...","confidence":0.85}'
# Bob's evolution engine manages it independently from here
Enter fullscreen mode Exit fullscreen mode

Quick Start (5 minutes)

git clone https://github.com/muonroi/experience-engine.git
cd experience-engine
bash .experience/setup.sh --local   # Docker Qdrant + Ollama (100% free)
Enter fullscreen mode Exit fullscreen mode

The setup wizard handles everything. After setup, your agent starts learning automatically through hooks.

Supported providers

You're not locked to Ollama. The engine supports:

Embedding: Ollama, OpenAI, Gemini, VoyageAI, SiliconFlow, or any OpenAI-compatible API

Brain (extraction): Ollama, OpenAI, Gemini, Claude, DeepSeek, SiliconFlow, or any OpenAI-compatible API

Mix and match — e.g., SiliconFlow for cheap embeddings + Ollama for free extraction.

Anti-Noise Scoring

Not all experiences are equal. Results are ranked by:

  • Hit frequency — confirmed experiences rank higher
  • Recency — recently confirmed > stale (60+ days penalty)
  • Confidence aging — new entries start lower, climb with confirmation
  • Ignore tracking — suggestions ignored 3x get demoted
  • Domain match — editing .ts file → TypeScript experiences rank higher
  • Temporal boost — confirmed in last 7 days → score boost
  • Superseded penalty — replaced knowledge ranks lower

This means your agent gets the most relevant, most trusted experience for the current context — not just the most similar vector match.

The Philosophy

Every AI memory company stores your data on their cloud and charges you to access it. Mem0 stores your memories. Letta stores your agent state. You pay monthly to access your own knowledge.

Experience Engine is different:

  • Your data never leaves your machine (unless you choose cloud sync)
  • Zero vendor lock-in — standard formats, portable profiles
  • Zero dependencies — Node.js built-in modules only
  • The engine is open source — you pay for convenience, never for capability

"Enterprise AI replaces you. Personal AI empowers you. Same technology. Different owner."

What's Next

The engine is live and working. I'm dogfooding it on my own projects right now. After 2 weeks:

  • 47 suggestions fired
  • 12 mistakes avoided
  • 5 principles evolved from ~50 raw entries
  • Memory footprint decreased (entries compressed into principles)

Next up: dashboard for visualizing the "Saves" feed (mistakes the agent didn't make), and a browser extension that injects experience into ChatGPT/Claude/Gemini web interfaces.

Links

If you're running local LLMs with Ollama, I'd love to hear how the engine works with your setup. And if you have ideas for new mistake detection patterns — PRs welcome.


Experience Engine is MIT licensed and free forever. The core engine will never be paywalled.

Read Full Tutorial open_in_new
arrow_back Back to Tutorials