I built a context engine that saves Claude Code 73% of its tokens on large codebases

python dev.to March 31, 2026

The problem LLM coding agents on large repos burn tokens scanning files. Claude Code on an 829-file codebase consumed 45K tokens just finding the right code. By turn 3 of a conversation, context is gone. Token cost compounds. 20 questions in a session at 45K each is 900K tokens -- nearly the entire 1M window. The agent degrades before your work is done. What Mnemosyne does Sits between your codebase and your LLM. Indexes into SQLite, scores every chunk with 6 retrieval signals (BM25, TF

Read Full Tutorial open_in_new