Claude Code re-reads whole files to use one function. I built Lumen to stop paying for it.

Why?

Lumen shows what your Claude Code sessions actually cost — and cuts token spend on large-file reads by around 87%, measured to the token. macOS app + CLI, MIT, fully local.

Here's a thing that quietly drains your Claude Code budget: the agent reads a 600-line file to use one function from it — and you pay input tokens for all 600 lines. Every time it re-reads.
Multiply that across a real session and it's the single biggest token sink in agentic coding. But Claude Code doesn't show you any of it while you work — not how full your context is, not what the session costs, not where the tokens went.
So I built Lumen. It shows you the whole picture, and it cuts the cost of those big reads.
https://github.com/HackPoint/lumen (MIT)

What it does
Two things.

It shows you the truth. Lumen watches your Claude Code session files locally and surfaces what the tool hides:

Context fill — a live gauge, green → amber at 80% → red at 95%, so compaction never blindsides you mid-flow.
Session cost — real dollars, split into input / output / cache read / cache write.
Where the money goes — per-category breakdown, plus today / this week / all-time totals.

It actually saves tokens. When Claude tries to read a large file (≥ 300 lines), Lumen steps in before the read runs and gives Claude a smarter path:

smart_read returns a map of the file — functions, classes, imports with line ranges — without the bodies. That's roughly 5–10% of the cost of reading the whole file.
recall_file then pulls just the one function Claude actually needs (resolved via a real AST, not a regex).
compress_logs collapses repeated lines and stack-trace noise in build/test output — deterministic, no information lost.

Instead of walking through the whole building every time, Claude gets the floor plan first and then opens only the door it needs.
So how much does it actually save?
On intercepted reads, about 87% fewer tokens — measured on my machine, and it varies with your codebase. Bigger, more structured files save more.
The important word is measured. Every intercepted read records the full-read cost vs. the actual cost, counted by the same kind of tokenizer Claude uses. No estimates, no extrapolation. The "saved" number starts small and grows honestly with every session.
Fewer input tokens means two wins at once: a lower bill, and more room in your context window — which means fewer compactions and sharper answers late in a session.
One honesty note I'm proud of
Lumen also shows what Claude Code's built-in prompt cache saved you — but it keeps that in a separate row and never adds it to its own number. Plenty of tools would bundle the two into one big hero stat. I don't. Small and verified beats large and invented.
GUI and CLI — same data, your choice

Menu-bar app — lives in your macOS menu bar (not the Dock). The tray icon changes color with your context fill, so the warning sits in your peripheral vision. Left-click for a quick popover; right-click to open the full window with Context and Optimizer tabs.
lumen CLI — the same live dashboard, rendered in your terminal. Reads from the same local database. If you live in the terminal, you never have to leave it.

Both read from one local SQLite file. Which brings me to:
Everything stays on your machine
No account. No telemetry. No phone-home.
The hooks are plain shell scripts. The intercept script reads no file contents and makes zero network calls. The meter counts tokens locally and writes one row to a local SQLite DB you can open, inspect, or delete yourself. That's it.
Honest status (a "not yet" beats a fake guide)
Right now Lumen runs on macOS (Apple Silicon). Windows, Linux, and Intel Mac builds are on the roadmap — but they're not shipping yet, so there are no fake install docs for them. (Need an Intel build? Open an issue.)
A couple more real edges: the app isn't notarized yet (Homebrew handles the Gatekeeper flag for you), and hook-based interception is CLI-only — the VS Code extension API doesn't support the hooks, so there it runs in a "soft mode" with the gauge and cost tracking but no enforced interception. Use the CLI for guaranteed, measurable savings.

Try it

bashbrew tap HackPoint/tap
brew install --cask HackPoint/tap/lumen

Launch it, click through the (non-destructive) setup, restart Claude Code, and watch the numbers show up.
Stack, for the curious: Rust core, daemon, and MCP server; Angular + Tauri for the app. It's MIT — issues, questions, and PRs all welcome.

https://github.com/HackPoint/lumen

If you've ever been mid-flow when Claude Code compacted on you without warning — or just wondered where your token budget actually went — this is the tool I wish I'd had. I'd love to hear what you think.