It is the ultimate flow-killer. You sit down, open your IDE, get maybe three or four good turns into a complex refactor, and then—BAM.
"You have reached your message limit until 4:00 PM."
It feels broken. You just started! How is the tank already empty? This happened to me so often—while burning through $50 worth of tokens a day—that I realized I was flying completely blind. I had no idea what was actually happening under the hood of my conversations.
I built Intern. It’s a tool to trace your Claude interactions and providing a heuristic-based profile of exactly how you’re consuming resources.
Intern acts as a transparent proxy. You point your claude code traffic through it, and it keeps a persistent history of every conversation. This allows you to go back and analyze exactly what happened in a session that nuked your limits.
The Heuristic Profile: A Reality Check
The core of the project is the profile command. It takes your raw conversation traces and applies a heuristic analysis to categorize the "work" being done.
When I profiled one of my "short" sessions that somehow cost me $18, this is what intern surfaced:
=== Cost Report ===
MODEL MSGS INPUT OUTPUT CACHE READ TOTAL
claude-opus-4-6 233 $0.0400 $1.9850 $11.2288 $17.8246
=== Complexity Breakdown ===
COMPLEXITY COUNT %
mechanical 196 70.0%
reasoning 69 24.6%
trivial 15 5.4%
=== Tool Usage (322 total calls) ===
TOOL COUNT % BAR
---- ----- - ---
Bash 110 34.2% █████████████████
Read 67 20.8% ██████████
Edit 49 15.2% ███████`
What Intern actually shows you:
Mechanical vs. Reasoning: My heuristics showed that 70% of my messages were purely mechanical (file I/O, bash commands). I was using high-tier reasoning limits for brick-laying tasks.
The Cache Tax: You can see exactly how much you are spending on CACHE READ. In this session, $11.22 was spent just on Claude re-reading context. If you change your system prompt or a large file mid-way, you can see the immediate financial and rate-limit spike here.
Trace Persistence: Instead of losing your history to the ether, intern saves everything to .jsonl files. You can analyze patterns across weeks of work to see which projects—or which specific tools—are your "token hogs."
Offload Candidates: The profiler automatically flags messages that are "trivial" or "tool continuations." It tells you exactly how many messages could have been handled by a lighter model without sacrificing quality.
How to Profile Your Own Workflow
The setup is non-suggestive and stays out of your way.
- Install the intern cli tool
brew tap abhishekjha17/intern
brew install intern
for platforms other than Mac, you can refer to the installation guide
- Spin up the proxy:
~ intern proxy
2026/04/17 13:29:08 intern proxy listening on :11411 → https://api.anthropic.com (traces → /Users/abhi17/.intern/traces/traces.jsonl)
- Start Claude with
intern
~ export ANTHROPIC_BASE_URL=http://localhost:11411
~ claude
Intern will silently log every request and response while you work.
- Run the heuristic report: When you hit that rate limit, don't just wait around. Run the analysis:
~ intern profile .intern/traces/traces.jsonl
The choice
Tracing your conversation is about more than just costs; it’s about reclaiming control over your workflow's path. Once you see which parts of your session are purely mechanical, you can consciously choose to offload those branches to cheaper or local models. Instead of being forced down a single, expensive rate-limited road, you get to decide exactly which model is worth your context and when.