Profling Claude Converstaions

It is the ultimate flow-killer. You sit down, open your IDE, get maybe three or four good turns into a complex refactor, and then—BAM.

"You have reached your message limit until 4:00 PM."

It feels broken. You just started! How is the tank already empty? This happened to me so often—while burning through $50 worth of tokens a day—that I realized I was flying completely blind. I had no idea what was actually happening under the hood of my conversations.

I built Intern. It’s a tool to trace your Claude interactions and providing a heuristic-based profile of exactly how you’re consuming resources.

Intern acts as a transparent proxy. You point your claude code traffic through it, and it keeps a persistent history of every conversation. This allows you to go back and analyze exactly what happened in a session that nuked your limits.

The Heuristic Profile: A Reality Check

The core of the project is the profile command. It takes your raw conversation traces and applies a heuristic analysis to categorize the "work" being done.

When I profiled one of my "short" sessions that somehow cost me $18, this is what intern surfaced:

=== Cost Report ===
MODEL                      MSGS   INPUT     OUTPUT    CACHE READ   TOTAL
claude-opus-4-6            233    $0.0400   $1.9850   $11.2288     $17.8246

=== Complexity Breakdown ===
COMPLEXITY  COUNT  %
mechanical  196    70.0%
reasoning   69     24.6%
trivial     15     5.4%

=== Tool Usage (322 total calls) ===
TOOL            COUNT   %      BAR
----            -----   -      ---
Bash            110     34.2%  █████████████████
Read            67      20.8%  ██████████
Edit            49      15.2%  ███████`

What Intern actually shows you:

Mechanical vs. Reasoning: My heuristics showed that 70% of my messages were purely mechanical (file I/O, bash commands). I was using high-tier reasoning limits for brick-laying tasks.
The Cache Tax: You can see exactly how much you are spending on CACHE READ. In this session, $11.22 was spent just on Claude re-reading context. If you change your system prompt or a large file mid-way, you can see the immediate financial and rate-limit spike here.
Trace Persistence: Instead of losing your history to the ether, intern saves everything to .jsonl files. You can analyze patterns across weeks of work to see which projects—or which specific tools—are your "token hogs."
Offload Candidates: The profiler automatically flags messages that are "trivial" or "tool continuations." It tells you exactly how many messages could have been handled by a lighter model without sacrificing quality.

How to Profile Your Own Workflow

The setup is non-suggestive and stays out of your way.

Install the intern cli tool

brew tap abhishekjha17/intern
brew install intern

for platforms other than Mac, you can refer to the installation guide

Spin up the proxy:

~ intern proxy
2026/04/17 13:29:08 intern proxy listening on :11411 → https://api.anthropic.com (traces → /Users/abhi17/.intern/traces/traces.jsonl)

Start Claude with intern

~ export ANTHROPIC_BASE_URL=http://localhost:11411
~ claude

Intern will silently log every request and response while you work.

Run the heuristic report: When you hit that rate limit, don't just wait around. Run the analysis:

~ intern profile .intern/traces/traces.jsonl

The choice

Tracing your conversation is about more than just costs; it’s about reclaiming control over your workflow's path. Once you see which parts of your session are purely mechanical, you can consciously choose to offload those branches to cheaper or local models. Instead of being forced down a single, expensive rate-limited road, you get to decide exactly which model is worth your context and when.

Check out the project:

abhishekjha17/intern