Cut LLM prompt tokens on structured data — losslessly

A small, dependency-free tool for shrinking logs, JSON, and CSV in prompts — without dropping a single byte.

Logs, JSON, and CSV are some of the bulkiest, most repetitive things we feed into LLMs. They're also where prompt-token costs quietly pile up.

The trouble with lossy compression

The usual fix is semantic compression: have a model summarize the input and drop "low-information" tokens. It works — until the question needs the data that got dropped.

Ask:

"How many errors are in this log?"
"What's the total across these 400 rows?"

…and a lossy compressor can hand back a confident, wrong answer — because the rows it discarded were exactly the ones you needed. The compression looks great. The answer is broken.

A different bet: lossless or no-op

ctxfold takes the opposite approach. Its single rule:

Lossless or no-op. Never lossy.

Instead of summarizing, it re-encodes structure. Logs, JSON arrays, and CSV are tables in disguise — the same keys, prefixes, and templates repeat on every line. ctxfold lifts those repeated parts into a one-time header and keeps only what varies per row, producing a compact, self-labeling table the model reads directly. Nothing is dropped.

The guarantee is enforced in code: every encoder ships with a decoder, and compress() verifies that decoding its output reproduces the input before returning it. If it can't, you get your original text back, untouched. It can't corrupt your data — worst case, it does nothing.

Does the model still read it?

Yes. On real data, ctxfold cuts ~35–40% of tokens on templated logs and JSON arrays, fully losslessly. And because the output is plain, labeled text, the model reads it as well as the raw input — in lookup tests against GPT-4o-mini, answers off the compressed form matched answers off the raw data, field for field.

(Readability is validated on GPT-4o-mini; the lossless guarantee is model-independent.)

Try it

npm install ctxfold

const { compress } = require("ctxfold");

const { text, stats } = compress(bigLogOrJsonOrCsv);
// send `text` instead of the original
console.log(`${(stats.tokenRatio * 100).toFixed(0)}% fewer tokens, lossless: ${stats.lossless}`);

It's a pure text transform — no API calls, no model, zero dependencies — so it works with any LLM.

Not a replacement — the other half

ctxfold isn't a competitor to semantic compression; it's the complement. Summarize to extract a subset; ctxfold to shrink repetition without losing anything. It shines on structured data, not prose.

Why I built it

This started from a simple frustration: lossy prompt compressors gave impressive token savings, but on aggregate questions — counts, totals, "find this record" — the answers came back wrong, because the data needed to answer had been summarized away. Great compression, broken results. The fix wasn't a smarter summarizer; it was to stop dropping data at all. Repetitive structured text is compressible losslessly — you just have to treat it as structure instead of prose.

If you push a lot of logs, JSON, or CSV into prompts, I'd genuinely like to know what your payloads look like and whether the lossless tradeoff fits your use case. What's eating the most tokens in your prompts right now? Questions, critique, and edge cases that break it are all welcome in the comments.

Repo & docs: https://github.com/antrixy/ctxfold · npm: npm install ctxfold · MIT licensed.