How I Built a PII Tokenization Middleware to Keep Sensitive Data Out of LLM APIs
go
dev.to
The Problem I Kept Ignoring Every time we sent a customer transcript to an LLM API, we were sending real data — credit card numbers, home addresses, full names, national IDs — in plaintext to a third-party server. Most teams I've talked to handle this in one of two ways: Ignore it and hope the provider's data processing agreement covers them Prompt engineer around it — "don't repeat personal information in your response" — which does nothing about what's already been transmitted