Policy-Locked Triage for Messy Citizen Text: A Municipal-Style Routing PoC with SFT and Preference Alignment
python
dev.to
How I stabilized noisy 311-style requests with supervised training and reviewer preferences in Python TL;DR This write-up is an experimental account of how I built a small routing proof of concept for synthetic municipal-style service requests. The goal was not to ship a city-wide system. From my perspective, the interesting part is the training story: start with labeled text, fit a transparent classifier, then inject reviewer-style preferences so the policy moves toward routes