Kodah: 51% on SWE-bench Lite with GPT-5-mini at $0.045/issue

python dev.to

I built Kodah around a simple bet: a small model with the right
approach can match frontier models on real engineering tasks,
at a fraction of the cost.

The result: Using GPT-5-mini, Kodah reaches 81% of Claude Opus 4.6's
resolve rate, at 1/38th of the cost.

Results from the full 300-issue evaluation on SWE-bench Lite

  • 153/300 resolved (51.0%), 280/300 generated a valid patch (93.3%)
  • psf/requests: 6/6 (100%), django/django: 67/114 (58.8%)
  • Flask: 0/3: codebases with heavy runtime coupling are the main weakness
  • 75% of issues cost under $0.05. 90% of issues cost under $0.10.
  • Total cost for all 300 issues: $13.59

Full breakdown with per-repo results and cost distribution:
https://www.silasdata.com/kodah

How it works

Send a repo URL and an issue description, get a diff back.
Available as an API at kodah.io. First 10 fixes free.

Source: dev.to

arrow_back Back to Tutorials