One sentence becomes a native iOS app, a native Android app, and a Rails API

I still sell mobile boilerplate — AI just changed what it's for. The "save 12 weeks" pitch is dead; what it can't do is the part nobody automated.

For years, the pitch for a mobile boilerplate wrote itself: a real native iOS app, a real native Android app, and a Rails API — already wired together, already authenticated, already shipping. Buy it, skip twelve to sixteen weeks of setup.

AI coding tools quietly compressed that pitch down to a couple of weeks of AI-assisted work. The setup you used to pay to skip is now the part a model does cheapest. So I had to ask an uncomfortable question: in an AI-assisted world, what part of "two native apps plus an API" is actually hard anymore?

The answer wasn't any one platform. It was the seams between them.

The durable problem is coherence, not code volume

The pain of shipping a mobile product was never writing the iOS app or the Android app or the API. It was that the same domain has to be implemented three times — a Rails API, a native SwiftUI client, a native Jetpack Compose client — each in its own language, its own idioms, its own naming conventions, its own localized copy. And then you have to keep all three agreeing with each other while the product changes underneath you.

Rename a concept on the API and you've created drift in two clients until someone fixes them. Add an endpoint to iOS and forget Android, and the two apps now do different things. None of these are hard problems individually. Collectively, under iteration, they're where weeks actually disappear — and AI didn't make that go away. If anything it made it worse, because now you can generate divergence three times as fast.

That's the durable problem. Not "how do I write this code," but "how do I keep three independent implementations of one idea from drifting apart." It's a consistency problem, and consistency is exactly what a one-shot prompt is bad at.

So I stopped thinking of my boilerplate as only a product you buy once, and started also treating it as a substrate a generator operates on — the same code, two roles.

From template to generator

The result is an open-source (MIT) agent. You hand it a spec in plain English — something as informal as:

a walk-in clinic queue for small veterinary practices

and in under an hour, from that one sentence, it produces all three at once: a real native SwiftUI iOS app, a real native Kotlin/Jetpack Compose Android app, and a Rails 8.1 API — not a cross-platform shim that pretends to be both, but three separate, idiomatic codebases — renamed, adapted, and validated so they all agree with each other.

Here's the 40-second version — what the agent does, the surfaces it ships on, and a live walk of a generated app:

https://youtu.be/fsjfskPWecQ

Quick look: spec in, three validated native platforms out.

npx nativeapptemplate-agent \
  "a walk-in clinic queue for small veterinary practices" \
  --project-name="VetClinic"

Under the hood, the run is a pipeline of small, single-purpose agents. A planner parses the sentence into a structured domain. Three workers — Rails, iOS, Android — run in parallel to customize each platform. A reviewer checks the three against each other. A judge scores the result. Concretely, it does six things:

Parses the spec into a structured domain — entities, fields, relationships, state machines.
Copies a real, MIT-licensed substrate — three production-derived repos covering Rails + iOS + Android — into a fresh output directory. The substrate is read-only; the agent never mutates the source.
Renames the skeleton coherently. Shop → Clinic, Shopkeeper → Vet, propagated through Ruby migrations, Swift models, Kotlin data classes, authorization policies, tests, and localized copy — in lockstep across all three platforms.
Adapts or replaces the domain module. Some specs are queue variants and reuse the existing entity; a task tracker needs the core resource swapped out entirely. The agent picks the path.
Drives the build green. Every platform has to pass before the agent exits.
Validates the output and writes a self-contained report.

The interesting word in that list is coherently — and it turns out "rename a thing across three languages" is much harder than it sounds.

What "coherent" actually means, in code

Three places in the pipeline exist purely to fight drift. They're the most interesting code in the project, because each one is a consistency problem that doesn't have a clever one-line answer.

The planner won't pick a name that breaks a compiler. When it renames the substrate's generic Shop / Shopkeeper / ItemTag into your domain's nouns, it can't choose freely. Some names are landmines. Rename an entity to Task and you've just shadowed Swift Concurrency's Task — Task.isCancelled stops resolving and iOS won't compile. Result, View, Date, Set are the same trap in Swift; Unit, Flow, Result in Kotlin; Hash, String, Request in Ruby. On top of that, the substrate's own auth layer already reserves words like Account, Role, Session, Notification — reuse one and you get duplicate-identifier errors. So the planner carries an explicit blocklist of language-stdlib and substrate-reserved vocabulary, and is steered toward domain-distinctive nouns (Vet, Host, Patient, Reservation) that are safe in all three languages at once. A name has to compile on iOS and Android and the server — picking it is a three-platform decision before a single line is generated.

The rename has to speak every casing convention. A concept like ItemTag doesn't appear in the code as one string. It's ItemTag in a Swift type, item_tag in a Rails table, itemTag in a Kotlin field accessor, item_tags in a route, ITEM_TAGS in a constant, and "Item Tag" / "item tag" / "Item tags" in three different cases of UI copy. The rename pass generates every one of those variants for each pair and rewrites file contents and file/directory names to match — with hand-rolled word boundaries, because the obvious regex shortcut (\b) treats underscores as word characters and would mangle shop_id. Miss one casing and you get a half-renamed app that compiles but lies to the user. Get it right and the rename is invisible, which is the whole point.

The reviewer makes "the apps must agree" a mechanical check. After the workers finish, it extracts the API contract three ways — the Rails OpenAPI spec, the iOS request layer, the Android Retrofit interfaces — normalizes their path encodings, and diffs them. The run fails if either mobile client calls an endpoint the Rails API doesn't expose (a guaranteed runtime 404), or if iOS and Android implement different subsets of the API. That last rule is the thesis compiled into a pass/fail gate: the two clients aren't allowed to drift from each other. Contract parity isn't a vibe; it's a diff that's either empty or it isn't.

Why I made the agent grade its own output

Anyone can write an agent that emits code and exits. The part I'm proudest of is that this one doesn't trust itself. It validates what it produced across three escalating layers, and the run fails if any layer fails.

Layer 1 — Structural. Walk the generated tree and search every source file for leftover pre-rename tokens. A forgotten Shop in a Kotlin string is a bug, and it gets reported with its file and line number — before any test runs. This is the rename-completeness check made mechanical, and it's paired with the contract-parity diff above.

Layer 2 — Runtime. The Rails app has to boot. The iOS and Android apps have to build — real xcodebuild build and ./gradlew assembleDebug, installed on a booted iPhone simulator and Android emulator. Turn the visual mode up and it goes further: it boots the live Rails server and drives a scripted walk-through of the actual UI — Welcome → Sign Up → email-confirm → Sign In → drill into a seeded record — tapping through the running app via mobile-mcp. Any 4xx, 5xx, or unhandled client error fails the run.

Layer 3 — Semantic. Claude acts as a judge — text and vision. It reads the actual home-screen screenshots off the simulator and emulator and scores whether the rendered UI genuinely expresses the intended domain, against a structured rubric, median of three samples per criterion. Did we build a veterinary clinic queue, or just rename some labels on the old thing?

Every run emits one self-contained HTML report: a platform-by-layer matrix, the leftover-token findings, the build commands and their stderr, the embedded screenshots with the judge's per-criterion rationale, the contract diff, and the rename plan. You can open it in a browser, attach it to a PR, or drop it into a demo.

The receipt: a real run of "a walk-in queue for small veterinary clinics", all three layers green, screenshots and judge rationale embedded.

A note on that vision judge, because it's easy to oversell: a model grading code produced by the same model shares its blind spots — it is not an independent oracle of correctness. That's exactly why it sits on top of the structural and runtime layers rather than replacing them. Layers 1 and 2 have a source of truth outside the model's interpretation — tokens either leak or they don't, the build either compiles or it doesn't, the contract diff is either empty or it isn't. The vision judge catches a narrower, complementary thing: "this renders, but is it the right domain?" Three layers, escalating, each covering the layer below's gaps.

What I learned building it solo, in a week

This started during a Claude Code hackathon and shipped to npm afterward. A few things stuck.

Coherence is a validation problem before it's a generation problem. I spent more design effort on how to prove the three platforms agree — the token sweep, the contract diff, the vision rubric — than on the renaming itself. The renaming is mechanical. Knowing you got it right across 40,000-plus lines of Swift, Kotlin, and Ruby is not, unless you build the check that says so.

Native is harder to automate than web — and the moat is the proof, not the targets. Most "AI builds your app" demos stop at a web frontend, and many that go further emit cross-platform code: one React Native or Flutter codebase compiled to both stores, not true native Swift and true native Kotlin. A few closed builders do claim native-on-both; I won't pretend that combination is unique. What I haven't found elsewhere is the rest of the sentence: an open-source agent that emits true native iOS, true native Android, and a Rails API you actually own — and then proves the three stay coherent, with a report you can read. You can fork it, run it, and check the work. That's a different promise than "trust the output of a walled web builder."

The boilerplate didn't stop being a product — it picked up a second job. I still sell it, and it's still worth selling: it's extracted from a real queue-management app that's been live on both app stores since 2024, and that battle-testing doesn't come from a prompt. What changed is that the same code now also serves as the substrate the generator operates on. One thing you buy; the same thing the agent reshapes on demand.

Where the commercial side fits

To be straight with you: there's a paid edition — native clients with multi-tenancy, invitations, role-based permissions, organization switching — sold at nativeapptemplate.com. The agent works against either edition without code changes; the free, MIT-licensed substrate is the one this article is about and the one you can run today. My honest read is that the value of paid boilerplate in an AI era shifts away from "save time" — the part AI commoditized — and toward what stays hard: production-proven code, and ongoing maintenance as Swift, Kotlin, and Rails versions churn.

Try it

The agent is open source under MIT. You'll need Node 22+, an Anthropic API key, and local checkouts of the three free substrate repos.

npx nativeapptemplate-agent \
  "a personal task tracker with due dates"

Or, if you live in Claude Code, install it as a plugin and drive it with two slash commands — one to generate-validate-explain, one to launch the generated app on a simulator and walk its UI with screenshots inline.

If you want to watch the whole thing happen — spec in, renamed Rails API plus iOS and Android apps out, all three validated — here's the full 90-second run:

https://youtu.be/z08ueZX-02I

End-to-end: one sentence to three validated native platforms, with the report to prove it.

Repo and the full technical spec are on GitHub.

The pitch is no longer "save twelve weeks." It's: describe the app, and get three platforms that agree with each other — and a report that proves it.

Built solo in Tokyo.