Making an Open-Source CRM AI-Native: laravel/ai in Production

A year ago I wrote about building Relaticle, a free, open-source CRM on Laravel and Filament. Since then, one request kept coming back in every channel: AI.

Last week we shipped it in v3.3 — "Ask Relaticle," an in-app agent that reads and writes the CRM with human approval on every write. This is the post I wish had existed when we started: what it actually takes to put the brand-new first-party laravel/ai package into production, and where the real difficulty lives (spoiler: it's not the prompts).

Why a built-in agent when MCP exists

Relaticle already exposes an MCP server (30 tools over Sanctum with per-team isolation), so Claude or ChatGPT can drive it remotely. That covers the agent-native minority. The built-in chat exists for everyone else — the salesperson who will never open Claude Desktop but will happily press Cmd+J and type "create the company and add the contact."

The architectural rule that kept this sane: the REST API, the MCP tools, and the in-app agent all call the same action classes. One code path for business logic, activity logging, notifications, and tenant scoping, no matter who's asking — a human in a Filament form, an HTTP client, or a model proposing a change.

The agent itself is the easy part

The whole agent definition is one class. laravel/ai does the heavy lifting through attributes and contracts:

#[Provider(['anthropic', 'openai'])]

#[MaxSteps(15)]

#[Temperature(0.3)]

#[Timeout(120)]

final class CrmAssistant implements Agent, Conversational, HasMiddleware, HasProviderOptions, HasTools

{

use Promptable;

use RemembersConversations;

// ~28 tools registered: CRUD for companies, people,

// opportunities, tasks, notes + search + summaries

}

RemembersConversations gives you persistent history out of the box. #[Provider] makes the agent provider-agnostic — users pick Claude or GPT per conversation and, when self-hosting, bring their own key.

That's maybe a day of work. Everything below is where the other six weeks went.

Streaming that survives reality

Chat runs as a queued job (Horizon) that streams over Reverb. In the demo-video world, that's the end of the story. In production:

Users reload the page mid-stream.
Websockets drop and reconnect.
Livewire re-renders at inconvenient moments.

Every stream needs an identity so the client can reconcile what it already rendered after a reconnect, and continuations have to be resumable — a reload mid-answer should pick the stream back up, not orphan a half-written response.

One production gotcha that cost us real debugging time: our broadcast channel authorization silently stopped registering once routes were cached. If your channels live anywhere unusual, verify they register with route:cache on — locally everything works, and then production behaves like Echo never subscribed.

Writes you can trust: the approval pipeline

The agent never writes directly. Tools emit proposals; the user sees an approval card and decides. Sounds simple. The non-obvious parts:

Idempotent approvals. Approval is an HTTP request, and HTTP requests get retried. If approving "create Acme Robotics" runs twice, you must not get two companies. (While filming our own demo, the automation double-clicked an Approve button — the idempotency layer absorbed it. Satisfying moment.)

Tenant scoping at approval time. A proposal is created in the context of a team, and it must execute in that same context — never trust ambient state when the approval lands. In a multi-tenant CRM this is the difference between a safety feature and a data breach.

Batch proposals. When the model wants to create five records, that's one card and one click, not five interruptions.

Supersede, don't haunt. If the user keeps typing instead of approving, pending proposals flip to superseded and the model is told. Without this, the model happily re-proposes the same records forever — one of those behaviors you only discover with real usage.

Undo as a server-side contract. Deletes show an undo toast for 5 seconds, but the server honors the undo window for 5 minutes. The toast is UX; the window is the actual guarantee.

The proposal lifecycle ends up as a small state machine — pending → approved / rejected / superseded / expired — and once you model it that way, the edge cases (approve after supersede, undo after expiry) become explicit instead of accidental.

Custom fields ruin static tool schemas

Relaticle's records have user-defined custom fields — every team's schema is different. So you can't hardcode the tool's JSON schema the way most agent demos do.

Our approach: at runtime, inline a per-tenant description of the custom-field schema into the prompt — field codes, types, and option labels — and translate option labels back to option IDs at validation time. The payoff is that adding a field in the admin UI makes it instantly usable from chat (and from MCP clients) with zero per-field code.

Provider notes from the trenches

Gemini is excluded, deliberately. The driver currently merges provider options into generationConfig, so you can't set function_calling_config — and without that, we can't enforce our sequential-write guard. Rather than ship an agent that behaves differently per provider, we limited the list until the driver supports it.

Anthropic prompt caching is one config flag and worth it. Multi-turn agent conversations re-send a large system prompt (especially with per-tenant schema injection). Enabling caching cut multi-turn input tokens dramatically:

'anthropic_prompt_caching' => (bool) env('CHAT_ANTHROPIC_PROMPT_CACHING', true),

Honest failures beat silent ones

Rate limits happen. Providers hiccup. The worst thing an agent can do is swallow the error and leave the user staring at a frozen cursor.

Every failure mode surfaces as an explicit state in the UI — "retrying," "failed — resume?" — and resuming continues the conversation instead of restarting it. Half-finished work disappearing silently kills trust in an agent faster than having no agent at all. This was a design decision, not an afterthought, and it shaped the event model (stream failed / stream retrying / chat paused are first-class events, not log lines).

What I'd tell you before you start

Budget most of your time for the distributed-systems hygiene: stream identity, resumability, idempotency, tenant scoping. The LLM part is the demo; this is the product.
Default to human approval for writes. After watching a model confidently propose the wrong record update, we consider default-approve the only honest default for other people's revenue data.
Make failure states first-class. Your users will hit rate limits on day one.
If your domain has dynamic schemas, design the prompt injection early — it changes how you think about tool definitions.

Everything in this post is readable in the repo — the chat lives in packages/Chat and is genuinely an afternoon's read: https://github.com/relaticle/relaticle

Questions about laravel/ai in production welcome — there isn't much real-world material on it yet, and I'm happy to go deeper on any of the pieces above.

Making an open-source CRM AI-native (laravel/ai in production)