How I built a switchable AI provider system in Next.js (Gemini GPT-4 Claude, one workspace setting)

I had Gemini hardcoded into every AI function in my codebase. Then a beta user told me her clients have data policies about Google. I rebuilt the whole AI layer in two days. Here's the architecture.

The wrong way (what I had)

import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);

export async function summarizeProject(projectId: string) {
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const result = await model.generateContent(prompt);
return result.response.text();
}
Works fine until you need to support another provider. Then you're touching every function.

The pattern: one factory, one interface

I defined a single AIClient interface that every provider implements:

export interface AIClient {
complete: (prompt: string, systemPrompt?: string) => Promise;
}
One factory function reads the workspace setting and returns the right client:

export async function getAIClient(workspaceId: string): Promise {
const { aiProvider } = await getWorkspaceSettings(workspaceId);

switch (aiProvider) {
case "openai": return createOpenAIClient();
case "anthropic": return createAnthropicClient();
default: return createGeminiClient();
}
}
Each factory wraps the provider SDK behind the same interface:

export async function summarizeProject(projectId: string, workspaceId: string) {
const ai = await getAIClient(workspaceId);
const data = await getProjectContext(projectId);
return ai.complete(buildPrompt(data), "You are a project analyst.");
}
Three decisions worth explaining

Workspace-level, not user-level. Project data belongs to the workspace. The whole team should have the same answer to "what AI provider processes our data?" — not "depends who's logged in."

No silent fallbacks. If the selected provider fails, the error propagates. I don't retry with another provider. If someone chose Claude for compliance reasons, silently falling back to Gemini is worse than an explicit error.

Per-request factory call, not module-level cache. Workspace settings can change. Module-level caching means the old provider keeps running until the function cold-starts. The extra DB read per AI call is negligible next to the actual inference cost.

The one thing the pattern doesn't solve

Different models behave differently on the same prompt. Claude follows system prompts very tightly. GPT-4 tends verbose without explicit length instructions. I handle this with thin per-provider prompt adjustments in the factory. It's not elegant but it works.

If you've solved the "same prompt, different model behavior" problem more cleanly — I'm curious how.

Building Melororium — pay-once project management for freelancers. Launching July 30.