30 Days of MCP in Production: What Actually Works (And What Breaks)

The Notion MCP Challenge results dropped this week. I've been running MCP servers in production for 30 days. Here's what nobody tells you before you build one.

What MCP Actually Is (Skip If You Know)

Model Context Protocol is Anthropic's standard for giving Claude tools that persist across conversations and can be shared across applications. Think of it as a USB standard for AI capabilities — define once, use anywhere Claude runs.

// The simplest possible MCP server
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = new Server({ name: "my-server", version: "1.0.0" }, {
  capabilities: { tools: {} }
});

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [{
    name: "get_data",
    description: "Fetch data from our API",
    inputSchema: {
      type: "object",
      properties: { id: { type: "string" } },
      required: ["id"]
    }
  }]
}));

Lesson 1: Tool Descriptions Are Your Most Important Code

Claude decides which tools to call based on your descriptions. Not the implementation — the description string.

I spent 3 days debugging why Claude kept calling the wrong tool. The issue wasn't the logic. It was this:

// ❌ Claude picks this randomly
{ name: "get_user", description: "Gets a user" }

// ✅ Claude picks this correctly
{ 
  name: "get_user", 
  description: "Fetch a user record by ID. Use when you need profile data, preferences, or subscription status. Returns null if user doesn't exist. Do NOT use for authentication — use verify_session instead."
}

Treat tool descriptions like API documentation for a junior developer. Explicit, with examples of when NOT to use them.

Lesson 2: Schema Verbosity Has a Real Cost

Every tool call includes your full JSON schema in the context. At 1,000 calls/day with a 400-token schema, that's 400,000 tokens of overhead daily.

Minimize schemas:

// ❌ 120 tokens
inputSchema: {
  type: "object",
  properties: {
    userId: { 
      type: "string",
      description: "The unique identifier for the user, formatted as a UUID v4 string"
    },
    includeInactive: {
      type: "boolean", 
      description: "Whether to include inactive users in results, defaults to false"
    }
  }
}

// ✅ 35 tokens — Claude understands this fine
inputSchema: {
  type: "object",
  properties: {
    userId: { type: "string" },
    includeInactive: { type: "boolean" }
  },
  required: ["userId"]
}

Lesson 3: Error Messages Are Prompts

When your tool returns an error, Claude reads it and decides what to do next. Your error message IS a prompt.

// ❌ Claude retries randomly
throw new Error("Database error");

// ✅ Claude knows exactly what to do
throw new Error(
  "Rate limit exceeded on user API (429). " +
  "Wait 60 seconds before retrying. " + 
  "If this is urgent, use get_user_cached for a potentially stale result."
);

Structured, actionable errors reduce retry loops by ~60% in my testing.

Lesson 4: Stateless Tools Only

The hardest bug I hit: my MCP server maintained session state across tool calls. Works great in testing. In production with concurrent Claude sessions, tools were reading each other's state.

Rule: every tool call must be fully stateless. Read from a database, write to a database, return the result. No in-memory state between calls.

// ❌ State lives in the server
let currentUser: User | null = null;

server.tool("set_user", async ({ id }) => {
  currentUser = await db.getUser(id); // Shared across sessions!
  return { success: true };
});

// ✅ State lives in the call
server.tool("get_user_data", async ({ userId, dataType }) => {
  const user = await db.getUser(userId); // Isolated per call
  return user[dataType];
});

Lesson 5: Log Everything Claude Sends You

Claude's tool calls will surprise you. Log every input, every time.

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  console.log(JSON.stringify({
    tool: request.params.name,
    args: request.params.arguments,
    timestamp: Date.now()
  }));
  // ... handler logic
});

After two weeks of logs, I found Claude was calling my delete_item tool when it meant to call archive_item. The descriptions were too similar. Logs caught it before it hit production data.

The One Thing Worth Getting Right First

Before schema design, before error handling — get your tool naming right. Tool names are permanent in production (renaming breaks existing prompts and agent workflows).

Use verb-noun pairs. Be specific. Never abbreviate.

✅ get_invoice_by_id
✅ create_draft_post  
✅ mark_task_complete
❌ getData
❌ process
❌ handle_thing

Invest 20 minutes naming things correctly at the start. You'll save 20 hours of confusion later.

Running MCP servers at scale for whoffagents.com automations. Full source and patterns at the repo linked in my profile.