What I've Learnt Shipping Them in Production

May 2026 · 2,800 words ·

I've built a lot of MCP servers in the past 18 months. Some were throwaway prototypes. A few are running in production right now, handling real tool calls from real AI clients every day. Along the way I've made most of the mistakes worth making: tools with descriptions that confused the model, servers that leaked credentials into debug traces, a beautiful TypeScript implementation that nobody connected to because I'd picked the wrong transport.

This guide is what I wish I'd had at the start. It covers architecture, all three primitives, Python and TypeScript implementation, security, and deployment. It is opinionated. Where the official docs are neutral, I'm not.

What MCP Actually Is (and Why It Matters Now)

MCP (Model Context Protocol) is the open standard that turns isolated LLMs into agents that can act on the world. Anthropic released it in November 2024. Within eight months, OpenAI, Google DeepMind, Microsoft, and AWS had all adopted it.

The growth numbers tell the story: 2M monthly SDK downloads at launch, 22M when OpenAI adopted in April 2025, 45M when Microsoft joined in July 2025, 97M by March 2026. React took roughly three years to reach 100M. MCP did it in 16 months.

The reason is the N×M problem it solves. Before MCP, connecting 10 AI tools to 10 services required 100 custom integrations. With MCP, you build 10 servers and 10 clients and each works with all the others. That N×M to N+M collapse is why every serious AI team is building MCP servers right now.

The three-part mental model

┌─────────────────────────────────────────────────────────┐
│                      MCP HOST                           │
│          (Claude Desktop, Cursor, your agent)           │
│                                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │  MCP CLIENT  │  │  MCP CLIENT  │  │  MCP CLIENT  │  │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  │
└─────────┼─────────────────┼─────────────────┼──────────┘
          │ JSON-RPC 2.0    │                 │
          ▼                 ▼                 ▼
   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
   │ YOUR SERVER │  │   GitHub    │  │   Postgres  │
   │  (content,  │  │   server    │  │   server    │
   │  analytics) │  │             │  │             │
   └─────────────┘  └─────────────┘  └─────────────┘

MCP Host: the AI application (Claude Desktop, Cursor, your custom agent).
MCP Client: the protocol layer inside the host, managing connections to servers.
MCP Server: your service, exposing tools, resources, and prompts to any compatible host.

All communication travels over JSON-RPC 2.0. The host maintains multiple client connections simultaneously, giving the LLM a unified tool surface without caring whether a tool call hits a local file system or a remote cloud API.

Two transport options

stdio is subprocess communication. Use it for local tools, dev environments, and desktop apps. Zero network config, works immediately.

Streamable HTTP is a single HTTP endpoint for bidirectional messaging via SSE. Use it for remote servers, multi-user deployments, and cloud. Note: the older HTTP+SSE transport was deprecated in the June 2025 spec. Use Streamable HTTP for all new projects.

The Three Primitives: The Decision That Matters Most

Every MCP server exposes capabilities through exactly three primitives. Getting this distinction right is the most important architectural decision you will make.

╔══════════════════════╦══════════════════════╦══════════════════════╗
║        TOOLS         ║      RESOURCES       ║       PROMPTS        ║
╠══════════════════════╬══════════════════════╬══════════════════════╣
║ Actions the AI       ║ Data the AI reads    ║ Reusable interaction ║
║ executes             ║                      ║ templates            ║
╠══════════════════════╬══════════════════════╬══════════════════════╣
║ Named functions with ║ Read-only snapshots  ║ Pre-built structured ║
║ JSON schema inputs   ║ identified by URI    ║ message sequences    ║
╠══════════════════════╬══════════════════════╬══════════════════════╣
║ Can read AND write / ║ Loaded once when the ║ User-controlled,     ║
║ mutate state         ║ client needs context ║ surfaced as slash    ║
║                      ║                      ║ commands             ║
╠══════════════════════╬══════════════════════╬══════════════════════╣
║ search_posts         ║ docs://              ║ /summarise-article   ║
║ publish_article      ║ db://schema          ║ /review-content      ║
║ send_email           ║ content://articles   ║ /generate-brief      ║
║ query_db             ║                      ║                      ║
╠══════════════════════╬══════════════════════╬══════════════════════╣
║ Treated as arbitrary ║ Think of them as     ║ Guide the LLM's      ║
║ code execution.      ║ context you          ║ behaviour for        ║
║ Most powerful.       ║ pre-serve the model  ║ domain-specific work ║
╚══════════════════════╩══════════════════════╩══════════════════════╝

The rule I use when designing a server:

If the AI needs to do something, it's a Tool. If the AI needs to know something static, it's a Resource. If you want to give the AI a reusable workflow pattern, it's a Prompt.

Mixing these up is the most common mistake I see in early MCP implementations.

When to Build Your Own Server (and When Not To)

Build a custom MCP server when you have data, actions, or context that no existing server exposes and that would make an LLM dramatically more useful for your domain.

The clearest signals you need one

Signal	What it looks like	MCP pattern
Proprietary data	Internal APIs, databases, CMS content, analytics your LLM cannot reach	Resource + Tool server. Highest ROI.
Repeated context assembly	Your team pastes the same docs or schemas into prompts daily	Resource server eliminates this immediately
AI actions in your system	Create records, publish content, trigger workflows	Tool server
Multiple AI clients	Claude, Cursor, and your own agent all need the same integration	Maximum N+M advantage
Domain-specific workflows	Content briefs, code review patterns, customer query templates	Prompt + Tool combination
Replacing N custom integrations	Three teams built three separate integrations for the same data	One MCP server consolidates all three

When not to build one

A well-maintained public server already exists for your tool. GitHub, Notion, Google Drive, and Slack all have official servers worth checking first.

Your use case is a one-off query with no reuse. Your data is fully public and accessible via web search. You need real-time streaming data (MCP is request/response, not pub/sub).

A Content MCP Server: The Highest-ROI Custom Build

I want to be concrete. A content MCP server is one of the best custom builds I've shipped. It gives any AI client instant access to your articles, briefs, brand voice, SEO data, and publishing workflows. No copy-pasting. No context windows stuffed with stale content dumps.

Here's what it exposes across all three primitives:

CONTENT MCP SERVER
│
├── TOOLS (the AI can act)
│   ├── search_content    ── semantic/keyword search across published + drafts
│   ├── get_article       ── retrieve full article by ID or slug
│   ├── create_draft      ── create new draft with title, outline, or body
│   ├── update_article    ── edit body, SEO, tags, status
│   ├── publish_article   ── trigger the publish workflow
│   └── get_analytics     ── views, conversions, rankings per article
│
├── RESOURCES (the AI can know)
│   ├── content://categories   ── all categories and tags, served as stable context
│   ├── content://brand-voice  ── guidelines, tone of voice, writing rules
│   └── content://templates    ── article templates, brief formats, structures
│
└── PROMPTS (the AI can follow patterns)
    ├── generate_brief     ── topic + keyword → structured content brief
    ├── improve_seo        ── article slug → SEO improvement recommendations
    └── repurpose_article  ── long-form → social posts, newsletter blurb

This server transforms any MCP-compatible AI from a generic assistant into one that understands your content operation: what you've written, what performs, what your brand sounds like, and how to create more.

Building in Python with FastMCP

FastMCP is the standard Python framework for MCP. Its decorator-based API turns your functions into MCP-compliant tools automatically, with schema, validation, and documentation generated from your type hints and docstrings.

FastMCP is downloaded 1 million times a day and powers approximately 70% of MCP servers across all languages. FastMCP 1.0 was incorporated into the official MCP Python SDK.

Installation

# Recommended: use uv for dependency management
uv pip install fastmcp
# or pip
pip install fastmcp

Complete content MCP server in Python

from fastmcp import FastMCP
from typing import Optional
import httpx

mcp = FastMCP('content-server')

# TOOLS ────────────────────────────────────────────────

@mcp.tool()
async def search_content(
    query: str,
    limit: int = 10,
    status: str = 'published'
) -> list[dict]:
    '''Search articles by keyword or semantic query.

    Args:
        query: Search term or phrase
        limit: Max results to return (default 10)
        status: 'published', 'draft', or 'all'
    '''
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            'https://your-cms.com/api/search',
            params={'q': query, 'limit': limit, 'status': status},
            headers={'Authorization': f'Bearer {mcp.config.API_KEY}'}
        )
        return resp.json()['articles']

@mcp.tool()
async def get_article(slug: str) -> dict:
    '''Retrieve full article content and metadata by slug.'''
    async with httpx.AsyncClient() as client:
        resp = await client.get(f'https://your-cms.com/api/articles/{slug}')
        return resp.json()

@mcp.tool()
async def create_draft(
    title: str,
    body: str,
    category: str,
    tags: list[str] = []
) -> dict:
    '''Create a new draft article. Returns the new article ID and slug.'''
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            'https://your-cms.com/api/articles',
            json={
                'title': title,
                'body': body,
                'category': category,
                'tags': tags,
                'status': 'draft'
            }
        )
        return resp.json()

# RESOURCES ────────────────────────────────────────────

@mcp.resource('content://brand-voice')
async def brand_voice() -> str:
    '''Brand guidelines and tone of voice rules.'''
    return open('brand_guidelines.md').read()

@mcp.resource('content://categories')
async def list_categories() -> list[str]:
    '''All available article categories.'''
    async with httpx.AsyncClient() as client:
        resp = await client.get('https://your-cms.com/api/categories')
        return [c['name'] for c in resp.json()]

# PROMPTS ──────────────────────────────────────────────

@mcp.prompt()
def generate_brief(topic: str, target_keyword: str) -> str:
    '''Generate a structured content brief for a given topic.'''
    return f'''Create a detailed content brief for:
Topic: {topic}
Target keyword: {target_keyword}

Include: headline options, outline (H2s + H3s), key points per section,
target word count, internal linking suggestions, meta description draft.'''

# RUN ──────────────────────────────────────────────────

if __name__ == '__main__':
    mcp.run()  # stdio by default
    # use mcp.run('streamable-http') for remote deployment

Building in TypeScript

The TypeScript SDK is the Tier 1 official implementation: 66M+ npm downloads, 27K+ dependent packages. Use it for type-safety, the largest ecosystem, and the best IDE tooling.

Setup

mkdir content-mcp && cd content-mcp
npm init -y
npm install @modelcontextprotocol/sdk zod
npm install -D typescript @types/node tsx

Complete TypeScript server

import { McpServer } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const server = new McpServer({
  name: 'content-server',
  version: '1.0.0',
  capabilities: { tools: {}, resources: {}, prompts: {} }
});

// TOOL: search content ──────────────────────────────────
server.registerTool(
  'search_content',
  {
    description: 'Search published and draft articles by keyword.',
    inputSchema: z.object({
      query: z.string().describe('Search term or phrase'),
      limit: z.number().default(10).describe('Max results'),
      status: z.enum(['published', 'draft', 'all']).default('published')
    })
  },
  async ({ query, limit, status }) => {
    const res = await fetch(
      `https://your-cms.com/api/search?q=${query}&limit=${limit}&status=${status}`,
      { headers: { Authorization: `Bearer ${process.env.CMS_API_KEY}` } }
    );
    const data = await res.json();
    return { content: [{ type: 'text', text: JSON.stringify(data.articles) }] };
  }
);

// TOOL: create draft ────────────────────────────────────
server.registerTool(
  'create_draft',
  {
    description: 'Create a new draft article in the CMS.',
    inputSchema: z.object({
      title: z.string(),
      body: z.string(),
      category: z.string(),
      tags: z.array(z.string()).default([])
    })
  },
  async (args) => {
    const res = await fetch('https://your-cms.com/api/articles', {
      method: 'POST',
      body: JSON.stringify({ ...args, status: 'draft' })
    });
    const article = await res.json();
    return { content: [{ type: 'text', text: `Draft created: ${article.id}` }] };
  }
);

// RESOURCE: brand voice ─────────────────────────────────
server.registerResource(
  {
    uri: 'content://brand-voice',
    name: 'Brand Voice Guidelines',
    description: 'Tone, style, and writing rules for all content.'
  },
  async () => ({
    contents: [{ uri: 'content://brand-voice', text: 'Your brand guidelines here...' }]
  })
);

// PROMPT: generate brief ────────────────────────────────
server.registerPrompt(
  {
    name: 'generate_brief',
    description: 'Generate a content brief.',
    arguments: [
      { name: 'topic', description: 'The article topic', required: true },
      { name: 'keyword', description: 'Target SEO keyword', required: true }
    ]
  },
  async ({ topic, keyword }) => ({
    messages: [{
      role: 'user',
      content: {
        type: 'text',
        text: `Generate a content brief for topic: ${topic}, keyword: ${keyword}`
      }
    }]
  })
);

// START ─────────────────────────────────────────────────
const transport = new StdioServerTransport();
await server.connect(transport);
console.error('Content MCP server running');

Tool Design: Write for the Model, Not for Humans

The quality of your tool descriptions is as important as the tool logic itself. The LLM decides which tool to call based entirely on the description. This is the lesson most people learn too late.

Good tool design

One tool per action. Narrow scope, clear purpose. If you're tempted to add an action parameter that switches between modes, you have two tools, not one.

Descriptions tell the LLM when to use the tool, not just what it does. "Use this when the user wants to find an article by topic" is more useful than "searches articles." Include examples: "Use this when the user asks to find, search, or look up content."

Return structured data the model can reason about. Don't return raw API responses full of irrelevant fields. Trim to what the model actually needs to continue.

Validate all inputs. Use Zod in TypeScript, Pydantic or type hints in Python. Reject bad input clearly with actionable error messages.

Keep payloads tight. Don't return 10KB when 100 bytes answer the question.

Anti-patterns I've shipped and regretted

# The god tool — don't do this
@mcp.tool()
async def manage_content(action: str, params: dict) -> dict:
    # handles search, create, update, delete, publish, analytics...
    # the model has no idea when to use this

# The vague description — don't do this
server.registerTool('content_tool', {
  description: 'manages articles'  // What action? When?
})

Other anti-patterns worth calling out: silent failures (return errors as content, not exceptions), exposing destructive tools without a confirmation pattern, building one monolithic server with 50 tools instead of focused servers. One MCP host can maintain multiple client connections simultaneously. Use that.

FOCUSED SERVER ARCHITECTURE

┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│  content-server  │  │ analytics-server │  │ publishing-server│
│                  │  │                  │  │                  │
│ search_content   │  │ get_metrics      │  │ publish_article  │
│ get_article      │  │ get_rankings     │  │ schedule_post    │
│ create_draft     │  │ get_conversions  │  │ notify_team      │
│ update_article   │  │                  │  │                  │
└──────────────────┘  └──────────────────┘  └──────────────────┘
         │                    │                      │
         └────────────────────┴──────────────────────┘
                              │
                    ┌─────────────────┐
                    │   MCP HOST      │
                    │ (Claude, Cursor) │
                    └─────────────────┘

Security: The OWASP MCP Top 10

MCP servers are a new and significant attack surface. The OWASP MCP Top 10, released in 2025, documents the most critical risks. These are not theoretical. Several incidents have already occurred in production.

OWASP MCP TOP 10 (2025)

  RISK                          MITIGATION
  ─────────────────────────     ──────────────────────────────────────
  1. Credential exposure         Short-lived scoped tokens. Vault for
                                 secrets. Never embed in descriptions.

  2. Tool description poisoning  Audit descriptions on deploy. Alert on
                                 metadata changes. Internal registry.

  3. Prompt injection            Input validation. Prompt shields.
  [GitHub incident, May 2025]    Human approval for sensitive writes.

  4. Excessive privilege         Least privilege. Scope tokens per server.
  [Supabase incident, 2025]      Regular permission audits.

  5. Tool mutation post-install  Version pinning. Signed manifests.
                                 Monitor for definition changes.

  6. Missing auth on remote      OAuth 2.1 + JWT for all remote servers.
     servers                     Never ship Streamable HTTP without auth.

  7. Insufficient input          Zod / Pydantic on all inputs.
     validation                  Treat all MCP input as untrusted.

  8. Verbose error messages      Generic errors in responses.
                                 Detailed logs server-side only.

  9. Unbounded tool calls        Max iteration limits. Rate limiting.
                                 Cost monitoring.

  10. Insecure deserialization   Strict schema validation on all
                                 incoming messages.

Authentication for remote servers

# FastMCP with OAuth 2.1 / JWT validation
from fastmcp import FastMCP
from fastmcp.auth import BearerAuthProvider
import jwt

def verify_token(token: str) -> dict:
    '''Validate JWT and return claims.'''
    return jwt.decode(
        token,
        options={'verify_exp': True},
        algorithms=['RS256'],
        audience='content-mcp-server'
    )

auth = BearerAuthProvider(verify_token=verify_token)
mcp = FastMCP('content-server', auth=auth)

# All tools are now protected.
# Unauthenticated requests return 401.

Deployment Options

Your deployment target depends on latency requirements, user count, and infrastructure preferences.

DEPLOYMENT DECISION TREE

Start here: Who uses this server?
│
├── Just me / one machine
│   └── stdio local
│       How: spawn as subprocess, add to claude_desktop_config.json
│       Scale: local only
│
├── My team (internal)
│   └── Streamable HTTP, self-hosted
│       How: Docker container, cloud provider, behind API gateway
│       Scale: multi-user, scalable
│
├── Many users, low latency needed
│   └── Cloudflare Workers
│       How: native MCP hosting, Durable Objects for session state
│       Scale: globally distributed
│
└── Auto-scaling, pay-per-use
    └── Cloud Run / Lambda
        How: containerise with Docker, deploy via gcloud / AWS
        Scale: serverless auto-scaling

Docker container for Streamable HTTP deployment

# Dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]

# server.py — Streamable HTTP mode
from fastmcp import FastMCP
from starlette.applications import Starlette
from starlette.routing import Mount

mcp = FastMCP('content-server')

# ... register tools, resources, prompts ...

app = Starlette(routes=[
    Mount('/mcp', app=mcp.streamable_http_app()),
])

Connecting to Claude Desktop (stdio)

{
  "mcpServers": {
    "content-server": {
      "command": "python",
      "args": ["/path/to/your/server.py"],
      "env": {
        "CMS_API_KEY": "your-api-key",
        "CMS_BASE_URL": "https://your-cms.com"
      }
    }
  }
}

For remote Streamable HTTP servers:

{
  "mcpServers": {
    "content-server": {
      "url": "https://your-server.com/mcp",
      "headers": { "Authorization": "Bearer your-token" }
    }
  }
}

Testing and Debugging

MCP Inspector: use this before connecting any real AI client

The MCP Inspector is the official debugging tool. It connects to any MCP server and shows you every protocol message, available tools, resources, and prompts.

# Install and run
npx @modelcontextprotocol/inspector

# What you'll see:
# - All registered tools with their full JSON schemas
# - All resources and their URIs
# - All prompts with their parameter schemas
# - Raw JSON-RPC message log for every interaction
# - Ability to call any tool manually and inspect the response

Run this before you connect Claude Desktop or any other host. It saves hours.

Unit testing your tools

import pytest
from server import search_content, create_draft

async def test_search_content():
    results = await search_content(query='machine learning', limit=5)
    assert isinstance(results, list)
    assert len(results) <= 5
    assert all('title' in r for r in results)

async def test_create_draft_validation():
    with pytest.raises(ValueError):
        await create_draft(title='', body='test', category='tech')

Production Readiness Checklist

Before shipping, I check off both columns. The left is the launch gate. The right is the ongoing discipline.

BEFORE SHIPPING                        ONGOING OPERATIONS
──────────────────────────────────     ──────────────────────────────────────
 All tools have Zod/Pydantic           Monitor for unexpected tool call
 input validation                      volumes (runaway agent loops)

 OAuth 2.1 or JWT auth on all          Alert on tool description changes
 remote endpoints                      (poisoning detection)

 Secrets in env vars or vault.         Audit access logs regularly.
 Never in code.                        Who called what, when.

 Rate limiting on the MCP server       Rotate API keys and tokens
                                       on a schedule

 Structured logging around all         Update SDK versions. The protocol
 tool invocations                      spec is still evolving.

 Tested with MCP Inspector.            Review tool permissions quarterly
 All tools return expected shapes.     against least-privilege

 Error messages generic externally,    Watch for OWASP MCP Top 10
 detailed internally                   advisories

 Destructive tools require             Test with new model versions.
 confirmation pattern                  Behaviour can differ.

 Tool descriptions clearly say         Keep resource content fresh.
 WHEN to use each tool                 Stale context degrades quality

 Server versioned. Clients can         Document breaking changes for
 pin to a stable version.              downstream clients

Real-World Use Cases by Domain

Domain	Tools (examples)	Resources / Prompts	What you get
Content and publishing	search_content, create_draft, publish_article, get_analytics	Brand voice, article templates, generate_brief	AI that writes, edits, and publishes in your voice
Developer tooling	run_tests, create_pr, get_build_status, query_logs	Codebase schema, architecture docs, style guide	AI coding assistant with full project context
Customer support	search_kb, create_ticket, lookup_customer, escalate	Product docs, FAQ, escalation policies	Support agent that knows your product
Sales and CRM	search_contacts, create_opportunity, log_activity	ICP definition, objection handling, pricing	AI SDR with real CRM access
Data and analytics	run_query, get_dashboard, export_report	Data dictionary, metric definitions, KPI targets	AI analyst that can actually query your data
E-commerce	search_products, check_inventory, update_pricing	Product catalogue, pricing rules, shipping policies	AI that can answer and act on customer queries
Internal ops	book_meeting, update_jira, get_onboarding_status	Team handbook, processes, org chart	AI ops assistant across your toolchain

SDK and Ecosystem Quick Reference

Tool / SDK	Where	Use it for	Install
FastMCP (Python)	gofastmcp.com	1M downloads/day. Powers ~70% of all MCP servers. Decorator API.	`pip install fastmcp`
@modelcontextprotocol/sdk (TS)	github.com/modelcontextprotocol/typescript-sdk	Tier 1 official SDK. 66M npm downloads. Zod schemas.	`npm install @modelcontextprotocol/sdk zod`
MCP Inspector	github.com/modelcontextprotocol/inspector	Official debugging tool.	`npx @modelcontextprotocol/inspector`
MCP Servers registry	github.com/modelcontextprotocol/servers	200+ community and official servers. Check here first.	Browse
create-mcp-server	npm: @agentailor/create-mcp-server	CLI scaffolding for TypeScript MCP servers.	`npx @agentailor/create-mcp-server`
Cloudflare Agents SDK	developers.cloudflare.com/agents	Native MCP hosting on Cloudflare Workers.	`npm install agents`

Key spec links:

MCP Specification (latest): modelcontextprotocol.io/specification/2025-11-25
OWASP MCP Top 10: owasp.org/www-project-model-context-protocol-security
FastMCP docs: gofastmcp.com

Where to Start

If you've been thinking about building an MCP server and haven't started: the content server pattern above is the fastest path to something genuinely useful. You can have a working stdio server with three tools in under an hour. The security and deployment complexity only matters when you go remote and multi-user.

Start local. Validate the value. Then harden and deploy.

The MCP Inspector before you connect anything. Focused servers over monolithic ones. Tool descriptions written for the model, not for you.

That's the 80% of what I've learnt shipping these in production.

Questions, corrections, or war stories from your own MCP builds: [your contact / social handle]

Related: RAG vs. CAG: The Definitive Guide

Building Your Own MCP Server