Building MCP Servers: What I Learned the Hard Way
I've spent the last few months deep in the Model Context Protocol ecosystem -- building servers, breaking servers, and occasionally shipping servers that actually work. MCP is the protocol that lets AI agents talk to your tools and data, and if you're building anything in the AI tooling space right now, you're going to encounter it. Here's what I wish someone had told me before I started.
The Protocol Is Simpler Than You Think (At First)
MCP uses JSON-RPC 2.0 messages. Three message types: requests (expect a response), responses (reply to requests), and notifications (fire-and-forget). Servers expose three core capabilities to clients: Tools (functions the AI can call), Resources (data the AI can read), and Prompts (templated workflows). That's the mental model. Everything else is transport and auth layered on top.
The real complexity isn't in understanding the protocol. It's in designing tools that an LLM can actually use well.
Your first instinct will be to take an existing API and expose every endpoint as an MCP tool. Don't. MCP is a User Interface for Agents, not a programmatic API. PagerDuty learned this building their MCP server -- traditional APIs target human developers reading docs, but MCP tools must be self-explanatory for LLMs. You need to think about how an agent reads your tool description, not how a developer reads your Swagger docs.
The stdout Mistake Will Cost You Hours
This is the single most common gotcha, and I hit it on day one. If you're building a stdio transport server -- which you will be, because that's what Claude Desktop and most IDEs use -- your server communicates over stdin/stdout using newline-delimited JSON-RPC. That means any stray console.log() in TypeScript, print() in Python, or println!() in Rust will corrupt the JSON-RPC stream and silently break everything.
// This will ruin your afternoon
console.log("Processing request...");
// This is what you want
console.error("Processing request...");
It sounds trivial. You will still do it. You will especially do it when you're debugging at 2am and you throw in a quick console.log to check a value. The server will just... stop working, with no useful error message.
In TypeScript, alias console.error to a logger. In Python, use the logging module (it writes to stderr by default). In Rust, use tracing with a stderr writer. Don't rely on discipline to avoid stdout -- make it structurally impossible to accidentally write there.
Choosing Your SDK
I've built MCP servers in TypeScript, Python, and experimented with Rust. Here's how they compare for different use cases:
| TypeScript SDK | Python SDK (FastMCP) | Rust SDK (rmcp) | |
|---|---|---|---|
| Setup complexity | Low -- npm install, Zod schemas | Lowest -- type hints + docstrings auto-generate schemas | Medium -- Cargo + proc macros + schemars |
| Performance | Good for most cases (event loop handles concurrency) | Good for I/O bound tools, weak for CPU-bound | 4,700+ QPS native, best for high-throughput |
| Ecosystem maturity | Most mature, v1.x stable | FastMCP is clean and ergonomic | v0.14 -- moving fast, some breaking changes |
| Best for | Production servers, Cloudflare Workers | Rapid prototyping, data/ML tools | High-concurrency infra, latency-sensitive |
| Key gotcha | Must build before testing (npm run build) | No print() -- use logging module | println!() corrupts stdio, use tracing to stderr |
For most people starting out, TypeScript is the right choice. The SDK is the most mature, Zod handles schema validation elegantly, and you can deploy to Cloudflare Workers for instant global distribution. Python's FastMCP is the fastest path to a working prototype -- it generates tool definitions from your type hints and docstrings, which is borderline magical.
The Tool Catalog Trap
Here's something that surprised me: each tool definition you expose consumes 400-600 tokens in the LLM's context window. Those tokens are consumed before the user types anything.
Monthly MCP SDK downloads (Dec 2025)
Active MCP servers in production
Tokens consumed by GitHub MCP's 93 tools alone
Sweet spot for tool count per server
Of tested MCP servers with command injection flaws
GitHub's official MCP server exposes 93 tools. That's roughly 55,000 tokens just for tool definitions -- before any actual work happens. At scale, a DevOps team of five people can pay $375/month just in tool definition overhead.
The solution is a pattern called intent multiplexing: instead of 45 individual tools, you expose one tool with an operation enum. The LLM passes the operation it wants, and your server routes internally.
// Instead of 45 separate tools...
server.registerTool("execute_operation", {
inputSchema: {
operation: z.enum([
"GET_USERS", "CREATE_USER", "DELETE_USER",
"GET_PROJECTS", "CREATE_PROJECT"
]),
params: z.record(z.unknown())
}
}, async ({ operation, params }) => {
const handler = commandRegistry.get(operation);
if (!handler) {
return {
content: [{
type: "text",
text: JSON.stringify({
error: `Unknown operation: ${operation}`,
valid_operations: commandRegistry.listOperations(),
suggestion: "Use GET_USERS to list available users"
})
}]
};
}
return handler.execute(params);
});
Notice the error handling -- it guides the agent toward the right next action. Bad error messages ("Something went wrong") are useless to an LLM. Good error messages tell the agent exactly what went wrong and what to try instead.
43% of tested MCP server implementations had command injection flaws. 30% permitted unrestricted URL fetching (SSRF). Anthropic's own Git MCP server had path traversal vulnerabilities. If you're passing user-controlled input anywhere near a shell command, file path, or URL fetch -- stop and validate. Every parameter. Every time.
OAuth 2.1: The Part Nobody Enjoys
If your MCP server is accessible over the network (Streamable HTTP transport), you need auth. The spec mandates OAuth 2.1 with PKCE for all clients. The catch? The current spec makes MCP servers both a Resource Server AND an Authorization Server, which means you're implementing discovery endpoints, registration endpoints, token endpoints, and protected resource metadata.
In practice, nobody does this from scratch. Delegate to an identity provider -- Cloudflare Workers has built-in OAuth support, Auth0 works as an external authorization server, or you put an auth gateway in front of your MCP server. Fighting the auth spec alone is a losing battle.
Build and test your server using stdio transport first. It's simpler, needs no auth, and the MCP Inspector makes debugging fast. Once your tools work correctly, add Streamable HTTP transport and auth for remote deployment. Trying to get auth and tool logic working simultaneously is a recipe for frustration.
Design for the Agent, Not the Developer
Detailed, domain-specific tool descriptions are critical for the model to even notice and use your tools. Clear, intentional naming can flip a tool from ignored to most-used overnight.
The biggest mindset shift in building MCP servers is realizing your user is an LLM, not a human. The tool description is your entire documentation. The parameter names are your API surface. The error messages are your debugging interface.
PagerDuty found that limiting to 20-25 tools, designing for user journeys (composite tools that handle multi-step workflows internally), and optimizing for model ergonomics made the difference between an MCP server that agents ignored and one they used effectively.
Builder.io's Blade MCP server -- a design-to-code pipeline -- achieved 75% accuracy on first generation and 3x faster shipping from Figma designs. That's what happens when you design tools around outcomes, not around API endpoints.
What I'd Do Differently
If I were starting over, I'd do three things from day one:
First, I'd write tool descriptions before writing tool implementations. The description is the product. If you can't explain what a tool does in two sentences that an LLM would understand, the tool isn't ready.
Second, I'd test with the MCP Inspector from the very first tool. Run npx @modelcontextprotocol/inspector node build/index.js and interact with your server visually. It catches protocol issues instantly that would take hours to debug through Claude Desktop logs.
Third, I'd build composite tools from the start instead of exposing primitives. Don't make the agent chain three calls when one tool can handle the workflow. The fewer round trips between agent and server, the more reliable the interaction.
The MCP ecosystem is moving fast -- the protocol went from initial release to Linux Foundation governance in just over a year, with 97 million monthly SDK downloads. The spec keeps evolving (Tasks and async execution landed in November 2025), and the tooling improves every month. But the fundamentals I've described here -- design for agents, validate everything, keep your tool count lean, and never write to stdout -- those aren't going to change.
Sources
- Lessons Learned Building the PagerDuty MCP Server
PagerDuty Engineering · 2025-12
- MCP Server Gotchas We Learned the Hard Way
CloudQuery · 2025-11
- How Not to Write an MCP Server
Towards Data Science · 2025-12
- Two Essential Patterns for MCP Servers
Shaaf.dev · 2026-01
- MCP Specification (2025-11-25)
Model Context Protocol · 2025-11
- Claude Code 46.9% Token Reduction with Tool Search
Joe Njenga · 2026-01
- MCP Security Vulnerabilities
Practical DevSecOps · 2025-10