Focus
AI tooling for the agentic era
AI tooling is the infrastructure layer that makes agents usable: the protocols, tool design principles, context strategies, memory primitives, observability patterns, and security guardrails that sit between a model and the real world. This is where I spend most of my time — and where most production agent failures originate.
Protocols
Tool calling & MCP
Tool calling is the five-step loop: provide tool definitions → model emits tool calls → application executes → tool output returns → model continues reasoning. MCP (Model Context Protocol) extends this to remote servers: a host connects LLM clients to MCP servers via JSON-RPC 2.0, giving models access to tools, resources, and prompts from any conforming server.
Ergonomics
Tool design as UI for agents
Tools are the agent's interface to the world. Bad tool design degrades reasoning: vague descriptions cause misuse, noisy outputs waste context, overlapping tools create ambiguity. Good tools have a single clear purpose, a precise name that implies what it does, and return only the signal the model needs to decide its next step.
Context
Context engineering
What enters the context window at each agent step determines reasoning quality. Context engineering is the practice of shaping those inputs: write durable facts to external memory, select only relevant history via retrieval, compress verbose outputs before they enter context, and isolate tool outputs that shouldn't contaminate the main reasoning trace.
Persistence
Memory & state
Agents without persistent memory restart from scratch on every run. Checkpointing serialises full agent state — tool results, decisions, conversation history — and stores it keyed by thread ID. On failure or resume, the agent reloads the checkpoint and continues. Short-term memory lives in context; long-term memory lives in a retrieval store.
Observability
Traces, evals & guardrails
You cannot debug what you cannot see. Instrument every tool call as a span: inputs, outputs, latency, model decision. Evals are regression tests for agent behaviour — a set of golden runs where you know the expected tool call sequence. Security guardrails address the OWASP LLM Top 10: prompt injection, insecure output handling, and excessive agency are the most common failure modes.
Tool design checklist
A good tool passes all six.
Single clear purpose per tool
Name implies what it does
Description covers when NOT to call it
Response contains only decision-relevant signal
Errors return structured, actionable messages
No side-effects beyond the stated action
Field notes
Posts and projects from building AI tooling in production.
What I Learned Building MCP Servers
Lessons on tool ergonomics, context poisoning, and designing MCP servers that agents actually use reliably.
Symbolic Ontology MCP (project)
A working MCP server for symbolic reasoning — tool design, resource management, and JSON-RPC in production.
Agentic AI Landscape 2026: Toolsmith Field Guide
A protocol-stack view of the agentic ecosystem: what to build on, what to avoid, and where the infrastructure gaps are.
Running AI Locally in 2026
Practical guide to local model inference — the foundation for self-hosted AI tooling stacks.
Common questions
What is MCP and why does it matter?
MCP (Model Context Protocol) is an open protocol for integrating LLM applications with external data and tools. It defines host, client, and server roles and uses JSON-RPC 2.0 for transport. It matters because it standardises how models connect to tools — instead of every team writing custom integrations, conforming servers work with any MCP-compatible client. Think USB-C for AI tooling.
What's the difference between tool calling and function calling?
Functionally similar, different names across vendors. Function calling (OpenAI's original term) and tool calling both describe the same five-step loop: application provides definitions → model emits a call → application executes → result returns → model continues. 'Tools' is now the more general term; it covers built-in tools (web search, code execution), function-calling tools, and remote MCP servers.
What is context engineering?
Context engineering is the practice of deliberately shaping what enters the model's context window at each agent step. The four strategies are: write (persist facts to external memory rather than re-injecting them), select (retrieve only relevant history), compress (summarise verbose tool outputs), and isolate (keep untrusted tool outputs in a separate trace to prevent context poisoning).
How do I know if my agent tool is well designed?
Ask: does the tool name unambiguously describe what it does? Is the description precise enough that the model would call it at the right time and not call it at the wrong time? Does the response include only the signal the model needs for its next decision, or does it dump a wall of data? Can the tool be misused to produce side effects the model didn't intend? If you can't answer these confidently, the tool needs a redesign.
What are the most important security risks for agentic systems?
Per the OWASP LLM Top 10: prompt injection (malicious content in tool outputs hijacking agent behaviour), insecure output handling (agent output executed without sanitisation), and excessive agency (agents granted more permissions than tasks require). The fix for all three is bounded tool surfaces — minimum necessary permissions, explicit allowlists, and never trusting tool output as trusted input to subsequent reasoning.
Also in Focus
Agentic Workflows →
Patterns for reliable AI agents: routing, orchestration, state, human-in-the-loop, and evals.
Also in Focus
Sacred Technology →
Lineage-aware systems, calm technology, and knowledge infrastructure built to preserve meaning.
Audit your agent's tool surface
Let's talk MCP server design, context engineering, or agent infrastructure.