Observing Agentic AI in Production: MCP Server Tracing with OpenTelemetry and Animal Crossing
AI agents are moving into production in 2026, but when something goes wrong (a tool call fails silently, an LLM takes 13 seconds to respond, token costs spike overnight) teams struggle to diagnose issues across multi-step agentic workflows. In this hands-on tutorial you will solve a real problem on the island in Animal Crossing by building a Model Context Protocol (MCP) server in Python using FastMCP, instrumenting it with OpenTelemetry following the emerging GenAI and MCP semantic conventions and visualising end-to-end traces in a local Jaeger instance. Did I mention that events on the island occur in real time and are collected and processed using Apache Kafka?
You will learn how distributed tracing captures the hierarchical relationship between agent conversations, tool executions and MCP protocol messages, and how to use that visibility for debugging, cost analysis and performance optimisation (including picking the right model and checking if you’re drowning in serialisation overhead). You will leave with a fully instrumented MCP server, a Docker Compose real-time observability stack and the knowledge to bring production-grade observability to your own agentic AI systems.