AI agents are moving into production in 2026, but when something goes wrong (a tool call fails silently, an LLM takes 13 seconds to respond, token costs spike overnight) teams struggle to diagnose issues across multi-step agentic workflows. In this hands-on tutorial you will build a Model Context Protocol (MCP) server in Python using FastMCP, instrument it with OpenTelemetry following the emerging GenAI and MCP semantic conventions, and visualise end-to-end traces in a local Jaeger instance.
You will learn how distributed tracing captures the hierarchical relationship between agent conversations, tool executions and MCP protocol messages, and how to use that visibility for debugging, cost analysis and performance optimisation (including picking the right model and checking if you’re drowning in serialisation overhead). You will leave with a fully instrumented MCP server, a Docker Compose observability stack and the knowledge to bring production-grade observability to your own agentic AI systems.