Distributed tracing across stdio MCP: same trace_id on CrewAI client and FastMCP server (SEP-414 + OpenTelemetry + Jaeger)

I put together a short walkthrough of something that tripped me up when building agentic workflows: MCP over stdio is two processes, so your usual “single-app” tracing story breaks unless you propagate W3C context explicitly.

Problem: A CrewAI agent calls MCP tools (get_order, check_inventory, …) in a child process over a pipe. Logs show something failed; they don’t show which LLM round triggered which tool, or whether latency sits in the model or in a specific tools/call.

Approach: Use OpenTelemetry with MCP semantic conventions and SEP-414 trace context in params._meta, so client spans (MCP request: tools/call …) and server spans (MCP server handle request: tools/call) share the same trace_id even though transport is stdio—not HTTP.

Stack (all local, reproducible):

CrewAI agent + Ollama (llama3.2)
FastMCP incident server (synthetic slow/failing inventory for order #1842)
OTLP → Jaeger
One-command demo: ./scripts/demo.sh

What you see in Jaeger: crewai.workflow → per-round .llm spans (with gen_ai.input.messages / output when enabled) → MCP client/server spans in one waterfall. The “money shot” is opening check_inventory and reading args + backorder error on the same trace as the agent’s LLM spans.

Video (12 min, architecture + live demo):
https://www.youtube.com/watch?v=qCHK4QlPXh8

Code (MIT):
https://github.com/ekb-dev-ai/mcp-trace-demo

Fast path without Ollama: ./scripts/quick_trace_demo.sh (~5s, MCP + Jaeger only).

Happy to hear how others are handling OTel for MCP—especially HTTP vs stdio and whether you’re standardizing on _meta vs custom headers.

u/Fabulous-Art4440 — 3 days ago

▲ 1 r/ArtificialNtelligence

[ Removed by Reddit ]

[ Removed by Reddit on account of violating the content policy. ]

reddit.com

u/Fabulous-Art4440 — 6 days ago

▲ 3 r/LLMAgenticLearning+2 crossposts

MCP is quietly becoming the “service mesh” layer for AI agents

Been digging deeper into MCP lately and it feels like the ecosystem is finally moving from “cool demo” territory into actual production engineering.

The most interesting shift IMO is around observability + SDK integration patterns.

A few things that stood out recently:

OpenTelemetry now has draft semantic conventions specifically for MCP traffic (mcp.client.operation.duration, session spans, transport-level tracing, etc.). That’s a pretty big signal that people are starting to treat MCP infra like real distributed systems instead of toy agent wrappers. (OpenTelemetry)
The Anthropic SDK integration story is getting much cleaner too. Their SDK now supports embedding MCP servers directly into app workflows instead of only external processes/SSE setups. Makes local tool orchestration way less painful. (Claude API Docs)
One production pattern I keep seeing: agent → MCP gateway/proxy → downstream MCP servers instead of direct client/server coupling. Feels very similar to API gateway evolution in microservices. People are layering auth, tracing, retries, session management, policy enforcement, and analytics into that middle layer. (Glama)
Another underrated issue: session lifecycle differences between clients. Some devs noticed Claude reuses MCP sessions while ChatGPT may create new sessions per tool call depending on implementation. That becomes a nightmare if your tools assume statefulness. Observability is basically the only way to even notice this. (Reddit)
Also seeing more discussion around “tool poisoning” + prompt injection at the protocol layer itself, not just model prompting. Security researchers are finally treating MCP as infrastructure with actual attack surfaces. (arXiv)

Honestly feels like we’re replaying the early Kubernetes/service-mesh era:
first everyone ships agents,
then everyone realizes they need tracing, policies, gateways, metrics, governance, and debugging 😅

The observability side is where things get really interesting for me. Once you can trace:

which tool was selected
latency per tool
token usage per tool chain
retries/failures
context propagation across agents
hallucinated tool calls
schema drift

…you stop building “AI demos” and start building actual systems.

u/Fabulous-Art4440 — 6 days ago

▲ 2 r/ContextEngineering

Context Engineering Explained: What Actually Goes Into an LLM’s Context Window

System prompts, RAG, tool results, and memory — how to design context for agents.

youtu.be

u/Fabulous-Art4440 — 8 days ago

▲ 2 r/LLMAgenticLearning+1 crossposts

MCP in action: local agents calling official MCP tools with Ollama — video + code

I put together a hands-on walkthrough of Model Context Protocol (MCP) with CrewAI and a local Ollama model—no paid API required for the core demos.

Video: https://www.youtube.com/watch?v=zLYx3YnPkZo

Code (GitHub): https://github.com/ekb-dev-ai/mcp-demo

What the video covers

What MCP is in practice (client ↔ server, tools over stdio / HTTP)
Running official MCP servers: filesystem, git, fetch, memory, time, Playwright, Context7
Attaching those tools to a CrewAI agent and letting the model call them (ReAct-style)
A gotcha I hit with Playwright: a second tool call can spawn a new MCP subprocess and reset the browser—plus a thin adapter server pattern that fixes it

Stack

Python + Poetry, mcp Python SDK, CrewAI, Ollama (llama3.2 in the README), Node/npx for several reference servers.

Who it’s for

Beginners who’ve heard “MCP” but want to see end-to-end wiring, or anyone building local agent tooling.

Happy to answer setup questions in the comments (Poetry, Ollama model choice, sandbox paths, etc.).

What are you using MCP with—Claude Desktop, Cursor, CrewAI, or custom Python?

u/Fabulous-Art4440 — 8 days ago