Advanced Architectures for Vertical AI Agents

Lesson 69: Agent Observability & Logging

May 20, 2026

∙ Paid

Highlights

What we build

An ObservableAgent wrapping Gemini 2.0 Flash where every execution step — prompt construction, LLM call, post-processing — is a named OTEL span with semantic attributes
A SQLiteSpanExporter that persists spans to a WAL-mode SQLite database, replacing the need for an external collector during development
A metric_snapshots table recording per-trace latency and cost with configurable thresholds — the direct feed for L70’s alerting system
A FastAPI service exposing trace list, span waterfall, and percentile metrics over REST and WebSocket
A React dashboard with a Jaeger-style span waterfall, real-time cost tracker, and p50/p95/p99 latency display

Connection to L68
L68 introduced the drift_snapshots table and feature snapshot API. L69 extends the same SQLite WAL database — adding traces, spans, and metric_snapshots tables — so a single query can correlate agent performance degradation (drift signals from L68) with the specific LLM calls that caused it (spans from L69).

Enables L70
The metric_snapshots table carries threshold, breached, and trace_id columns. L70’s alerting engine will query this table, match breached rows to span attributes, and fire notifications without any new instrumentation work.

Architecture Context

Place in the 90-lesson path
Lessons 61–68 built the MLOps control plane: CI/CD, model versioning, feature stores, continuous training, and drift detection. L69 closes the observability loop by making individual agent invocations inspectable. Without trace-level visibility, the drift signals from L68 are actionable only in aggregate — you know something degraded, but not which prompts, which spans, and what cost was involved.

Module 5 alignment
The module requires production-grade operability: the ability to diagnose problems quickly enough to meet enterprise SLAs. OTEL-based tracing is the industry-standard mechanism for this in distributed systems, and L69 adapts its primitives for the unique shape of LLM workloads (variable latency, token-denominated cost, prompt sensitivity).

Continue reading this post for free, courtesy of AI Agents Roadmap.

Or purchase a paid subscription.

Hands On AI Agent Mastery Course

Lesson 69: Agent Observability & Logging

Highlights

Architecture Context

Continue reading this post for free, courtesy of AI Agents Roadmap.