Hands On AI Agent Mastery Course

Hands On AI Agent Mastery Course

Advanced Architectures for Vertical AI Agents

Lesson 64: Continuous Delivery (CD) & Containerization

May 19, 2026
∙ Paid

Highlights

What we build in L64:

  • A production multi-stage Dockerfile that shrinks the VAIA agent image to ~180 MB while keeping all Gemini, ChromaDB, and FastAPI dependencies intact.

  • A Docker Compose stack that runs the full local dev environment (agent + Redis + ChromaDB) with a single command, sharing the same service contracts the CI tests from L63 verify.

  • A GitHub Actions CD workflow that chains L63’s CI jobs → docker build → Trivy CVE scan → registry push → kubectl apply, with hard failure gates at each stage.

  • Kubernetes Deployment, Service, ConfigMap, and HPA manifests that form the serving target for L65’s locust load tests.

Connection to L63: The CD pipeline imports ci.yml as a prerequisite job. Docker build only triggers after L63’s pytest suite — tool schema checks, prompt robustness tests, agent logic assertions — passes clean. The same GEMINI_API_KEY and CHROMA_HOST values verified in CI become Kubernetes Secrets consumed by the Deployment.

Enables L65: The containerized pod exposes /health, /metrics (Prometheus), and /v1/agent/infer at a fixed port, giving L65’s FastAPI serving layer a stable, load-testable surface. The HPA stub in k8s/hpa.yaml is intentionally left at a low CPU threshold so L65’s locust traffic immediately triggers scale-out.

Hands On AI Agent Mastery Course is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


Architecture Context

L64 sits at the hinge point of Module 5. L61–L63 built the code quality and testing machinery; L64 packages the agent as an immutable, scannable artifact and installs the delivery rails. L65–L67 assume a running, containerized agent and focus entirely on serving characteristics — throughput, latency, and autoscaling.

The architecture follows an immutable-image model: every commit produces a uniquely tagged image (sha-<git_sha> + semantic version on release). No environment-specific code lives inside the image; instead, environment differences are injected at runtime via ConfigMap (non-sensitive env vars) and Secret (API keys). This pattern eliminates “works on my machine” failures — the same image bit-for-bit runs locally via Compose, in staging, and in prod.

The Kubernetes layer is intentionally minimal at this stage: one Deployment with two replicas, a ClusterIP Service, and an Ingress stub. L65 upgrades the HPA and adds load-balanced serving strategies on top of this base.


User's avatar

Continue reading this post for free, courtesy of AI Agents Roadmap.

Or purchase a paid subscription.
© 2026 Systemdr, Inc. · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture