Ship new versions to a few users first — analyze, then promote or auto-roll-back.
Advanced A4 · Canary & blue-green for the notes app with Argo Rollouts.
Advanced Delivery ~65 minRolloutPrerequisites: A1, A2, A3. A running k3d cluster with the notes app.
A normal Kubernetes Deployment does a rolling update: it swaps old pods for new ones until 100% of traffic hits the new version. If v2 has a bug, everyone gets it at once — you find out from your users (or your pager).
Rolling updates have no concept of "try it on 5% first." There's no automatic health check on real traffic, and rollback is a manual scramble once it's already affecting everyone.
Progressive delivery rolls a new version out gradually: send it a small slice of traffic, watch the metrics, and only proceed if it's healthy — otherwise roll back automatically, before most users ever notice. Argo Rollouts adds this to Kubernetes with a drop-in Rollout resource.
Bad releases are caught at 5–25% blast radius and reverted automatically. You ship more often and more safely — the two goals that usually fight each other.
| Strategy | How it works | Trade-off |
|---|---|---|
| Recreate | Kill all old, start all new | Downtime |
| Rolling | Replace pods gradually (K8s default) | No traffic-based safety; both versions serve during the roll |
| Blue-Green | Run v2 alongside v1, flip 100% at once after testing | Instant switch + easy rollback, but 2× resources |
| Canary | Shift a small % of traffic to v2, increase in steps | Safest; needs metrics to judge each step |
Argo Rollouts replaces your Deployment with a Rollout (same pod spec) plus a strategy: block. It manages two ReplicaSets — stable and canary — and steps through setWeight / pause / analysis stages you define.
For precise traffic percentages you add a traffic provider (Istio, NGINX, or SMI). Without one, Argo Rollouts approximates the weight by replica count — e.g. 25% ≈ 1 of 4 pods. That's perfect for learning the workflow on k3d; the manifests gain a few lines when you add a real mesh (A9).
We'll convert the web app to a canary Rollout, then watch a v2 release roll out step by step. Use your k3d cluster.
A Rollout is a Deployment with a strategy. Save as k8s-prod/web-rollout.yaml — the canary advances 25% → 50% → 75% → 100% with pauses:
If the A1 web Deployment is still applied, delete it first (kubectl delete deploy web -n notes) — the Rollout takes over managing the web pods and Service.
Make any visible change to the notes app (e.g. add an /about route), rebuild, and import it as 2.0:
Point the rollout at v2. In one terminal, watch the live progress dashboard:
Watch it move to 25% (1 canary pod), wait 30s, advance to 50%, then pause indefinitely at step 4 — exactly where you told it to stop for a human decision.
v2 is live for a fraction of traffic while v1 still serves the rest. If v2 were broken, only a quarter of users would be affected — and you haven't committed to it yet.
v2 looks good? Promote it through the remaining steps. Looks bad? Abort and it snaps back to 100% stable (v1) instantly:
kubectl argo rollouts dashboard serves a local web UI at localhost:3100 — a visual view of canary weight, steps, and revisions.
A human gate is fine, but the real win is automated analysis: query a metric at each step and auto-abort if it's bad. This AnalysisTemplate checks the success rate from Prometheus (from beginner Module 11 / advanced A8):
Wire it into the canary so it runs alongside the steps — if success rate drops below 95% twice, the rollout aborts itself:
Now a bad deploy is detected by metrics, not a human watching a dashboard, and rolled back before it spreads. This is the heart of progressive delivery.
Prefer an instant switch with a preview environment? Swap the strategy block. Blue-green keeps v2 fully running on a preview Service; you test it, then flip the active Service to it:
You'd add a second Service named web-preview. Test v2 via the preview Service, then kubectl argo rollouts promote web to flip 100% of live traffic over — instant, with an instant rollback if needed.
This stays fully declarative. Put the Rollout in your Helm chart (A2) and let Argo CD (A3) manage it. A release then becomes: bump the image tag in Git → Argo CD syncs the new Rollout spec → Argo Rollouts runs the canary → analysis promotes or aborts. No imperative commands in steady state.
A2 packaged it, A3 syncs it from Git, A4 rolls it out safely. That's a real continuous-delivery pipeline — built entirely from declarative YAML.
| Command | What it does |
|---|---|
kubectl argo rollouts get rollout N --watch | Live, colorized rollout status |
kubectl argo rollouts set image N c=img | Start a rollout to a new image |
kubectl argo rollouts promote N | Advance past a pause |
kubectl argo rollouts promote N --full | Skip all remaining steps |
kubectl argo rollouts abort N | Stop & revert to stable |
kubectl argo rollouts undo N | Roll back to the previous revision |
kubectl argo rollouts retry rollout N | Retry an aborted rollout |
kubectl argo rollouts dashboard | Open the local web UI |
kubectl argo rollouts status N | One-line health/phase |
Canary: gradual, metric-driven, lowest risk — great for user-facing web traffic. Blue-green: instant cutover with a tested preview — great when you can't run two versions live at once or need an atomic switch.
| Symptom | Likely cause & fix |
|---|---|
| Rollout stuck at a step | A pause: {} waits forever by design — promote it, or add a duration. |
| Traffic % looks off | No traffic provider on k3d — weight is approximated by replica count. Expected. |
| Analysis always fails | Prometheus address/query wrong, or no traffic to measure — generate load; verify the query in Prometheus. |
web pods not managed | Old Deployment still owns them — delete it so the Rollout takes over. |
| New image won't appear | Forgot k3d image import notes-app:2.0. |
Can't run kubectl argo rollouts | Plugin not on PATH — reinstall to /usr/local/bin. |
v3 (e.g. crash on start) and confirm the canary aborts automatically.prePromotionAnalysis for blue-green).Rollout into your Helm chart and deploy it via Argo CD (A2 + A3).setCanaryScale step to control how many canary pods run independent of weight.Convert a Deployment to a Rollout, run canary and blue-green releases, promote/abort/undo, and add automated metric analysis that auto-rolls-back bad versions — all GitOps-friendly. Releases are now low-risk.
Next up: A5 — Advanced Terraform, where you'll level up your IaC with reusable modules, remote state, and workspaces — and provision a real cluster instead of a local one.