All Modules Why Progressive Strategies Hands-on Lab Cheat Sheet

Progressive Delivery

Ship new versions to a few users first — analyze, then promote or auto-roll-back.

Advanced A4 · Canary & blue-green for the notes app with Argo Rollouts.

Advanced Delivery ~65 min

What You'll Learn

  • Compare deployment strategies: recreate, rolling, blue-green, canary
  • Install Argo Rollouts and convert a Deployment into a Rollout
  • Run a canary release with traffic steps and pauses
  • Promote, abort, and undo a rollout
  • Add automated analysis that auto-rolls-back on a bad metric
  • See how blue-green works — and how this all stays GitOps-driven

Prerequisites: A1, A2, A3. A running k3d cluster with the notes app.

Why Progressive Delivery

A normal Kubernetes Deployment does a rolling update: it swaps old pods for new ones until 100% of traffic hits the new version. If v2 has a bug, everyone gets it at once — you find out from your users (or your pager).

All-or-nothing is risky

Rolling updates have no concept of "try it on 5% first." There's no automatic health check on real traffic, and rollback is a manual scramble once it's already affecting everyone.

Progressive delivery rolls a new version out gradually: send it a small slice of traffic, watch the metrics, and only proceed if it's healthy — otherwise roll back automatically, before most users ever notice. Argo Rollouts adds this to Kubernetes with a drop-in Rollout resource.

The payoff

Bad releases are caught at 5–25% blast radius and reverted automatically. You ship more often and more safely — the two goals that usually fight each other.

Deployment Strategies

StrategyHow it worksTrade-off
RecreateKill all old, start all newDowntime
RollingReplace pods gradually (K8s default)No traffic-based safety; both versions serve during the roll
Blue-GreenRun v2 alongside v1, flip 100% at once after testingInstant switch + easy rollback, but 2× resources
CanaryShift a small % of traffic to v2, increase in stepsSafest; needs metrics to judge each step

The Rollout resource

Argo Rollouts replaces your Deployment with a Rollout (same pod spec) plus a strategy: block. It manages two ReplicaSets — stable and canary — and steps through setWeight / pause / analysis stages you define.

Traffic on k3d

For precise traffic percentages you add a traffic provider (Istio, NGINX, or SMI). Without one, Argo Rollouts approximates the weight by replica count — e.g. 25% ≈ 1 of 4 pods. That's perfect for learning the workflow on k3d; the manifests gain a few lines when you add a real mesh (A9).

Hands-on Lab: Canary the Notes App

We'll convert the web app to a canary Rollout, then watch a v2 release roll out step by step. Use your k3d cluster.

1

Install Argo Rollouts + the kubectl plugin

kubectl create namespace argo-rollouts kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml # kubectl plugin (macOS arm64 shown; pick your OS/arch) curl -sLO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-darwin-arm64 chmod +x kubectl-argo-rollouts-darwin-arm64 sudo mv kubectl-argo-rollouts-darwin-arm64 /usr/local/bin/kubectl-argo-rollouts kubectl argo rollouts version
2

Convert the web Deployment into a Rollout

A Rollout is a Deployment with a strategy. Save as k8s-prod/web-rollout.yaml — the canary advances 25% → 50% → 75% → 100% with pauses:

apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: web namespace: notes spec: replicas: 4 selector: matchLabels: { app: web } template: metadata: labels: { app: web } spec: containers: - name: web image: notes-app:1.0 imagePullPolicy: IfNotPresent env: - name: DATABASE_URL value: postgresql://notes:secret@db:5432/notesdb ports: - containerPort: 5000 readinessProbe: httpGet: { path: /health, port: 5000 } strategy: canary: steps: - setWeight: 25 - pause: { duration: 30s } - setWeight: 50 - pause: {} # pause forever — wait for manual promote - setWeight: 75 - pause: { duration: 30s }

Replace, don't duplicate

If the A1 web Deployment is still applied, delete it first (kubectl delete deploy web -n notes) — the Rollout takes over managing the web pods and Service.

kubectl apply -f k8s-prod/web-rollout.yaml kubectl argo rollouts get rollout web -n notes # initial rollout (all stable)
3

Build a v2 image to roll out

Make any visible change to the notes app (e.g. add an /about route), rebuild, and import it as 2.0:

docker build -t notes-app:2.0 . k3d image import notes-app:2.0 -c notes
4

Trigger the canary & watch it

Point the rollout at v2. In one terminal, watch the live progress dashboard:

# terminal 1 — live, colorized rollout view kubectl argo rollouts get rollout web -n notes --watch # terminal 2 — start the canary kubectl argo rollouts set image web web=notes-app:2.0 -n notes

Watch it move to 25% (1 canary pod), wait 30s, advance to 50%, then pause indefinitely at step 4 — exactly where you told it to stop for a human decision.

The "aha!" moment

v2 is live for a fraction of traffic while v1 still serves the rest. If v2 were broken, only a quarter of users would be affected — and you haven't committed to it yet.

5

Promote — or abort

v2 looks good? Promote it through the remaining steps. Looks bad? Abort and it snaps back to 100% stable (v1) instantly:

kubectl argo rollouts promote web -n notes # continue past the pause # or, if something's wrong: kubectl argo rollouts abort web -n notes # instant rollback to stable kubectl argo rollouts undo web -n notes # roll back to the previous revision

Open the dashboard UI

kubectl argo rollouts dashboard serves a local web UI at localhost:3100 — a visual view of canary weight, steps, and revisions.

6

Automate the decision — analysis

A human gate is fine, but the real win is automated analysis: query a metric at each step and auto-abort if it's bad. This AnalysisTemplate checks the success rate from Prometheus (from beginner Module 11 / advanced A8):

apiVersion: argoproj.io/v1alpha1 kind: AnalysisTemplate metadata: name: success-rate namespace: notes spec: metrics: - name: success-rate interval: 30s failureLimit: 2 successCondition: result[0] >= 0.95 provider: prometheus: address: http://prometheus.notes:9090 query: | sum(rate(flask_http_request_total{status=~"2.."}[1m])) / sum(rate(flask_http_request_total[1m]))

Wire it into the canary so it runs alongside the steps — if success rate drops below 95% twice, the rollout aborts itself:

strategy: canary: analysis: templates: - templateName: success-rate startingStep: 1 # begin analysis after the first setWeight steps: - setWeight: 25 - pause: { duration: 1m } - setWeight: 50 - pause: { duration: 1m }

Hands-off safety

Now a bad deploy is detected by metrics, not a human watching a dashboard, and rolled back before it spreads. This is the heart of progressive delivery.

7

Alternative: blue-green

Prefer an instant switch with a preview environment? Swap the strategy block. Blue-green keeps v2 fully running on a preview Service; you test it, then flip the active Service to it:

strategy: blueGreen: activeService: web # live traffic previewService: web-preview # test v2 here first autoPromotionEnabled: false # require a manual promote

You'd add a second Service named web-preview. Test v2 via the preview Service, then kubectl argo rollouts promote web to flip 100% of live traffic over — instant, with an instant rollback if needed.

8

Keep it GitOps

This stays fully declarative. Put the Rollout in your Helm chart (A2) and let Argo CD (A3) manage it. A release then becomes: bump the image tag in Git → Argo CD syncs the new Rollout spec → Argo Rollouts runs the canary → analysis promotes or aborts. No imperative commands in steady state.

The pieces click together

A2 packaged it, A3 syncs it from Git, A4 rolls it out safely. That's a real continuous-delivery pipeline — built entirely from declarative YAML.

9

Clean up

kubectl delete rollout web -n notes kubectl delete analysistemplate success-rate -n notes 2>/dev/null

Argo Rollouts Cheat Sheet

CommandWhat it does
kubectl argo rollouts get rollout N --watchLive, colorized rollout status
kubectl argo rollouts set image N c=imgStart a rollout to a new image
kubectl argo rollouts promote NAdvance past a pause
kubectl argo rollouts promote N --fullSkip all remaining steps
kubectl argo rollouts abort NStop & revert to stable
kubectl argo rollouts undo NRoll back to the previous revision
kubectl argo rollouts retry rollout NRetry an aborted rollout
kubectl argo rollouts dashboardOpen the local web UI
kubectl argo rollouts status NOne-line health/phase

Canary vs. blue-green — pick by need

Canary: gradual, metric-driven, lowest risk — great for user-facing web traffic. Blue-green: instant cutover with a tested preview — great when you can't run two versions live at once or need an atomic switch.

Troubleshooting

SymptomLikely cause & fix
Rollout stuck at a stepA pause: {} waits forever by design — promote it, or add a duration.
Traffic % looks offNo traffic provider on k3d — weight is approximated by replica count. Expected.
Analysis always failsPrometheus address/query wrong, or no traffic to measure — generate load; verify the query in Prometheus.
web pods not managedOld Deployment still owns them — delete it so the Rollout takes over.
New image won't appearForgot k3d image import notes-app:2.0.
Can't run kubectl argo rolloutsPlugin not on PATH — reinstall to /usr/local/bin.

Your Challenge

  • Ship a deliberately broken v3 (e.g. crash on start) and confirm the canary aborts automatically.
  • Add an analysis before the first weight (prePromotionAnalysis for blue-green).
  • Move the Rollout into your Helm chart and deploy it via Argo CD (A2 + A3).
  • Add a setCanaryScale step to control how many canary pods run independent of weight.
  • Bonus: wire an NGINX/Istio traffic router for real percentage-based splitting (preview of A9).
# point the rollout at an image that fails its readiness probe: kubectl argo rollouts set image web web=notes-app:broken -n notes # the canary pod never becomes Ready -> the rollout degrades. # With an AnalysisTemplate attached, it aborts back to stable on its own. kubectl argo rollouts get rollout web -n notes --watch # watch it revert

Recap & What's Next

You can now

Convert a Deployment to a Rollout, run canary and blue-green releases, promote/abort/undo, and add automated metric analysis that auto-rolls-back bad versions — all GitOps-friendly. Releases are now low-risk.

Next up: A5 — Advanced Terraform, where you'll level up your IaC with reusable modules, remote state, and workspaces — and provision a real cluster instead of a local one.

Progressive Delivery

Objectives Why Progressive Strategies Hands-on Lab Cheat Sheet Troubleshooting Challenge Recap