All Modules Why Prod Hands-on Lab Cheat Sheet

Production Kubernetes

Run a real multi-node cluster the way production does — limits, probes, RBAC, autoscaling, durable storage.

Advanced A1 · The notes app on a 3-node k3d cluster. Free · Local.

Advanced Kubernetes ~75 min

What You'll Learn

  • Spin up a real multi-node cluster locally with k3d
  • Set resource requests & limits so pods are scheduled and capped correctly
  • Add the three probes (liveness, readiness, startup) for self-healing & safe rollouts
  • Make the database durable with a StatefulSet + PersistentVolumeClaim
  • Autoscale the app under load with a HorizontalPodAutoscaler
  • Lock down access with namespaces and RBAC (least privilege)

Prerequisites: the beginner Kubernetes module and Docker. You should already know Pods, Deployments, and Services.

Intro K8s vs. Production K8s

In the beginner course you ran the app on a single-node Minikube with bare Deployments. That proves the concepts, but a real cluster does much more — and assumes you've configured the things Minikube let you skip.

Beginner (Minikube)Production (this module)
One nodeMultiple nodes — pods are scheduled across them
No resource limitsRequests & limits so one pod can't starve the node
One probe (readiness)Liveness + readiness + startup for self-healing
DB as a throwaway DeploymentStatefulSet + PVC so data actually survives
Fixed replica countAutoscaling on real CPU load
Full cluster-adminNamespaces + RBAC, least privilege

Why k3d?

k3d runs lightweight k3s clusters inside Docker — so you can create a genuine multi-node cluster on your laptop in seconds, for free. Everything here works identically on EKS/GKE/AKS; the manifests don't change. Bonus: k3s ships with metrics-server built in, so autoscaling works out of the box.

Hands-on Lab: Productionize the Notes App

We'll deploy the same notes app onto a 3-node cluster and harden it one production concern at a time. Put these manifests in a k8s-prod/ folder.

1

Create a multi-node cluster with k3d

Install k3d (brew install k3d, or the install script below), then create 1 server + 2 agents:

curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash k3d cluster create notes --servers 1 --agents 2 --port "8080:80@loadbalancer" kubectl get nodes # 3 nodes, all Ready

You now have a real multi-node Kubernetes cluster. k3d wired kubectl to it automatically.

2

Namespace + load the image

Namespaces isolate workloads. Create one, then import the notes-app:1.0 image you built in the beginner course into the cluster:

kubectl create namespace notes k3d image import notes-app:1.0 -c notes # push local image into the cluster

Set a default namespace (optional)

Save typing -n notes on every command: kubectl config set-context --current --namespace=notes.

3

Durable database — StatefulSet + PVC

The beginner DB was a Deployment with no storage — restart it and data vanished. Production databases use a StatefulSet with a volumeClaimTemplate, giving each replica a stable identity and its own persistent disk. Save as k8s-prod/db.yaml:

apiVersion: v1 kind: Secret metadata: name: db-secret namespace: notes stringData: POSTGRES_USER: notes POSTGRES_PASSWORD: secret POSTGRES_DB: notesdb --- apiVersion: apps/v1 kind: StatefulSet metadata: name: db namespace: notes spec: serviceName: db replicas: 1 selector: matchLabels: { app: db } template: metadata: labels: { app: db } spec: containers: - name: postgres image: postgres:16 envFrom: - secretRef: { name: db-secret } ports: - containerPort: 5432 volumeMounts: - name: data mountPath: /var/lib/postgresql/data subPath: pgdata volumeClaimTemplates: - metadata: { name: data } spec: accessModes: ["ReadWriteOnce"] resources: requests: { storage: 1Gi } --- apiVersion: v1 kind: Service metadata: name: db namespace: notes spec: clusterIP: None # headless — required for StatefulSets selector: { app: db } ports: - port: 5432 targetPort: 5432
kubectl apply -f k8s-prod/db.yaml kubectl get pvc -n notes # a PersistentVolumeClaim is now Bound

Why StatefulSet, not Deployment?

StatefulSets give pods stable names (db-0, db-1…) and bind each to its own volume that survives rescheduling. Deployments treat pods as interchangeable and lose the disk — fine for stateless web apps, fatal for databases.

4

The web app — limits + all three probes

This is the production-grade web Deployment. Save as k8s-prod/web.yaml — note the resources block and the three probes:

apiVersion: apps/v1 kind: Deployment metadata: name: web namespace: notes spec: replicas: 2 selector: matchLabels: { app: web } template: metadata: labels: { app: web } spec: containers: - name: web image: notes-app:1.0 imagePullPolicy: IfNotPresent env: - name: DATABASE_URL value: postgresql://notes:secret@db:5432/notesdb ports: - containerPort: 5000 resources: requests: { cpu: "100m", memory: "128Mi" } limits: { cpu: "500m", memory: "256Mi" } startupProbe: # gives slow starts time before liveness kicks in httpGet: { path: /health, port: 5000 } failureThreshold: 30 periodSeconds: 2 readinessProbe: # only send traffic when ready httpGet: { path: /health, port: 5000 } periodSeconds: 5 livenessProbe: # restart if it hangs httpGet: { path: /health, port: 5000 } periodSeconds: 10 --- apiVersion: v1 kind: Service metadata: name: web namespace: notes spec: selector: { app: web } ports: - port: 5000 targetPort: 5000
kubectl apply -f k8s-prod/web.yaml kubectl get pods -n notes -o wide # see pods spread across the agent nodes

The "aha!" moment

Run kubectl get pods -o wide — your web pods landed on different nodes. The scheduler placed them for you, respecting the CPU/memory requests you declared. That's real Kubernetes.

5

Understand requests vs. limits & the probes

SettingWhat it does
requestsWhat the pod is guaranteed. The scheduler uses it to choose a node.
limitsThe hard ceiling. Exceed CPU → throttled; exceed memory → the pod is OOM-killed.
startupProbeProtects slow-booting apps — liveness/readiness wait until it passes once.
readinessProbeGates traffic. Failing = removed from the Service, but not restarted.
livenessProbeDetects a hung process and restarts the container.

No requests = no autoscaling

The HorizontalPodAutoscaler measures usage against the CPU request. Without a CPU request set (Step 4), HPA has no baseline to compute a percentage from and won't scale. Always set requests.

6

Autoscale under load — HPA

Add a HorizontalPodAutoscaler that keeps CPU near 50%, scaling between 2 and 10 pods. Save as k8s-prod/hpa.yaml:

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: web namespace: notes spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: web minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50

Apply it, then generate load from inside the cluster and watch it scale:

kubectl apply -f k8s-prod/hpa.yaml # terminal 1 — watch the autoscaler react kubectl get hpa -n notes -w # terminal 2 — hammer the service from a throwaway pod kubectl run -n notes load --image=busybox --restart=Never -- \ /bin/sh -c "while true; do wget -q -O- http://web:5000/ >/dev/null; done"

Within a minute or two the REPLICAS column climbs past 2 as CPU rises. Stop the load (kubectl delete pod load -n notes) and it scales back down after the cool-down.

The platform is reacting to reality

You didn't change replica counts by hand — the cluster grew the app to meet demand and shrank it when idle. That's production autoscaling, running free on your laptop.

7

Least privilege — RBAC

In production, not everything runs as cluster-admin. RBAC grants the minimum permissions needed. Here's a ServiceAccount that can only read pods in the notes namespace. Save as k8s-prod/rbac.yaml:

apiVersion: v1 kind: ServiceAccount metadata: name: notes-reader namespace: notes --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: pod-reader namespace: notes rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: notes-reader-binding namespace: notes subjects: - kind: ServiceAccount name: notes-reader namespace: notes roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io

Apply it, then verify the boundaries with kubectl auth can-i:

kubectl apply -f k8s-prod/rbac.yaml SA=system:serviceaccount:notes:notes-reader kubectl auth can-i list pods -n notes --as=$SA # yes kubectl auth can-i delete pods -n notes --as=$SA # no kubectl auth can-i get secrets -n notes --as=$SA # no

Role vs. ClusterRole

A Role + RoleBinding grant permissions within one namespace. Their cluster-wide cousins are ClusterRole + ClusterRoleBinding. Prefer namespaced Roles — narrower blast radius.

8

Verify the whole production setup

kubectl get all,pvc,hpa,sa -n notes # the full picture in one view kubectl describe pod -n notes -l app=web | grep -A6 "Limits\|Liveness\|Readiness"

Open the app via the k3d load balancer at http://localhost:8080 (route through an Ingress, or temporarily kubectl port-forward -n notes svc/web 8080:5000).

9

Tear down

k3d cluster delete notes # removes the whole cluster (and its containers)

Cheat Sheet

k3d

CommandWhat it does
k3d cluster create N --agents 2Create a multi-node cluster
k3d cluster list / delete NList / delete clusters
k3d image import IMG -c NLoad a local image into the cluster
k3d node listList the cluster's nodes

Production kubectl

CommandWhat it does
kubectl get pods -o wideSee which node each pod runs on
kubectl get hpa -wWatch the autoscaler live
kubectl top pods / nodesLive CPU/memory usage (metrics-server)
kubectl get pvcList persistent volume claims
kubectl auth can-i VERB RES --as=SATest RBAC permissions
kubectl rollout restart deploy/webRestart pods (e.g. after a new image)
kubectl describe pod PEvents, limits, probe status
kubectl scale deploy/web --replicas=NManual scale (HPA overrides this)

Troubleshooting

SymptomLikely cause & fix
HPA shows <unknown> for CPUNo CPU request set, or metrics-server not ready yet — wait, and confirm requests exist.
Pod ImagePullBackOffForgot k3d image import, or imagePullPolicy not IfNotPresent.
Pod PendingNo node has enough free CPU/memory for the requests — lower them or add an agent.
Pod OOMKilledHit the memory limit — raise it or fix the leak.
DB data lost on restartUsing a Deployment, or the volume isn't mounted at the data path — use the StatefulSet + PVC.
Liveness keeps restarting the podProbe too aggressive for a slow start — add/extend the startupProbe.

Your Challenge

  • Add a PodDisruptionBudget so at least 1 web pod stays up during node drains.
  • Spread web pods across nodes with a topologySpreadConstraint (or pod anti-affinity).
  • Drain a node (kubectl drain) and watch pods reschedule to the others.
  • Switch the HPA to also scale on memory, not just CPU.
  • Bonus: give the web container a ConfigMap-driven setting and roll it out with kubectl rollout.
apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: web-pdb namespace: notes spec: minAvailable: 1 selector: matchLabels: { app: web }

Recap & What's Next

You can now

Run a multi-node cluster, set resource requests/limits, configure all three probes, persist data with a StatefulSet + PVC, autoscale under load, and enforce least-privilege RBAC. This is the baseline every production workload needs.

Next up: A2 — Helm, where you'll package all these manifests into a single reusable chart with per-environment values, instead of juggling raw YAML files.

Production Kubernetes

Objectives Intro vs Prod Hands-on Lab Cheat Sheet Troubleshooting Challenge Recap