Production Kubernetes (k3d) - DevOps Lab Advanced

What You'll Learn

Spin up a real multi-node cluster locally with k3d
Set resource requests & limits so pods are scheduled and capped correctly
Add the three probes (liveness, readiness, startup) for self-healing & safe rollouts
Make the database durable with a StatefulSet + PersistentVolumeClaim
Autoscale the app under load with a HorizontalPodAutoscaler
Lock down access with namespaces and RBAC (least privilege)

Prerequisites: the beginner Kubernetes module and Docker. You should already know Pods, Deployments, and Services.

Intro K8s vs. Production K8s

In the beginner course you ran the app on a single-node Minikube with bare Deployments. That proves the concepts, but a real cluster does much more — and assumes you've configured the things Minikube let you skip.

Beginner (Minikube)	Production (this module)
One node	Multiple nodes — pods are scheduled across them
No resource limits	Requests & limits so one pod can't starve the node
One probe (readiness)	Liveness + readiness + startup for self-healing
DB as a throwaway Deployment	StatefulSet + PVC so data actually survives
Fixed replica count	Autoscaling on real CPU load
Full cluster-admin	Namespaces + RBAC, least privilege

Why k3d?

k3d runs lightweight k3s clusters inside Docker — so you can create a genuine multi-node cluster on your laptop in seconds, for free. Everything here works identically on EKS/GKE/AKS; the manifests don't change. Bonus: k3s ships with metrics-server built in, so autoscaling works out of the box.

Hands-on Lab: Productionize the Notes App

We'll deploy the same notes app onto a 3-node cluster and harden it one production concern at a time. Put these manifests in a k8s-prod/ folder.

Create a multi-node cluster with k3d

Install k3d (brew install k3d, or the install script below), then create 1 server + 2 agents:

curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash k3d cluster create notes --servers 1 --agents 2 --port "8080:80@loadbalancer" kubectl get nodes # 3 nodes, all Ready

You now have a real multi-node Kubernetes cluster. k3d wired kubectl to it automatically.

Namespace + load the image

Namespaces isolate workloads. Create one, then import the notes-app:1.0 image you built in the beginner course into the cluster:

kubectl create namespace notes k3d image import notes-app:1.0 -c notes # push local image into the cluster

Set a default namespace (optional)

Save typing -n notes on every command: kubectl config set-context --current --namespace=notes.

Durable database — StatefulSet + PVC

The beginner DB was a Deployment with no storage — restart it and data vanished. Production databases use a StatefulSet with a volumeClaimTemplate, giving each replica a stable identity and its own persistent disk. Save as k8s-prod/db.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
  namespace: notes
stringData:
  POSTGRES_USER: notes
  POSTGRES_PASSWORD: secret
  POSTGRES_DB: notesdb
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: db
  namespace: notes
spec:
  serviceName: db
  replicas: 1
  selector:
    matchLabels: { app: db }
  template:
    metadata:
      labels: { app: db }
    spec:
      containers:
        - name: postgres
          image: postgres:16
          envFrom:
            - secretRef: { name: db-secret }
          ports:
            - containerPort: 5432
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data
              subPath: pgdata
  volumeClaimTemplates:
    - metadata: { name: data }
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests: { storage: 1Gi }
---
apiVersion: v1
kind: Service
metadata:
  name: db
  namespace: notes
spec:
  clusterIP: None        # headless — required for StatefulSets
  selector: { app: db }
  ports:
    - port: 5432
      targetPort: 5432

kubectl apply -f k8s-prod/db.yaml kubectl get pvc -n notes # a PersistentVolumeClaim is now Bound

Why StatefulSet, not Deployment?

StatefulSets give pods stable names (db-0, db-1…) and bind each to its own volume that survives rescheduling. Deployments treat pods as interchangeable and lose the disk — fine for stateless web apps, fatal for databases.

The web app — limits + all three probes

This is the production-grade web Deployment. Save as k8s-prod/web.yaml — note the resources block and the three probes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
  namespace: notes
spec:
  replicas: 2
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: web
          image: notes-app:1.0
          imagePullPolicy: IfNotPresent
          env:
            - name: DATABASE_URL
              value: postgresql://notes:secret@db:5432/notesdb
          ports:
            - containerPort: 5000
          resources:
            requests: { cpu: "100m", memory: "128Mi" }
            limits:   { cpu: "500m", memory: "256Mi" }
          startupProbe:          # gives slow starts time before liveness kicks in
            httpGet: { path: /health, port: 5000 }
            failureThreshold: 30
            periodSeconds: 2
          readinessProbe:        # only send traffic when ready
            httpGet: { path: /health, port: 5000 }
            periodSeconds: 5
          livenessProbe:         # restart if it hangs
            httpGet: { path: /health, port: 5000 }
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: web
  namespace: notes
spec:
  selector: { app: web }
  ports:
    - port: 5000
      targetPort: 5000

kubectl apply -f k8s-prod/web.yaml kubectl get pods -n notes -o wide # see pods spread across the agent nodes

The "aha!" moment

Run kubectl get pods -o wide — your web pods landed on different nodes. The scheduler placed them for you, respecting the CPU/memory requests you declared. That's real Kubernetes.

Understand requests vs. limits & the probes

Setting	What it does
`requests`	What the pod is guaranteed. The scheduler uses it to choose a node.
`limits`	The hard ceiling. Exceed CPU → throttled; exceed memory → the pod is OOM-killed.
`startupProbe`	Protects slow-booting apps — liveness/readiness wait until it passes once.
`readinessProbe`	Gates traffic. Failing = removed from the Service, but not restarted.
`livenessProbe`	Detects a hung process and restarts the container.

No requests = no autoscaling

The HorizontalPodAutoscaler measures usage against the CPU request. Without a CPU request set (Step 4), HPA has no baseline to compute a percentage from and won't scale. Always set requests.

Autoscale under load — HPA

Add a HorizontalPodAutoscaler that keeps CPU near 50%, scaling between 2 and 10 pods. Save as k8s-prod/hpa.yaml:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web
  namespace: notes
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50

Apply it, then generate load from inside the cluster and watch it scale:

kubectl apply -f k8s-prod/hpa.yaml # terminal 1 — watch the autoscaler react kubectl get hpa -n notes -w # terminal 2 — hammer the service from a throwaway pod kubectl run -n notes load --image=busybox --restart=Never -- \ /bin/sh -c "while true; do wget -q -O- http://web:5000/ >/dev/null; done"

Within a minute or two the REPLICAS column climbs past 2 as CPU rises. Stop the load (kubectl delete pod load -n notes) and it scales back down after the cool-down.

The platform is reacting to reality

You didn't change replica counts by hand — the cluster grew the app to meet demand and shrank it when idle. That's production autoscaling, running free on your laptop.

Least privilege — RBAC

In production, not everything runs as cluster-admin. RBAC grants the minimum permissions needed. Here's a ServiceAccount that can only read pods in the notes namespace. Save as k8s-prod/rbac.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: notes-reader
  namespace: notes
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: notes
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: notes-reader-binding
  namespace: notes
subjects:
  - kind: ServiceAccount
    name: notes-reader
    namespace: notes
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Apply it, then verify the boundaries with kubectl auth can-i:

kubectl apply -f k8s-prod/rbac.yaml
SA=system:serviceaccount:notes:notes-reader
kubectl auth can-i list pods   -n notes --as=$SA      # yes
kubectl auth can-i delete pods -n notes --as=$SA      # no
kubectl auth can-i get secrets -n notes --as=$SA      # no

Role vs. ClusterRole

A Role + RoleBinding grant permissions within one namespace. Their cluster-wide cousins are ClusterRole + ClusterRoleBinding. Prefer namespaced Roles — narrower blast radius.

Verify the whole production setup

kubectl get all,pvc,hpa,sa -n notes # the full picture in one view kubectl describe pod -n notes -l app=web | grep -A6 "Limits\|Liveness\|Readiness"

Open the app via the k3d load balancer at http://localhost:8080 (route through an Ingress, or temporarily kubectl port-forward -n notes svc/web 8080:5000).

Tear down

k3d cluster delete notes # removes the whole cluster (and its containers)

Cheat Sheet

k3d

Command	What it does
`k3d cluster create N --agents 2`	Create a multi-node cluster
`k3d cluster list / delete N`	List / delete clusters
`k3d image import IMG -c N`	Load a local image into the cluster
`k3d node list`	List the cluster's nodes

Production kubectl

Command	What it does
`kubectl get pods -o wide`	See which node each pod runs on
`kubectl get hpa -w`	Watch the autoscaler live
`kubectl top pods / nodes`	Live CPU/memory usage (metrics-server)
`kubectl get pvc`	List persistent volume claims
`kubectl auth can-i VERB RES --as=SA`	Test RBAC permissions
`kubectl rollout restart deploy/web`	Restart pods (e.g. after a new image)
`kubectl describe pod P`	Events, limits, probe status
`kubectl scale deploy/web --replicas=N`	Manual scale (HPA overrides this)

Troubleshooting

Symptom	Likely cause & fix
HPA shows `<unknown>` for CPU	No CPU `request` set, or metrics-server not ready yet — wait, and confirm requests exist.
Pod `ImagePullBackOff`	Forgot `k3d image import`, or `imagePullPolicy` not `IfNotPresent`.
Pod `Pending`	No node has enough free CPU/memory for the `requests` — lower them or add an agent.
Pod `OOMKilled`	Hit the memory `limit` — raise it or fix the leak.
DB data lost on restart	Using a Deployment, or the volume isn't mounted at the data path — use the StatefulSet + PVC.
Liveness keeps restarting the pod	Probe too aggressive for a slow start — add/extend the `startupProbe`.

Your Challenge

Add a PodDisruptionBudget so at least 1 web pod stays up during node drains.
Spread web pods across nodes with a topologySpreadConstraint (or pod anti-affinity).
Drain a node (kubectl drain) and watch pods reschedule to the others.
Switch the HPA to also scale on memory, not just CPU.
Bonus: give the web container a ConfigMap-driven setting and roll it out with kubectl rollout.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
  namespace: notes
spec:
  minAvailable: 1
  selector:
    matchLabels: { app: web }

Recap & What's Next

You can now

Run a multi-node cluster, set resource requests/limits, configure all three probes, persist data with a StatefulSet + PVC, autoscale under load, and enforce least-privilege RBAC. This is the baseline every production workload needs.

Next up: A2 — Helm, where you'll package all these manifests into a single reusable chart with per-environment values, instead of juggling raw YAML files.

Beginner: Kubernetes A2: Helm