All Modules Why Monitor Concepts Hands-on Lab Cheat Sheet

Monitoring & Logging

See what your app is actually doing — before your users tell you it's broken.

Module 11 · Prometheus + Grafana on the notes app. Free · Local.

Beginner+ Observability ~55 min

What You'll Learn

By the end of this module you will be able to:

  • Explain why monitoring is non-negotiable, and what "observability" means
  • Know the three pillars — metrics, logs, traces — and the four golden signals
  • Instrument the notes app to expose a /metrics endpoint
  • Run Prometheus to scrape those metrics and query them with PromQL
  • Build a live Grafana dashboard from your metrics
  • Read logs and understand where alerting fits

Prerequisites: Module 7 (Compose) — we extend the same stack.

Why Monitoring Exists

Your app is deployed and running. But is it healthy? How many requests is it serving? How fast? Are any failing? Without monitoring, you're flying blind — and the first you'll hear of an outage is an angry user.

You can't fix what you can't see

"It's slow" — how slow? Since when? For everyone or just some? Without data you're guessing. Monitoring turns guesses into facts and surfaces problems before they become outages.

Observability is the goal: being able to understand what's happening inside your system from the outside. It rests on three pillars:

PillarWhat it answersTool here
Metrics"How much / how fast / how many?" — numbers over timePrometheus + Grafana
Logs"What exactly happened at 14:32?" — event recordsdocker logs / Loki
Traces"Where did this request spend its time?" — request pathsJaeger / Tempo (advanced)

The four golden signals

A simple framework for what to watch on any service: Latency (how slow), Traffic (how much demand), Errors (how many fail), and Saturation (how full your resources are). Master these four and you cover most real incidents.

How Prometheus & Grafana Fit Together

PieceIts job
InstrumentationCode in your app that exposes metrics at a /metrics URL.
PrometheusA time-series database that pulls (scrapes) /metrics on a schedule and stores the numbers.
PromQLPrometheus's query language for slicing those metrics (rates, averages, totals).
GrafanaDashboards — turns Prometheus queries into graphs anyone can read.
AlertmanagerSends notifications (email, Slack) when a metric crosses a threshold.

The flow in one line

Your app exposes /metricsPrometheus scrapes & stores it → Grafana queries Prometheus & draws graphs → Alertmanager pages you when something's wrong.

Pull, not push

Unlike many tools, Prometheus pulls metrics from your app on a timer. Your app's only job is to expose /metrics — it doesn't need to know Prometheus exists. That simplicity is why Prometheus became the standard.

Hands-on Lab: Monitor the Notes App

We'll instrument the notes app, then extend the Compose stack from Module 7 with Prometheus and Grafana. Work in your notes-app folder.

1

Instrument the app — expose /metrics

Add two lines to app.py. The prometheus-flask-exporter library automatically tracks request count, latency, and errors, and serves them at /metrics.

from prometheus_flask_exporter import PrometheusMetrics app = Flask(__name__) metrics = PrometheusMetrics(app) # <- adds /metrics automatically

Add the library to requirements.txt:

flask==3.0.3 psycopg2-binary==2.9.9 prometheus-flask-exporter==0.23.1
2

Tell Prometheus what to scrape — prometheus.yml

This config says "every 5 seconds, scrape the web service on port 5000." The target web:5000 works because Compose's network resolves service names (Module 7).

global: scrape_interval: 5s scrape_configs: - job_name: notes-app static_configs: - targets: ["web:5000"]
3

Extend the stack — add to docker-compose.yml

Add Prometheus and Grafana as two new services under services:, alongside web and db from Module 7.

prometheus: image: prom/prometheus ports: - "9090:9090" volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml grafana: image: grafana/grafana ports: - "3000:3000" depends_on: - prometheus
4

Launch the whole stack

docker compose up --build

Four containers now run together: the app, its database, Prometheus, and Grafana. First, confirm your app is emitting metrics — open http://localhost:8080/metrics and you'll see raw counters like flask_http_request_total.

5

Generate some traffic

Metrics are boring with no activity. Hit the app a few dozen times (refresh the browser, or run this):

for i in $(seq 1 50); do curl -s http://localhost:8080/ > /dev/null; done
6

Query in Prometheus

Open http://localhost:9090. Under Status → Targets confirm notes-app is UP. Then in the query box, try:

# total requests served flask_http_request_total # requests per second over the last minute (traffic) rate(flask_http_request_total[1m])

The "aha!" moment

You're seeing live, queryable data about your running app — request rates, response times, error counts — all without changing your app logic. That's instrumentation paying off.

7

Connect Grafana to Prometheus

Open http://localhost:3000 and log in with admin / admin (it'll ask you to set a new password). Then:

  • Go to Connections → Data sources → Add data source → Prometheus
  • Set the URL to http://prometheus:9090 (the service name — same network!)
  • Click Save & test — you should see "Successfully queried".
8

Build your first dashboard panel

Create Dashboards → New → New dashboard → Add visualization, pick your Prometheus source, and enter this query:

rate(flask_http_request_total[1m])

Generate more traffic (Step 5) and watch the graph climb in real time. Add a second panel for latency:

rate(flask_http_request_duration_seconds_sum[1m]) / rate(flask_http_request_duration_seconds_count[1m])

You built an observability stack

App → metrics → Prometheus → Grafana, all wired together with Compose. This is genuinely how production systems are monitored.

Shortcut: import a pre-built dashboard

Instead of building panels by hand, go to Dashboards → New → Import and paste a dashboard ID from grafana.com/dashboards. The community has thousands ready to go.

9

Don't forget logs

Metrics tell you something's wrong; logs tell you what. You already have them:

docker compose logs -f web # follow the app's logs

Scaling logs up

For many containers, you graduate from docker logs to a log aggregator — Loki (pairs with Grafana) or the ELK stack (Elasticsearch + Logstash + Kibana) — so all logs are searchable in one place.

The Metric Types You'll Meet

TypeWhat it isExample
CounterOnly goes up; reset on restart. Use rate() to get per-second.total requests served
GaugeGoes up and down — a current value.memory in use, active connections
HistogramBuckets observations to compute averages & percentiles.request duration
SummaryLike a histogram, with client-side quantiles.response size

Counters: always wrap in rate()

A raw counter just climbs forever — not useful on a graph. rate(metric[1m]) turns it into "per second over the last minute," which is what you actually want to see.

Cheat Sheet

Ports, queries, and commands you'll reach for. Bookmark this.

Default ports

ServiceURLLogin
App metricslocalhost:8080/metrics
Prometheuslocalhost:9090
Grafanalocalhost:3000admin / admin

PromQL starters

QueryShows
upWhich targets are reachable (1 = up)
flask_http_request_totalTotal requests (a counter)
rate(flask_http_request_total[1m])Requests per second (traffic)
rate(...[1m]) by (status)Rate split by HTTP status (find errors)
histogram_quantile(0.95, ...)95th-percentile latency

Troubleshooting

SymptomLikely cause & fix
Prometheus target is DOWNWrong target. Use the service name + container port: web:5000, not localhost/8080.
/metrics is 404PrometheusMetrics(app) not added, or library missing from requirements.
Grafana "data source not working"URL must be http://prometheus:9090 (service name), not localhost.
Empty graphsNo traffic yet — generate some (Step 5); counters need rate().
Prometheus won't startprometheus.yml indentation (2 spaces, no tabs) or wrong volume path.
Port 3000/9090 in useChange the host port, e.g. "3001:3000".

Your Challenge

Deepen it before the Capstone:

  • Add a Grafana panel for error rate — requests with status 500 per second.
  • Add a 95th-percentile latency panel with histogram_quantile.
  • Import a community Flask/Prometheus dashboard by its ID.
  • Bonus: add a Prometheus alert rule that fires when no requests arrive for 1 minute.
# requests per second that returned a 500 error rate(flask_http_request_total{status="500"}[1m]) # error ratio (fraction of all requests that failed) sum(rate(flask_http_request_total{status="500"}[1m])) / sum(rate(flask_http_request_total[1m]))

Recap & What's Next

You can now

Instrument an app, scrape it with Prometheus, query metrics with PromQL, visualize them in Grafana, and find logs — the full observability loop. You can finally see what your deployments are doing.

Next up: Module 12 — the Capstone, where everything from Modules 6–11 comes together: code → container → cluster → CI/CD → monitored, end to end.

Monitoring & Logging

Objectives Why Monitor How It Fits Hands-on Lab Metric Types Cheat Sheet Troubleshooting Challenge Recap