Prometheus

💤0
Lv 10 XP
← 📊 Monitoring & Observability · Metrics & Dashboards

Prometheus

Intermediate ⭐ 80 XP ⏱ 18 min #observability#prometheus#metrics

Collect and query time-series metrics with Prometheus and PromQL.

📖Theory

Prometheus is a time-series database and monitoring system. Its defining design is the pull model: Prometheus periodically scrapes a /metrics HTTP endpoint on each target. Apps expose metrics via client libraries or exporters (e.g. node_exporter for host metrics).

Metrics have a name, labels (dimensions like method="GET"), and a value. The four metric types: counter (only increases), gauge (up/down), histogram, and summary. You query with PromQL and alert via Alertmanager.

🌍Real-World Example
# Requests per second over the last 5 minutes
rate(http_requests_total[5m])

# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Error ratio
sum(rate(http_requests_total{code=~"5.."}[5m]))
  / sum(rate(http_requests_total[5m]))
# scrape config
scrape_configs:
  - job_name: api
    static_configs:
      - targets: ['api:9090']
✍️Hands-On Exercise
  1. Explain the pull model and what a /metrics endpoint is.
  2. Distinguish a counter from a gauge with an example of each.
  3. Write a PromQL query for the per-second request rate.
  4. What does an exporter do?
🧾Cheat Sheet
ConceptDetail
ModelPull (scrape /metrics)
CounterMonotonically increasing
GaugeGoes up and down
HistogramBucketed distributions
LabelsMetric dimensions
PromQLQuery language
ExporterExposes metrics for a system
AlertmanagerRoutes alerts
💬Common Interview Questions
How does Prometheus collect metrics?

With a pull model: it scrapes HTTP /metrics endpoints on targets at a set interval. Apps expose metrics via client libraries; exporters expose them for systems that can’t natively.

Why query rate() on a counter?

Counters only increase, so the raw value isn’t meaningful on its own. rate() gives the per-second increase over a window, which is what you actually want to graph/alert on.

What's the difference between a counter and a gauge?

A counter only goes up (e.g. total requests); a gauge can rise and fall (e.g. memory in use, queue length).

📚Official Documentation

📝 My notes on this topic

Auto-saves as you type