A

Prometheus and Grafana: Observability Stack From Scratch

A
Amit Nepal
Security Engineer · Linux & Infrastructure · Offensive Security
·Jun 25, 2025·1 min read
Infrastructure

Prometheus and Grafana: Observability Stack From Scratch

Jun 25, 2025 · 1 min read

Why Observability Matters for Security

I got into monitoring as a sysadmin. I stay in it as a security engineer because an observability stack is also your anomaly detection platform. CPU spikes, unusual network connections, failed auth floods — all visible in Prometheus metrics if you've instrumented the right things.

Prometheus Configuration

global:
  scrape_interval: 15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

rule_files:
  - /etc/prometheus/rules/*.yml

scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets: ['node-exporter:9100']

Node Exporter

docker run -d --name node-exporter \
  --net="host" --pid="host" \
  -v "/:/host:ro,rslave" \
  prom/node-exporter \
  --path.rootfs=/host

Security-Relevant Alert Rules

groups:
- name: security
  rules:
  - alert: UnusualOutboundConnections
    expr: node_netstat_Tcp_CurrEstab > 500
    for: 5m
    annotations:
      summary: "High TCP connections on {{ $labels.instance }}"

Defensive Takeaways

  • Instrument application-level metrics, not just host metrics
  • Set up Alertmanager with PagerDuty/Slack routing before you need it
  • Retain Prometheus data for at least 90 days for incident retrospectives
  • Secure Prometheus endpoints — they expose sensitive system information
Keep going

Get the next writeup in your inbox

New posts delivered when I publish. No spam.