Four Golden Signals of Monitoring

What are the four golden signals of monitoring and why are they important?

junior

beginner

Monitoring

Question

What are the four golden signals of monitoring and why are they important?

Answer

The Four Golden Signals (from Google SRE) are: Latency (request response time), Traffic (requests per second), Errors (rate of failed requests), and Saturation (resource utilization). These metrics quickly indicate service health and help diagnose issues. They're the minimum metrics every service should track.

Why This Matters

These signals come from Google's Site Reliability Engineering practices. Together they provide a comprehensive view of system health. Latency shows user experience, traffic shows demand, errors show reliability, and saturation shows capacity. Start with these before adding more specific metrics.

Code Examples

Prometheus alerting rules

yaml

Common Mistakes

Only monitoring one signal (e.g., just CPU usage)
Setting alerts too sensitive (alert fatigue) or too loose (missing issues)
Not measuring latency at different percentiles (p50, p95, p99)

Follow-up Questions

Interviewers often ask these as follow-up questions

How do you distinguish between latency for successful vs failed requests?
What's the difference between metrics, logs, and traces?
How do you set appropriate alert thresholds?

Also worth your time on this topic

Interview

Monitoring and Alerting Strategy

How do you design a monitoring and alerting strategy? What metrics would you track and how do you avoid alert fatigue?

mid

Checklist

Monitoring & Observability Checklist

Comprehensive checklist for implementing monitoring, logging, tracing, and alerting across your infrastructure and applications.

60-90 minutes

Article

SLOs, SLIs, and Error Budgets: A Practical Implementation Guide

Your service went down at 2 AM and nobody could agree on whether it was "bad enough" to page someone. SLOs, SLIs, and error budgets fix that. Here is how to define, measure, and act on them with real Prometheus queries and alerting rules.

Four Golden Signals of Monitoring

More Monitoring interview questions

Also worth your time on this topic

Monitoring and Alerting Strategy

Monitoring & Observability Checklist

SLOs, SLIs, and Error Budgets: A Practical Implementation Guide