We analyzed 1,000 AWS cost anomaly alerts across our customers last quarter. 53% were from resources a developer spun up and forgot about. Here's the breakdown!
We run cloud cost management for mid-market AWS customers and pulled data from our anomaly detection across accounts last quarter.
The results were honestly embarrassing - and familiar:
- 53% of anomalies: forgotten dev/test resources (EC2s, EBS volumes, NAT gateways left running after a sprint ended)
- 21%: data transfer costs nobody budgeted for, usually cross-AZ or egress to the internet
- 14%: RDS instances over-provisioned during a peak that never got right-sized
- 12%: everything else (Lambda timeouts, S3 lifecycle rules misconfigured, etc.)
The wild part? Most of these weren't caught by AWS Cost Anomaly Detection natively - they were caught by threshold alerts we set manually.
AWS CAD is free and a good starting point, but it's terrible at catching slow-burning waste (costs that creep up 5–10% a week rather than spiking). It's optimized for sudden spikes, not gradual drift.
This is an open discussion. Is your biggest cost leak dev waste, over-provisioning, or something else entirely?