What actually gets lost when on-call rotates isn’t in the runbooks
When you're handing off on-call? The runbook covers the basics: where the dashboards are, who to page if something’s really bad… all that standard stuff.
But there's this other layer, the things you actually say when passing it on. Like:
- "Hey, if Alert X fires up? Everyone thinks ‘check Y,’ but don’t bother with Y at all. Just go straight to Z."
- "If logging for Service A stops suddenly? Usually not a logging problem at all, something upstream probably died quietly."
That kind of info gets shared in conversations during handoffs but not documented anywhere official.
You say it once while sitting across from someone and hope they remember later... because if they roll off rotation soon after? That knowledge just evaporates