We rebuilt infrastructure from backups as a DR-test. The restore worked. The environment didn’t.
Recently we rebuilt infrastructure from backups while setting up a new environment.
Part of the idea was also just seeing how recovery would actually go in a real disaster situation and what kind of hidden problems would show up along the way.
Luckily this wasn’t a production outage, so nobody was panicking and we could take our time digging through issues properly.
We thought it would take maybe a couple of days. It ended up taking weeks...
Every few hours we discovered something new: forgotten settings, incompatible software versions, undocumented dependencies, random unexplained errors, or some component nobody had touched in years.
The good part is that the next test restore was dramatically faster because we already understood most of the weak spots and had documentation for the recovery process.