Amazon AWS, Azure SQL Database, Editorials

Backup and Restore… No, Really.

I cannot tell you how many webcasts, articles, sessions and such have been done about backup and restore.  But here we are again.  Saying similar things, with different icing on the cake I suppose.

Essentially – make sure they work, make sure you know how to use them, and test your “assuredness.”

This applies whether you’re on-premise or in the cloud or both or some other thing you have set up that “just does it for you.”

I’ve been talking with a couple of folks lately that are in need of recovery – one in a development environment and one in a partial production environment – and both made some huge assumptions about what was happening “automagically” behind the scenes.  They figured since it was with a provider that did automatic backups, that the systems and solutions to USE those backups would be simple enough to figure out.

Now they’re figuring it out in real-time.  As in the the world waits while you determine how best to restore.  Just not cool.  Yes, the backups are there, but figuring out the restore process, figuring out how to return to a point in time, and then get that data into your production system – all of those bits and pieces are fairly important to understand.

It’s logical to know that you’ll have to manage it for your on-premise solutions.  I think a lot of people get that.  Still – test your backups.  Check that the backups are working and you know where the files are, what you have stored, how long of a window you have available and what the process may be for getting those files accessible to a server you’re recovering.  Of course being familiar with the restore process is critical too – what happens after a restore, what if you only need a single table?  What if it had dependencies?  All of these come into play, and all are extremely important to know and have actual experience with BEFORE you need to do the deed.

The same is true though for your cloud-based installations.  Do a restore.  Find out how it works.  Don’t assume your backup window is what you need (though it may well be) – verify.  Taking the time now, instead of when things are going critical, will mean you have a plan, know what pieces go where, and how the restore process works at the provider.  Some restore ONLY to a new database or instance.  Some support different options for the restore in terms of point in time recovery.  Find out what’s happening – but more importantly, DO IT.  Actually sit down and do the thing.  Find out how long it takes.

The last suggestion is to know what success looks like.  You need to know how to test the process and make sure all things are running as they should.  It might be calling a specific user to test or supervisor or something you can run or check on to make sure all the pieces are happy again.  Whatever it is, you need a point in time to know you can sit back in your chair, grin, and know that you did it, recovered things and life is good again.

Just don’t assume.  Don’t assume your backups are happening.  Put checks in place.  Don’t assume you can figure out the providers processes, no matter how good they are about building interfaces and making great tools.  Test it.  Do it.  Know it.  Have a process for revisiting it – remember that things change very, very quickly with all of the providers – menu options, processes, etc.  Have a means of revisiting the process every 6 months or so so you know what’s what.

It can save all sorts of headaches and stress – things that will already be in abundant supply if the stuff hits the fan and you are the one to pick up the pieces and put it together again.