Editorials

Is an SLA Passe’? (Spoiler: Yes.)

Is an SLA Passe? (Spoiler: Yes.)
The question of SLAs being helpful, at the hosting of your servers level anyway, is, I believe a done deal. It’s not helpful, it’s not useful and it’s simply going to give you false hopes of infrastructure, whereas you could be putting in place more robust options that would take care of failures by themselves.

Here’s that article. And then I noticed an ironic related story – that cloud providers must guarantee zero downtime.

I have to say, I agree with the premise that the SLA is the wrong way to get it done. You may get the promise, but you’re guaranteeing UPTIME, not usually application stability and availability. Those are very different beasts. Think about what you’re really after – you want your applications and solutions to be avaialble and online, ready to do business.

Your SLA is properly set up infrastructure, as the article touches on. Redundancy, failover, recoverability, whatever keywords you want to associate with it. The fact is the that availability at the hardware and OS-level is more easily managed than the application, particularly if something should go awry.

Take a step back and look at the layers that make up availability for your applications. From the OS to the hardware bits you’re using, to the systems applications (database, utilities, etc.) to hosting and so-on. These are things that are likely "on you." They’re your responsibility to make sure they’re running as needed.

I’d suggest setting aside the time and energy you’d spend on a hosting SLA and instead pick a solid provider, learn their infrastructure options and figure out how you apply them to your needs. All infrastructure guidance is not created equally, so make sure you know what happens when… then have your known scenarios to address.

For us data types, this means databases, accessibility, recovery and security items. We have to figure out what happens when the database goes away or becomes unstable for whatever reason. Do you have failover options in place? Backups? What’s the recovery path like? Does it work? Those are some powerfully important areas to worry about that will have bigger payoffs than an SLA that has gummi bears for teeth.

How do you approach uptime for your end-users? Do you make internal SLAs? What works for you?