So much has been written about DBAs, the cloud and the impact cloud-based resources and services have on the role of the DBA. There’s no doubt the impact has been substantial. In many cases, however, it’s been a good thing. The mundane “noise” of managing databases has become a game of options.
You have the option to have a fully-managed process of backing up, assuring recovery for, and performance of, your databases. It’s a great opportunity. All of this power and support brings with it the chance to do bigger things with the data. /End of Sermon
One of the biggest things we continually hear now when talking with people is that the bill for living in the cloud becomes, itself, this living, breathing thing. The bill gains wings. The charges continually change, grow and have to be managed to.
It used to be that physical restraints led the financial constraints – processes stepped in to help put a fence around resources. Sure, disk was cheaper, servers in general were. BUT, even though they were cheaper and made it simple to add capacity, there was still an operational process to spend money. It didn’t happen “organically.”
With cloud-based, you’re paying for usage, you have nearly unlimited resources and many times you may not even have control over the processes that demand more resources. If your company has a successful sale, your systems scale up as they should, you may not have even been aware it was going on until systems started responding.
This is a great opportunity for data professionals. We hold the biggest sway over the big fire-breathing piece of cloud resources. Processing and storage associated with data. We’ve had great success working with companies to pull systems together, to remove redundant data sources and storage mechanisms and to get a grip on storage usage and processing power requirements.
By standing back with a bigger picture look at the processing, you can consolidate database instances, manage backup storage requirements, even consolidate processing. It’s also been a good thing to review auto-scale operations. Now that many systems have been in place for quite some time, you have real metrics about what was needed to address those sales and crush times on the resources.
You may find now that you don’t need to scale at 50-60% spike utilization. Perhaps it needs to be sustained for some period and can be a higher utilization. Set off an alert as it pass through unusual usage, so you can be aware that it may be needed (scaling) but don’t actually move to new resources until you know you’ll need them.
As a data professional, you’ll know the patterns. You’ll know the storage requirements. You also need to make it a point to know about the processing resources. We’ve been bitten internally and externally at accounts we work with, where they had automated processes that spun up, did a bit of work then spun down.
“Look! See?! We only bring them up when they have to do their thing! Isn’t that cool?”
Yeah, except you’re spinning up 3 XL-style instances in anticipation of lots of data, but no one has revisited to see if there’s a better way, a way to not use those spot instances, a way to process with existing resources. Perhaps there’s even an opportunity to do some real-time processing rather than batches.
These were super-difficult to identify where the initiation was coming from, why it needed this type of horsepower for this seemingly simple request and why people that were doing reporting style processes had this type of access to the resources on the cloud provider.
Lots of good questions, but they come back to data. Where it’s coming from, used for, how it’s processed, how it’s stored. Those are data professionals that know and understand the broader map, perhaps, and can save big headaches and help manage that bill. Huge data-architectural opportunities abound – and don’t even get me started about all of the learning that needs to be done from the data in our systems – both pay huge dividends for the company.