We’ve recently moved a heavily used SQL Server instance from a VM-based installation to an RDS (Amazon, SaaS) solution.
The server instance was not dedicated (don’t get me started) and, under load, would feel the pain of that type of installation. We would see loading that would approach 90% with frequent spikes to 100% depending on what was happening, jobs, etc. We knew we had an issue and we have the luxury of it being our own solution, so wanted to experiment a bit.
We decided to move to an RDS instance with essentially the same memory resources, same processing resources. We also wanted to upgrade to the latest version of SQL Server coming off a rather old version. We expected some good benefits from this update but were most anticipating seeing what happens simply moving to a dedicated SQL Server instance.
Now, of course, I understand this isn’t quite apples :: apples in this comparison – we’re changing instance versions, etc. But it was as close as we were going to get in this experiment with this opportunity to monitor loading.
The prior instance broke all of the standard rules for setting up a SQL Server – it wasn’t dedicated to SQL Server, had accumulated many support roles over the years – from small job to database intensive jobs that relied on applications and toolsets on the server itself to process. It also jumped servers in the sense that we had a linked server setup to go from MySQL to SQL Server.
In short, it was a mess, but a good candidate to “eat our own dog food” in terms of putting in place the things we always tell others they should be considering.
We moved to SQL Server 2016. We used all the sizing tools, compared Azure and RDS, the decision point really came down to just the fact that the solution was already hosted on AWS, so the move was simpler to accomplish there. But using the tools to determine instance size, we set everything up accordingly, chose the instance, etc.
We migrated over to the new server, let it run in “don’t give it too much to eat” mode for a bit so we could get our arms around new monitoring tools and new things that we needed to be monitoring with those tools.
Then we turned it loose with the normal workload of doing the day’s business.
The difference, moving to an instance that could essentially focus on the tasks at hand, was astonishing.
CPU utilization dropped to somewhere between 5 and 15%.
Performance increase and connections and requests were serviced much more quickly with no resource contention.
During peak load times, the processing capability was so starkly different that we didn’t actually see a spike in CPU in servicing those peak times. This part was really quite incredible. Our costs are down because RDS is more finely tuned for the instance size we need for SQL Server, rather than a bigger non-dedicated box that happened to have SQL Server on it.
Our performance is up as mentioned, meaning our growth options are much more clearly defined and available to us now.
It was one of those upgrades where you just sort of scratch your head and say “I knew it would be good, but not THIS good.”
It was *that* significant in terms of cost, performance, and utilization. If anyone would like to talk about exactly what we saw or did, just shoot me a note and I’d be happy to discuss.