Editorials

Our SQL Server Issue Resolved

Featured Article(s)
Oracle 2-Day DBA in x Days Post 15; Temporary Tablespaces
Chapter 6 brushes over temporary tablespaces so let’s just take a deeper look.

Great Microsoft Virtualization Whitepaper (Direct Link)
EMC is working very closely with Microsoft on many fronts. One is the whole area of virtualization – and their support for this, and SQL Server and much more in the Networker Product is worth a look. If you’re looking for a comprehensive tool to use for managing not only BACKUP, but RECOVERY (which, let’s face it, is the important piece), check it out when you get a minute

The Solution…. Our Solution At Least.
It turns out that the solution we ended up with was pretty straightforward. Quick summary: intermittent, but regular, performance issues on the SQL Server. No real indication of jobs running of any kind of task firing on a regular basis and no real correlation to workloads or other activity on the system that looked like it should be causing the problem.

Our ability to report on the problem objectively included Disk Queue wait times – the counter in Performance Monitor. This counter would skyrocket when the issue was occurring. On the user-side, the application/site would timeout or come close to it, with wait times approaching several minutes if the full timeout didn’t trip before that point.

We worked a LOT on understanding the root issue. We had a huge array of options from everyone that submitted ideas. Many centered around disk issues – from errors to utilization to simple overload on the disk. We ended up using a performance monitoring tool to see the issue. It uses some really nice graphics that help you see what’s happening, then you simply highlight the timeframe you’re interested in and you can see what statements are running, both with, and without, the parameters. More on the tool in a later column (write to me directly if you’re interested in more information in the meantime).

This is not an ad for them – it’s just the tool we ended up using. Using this, we narrowed down our issues to two things. First, full-text index searches seemed to be present every time the issue surfaced. Second, we were indeed seeing SQL Server work really hard to utilize indexes, but they weren’t having the desired effect.

We decided to first move the full text indexes to a different drive altogether. As soon as we did this, the disk queue dropped off to about 25% of it’s problematic levels. Still really spikey, but not system-stopping to be sure. Good, but not "done."

Next, I had received several notes that suggested we do an sp_updatestats on the database, manually updating the internal stats used by SQL Server. We did this. Our stats dropped off the charts to the point where we wondered if they were still being gathered. We had auto-update stats on the database objects, but this sp_updatestats had definitive impact on our issues. It shouldn’t have been this significant, even several people that we spoke to at Microsoft doubted the impact, but I have to tell you, it was clear as day in our testing and results.

So, two-fold. First, move the full-text indexes. Second, sp_updatestats. "Simple" in retrospect, but the investigation just didn’t add up while we were in the thick of it.

I’ll get that article put together this week with everyone’s troubleshooting suggestions. I think the power is in the process of debugging, testing and troubleshooting. Though you might not have this specific issue, the processes people suggested were great tools for digging into your system. Thanks for all of the suggestions and assistance!

Featured White Paper(s)
SQL Server Virtualization Management
Virtualization is becoming the popular practice in many environments, allowing organizations of all sizes to utilize resource… (read more)

The Shortcut Guide to SQL Server Infrastructure Optimization
In The Shortcut Guide to SQL Server Infrastructure Optimization, the new eBook from Realtime Publishers, leading IT author Do… (read more)