Editorials

Tips to Optimize SQL Server Statistics

While I was doing research into statistics for yesterday’s editorial, I came across some characteristics of how SQL Server works with non-materialized objects and query plans, I felt useful to impart. Many of you probably already know these tips. Some you may have experienced them through trial and error; this may confirm your experience.

First of all, objects that are not materialized do not have statistics. This includes objects such as memory tables, or table valued functions. This has an impact on the query optimizer when it builds a query plan using non-materialized objects. This is similar to building a query across databases. Statistics are not always available on remote databases. So, SQL Server has to make the best guess it can when producing a query plan using non-materialized objects. Most often it will assume a table scan.

Note: SQL Server memory optimized tables, introduced in SQL Server 2012, are not the same as memory table variables. They do have statistics available, although the maintenance of them is different from disk based tables. Still, the optimizer is able to utilize statistics on memory optimized tables when producing a query plan.

What can you do in those situations? Sometimes, you can use # Temp table as an intermediary. Copying data from a non-materialized source into a temp table results in an object that has statistics. You can then filter or join the #Temp table to the materialized tables in your database, taking advantage of statistics created on the temp table.

The second tip I found when studying statistics is, when you use an input parameter as a value in the where clause of a stored procedure or function, the query optimizer builds a plan based on the value as it was established when you call that function or procedure. If you change the value of the input parameter in the procedure or function prior to selecting data based on the input parameters, the query plan may not be the best match, because the statistics used to evaluate the query were based on the original values.

If you need to modify the value of an input parameter, it was suggested that you break the procedure or function into two separate objects. In the first object, modify the values as needed. Then, have the first function or procedure call a second function or procedure with the modified values.

I frequently modify input parameters in my code, allowing me to set a parameter to a default value if null is passed for a parameter. Other times I build parameters in procedures or functions based on initial input parameters.

Here is an example of splitting a procedure as was recommended.

Original procedure not taking advantage of SQL Server statistics:

CREATE PROCEDURE prsFindByDate(@StartDTM DateTime)

AS

BEGIN

SET @StartDTM = CONVERT(VARCHAR(10), @StartDTM, 101) – drop time data

DECLARE @EndDTM DATETIME = DATEADD(ms, -3, DATEADD(day, 1, @StartDTM))

SELECT *

FROM SomeTable

WHERE CreateDTM BETWEEN @StartDTM AND @EndDTM

END

Two new procedures implementing the same functionality, taking advantage of statistics:

CREATE PROCEDURE prsFindByDateRun(@StartDTM DateTime, @EndDTM DateTime)

AS

BEGIN

SELECT *

FROM SomeTable

WHERE CreateDTM BETWEEN @StartDTM AND @EndDTM

END

CREATE PROCEDURE prsFindByDate(@StartDTM DateTime)

AS

BEGIN

SET @StartDTM = CONVERT(VARCHAR(10), @StartDTM, 101) – drop time data

DECLARE @EndDTM DATETIME = DATEADD(ms, -3, DATEADD(day, 1, @StartDTM))

EXEC prsFindByDateRun @StartDTM, @EndDTM

END

Perhaps these two tips will assist you in optimizing your queries to take advantage of statistics and indexes on your materialized objects.

Cheers,

Ben