Editorials

SQL Server Statistics

Statistics, in SQL Server, are very powerful. The query optimizer utilizes statistics to determine the most efficient sequence to process a query. This is pretty common knowledge. Maybe you are not aware of some of the cool capabilities Microsoft has built into statistics.

Statistics maintain the distribution of data in a table. They maintain multiple ranges or bands of data for the contents of a statistic, tracking the min value, max value and the number of records included in that range. When you search based on the value of the contents of a statistic, it can quickly determine the range, and know how many rows must be reviewed for that attribute.

Like indexes, you can create statistics on a single column, multiple columns, and even use a filter for a statistic. Using multiple columns in a statistic follows the same rules as composite indexes when determining if it can be useful to the query optimizer when building a plan.

Microsoft recommends keeping auto updating of statistics turned on for a database. They have optimized the maintenance of the statistics by having a quick algorithm to determine if updating statistics may enhance performance. Statistics are not updated with every modification.

You can turn on auto-create statistics for your database. Be aware that turning this on results in the creation of many statistics that may not be useful to maintain constantly. If you execute a query filtering on a column that is rarely used as a filter, it is possible SQL Server will create statistics on that column, resulting adding the load of maintaining that new statistic, which may be more painful to maintain than the value it provides for infrequent use.

Are you a wizard at the utilization of statistics? Share your experience here or by email to btaylor@sswug.org.

Cheers,

Ben