Editorials

Compensating for Bad Schema

Compensating for Bad Schema
Most of the items from the bad things to do in a database survey were pretty straight forward. I find it interesting that the item receiving the most votes for bad things to do was also one of the items I received the most questions regarding it’s meaning, “Compensating for Bad Schema.”

There are a number of ways you can compensate for a bad, or poorly performing schema. For example, a table containing BLOB (binary large object data) data and relational data may perform poorly due to having the BLOB data included. The reason is that BLOB data has a different storage method than other data types.

Using the Varbinary(MAX) data type will cause the data to be stored inline with all other columns as long as the sum of the bytes for all columns is less than 8k (SQL Server 2000 and later page size). If the BLOB data will not fit within this constraint, it is placed in a series of pages and accessed using a linked list.

Regardless of your good practice of not using SELECT * when accessing a table, performance for this schema can be decreased if a large amount of your BLOB data fits inside the 8k Page boundary. Why is this? Because when you read in the normalized data, desired frequently without the BLOB data, then a single read of the disk (or buffer cache) can only access 8 records (8k * 8 pages = 1 64k Block).

If, instead, you had placed your blob data into a separate table, with a 1-1 relationship, you would be able to read a lot more records with a single buffer or disk read. One way this poorly performing schema is compensated is by creating a covering index containing only the relational columns.

The problem with this compensating solution is that you end up creating multiple covering indexes because you can’t create an index on an index, unless you create an Indexed View, another form of compensation.

A Relational database will do a lot better job of joining this data when it is desired, that separating it out when it is not. I recommend this principle…store the BLOB data separately, preferably in a different, related table, stored in a different file object, residing on a different disk drive when:

  • The normalized data is frequently returned without the BLOB data
  • The normalized data is updated more frequently than the BLOB data
  • The BLOB data is updated more frequently than the Normalized data
  • The clustered index is rebuilt or reorganized frequently
  • There are many indexes with lots of columns performing covering of queries
  • You need to optimize database backups using FileSystem backup instead of backing up the whole database
  • The contents of the BLOB data greatly reduces the number of records in an 8K page

Otherwise, integrate the BLOB data with the Normalized data.

Next time I’ll dig into the use of Triggers to compensate for a bad schema.

You can share your experience with schema compensation techniques. Write to btaylor@sswug.org.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
Reporting Services for the DBA (Part 2 of 3)
Take proactive DBA work to the next level. Put your administration queries to work in SSRS and have the answers to the hard questions at your finger tips. Identify, log, and summarize critical data sources on server performance and health. Learn how to plug in to historic data from the jobs, perform data, and app logs so you can analyze and trend over time. Provide easy access to important real time and semi-historic data from the DMVs. Provide access to data that was previously off limits to management, developers, and Jr. DBAs. You will also learn how to set up access to critical real-time performance indicators through the Web, e-mail, and mobile devices. All demos will be done on SQL Server 2008. Most will be compatible with SQL Server 2005.

Featured White Paper(s)
Storage Optimization
Written by AvePoint

Microsoft SharePoint Server 2010, the latest release of M… (read more)