SQL Server Disaster Recovery Expo
Tomorrow, June 17th
Today is the last day to register for this free expo and gain access to the insights and experiences of Eight renowned experts in SQL Server Disaster Recover. Our speakers include Alan Berry, Sarah Barela, Keven Kline and others. Additional content is available in the optional premium track for those wishing to dig even deeper. I hope to see you there!
Click here for more info
Reader Feedback on High/Low
Following are a couple of emails I received regarding the HighLow external key assignment technique from the editorial yesterday, "Surrogate Key Assignment Techniques."
Paul Writes:
So, how’s the multiple user roll-your-own sequence working for your page split count? Inserts a bit slow? This method has always been on my 10 most evil SQL Server dev ideas list.
For those of you not familiar with how SQL Server physically stores data, whenever there is not enough room on the page where data should be placed in a Clustered index, a new page is created from available space in the database. If the new page is not physically adjacent to the page of the previous record, this results in what is known as a page split.
Using the HIGH/Low method for a clustered index may result in more page splits than simply using a sequential increasing number, such as an identity column (SQL Server best practice). SQL Server has been tuned over the years to work best with a single sequential number as the clustered index. It works best if the numbers are actually sequential as they are inserted into the database.
There are a few SQL Server options to Paul’s accurate assessment:
- Live with it. Re-index and defragment during off hours. This probably won’t work if there are many records added frequently since page splits impact not only insert queries, but also update and select queries.
- Leave more space in the index for additional records by reducing the fill factor of the clustered index. Of course this can result in less performance if it reduces the number of populated records per page. But inserts will work more quickly without splitting pages.
- Reduce page splits by using an Identity column on the table as well. Create a clustered index on the identity column. Address the table with the HighLow value as the primary key. This method has some performance loss due to the fact that the primary key is not a clustered index, requiring some additional overhead.
- Reduce page splits by using an Identity column on the table as well. Create a UNIQUE clustered index on the identity column. This allows the identity column to perform as an alternate key. Therefore, the identity column or the HighLow column may be used to assert referential integrity through foreign keys. The Identity column will have the best performance due to the fact that it is the Clustered index. The HighLow value will have better performance for foreign key assignment because the key value may be assigned prior to inserting data into the database.
This methodology of using an alternate key works with any kind of externally assigned key methodology. If you wish to use natural keys, UIDS, GUIDS, you don’t have to give up all performance to maintain other key sources.
Paul reminds us:
Using an identity for the clustered index and then a second sequence for the PK is also a huge perf problem. Then all the query access that’s by the PK will be forced to do a bookmark lookup.
I agree. For this reason we have option 4. There is nothing that states you can’t have more than one value in a foreign (child) table referencing a key value in the parent table. A child table can have both a primary key and an alternate key value of its parent table. It does waste a little space,
It is easy enough to find the identity value for a row with HHHHLLLL and insert that alongside of HHHHLLLL in a foreign table. This provides you with the ability to assign a key value outside of your storage engine, and then retrieve the value when storing the records in the child table(s), alleviating the need for round trips to determine and assign the primary key value assigned in the database to each child record.
Since SQL Server is the target for this discussion, this technique allows you to insert multiple child records in a table with multiple foreign key values in a single insert statement due to the fact that stored procedures now accept table parameters as input for stored procedures. Assigning HighLow values outside the database may be used to assign the IDENTITY value assigned within the database in a single query, resulting in the bookmark lookup only happening for inserts of child records.
Now you have both worlds; SQL Server performance and optimized inserts. Universal construct that works well in other engines without the fuss.
Many thanks to Paul for keeping me honest. Just because something is cool, doesn’t mean it is always good. My intention is not to be defensive; rather, I’m attempting to demonstrate that both needs may be accomplished, and that there are times when it is appropriate.
Do you have other comments or suggestions. Share them with us by sending them to btaylor@sswug.org.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Game-changing Features in the Silverlight 5 Beta (Part 3)
In the second part of my “Game-changing Features” series, I investigated how to create multiple windows in a trusted Silverlight 5 application. Now, it is time to explore another set of features: SoundEffect Class for Low-Latency, Supporting Double- and Triple-Mouse Clicks and Linked Text Containers.
Featured White Paper(s)
Hybrid Management White Paper
As organizations worldwide continue to look for ways to stay ahead of a hyper-competitive business landscape, cloud computing… (read more)
Featured Script
check_schema_differences.sql
To report on the differences in table structures between two databases … (read more)