Editorials

SQL Server Join Techniques

SQL Server has three different join techniques we’re used to hearing about.

A Merge Join takes two ordered sets and walks through them, keeping the matching rows. This join only looks at records sequentially in both tables it only moves forward, never backward. If no match is found in table A, and the value in table B is Lower, then Table B moves to the next row. If Table B is > Table A, then table A moves to the next row. If you have sets with pretty much the same number of rows, or a similar number, where you will almost always, or always have a match for every row, the Merge join is usually the fastest. It’s even faster if the index you are using on both tables is already sorted in the same order.

A Loop Join walks through all the records in one table, and seeks the matching records from another table for a match. If an index is on the loop criteria, this is faster when you have a small table joining a very large table, because it doesn’t process every row in the large table, it uses a index to get the matching data. So, a table with 20 records, joining a large table with millions of records, where only a few thousand will match may use a loop join.

The final type of join is executed when there aren’t any indexes to implement the join, such as those instances where you are joining data from multiple tables to a single table In this case, SQL Server generates a hash for each record, and then does a join on the Hash value to simplify the comparison. This is a really simple explanation.

I went down this path today to build a foundation for the editorial on Monday demonstrating why # Tables in tempdb can outperform memory table variables.

Please leave a comment if you want to clarify or correct this simple description. Then we can pick up on this foundation Monday.

Cheers,

Ben