Editorials

The Power of the CROSS JOIN

The Power of the CROSS JOIN
I have been intrigued with the power of a Cartesian query ever since the days when we had to pull a power cord to kill a query someone had written doing a Cartesian query on two very large tables.

A Cartesian query, implemented in TSQL by using a CROSS JOIN, results in joining all rows in table A to all rows in table B. In this case, you return the number of rows in Table A * the number of rows in Table B. With a CROSS JOIN there is no join criteria in the JOIN clause; by design. A cartesian join of two tables with 100 rows each would be 100 * 100 = 10,000 rows. Guess what happens to your server when you run a query like that on tables with thousands or even millions of rows? Start looking for the power cord.

This can be a very useful behavior, and I have often taken advantage of the CROSS join. One of my favorite examples is when I want to take two dimensions, get all possible permutations of those dimensions, and then find facts meeting in the intersection. Here is an example:

Dimension 1 – Sales Region
Dimension 2 – Time in Quarters
Fact – Sales

I can do a cross join of Sales Region to Quarters to get my matrix. Then, I can do an outer join to the Sales table and sum the sales amounts. Those regions with no sales in a specific quarter will simply have zero for the sales total. However, without my CROSS JOIN I would only see those intersections where sales occurred.

Here is what that could look like in SQL.

SELECT
Quarter.Name as [Quarter]
,Region.Name as Region
,ISNULL(SUM(Sales.Total), 0) as TotalSales
FROM Region
CROSS JOIN Quarters
LEFT JOIN Sales ON Sales.QuarterID = Quarters.QuarterID
AND Sales.RegionID = Region.RegionID
GROUP BY Quarter.Name, Region.Name

Maurice has another good example demonstrating the power of CROSS JOIN.

-- This query demonstrate that you can generate some iterations on data
-- This one finds work days.
-- The nums CTE generate some "virtual rows" inside the access plan
-- Cross join make this iteration.
-- Then the CTE monthStart get the date that starts the month
-- Then by cross joining nums values with month start it is
-- possible to generate all month dates from current month
-- The last LEFT JOIN eliminates all dates for which there is some event
-- There is no need to have all existing dates in the @event calendar.
-- WeekEnds are also removed. So all that remains are workdays.

declare @Events table (d datetime, Holiday nvarchar(50))

insert into @Events (d, Holiday)
values ('20130508', 'Holiday0'), ('20130523', 'Holiday1'), ('20130524', 'Holiday2')

;With
L0 AS (select 1 as c union all Select 1 )
, L1 as (select 1 as C From L0 A Cross JOIN L0 B )
, L3 as (select 1 as C From L1 A Cross JOIN L1 B )
, L4 as (select 1 as C From L3 A Cross JOIN L3 B )
, L5 as (select 1 as C From L4 A Cross JOIN L4 B )
, nums as (Select ROW_NUMBER() OVER (Order by C) as nb from L5)
, monthStart as (Select dateadd(mm, datediff (mm, '19000101', GETDATE()), '19000101') as Start)
, NbOfDayInAMonth as (Select nb-1 as DayOrderInTheMonth from nums where nb <= 31)
Select
Start + DayOrderInTheMonth as DayGeneratedFromNums, tc.d, tc.Holiday
From
MonthStart M
cross join
NbOfDayInAMonth NDM
left join
@Events tc
on (tc.d = M.Start + NDM.DayOrderInTheMonth)
Where tc.d is null
And (DATEPART(dw, M.Start + NDM.DayOrderInTheMonth) + @@DATEFIRST) % 7 not in (0,6 )
order by NDM.DayOrderInTheMonth

Share your favorite use for the CROSS JOIN behavior by dropping me an Email at btaylor@sswug.org, or if available you can enter comments below. Get into the conversation.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
Preparing for Database Application Design Reviews
By establishing and following a systematic approach to database application design reviews, the opportunity for implementing optimal applications is expanded. As database development can be very complex, successful systems can be created only by managing and documenting the implementation process. The design review process is an effective way to encourage a rigorous and systematic pre and post-implementation review of database applications.

Featured White Paper(s)
Ten Tips for Optimizing SQL Server Performance
read more)

Featured Script
dba3_Calendar_Tables_and_Views_Article
Modeling Date Logic III: Implementing Business Calendars (by example in MS Sql Server 2000) http://bitonthewire.wpengine.com/see/17379 C… (read more)