Editorials

Dot Notation Hierarchy

Today I am sharing one last hierarchy technique I have seen in rare instances. It is not the most pure form of an hierarchy and has a number of issues. Still, implementation is simple and quick, and for smaller amounts of data, this works really slick.

This last hierarchy method simply uses a single VARCHAR column. In the column, the hierarchy is formed by using dot notation similar to that found in a document for outlining. Here is a sample:

Example Hierarchy Data
NodeId Description
001. Food
001.001, Main Dish
001.001.001. Lazagna
001.001.002 Calzone
001.002. Breakfast
001.002.001. Scrambled Eggs
001.002.002. Waffles
001.002.003. Pancakes
001.003. Salads
001.004. Bread
001.004.001. Quick Breads
001.004.001.001 Corn Muffins

As you can see, this implementation handles jagged data of any type quite easily. The hierarchy is clear to see and calculate in the data. Simply select records, using order by of the NodeID column, and data is returned in hierarchical order.

Find all descendants for the 001.001. Main Dish Node:

SELECT NodeId, Description

FROM Hierarchy

WHERE NodeId LIKE ‘001.001.{0-9]%’

Notice I included [0-9] after current node key in my where clause. This filters out the node which is my point of reference by requiring that the first character after your current node key string must be a number between 0 and 9. Your current node will not have a number following the last period. If you wish to include the current node, and all descendants change the where clause to WHERE NodeId LIKE ‘001.001.%’.

Using the LIKE operator is not generally a performant operation. However, since we are only using a wild card at the end of the comparison string, SQL Server still takes advantages of an index created on the NodeId column.

Find the parent of 001.004.001.001. Corn Muffins where @NodeID is an input for the query with the value = ‘001.004.001.001.’:



DECLARE @Level INT = LEN(@NodeID)/4

DECLARE @SearchKey VARCHAR(128) = LEFT(@NodeID,(@Level-1)*4)


SELECT NodeId, Description

FROM Hierarchy

WHERE NodeId = @SearchKey

Find all Ancestors of 001.004.001.001. Corn Muffins where @NodeID is an input for the query with the value = ‘001.004.001.001.’:



DECLARE @Level INT = LEN(@NodeID)/4

DECLARE @SearchFilter TABLE (NodeID VARCHAR(128) NOT NULL UNIQUE)


INSERT INTO @SearchFilter

SELECT LEFT(@Node, NUMBER * 4)

FROM TALLY

WHERE Number > 0

AND Number < @Level


SELECT NodeId, Description

FROM @SearchFilter

JOIN Hierarchy ON @SearchFilter.NodeID = Hieararchy.NodeId

I am using a TALLY table I maintain in all my databases, which simple has a column named NUMBER, containing a sequence of records from 0 to 100,000. In my query to insert into the @SearchFilter table, it selects records 1 thru 3, which result in LEFT(4) = ‘001.’, LEFT(8) = ‘001.004.’, and LEFT(12) = ‘001.004.001.’. Those three values are the ancestor keys for which I am searching.

The issues with this technique that I have experienced:

  1. If you want data to sorted correctly you must use leading zeros, and the number of nodes at any level are restricted to the number of digits used for each level. Preceding zeros are not required to make the hierarchy work
  2. Not relationships for descendants are easy. All others require a length function , if using a fixed number of digits for each level as demonstrated here, or a charnindex function if using a variable length node id to determine a part of the current node to compare to others
  3. All comparisons use string functions which are not the best performing functions found in SQL Server
  4. Walking relationships from parent to descendants is easy. Relationships from descendants to ancestors is painful




That pretty much exhausts my experience with hierarchies. Tomorrow we’ll start on something new. Does anything interest you? Share topics you’d like to see. Drop an email to btaylor @sswug.org.

Cheers,

Ben