Today I would like to revisit a post of mine that is rather old. More precisely, the script in that post needs revisiting. This is one of my more favorite scripts and I still have more ideas to implement with it. The post/script in question can be found here.
In revisiting this script, I simplified it a bit. I also had to work on a problem with it that occurs in Hierarchies with circular dependencies. Quite frankly, that was a huge pain in the butt. There are some tricks out there to try and help with it – but I was having a hard time getting any of them to work in this scenario. I also updated the script to better handle self-referencing objects.
When you have circular references, and are trying to recurse the tree via a CTE, an instant blocker comes into play. You can only reference the anchor of the Recursive CTE once. Fixing a circular reference would be many times easier if you could reference the anchor twice.
In the end, the biggest hint to getting this to work came from this post. For it to work, I needed to find which combination of fields would work best. I finally settled on using the Object_ID to help reduce my pain. I settled on using the following in the anchor:
[codesyntax lang=”tsql”]
CAST('.'+CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(Max))+'.' AS VARCHAR(max))
[/codesyntax]
and the following in the recursive definition.
[codesyntax lang=”tsql”]
STUFF(('.' + F.SortCol + CAST(CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(max)) + '.' AS VARCHAR(max)) ),1,1,'')
[/codesyntax]
You can see that I am concatenating into a string for this column. This seems to work well for the purpose of eliminating those circular references.
Other adjustments to the script are not quite as significant but there is a small performance gain to be seen by these subtle changes. The most notable is the change to remove the two joins out to sys.columns in order to get the column names of the Parent and Child objects. In lieu of these joins, I am using the COL_NAME() function. This little change came to me thanks to a little work done last week on my statistics script that you can read here.
The final notable change comes in the naming of the CTEs in this script. I decided to rename the CTEs to something a bit more meaningful. In this case, Hierarchy and Ancestry are much more appropriate.
Without further adieu, here is the next major revision of that script.
[codesyntax lang=”tsql”]
DECLARE@StartParentTableVarchar(256) SELECT @StartParentTable = 'yourtable' ;With Hierarchy (ParentTable,ParentColumn, ChildTable,ChildColumn, FKLevel, SortCol) as ( Select OBJECT_NAME(sfk.referenced_object_id) as ParentTable ,COL_NAME(sfkc.referenced_object_id,sfkc.referenced_column_id) as ParentColumn ,OBJECT_NAME(sfk.Parent_object_id) as ChildTable ,COL_NAME(sfkc.parent_object_id,sfkc.parent_column_id) as ChildColumn ,0 as FKLevel ,CAST('.'+CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(Max))+'.' AS VARCHAR(max)) From sys.foreign_key_columns sfkc Inner Join sys.foreign_keys sfk On sfkc.constraint_object_id = sfk.object_id Where OBJECT_NAME(sfk.referenced_object_id) = @StartParentTable Union All Select OBJECT_NAME(sfk.referenced_object_id) as ParentTable ,COL_NAME(sfkc.referenced_object_id,sfkc.referenced_column_id) as ParentColumn ,OBJECT_NAME(sfk.Parent_object_id) as ChildTable ,COL_NAME(sfkc.parent_object_id,sfkc.parent_column_id) as ChildColumn ,FKLevel + 1 ,STUFF(('.' + F.SortCol + CAST(CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(max)) + '.' AS VARCHAR(max)) ),1,1,'') From sys.foreign_keys sfk Inner Join Hierarchy F On OBJECT_NAME(sfk.referenced_object_id) = f.ChildTable And F.SortCol NOT LIKE '%'+CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(max))+'%' Inner Join sys.foreign_key_columns sfkc On sfkc.constraint_object_id = sfk.object_id Where OBJECT_NAME(sfk.referenced_object_id) <> @StartParentTable And sfk.referenced_object_id <> sfk.parent_object_id ), Ancestry (ParentTable,ParentColumn, ChildTable,ChildColumn, FKLevel, SortCol) as ( Select distinct OBJECT_NAME(sfk.referenced_object_id) as ParentTable ,COL_NAME(sfkc.referenced_object_id,sfkc.referenced_column_id) as ParentColumn ,OBJECT_NAME(sfk.Parent_object_id) as ChildTable ,COL_NAME(sfkc.parent_object_id,sfkc.parent_column_id) as ChildColumn ,-1 as FKLevel ,CAST('.'+CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(Max))+'.' AS VARCHAR(max)) From Hierarchy F Inner Join sys.foreign_keys sfk On f.ChildTable = OBJECT_NAME(sfk.parent_object_id) And F.ParentTable <> OBJECT_NAME(sfk.referenced_object_id) Inner Join sys.foreign_key_columns sfkc On sfkc.constraint_object_id = sfk.object_id Union All Select OBJECT_NAME(sfk.referenced_object_id) as ParentTable ,COL_NAME(sfkc.referenced_object_id,sfkc.referenced_column_id) as ParentColumn ,OBJECT_NAME(sfk.Parent_object_id) as ChildTable ,COL_NAME(sfkc.parent_object_id,sfkc.parent_column_id) as ChildColumn ,f.FKLevel -1 ,STUFF(('.' + SortCol + CAST(CAST(OBJECT_NAME(sfk.referenced_object_id) AS VARCHAR(max)) + '.' AS VARCHAR(max)) ),1,1,'') From Ancestry F Inner Join sys.foreign_keys sfk On f.parentTable = OBJECT_NAME(sfk.parent_object_id) Inner Join sys.foreign_key_columns sfkc On sfkc.constraint_object_id = sfk.object_id Where f.parentTable not in (select ParentTable from Hierarchy) And sfk.referenced_object_id <> sfk.parent_object_id And f.ChildTable not IN (select ChildTable from Hierarchy) ) Select ParentTable,ParentColumn, ChildTable,ChildColumn, FKLevel, SortCol From Hierarchy Union All Select ParentTable,ParentColumn, ChildTable,ChildColumn, FKLevel, SortCol From Ancestry Order By SortCol asc option (maxrecursion 500)
[/codesyntax]
I hope you will play with this script, test it out and make recommendations or even find bugs with it and let me know.