detecting cycles in a hierarchy using t-sql

  • I am tasked with identifying all the cycles (node relationships that have a circular reference, i.e. A is child of B, B is child of A) within a hierarchy. I have found a way which works under various tests, but is confusing to understand. I'm wondering if there's a better way. It's a quite a bit of code to look at, but if you're interested, thanks for checking it out...

    /*

    --

    the table has 2 cycles and one valid path

    --

    A child of B child of A (cycle)

    C child of D child of C (cycle)

    E child of F child of NULL (valid)

    --

    child parent

    A B

    B A

    C D

    D C

    E F

    F null

    --

    the goal is to identify which rows are part of cycles,

    group those rows in the cycles to which they belong,

    and then arbitrarily choose a cycle row and set its

    parent to null.

    --

    the output should look like this:

    --

    child parent is_cycle cycle_group

    A B 1 A (must appear for every row in cycle)

    B A 1 A

    C D 1 C

    D C 1 C

    --

    */

    --

    set nocount on

    --

    -- load table

    --

    if object_id('tempdb..#Temp_cte') is not null drop table #Temp_cte

    --

    if object_id('tempdb..#Tree') is not null drop table #Tree

    --

    create table #Tree (

    child varchar(10),

    parent varchar(10),

    )

    --

    insert #Tree (child, parent)

    select 'A', 'B'

    union all

    select 'B', 'A'

    union all

    select 'C', 'D'

    union all

    select 'D', 'C'

    union all

    select 'E', 'F'

    union all

    select 'F', null

    --

    select * from #Tree;

    --

    -- identify and group all cycles

    --

    -- identify the cycles by building the traversing the

    -- tree UP one child at a time and searching for the

    -- parent within the growing lineage. if the next parent

    -- already exists in the lineage, the original child

    -- is part of a cycle.

    --

    with cte as (

    select

    child,

    parent,

    convert(varchar(100), child) as lineage,

    0 as is_cycle,

    child as start_child

    from #Tree

    --

    union all

    --

    select

    t.child,

    t.parent,

    convert(varchar(100), c.lineage + '.' + t.child ),

    case when c.lineage like '%' + t.parent + '%' then 1 else 0 end,

    c.start_child

    from #Tree t

    join cte c on t.child = c.parent

    where c.lineage not like '%' + t.child + '%' -- check the next child to avoid the cycle

    )

    select *

    into #temp_cte

    from cte

    --

    -- show all records to see results of cte

    --

    select *

    from #temp_cte

    --

    -- find the rows that are part of cycles

    -- by grouping all cycle rows based on the

    -- max(start_child) value. every row in a

    -- a cycle will appear in a group whenever

    -- that cycle is generated by the cte.

    -- using the max(start_child) value will

    -- filter out the distinct rows and give

    -- them an identifier to show how they are

    -- related

    --

    select

    child,

    parent,

    max(is_cycle) as is_cycle,

    min(start_child) as start_child_group

    from #temp_cte

    group by

    child,

    parent

    having max(is_cycle) = 1

  • If your query does not require to check multi-level circular loops you can do this:

    SELECT

    t1.*

    FROM #Tree t1

    JOIN #Tree t2 ON t1.child = t2.parent AND t1.parent = t2.child

    ORDER BY child

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply