query help

  • Dear,

    Can you guys please help me to create following op?

    create table test222(sid bigint, scode nvarchar(50), parentid bigint, sname nvarchar(50))

    insert into test222 values (1, '2323', 0, 'iam a boy')

    insert into test222 values (2, '23231000', 1, 'boy')

    insert into test222 values (3, '23232', 1, 'boo')

    insert into test222 values (4, '232321', 3, 'bo')

    insert into test222 values (5, '23232110', 4, 'boyy')

    insert into test222 values (6, '23232190', 4, 'gril')

    insert into test222 values (7, '232329', 3, 'body')

    insert into test222 values (8, '23232910', 7, 'girll')

    insert into test222 values (9, '23232990', 7, 'boy')

    insert into test222 values (10, '23233000', 1, 'bo')

    insert into test222 values (11, '232390', 1, 'nh')

    insert into test222 values (12, '23239010', 10, 'ui')

    insert into test222 values (13, '23239020', 10, 'dert')

    insert into test222 values (14, '23239030', 10, 'hyui')

    insert into test222 values (15, '23239040', 10, 'nji')

    insert into test222 values (16, '23239090', 10, 'vfr')

    insert into test222 values (17, '2345', 0, 'hhh')

    insert into test222 values (18, '23455', 17, 'kkk')

    insert into test222 values (19, '2345', 0, 'ddd')

    insert into test222 values (20, '23455', 19, 'eee')

    insert into test222 values (21, '23456', 20, 'fff')

    insert into test222 values (22, '23457', 21, 'ggg')

    I have above records and

    i want following op when i search for iam a boy(from top)

    sid scode Parentid sName Hierarchy Level

    1 2323 0 iam a boy 2323 1

    223231000 1 boy 2323\23231000 2

    323232 1 boo 2323\23232 2

    4232321 3 bo 2323\23232\232321 3

    523232110 4 boyy 2323\23232\232321\23232110 4

    623232190 4 gril 2323\23232\232321\23232190 4

    7232329 3 body 2323\23232\232329 3

    823232910 7 girll 2323\23232\232329\23232910 4

    923232990 7 boy 2323\23232\232329\23232990 4

    1023233000 1 bo 2323\23233000 2

    11232390 1 nh 2323\232390 2

    122323901010 ui 2323\23233000\23239010 3

    132323902010 dert 2323\23233000\23239020 3

    142323903010 hyui 2323\23233000\23239030 3

    152323904010 nji 2323\23233000\23239040 3

    162323909010 vfr 2323\23233000\23239090 3

    i want following op when i search for kkk(from bottom)

    sid scode Parentid sName Hierarchy Level

    17 2345 0 hhh 2345 1

    1823455 17 kkk 2345\23455 2

    i want following op when i search for fff(from middle)

    sid scode Parentid sName Hierarchy Level

    19 2345 0 ddd 2345 1

    20 23455 19 eee 2345\23455 2

    21 23456 20 fff 2345\23455\23456 3

    22 23457 21 ggg 2345\23455\23456\23457 4

    Can you guys please help me?

    Thanks

    Nick

  • can some one here please help me?

  • peterausger (8/15/2015)


    can some one here please help me?

    Help is on the way. I need about another half hour or so to finish my answer.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes. And I'm going to do this in a manner that no one else might because the queries end up being nasty fast and it prepares your data for other things that you might want to do using the Hierarchy.

    First, a bit of an explanation.

    What a lot of people would do in this case (more simply described as being able to find an entire tree based on given branch info) would be to run a recursive CTE to traverse the UPLINE of a hierarchy to find the root node and then run yet another recursive CTE to traverse the entire DOWNLINE of the hierarchy. Both are fairly expensive to do.

    So, riddle me this. How often does the hierarchy change? Once a day? Once an hour? Every 10 minutes? And in that time, how many times will you need to do a lookup as you've requested? The bottom line here is that the recursive CTE's are expensive, the recalculate the whole hierarchy for the given tree every time a lookup is performed, and the hierarchy probably changes less frequently than the lookups occur.

    With that in mind, we're going to pre-calculate all of the trees just once and only when something in the hierarchy changes. The lookups that occur will be indexed based lookups instead using recursive CTEs (which are actually worse than most While loops for performance and resource usage). In other words, we're going to convert the lookups to being "set based" and we're going to do it with a thing called "Nested Sets".

    Now, the traditional method for using Nested Sets is to convert all of the data to them instead of preserving the original, easy to maintain and troubleshoot "Adjacency List", which is a fancy name for the parent/child list that you have. The traditional method is also hopelessly slow because it uses a "push stack" method to do the conversion which can be hundreds of times slower than even a recursive CTE (rCTE).

    The first thing we need is a special type of "Tally Table" (nothing more than a table containing a single column of very well indexed numbers) to help us with our code. Basically, it replaces the need for a While loop or rCTE using set based technology. We need this one to split up the sort path that will eventually be made. It starts at 1 and increments by 8 because you use BIGINT for your IDs and that takes 8 bytes to store in the sort path column that we'll make.

    Here's the code to make the "hTally" table that I just spoke of.

    --===========================================================================

    -- Build a special purpose Tally Table which will be used to split

    -- the SortPath column when we need to.

    --===========================================================================

    --===== Create the HTally table to be used for splitting SortPath

    SELECT TOP 500 --(8 * 500 = VARBINARY(4000) in length)

    N = ISNULL(CAST(

    (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1)*8+1

    AS INT),0)

    INTO dbo.HTally

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    --===== Add the quintessential PK for performance.

    ALTER TABLE dbo.HTally

    ADD CONSTRAINT PK_HTally

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    ;

    Rather than make any changes to the original table, we're going to build a "hierarchy" table that contains what we need from the original table and also what we need to build the Nested Sets. The table is actually built on the first pass without having to predefine it. Of course, when you go to rebuild the table when the original table suffers a change, you'll need to drop this dbo.Hierarchy table and rerun the code.

    --===========================================================================

    -- Build a new "Hierarchy" on-the-fly including some place holders

    --===========================================================================

    --===== Build a new "Hierarchy" on-the-fly including some place holders

    WITH cteBuildPath AS

    ( --=== This is the "anchor" part of the recursive CTE.

    -- The only thing it does is load the Root Nodes.

    SELECT anchor.sid,

    anchor.parentid,

    anchor.scode,

    anchor.sname,

    HLevel = 1,

    SortPath = CAST(

    CAST(anchor.sid AS BINARY(8))

    AS VARBINARY(4000)) --Up to 500 levels deep.

    FROM dbo.test222 AS anchor

    WHERE parentid = 0 --Only the Root Nodes have a 0 parentid

    UNION ALL

    --==== This is the "recursive" part of the CTE that adds 1 for each level

    -- and concatenates each level of sid's to the SortPath column.

    SELECT recur.sid,

    recur.parentid,

    recur.scode,

    recur.sname,

    HLevel = cte.HLevel + 1,

    SortPath = CAST( --This does the concatenation to build SortPath

    cte.SortPath + CAST(Recur.sid AS BINARY(8))

    AS VARBINARY(4000))

    FROM dbo.test222 AS recur

    INNER JOIN cteBuildPath AS cte

    ON cte.sid = recur.parentid

    ) --=== This final SELECT/INTO creates the Node # in the same order as a

    -- push-stack would. It also creates the final table with some

    -- "reserved" columns on the fly. We'll leave the SortPath column in

    -- place because we're still going to need it later.

    -- The ISNULLs make NOT NULL columns.

    SELECT sid = ISNULL(sorted.sid,0),

    sorted.parentid,

    RootNode = ISNULL(CAST(0 AS INT),0), --Place holder

    sorted.scode,

    sorted.sname,

    HLevel = ISNULL(sorted.HLevel,0),

    LeftBower = ISNULL(CAST(0 AS INT),0), --Place holder

    RightBower = ISNULL(CAST(0 AS INT),0), --Place holder

    NodeNumber = ROW_NUMBER() OVER (ORDER BY sorted.SortPath),

    NodeCount = ISNULL(CAST(0 AS INT),0), --Place holder

    SortPath = ISNULL(sorted.SortPath,sorted.SortPath)

    INTO dbo.Hierarchy

    FROM cteBuildPath AS sorted

    OPTION (MAXRECURSION 100) --Change this IF necessary

    ;

    --===========================================================================

    -- Using the information created in the table above, create the

    -- NodeCount column and the LeftBower and RightBower columns to create

    -- the Nested Sets hierarchical structure.

    --===========================================================================

    --===== Declare a working variable to hold the result of the calculation

    -- of the LeftBower so that it may be easily used to create the

    -- RightBower in a single scan of the final table.

    DECLARE @LeftBower INT

    ;

    --===== Create the Nested Sets from the information available in the table

    -- and in the following CTE. This uses the proprietary form of UPDATE

    -- available in SQL Serrver for extra performance.

    WITH cteCountDownlines AS

    ( --=== Count each occurance of EmployeeID in the sort path

    SELECT sid = CAST(SUBSTRING(h.SortPath,t.N,8) AS INT),

    NodeCount = COUNT(*) --Includes current node

    FROM dbo.Hierarchy h,

    dbo.HTally t

    WHERE t.N BETWEEN 1 AND DATALENGTH(SortPath)

    GROUP BY SUBSTRING(h.SortPath,t.N,8)

    ) --=== Update the NodeCount and calculate both Bowers

    UPDATE h

    SET @LeftBower = LeftBower = 2 * NodeNumber - HLevel,

    h.NodeCount = downline.NodeCount,

    h.RightBower = (downline.NodeCount - 1) * 2 + @LeftBower + 1,

    h.RootNode = CAST(SUBSTRING(SortPath,1,8) AS BIGINT)

    FROM dbo.Hierarchy h

    JOIN cteCountDownlines downline

    ON h.sid = downline.sid

    ;

    --===== Let's see what we have.

    SELECT *

    FROM dbo.Hierarchy

    ;

    The Hierarchy table now has the original "Adjacency List" (parent child) hierarchy, a "Hierarchical Path" (sorted path) hierarchy (almost like what you'd work with using the HierarchyID datatype), and the "Bowers" (anchors) of the very high performance "Nested Sets" hierarchy. There's also some extra information in there such as Level, RootNode (the root node at the top level of each tree) and the number of nodes contained in the downline of every node (includes that node).

    The world is now your nut to crack and your queries now become incredibly simple (even simpler if you create an iTVF (inline table valued function) for the common query).

    --===== Find the tree for 'iam a boy'

    SELECT *

    FROM dbo.Hierarchy h

    WHERE RootNode = (SELECT RootNode FROM dbo.Hierarchy WHERE sname = 'iam a boy')

    ORDER BY LeftBower

    ;

    --===== Find the tree for 'kkk'

    SELECT *

    FROM dbo.Hierarchy h

    WHERE RootNode = (SELECT RootNode FROM dbo.Hierarchy WHERE sname = 'kkk')

    ORDER BY LeftBower

    ;

    --===== Find the tree for 'fff'

    SELECT *

    FROM dbo.Hierarchy h

    WHERE RootNode = (SELECT RootNode FROM dbo.Hierarchy WHERE sname = 'fff')

    ORDER BY LeftBower

    ;

    While that seems like a lot of work, it's well worth it because, if you have one query about hierarchical data, you're bound to eventually have more and your table is ready for almost anything and everything especially with the left and right bowers of the Nested Sets being available.

    I'll also add that some indexes on the dbo.Hierarchy table will make just about any code played against it absolutely fly. I'll let you figure that out to suit your own purposes but I strongly recommend that you make the clustered index on the LeftBower column.

    I'd explain how all of this works and can be used for a whole host of different types of queries against the hierarchical structure but I've already done that in great detail. Please see the following two articles.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    http://www.sqlservercentral.com/articles/T-SQL/94570/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • i did some research and got the following quries

    --bottom to top

    ;WITH

    CTE_Search AS

    (

    SELECT

    sId,

    scode,

    ParentId,

    sName

    FROM test222

    WHERE

    sname = 'eee'

    UNION ALL

    SELECT

    t.sId,

    t.scode,

    t.ParentId,

    t.sName

    FROM CTE_Search as s

    INNER JOIN test222 as t

    ON t.sId = s.parentid

    ),

    CTE_Hierarchy as

    (

    select distinct

    *,

    CONVERT(nvarchar(800), scode) AS Hierarchy,

    1 as LevelNo

    from CTE_Search

    where

    parentid = 0

    union all

    select

    s.*,

    CONVERT(nvarchar(800), (h.Hierarchy +'\' + CONVERT(nvarchar(800), s.scode))),

    h.LevelNo + 1

    from CTE_Hierarchy as h

    inner join CTE_Search as s

    on s.parentid = h.sid

    )

    select * from CTE_Hierarchy

    order by Hierarchy

    the op of the above query is

    sid scode parentid sname hierarchy levelno

    202345519eee234551

    212345620fff23455\234562

    222345721ggg23455\23456\234573

    --top to bottom

    ;WITH SInfo AS

    (

    SELECT

    sId,

    scode,

    ParentId,

    sName,

    CONVERT(nvarchar(800), scode) AS Hierarchy,

    1 as LevelNo

    FROM test222

    WHERE

    sname = 'eee'

    UNION ALL

    SELECT

    TH.sId,

    TH.scode,

    TH.ParentId,

    TH.sName,

    CONVERT(nvarchar(800), (SInfo.Hierarchy +'\' + CONVERT(nvarchar(800), TH.scode))),

    LevelNo + 1

    FROM test222 TH

    INNER JOIN SInfo

    ON SInfo.sId = TH.ParentId

    )

    select t.*

    from SInfo as t

    --where

    -- not exists (select 1 from SInfo as s

    -- where

    -- s.sid = t.sid and

    -- s.LevelNo > t.LevelNo)

    order by Hierarchy

    sid scode parentid sname hierarchy levelno

    1923450ddd23451

    202345519eee2345\234552

    i want to merge above code and want the following op if i search for 'eee'

    sid scode parentid sname hierarchy levelno

    192345 0 ddd 2345 1

    202345519 eee 2345\23455 2

    212345620 fff 2345\23455\23456 3

    222345721 ggg 2345\23455\23456\23457 4

    thanks

    peter

  • Yep. Exactly what I was talking about. That uses recursive CTEs and is going to be slower and more resource intensive over time, especially if you have a lot of folks hitting similar queries. It also means that if a different hierarchical requirement comes up (and it usually does), then you'll need more recursive CTEs.

    If it works for you, that's fine. Just be aware that it creates an unnecessary load on the server because of not only the rCTEs but because they keep recalculating the same row returns over and over and ... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply