Recursively select records in single table when having main categoryid and textvalue of subcategory

  • I want to recursively select all records within a hierarchy, using the main parentid and a textvalue on level 1 OR level 2 of the subcategories.

    My data:

    CREATE TABLE [dbo].[articlegroups](

    [id] [int] NOT NULL,

    [parentid] [int] NOT NULL,

    [catlevel] [tinyint] NOT NULL,

    [slug_en] [nvarchar](50) NOT NULL,

    CONSTRAINT [PK_globos_articlegroups] PRIMARY KEY CLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (129, 0, 0, N'baby-and-birth')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (130, 129, 1, N'nursery')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (135, 130, 2, N'mattresses')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (136, 130, 2, N'cradles')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (139, 129, 1, N'pillows')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (140, 139, 2, N'headrest')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (141, 0, 0, N'computers')

    INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (142, 141, 1, N'laptops')

    When selecting rows I always have the main parentId (so catlevel 0) and the slug_en value.

    In my example case I have id 129 and slug_en='cradles'. I want my query to then return:

    idparentidcatlevel

    12900

    1301291

    1361302

    If I have id 129 and slug_en='pillows'. I want my query to then return:

    idparentidcatlevel

    12900

    1391291

    How can I do this?

    I'm new to SQL Server. I was reading here https://msdn.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396 on recursive SQL, but I have no idea how to implement this as I just have one table and I also have 2 selection criteria (main category id and a textvalue on either level 1 or 2).

  • Do you have any other entries in your table that have a ParentID of "0"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oeh, good question:) Yes I do actually! I've updated the INSERT queries in my post with some extra data.

  • Then you need to know the answer to the question before the question is known. Sorry, don't mean to sound cryptic but it's true and I can demonstrate how to do that. Before I do, I need to know another thing... is the slug_en column constrained to be unique or can there be duplicates in that column? For example, could the word "pillows" theoretically occur more than once in that column?

    Also, how often does the content of the ArticleGroups table get inserted, updated, or deleted?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Then you need to know the answer to the question before the question is known.

    🙂 No idea what you mean there 🙂

    To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

    Also

    , how often does the content of the ArticleGroups table get inserted, updated, or deleted?

    Not often...new articlegroups may be added over time, but this would occur like once every few months when new categories are added.

  • What I meant was that the answer to all the possible questions must already be present in the data or we'll be stuck with double (or more) recursion. Recursion is bad enough for performance and resource usage especially in the form of rCTEs (recursive Common Table Expressions).

    Because of your requirements and the fact that slug_en is not unique, one way to do this would be to recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want.

    Another way would be to find all the child rows and recurse all of them towards their respective parent, pick the correct parent, and then pick the correct set of rows.

    ... over and over and over.

    Or, we could calculate all of the answers for the entire table just one time when there's a change to the content of the table and not only easily solve this problem but also solve nearly any other hierarchical query that may arise from this hierarchical table.

    What's your pleasure?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah, makes sense now 🙂

    Based on my total set of data I'd say that your suggestion to

    recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want

    would best fit my needs.

    However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?

    Thanks!

  • sqlserverstarter (8/16/2015)


    Ah, makes sense now 🙂

    Based on my total set of data I'd say that your suggestion to

    recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want

    would best fit my needs.

    Can do.

    However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?

    Thanks!

    Heh... data duplication within a table is certainly a bad practice but if data duplication in general were a bad practice, a whole lot of people in BI would be out of a job when their data warehouses went away. 😀 Audit tables would also be banned as a bad practice as would any form of reporting table.

    To be sure, the original table could be modified to take a couple of new columns but the disadvantage there is that the original table would be locked up for several seconds while the "answers" to most of the possible hierarchical questions were pre-calculated. There are several options including having an "A" table and a "B" table where one is built while the other continues to be used and then flop a synonym to nearly instantly bring the new table online while taking the old table offline for the next update.

    With the understanding that I didn't do such a redirection in the post, please see the post at the following link for how I solved a very similar problem to yours. It also has a couple of links to some articles that explain how it all works.

    http://www.sqlservercentral.com/Forums/FindPost1712086.aspx

    Shifting gears back to your problem, let me see if I can tap out some code for you in the manner you requested that uses the great readily consumable data you were good enough to provide.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The following code will work as requested against the data you posted. It uses 144 logical reads because of the recursive CTEs and takes 1 to 5 ms duration. Adding another index would help eliminate the scans but it's still hitting the table 4 times. Let's hope that you don't need to run it often. 🙂

    --===== These would be the parameters for, say, an iTVF function

    DECLARE @MainParentID INT

    ,@Slug_En NVARCHAR(50)

    ;

    SELECT @MainParentID = 129

    ,@Slug_En = N'cradles'

    ;

    --===== Build the "Hierarchy" on-the-fly

    WITH cteBuildPath AS

    ( --=== This is the "anchor" part of the recursive CTE.

    -- The only thing it does is load the Root Nodes.

    SELECT anchor.id

    ,anchor.parentid

    ,catlevel = 0

    ,anchor.slug_en

    FROM dbo.articlegroups AS anchor

    WHERE id = @MainParentID

    UNION ALL

    --==== This is the "recursive" part of the CTE that adds 1 for each level

    SELECT recur.id

    ,recur.parentid

    ,catlevel = cte.catlevel + 1

    ,recur.slug_en

    FROM dbo.articlegroups AS recur

    JOIN cteBuildPath AS cte

    ON cte.id = recur.parentid

    )

    ,

    cteReturnUpline AS

    ( --=== This is the anchor for the upline return

    SELECT anchor.id

    ,anchor.parentid

    ,anchor.catlevel

    --,anchor.slug_en

    FROM cteBuildPath anchor

    WHERE anchor.slug_en = @Slug_En

    UNION ALL

    --==== This is the recursive part that step through the upline.

    SELECT recur.id

    ,recur.parentid

    ,recur.catlevel

    --,recur.slug_en

    FROM cteBuildPath AS recur

    JOIN cteReturnUpline AS cte

    ON cte.parentid = recur.id

    )

    --===== This produces the final output in the correct sorted order

    SELECT id,parentid,catlevel

    FROM cteReturnUpline

    ORDER BY catlevel

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sqlserverstarter (8/16/2015)


    To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

    Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.

  • serg-52 (8/17/2015)


    sqlserverstarter (8/16/2015)


    To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

    Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.

    Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/17/2015)


    serg-52 (8/17/2015)


    sqlserverstarter (8/16/2015)


    To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

    Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.

    Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.

    Heh... another reason to hate recursive CTEs...

    If you need it all in one query, like this...

    --===== These would be the parameters for, say, an iTVF function

    DECLARE @MainParentID INT

    ,@Slug_En NVARCHAR(50)

    ;

    SELECT @MainParentID = 129

    ,@Slug_En = N'cradles'

    ;

    --===== Build the "UPLINE Hierarchy" on-the-fly

    WITH

    cteReturnUpline AS

    ( --=== This is the anchor for the upline return

    SELECT anchor.id

    ,anchor.parentid

    ,catlevel=0

    ,ChildID = anchor.id

    --,anchor.slug_en

    FROM dbo.articlegroups anchor

    WHERE anchor.slug_en = @Slug_En

    UNION ALL

    --==== This is the recursive part that step through the upline.

    SELECT recur.id

    ,recur.parentid

    ,cte.catlevel+1

    ,cte.ChildID

    --,recur.slug_en

    FROM dbo.articlegroups AS recur

    JOIN cteReturnUpline AS cte

    ON cte.parentid = recur.id

    )

    --===== This produces the final output in the correct sorted order

    SELECT id

    ,parentid

    ,catlevel=ROW_NUMBER()OVER(PARTITION BY ChildID ORDER BY catlevel DESC)-1

    FROM cteReturnUpline

    WHERE ChildID = (SELECT ChildID FROM cteReturnUpline WHERE parentid = @MainParentID)

    ORDER BY catlevel

    ;

    ... it still consists of a double recursion because, like a view, every time you call it, it recalculates. The performance is still a little better than the previous code because it only has to calculate two direct uplines instead of a fanout downline and an upline but it's still double recursive and it still means that you're recalculating the upline over and over even for identical calls.

    I still think a Nested Sets table would be better in the long run because it would only require a relatively simple lookup.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.

    declare @item varchar(50) = 'headrest'

    ;with cte (id,parentid,catlevel,slug_en)

    as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item

    union all

    select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a

    Join [dbo].[articlegroups] a1 on a1.id = a.parentid

    )

    select * from cte

    order by catlevel

  • AKIMwilliams (8/17/2015)


    Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.

    declare @item varchar(50) = 'headrest'

    ;with cte (id,parentid,catlevel,slug_en)

    as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item

    union all

    select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a

    Join [dbo].[articlegroups] a1 on a1.id = a.parentid

    )

    select * from cte

    order by catlevel

    Works fine except that slug_en isn't unique and the "upline" search code above will return multiple paths to multiple "main" parent_id's if that happens.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is my query modified. I created an additional cte that uses the initial result set. It will filter through and only show the records that uses the original id.

    declare @item varchar(50) = 'mattresses',

    @baseid int = 129

    ;with cte (id,parentid,catlevel,slug_en)

    as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item

    union all

    select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a

    Join [dbo].[articlegroups] a1 on a1.id = a.parentid

    )

    , cteresultset ( id,parentid,catlevel)

    as (select id,parentid,catlevel from cte

    where parentid = 0 And id = @baseid

    union all

    select a1.id,a1.parentid,a1.catlevel from cteresultset a

    Join cte a1 On a.id = a1.parentid)

    select * from cteresultset

    order by catlevel

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply