Recursively select records in single table when having main categoryid and textvalue of subcategory

Question

Post reply

Recursively select records in single table when having main categoryid and textvalue of subcategory

sqlserverstarter

SSC Enthusiast

Points: 155
More actions
August 16, 2015 at 5:43 pm

#306819

I want to recursively select all records within a hierarchy, using the main parentid and a textvalue on level 1 OR level 2 of the subcategories.
My data:
CREATE TABLE [dbo].[articlegroups](
[id] [int] NOT NULL,
[parentid] [int] NOT NULL,
[catlevel] [tinyint] NOT NULL,
[slug_en] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_globos_articlegroups] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (129, 0, 0, N'baby-and-birth')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (130, 129, 1, N'nursery')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (135, 130, 2, N'mattresses')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (136, 130, 2, N'cradles')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (139, 129, 1, N'pillows')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (140, 139, 2, N'headrest')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (141, 0, 0, N'computers')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (142, 141, 1, N'laptops')
When selecting rows I always have the main parentId (so catlevel 0) and the slug_en value.
In my example case I have id 129 and slug_en='cradles'. I want my query to then return:
idparentidcatlevel
12900
1301291
1361302
If I have id 129 and slug_en='pillows'. I want my query to then return:
idparentidcatlevel
12900
1391291
How can I do this?
I'm new to SQL Server. I was reading here https://msdn.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396 on recursive SQL, but I have no idea how to implement this as I just have one table and I also have 2 selection criteria (main category id and a textvalue on either level 1 or 2).

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 1

Do you have any other entries in your table that have a ParentID of "0"?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sqlserverstarter SSC Enthusiast Points: 155 More actions · Answer 2

Oeh, good question:) Yes I do actually! I've updated the INSERT queries in my post with some extra data.

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 3

Then you need to know the answer to the question before the question is known. Sorry, don't mean to sound cryptic but it's true and I can demonstrate how to do that. Before I do, I need to know another thing... is the slug_en column constrained to be unique or can there be duplicates in that column? For example, could the word "pillows" theoretically occur more than once in that column?

Also, how often does the content of the ArticleGroups table get inserted, updated, or deleted?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sqlserverstarter SSC Enthusiast Points: 155 More actions · Answer 4

Then you need to know the answer to the question before the question is known.

🙂 No idea what you mean there 🙂

To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

Also

, how often does the content of the ArticleGroups table get inserted, updated, or deleted?

Not often...new articlegroups may be added over time, but this would occur like once every few months when new categories are added.

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 5

What I meant was that the answer to all the possible questions must already be present in the data or we'll be stuck with double (or more) recursion. Recursion is bad enough for performance and resource usage especially in the form of rCTEs (recursive Common Table Expressions).

Because of your requirements and the fact that slug_en is not unique, one way to do this would be to recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want.

Another way would be to find all the child rows and recurse all of them towards their respective parent, pick the correct parent, and then pick the correct set of rows.

... over and over and over.

Or, we could calculate all of the answers for the entire table just one time when there's a change to the content of the table and not only easily solve this problem but also solve nearly any other hierarchical query that may arise from this hierarchical table.

What's your pleasure?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sqlserverstarter SSC Enthusiast Points: 155 More actions · Answer 6

Ah, makes sense now 🙂

Based on my total set of data I'd say that your suggestion to

recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want

would best fit my needs.

However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?

Thanks!

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 7

sqlserverstarter (8/16/2015)
Ah, makes sense now 🙂
Based on my total set of data I'd say that your suggestion to
recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want
would best fit my needs.

Can do.

However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?
Thanks!

Heh... data duplication within a table is certainly a bad practice but if data duplication in general were a bad practice, a whole lot of people in BI would be out of a job when their data warehouses went away. 😀 Audit tables would also be banned as a bad practice as would any form of reporting table.

To be sure, the original table could be modified to take a couple of new columns but the disadvantage there is that the original table would be locked up for several seconds while the "answers" to most of the possible hierarchical questions were pre-calculated. There are several options including having an "A" table and a "B" table where one is built while the other continues to be used and then flop a synonym to nearly instantly bring the new table online while taking the old table offline for the next update.

With the understanding that I didn't do such a redirection in the post, please see the post at the following link for how I solved a very similar problem to yours. It also has a couple of links to some articles that explain how it all works.

http://www.sqlservercentral.com/Forums/FindPost1712086.aspx

Shifting gears back to your problem, let me see if I can tap out some code for you in the manner you requested that uses the great readily consumable data you were good enough to provide.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 8

The following code will work as requested against the data you posted. It uses 144 logical reads because of the recursive CTEs and takes 1 to 5 ms duration. Adding another index would help eliminate the scans but it's still hitting the table 4 times. Let's hope that you don't need to run it often. 🙂

--===== These would be the parameters for, say, an iTVF function

DECLARE @MainParentID INT

,@Slug_En NVARCHAR(50)

;

SELECT @MainParentID = 129

,@Slug_En = N'cradles'

;

--===== Build the "Hierarchy" on-the-fly

WITH cteBuildPath AS

( --=== This is the "anchor" part of the recursive CTE.

-- The only thing it does is load the Root Nodes.

SELECT anchor.id

,anchor.parentid

,catlevel = 0

,anchor.slug_en

FROM dbo.articlegroups AS anchor

WHERE id = @MainParentID

UNION ALL

--==== This is the "recursive" part of the CTE that adds 1 for each level

SELECT recur.id

,recur.parentid

,catlevel = cte.catlevel + 1

,recur.slug_en

FROM dbo.articlegroups AS recur

JOIN cteBuildPath AS cte

ON cte.id = recur.parentid

)

,

cteReturnUpline AS

( --=== This is the anchor for the upline return

SELECT anchor.id

,anchor.parentid

,anchor.catlevel

--,anchor.slug_en

FROM cteBuildPath anchor

WHERE anchor.slug_en = @Slug_En

UNION ALL

--==== This is the recursive part that step through the upline.

SELECT recur.id

,recur.parentid

,recur.catlevel

--,recur.slug_en

FROM cteBuildPath AS recur

JOIN cteReturnUpline AS cte

ON cte.parentid = recur.id

)

--===== This produces the final output in the correct sorted order

SELECT id,parentid,catlevel

FROM cteReturnUpline

ORDER BY catlevel

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 9

sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.

Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 10

serg-52 (8/17/2015)
sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.
Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.

Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 11

Jeff Moden (8/17/2015)
serg-52 (8/17/2015)
sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.
Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.
Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.

Heh... another reason to hate recursive CTEs...

If you need it all in one query, like this...

--===== These would be the parameters for, say, an iTVF function

DECLARE @MainParentID INT

,@Slug_En NVARCHAR(50)

;

SELECT @MainParentID = 129

,@Slug_En = N'cradles'

;

--===== Build the "UPLINE Hierarchy" on-the-fly

WITH

cteReturnUpline AS

( --=== This is the anchor for the upline return

SELECT anchor.id

,anchor.parentid

,catlevel=0

,ChildID = anchor.id

--,anchor.slug_en

FROM dbo.articlegroups anchor

WHERE anchor.slug_en = @Slug_En

UNION ALL

--==== This is the recursive part that step through the upline.

SELECT recur.id

,recur.parentid

,cte.catlevel+1

,cte.ChildID

--,recur.slug_en

FROM dbo.articlegroups AS recur

JOIN cteReturnUpline AS cte

ON cte.parentid = recur.id

)

--===== This produces the final output in the correct sorted order

SELECT id

,parentid

,catlevel=ROW_NUMBER()OVER(PARTITION BY ChildID ORDER BY catlevel DESC)-1

FROM cteReturnUpline

WHERE ChildID = (SELECT ChildID FROM cteReturnUpline WHERE parentid = @MainParentID)

ORDER BY catlevel

;

... it still consists of a double recursion because, like a view, every time you call it, it recalculates. The performance is still a little better than the previous code because it only has to calculate two direct uplines instead of a fanout downline and an upline but it's still double recursive and it still means that you're recalculating the upline over and over even for identical calls.

I still think a Nested Sets table would be better in the long run because it would only require a relatively simple lookup.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

AKIMwilliams SSC Enthusiast Points: 118 More actions · Answer 12

Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.

declare @item varchar(50) = 'headrest'

;with cte (id,parentid,catlevel,slug_en)

as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item

union all

select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a

Join [dbo].[articlegroups] a1 on a1.id = a.parentid

)

select * from cte

order by catlevel

Jeff Moden SSC Guru Points: 1004313 More actions · Answer 13

AKIMwilliams (8/17/2015)
Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.
declare @item varchar(50) = 'headrest'
;with cte (id,parentid,catlevel,slug_en)
as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item
union all
select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a
Join [dbo].[articlegroups] a1 on a1.id = a.parentid
)
select * from cte
order by catlevel

Works fine except that slug_en isn't unique and the "upline" search code above will return multiple paths to multiple "main" parent_id's if that happens.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

AKIMwilliams SSC Enthusiast Points: 118 More actions · Answer 14

Here is my query modified. I created an additional cte that uses the initial result set. It will filter through and only show the records that uses the original id.

declare @item varchar(50) = 'mattresses',

@baseid int = 129

;with cte (id,parentid,catlevel,slug_en)

as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item

union all

select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a

Join [dbo].[articlegroups] a1 on a1.id = a.parentid

)

, cteresultset ( id,parentid,catlevel)

as (select id,parentid,catlevel from cte

where parentid = 0 And id = @baseid

union all

select a1.id,a1.parentid,a1.catlevel from cteresultset a

Join cte a1 On a.id = a1.parentid)

select * from cteresultset

order by catlevel