August 16, 2015 at 5:43 pm
I want to recursively select all records within a hierarchy, using the main parentid and a textvalue on level 1 OR level 2 of the subcategories.
My data:
CREATE TABLE [dbo].[articlegroups](
[id] [int] NOT NULL,
[parentid] [int] NOT NULL,
[catlevel] [tinyint] NOT NULL,
[slug_en] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_globos_articlegroups] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (129, 0, 0, N'baby-and-birth')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (130, 129, 1, N'nursery')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (135, 130, 2, N'mattresses')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (136, 130, 2, N'cradles')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (139, 129, 1, N'pillows')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (140, 139, 2, N'headrest')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (141, 0, 0, N'computers')
INSERT [dbo].[articlegroups ([id], [parentid], [catlevel], [slug_en]) VALUES (142, 141, 1, N'laptops')
When selecting rows I always have the main parentId (so catlevel 0) and the slug_en value.
In my example case I have id 129 and slug_en='cradles'. I want my query to then return:
idparentidcatlevel
12900
1301291
1361302
If I have id 129 and slug_en='pillows'. I want my query to then return:
idparentidcatlevel
12900
1391291
How can I do this?
I'm new to SQL Server. I was reading here https://msdn.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396 on recursive SQL, but I have no idea how to implement this as I just have one table and I also have 2 selection criteria (main category id and a textvalue on either level 1 or 2).
August 16, 2015 at 6:17 pm
Do you have any other entries in your table that have a ParentID of "0"?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2015 at 6:29 pm
Oeh, good question:) Yes I do actually! I've updated the INSERT queries in my post with some extra data.
August 16, 2015 at 6:37 pm
Then you need to know the answer to the question before the question is known. Sorry, don't mean to sound cryptic but it's true and I can demonstrate how to do that. Before I do, I need to know another thing... is the slug_en column constrained to be unique or can there be duplicates in that column? For example, could the word "pillows" theoretically occur more than once in that column?
Also, how often does the content of the ArticleGroups table get inserted, updated, or deleted?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2015 at 6:47 pm
Then you need to know the answer to the question before the question is known.
🙂 No idea what you mean there 🙂
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.
Also
, how often does the content of the ArticleGroups table get inserted, updated, or deleted?
Not often...new articlegroups may be added over time, but this would occur like once every few months when new categories are added.
August 16, 2015 at 7:00 pm
What I meant was that the answer to all the possible questions must already be present in the data or we'll be stuck with double (or more) recursion. Recursion is bad enough for performance and resource usage especially in the form of rCTEs (recursive Common Table Expressions).
Because of your requirements and the fact that slug_en is not unique, one way to do this would be to recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want.
Another way would be to find all the child rows and recurse all of them towards their respective parent, pick the correct parent, and then pick the correct set of rows.
... over and over and over.
Or, we could calculate all of the answers for the entire table just one time when there's a change to the content of the table and not only easily solve this problem but also solve nearly any other hierarchical query that may arise from this hierarchical table.
What's your pleasure?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2015 at 7:34 pm
Ah, makes sense now 🙂
Based on my total set of data I'd say that your suggestion to
recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want
would best fit my needs.
However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?
Thanks!
August 16, 2015 at 9:33 pm
sqlserverstarter (8/16/2015)
Ah, makes sense now 🙂Based on my total set of data I'd say that your suggestion to
recursively find all the children of the top level parent and then pick the child row(s) with the matching slug_en and recurse back to the parent to return only the rows you want
would best fit my needs.
Can do.
However, I AM also curious to how you would go about the option "calculate all of the answers for the entire table". I'm not asking you to solve it, but am curious where and how you would store the results...a new table? If so, would that not be data duplication, which is considered a bad practice?
Thanks!
Heh... data duplication within a table is certainly a bad practice but if data duplication in general were a bad practice, a whole lot of people in BI would be out of a job when their data warehouses went away. 😀 Audit tables would also be banned as a bad practice as would any form of reporting table.
To be sure, the original table could be modified to take a couple of new columns but the disadvantage there is that the original table would be locked up for several seconds while the "answers" to most of the possible hierarchical questions were pre-calculated. There are several options including having an "A" table and a "B" table where one is built while the other continues to be used and then flop a synonym to nearly instantly bring the new table online while taking the old table offline for the next update.
With the understanding that I didn't do such a redirection in the post, please see the post at the following link for how I solved a very similar problem to yours. It also has a couple of links to some articles that explain how it all works.
http://www.sqlservercentral.com/Forums/FindPost1712086.aspx
Shifting gears back to your problem, let me see if I can tap out some code for you in the manner you requested that uses the great readily consumable data you were good enough to provide.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2015 at 10:23 pm
The following code will work as requested against the data you posted. It uses 144 logical reads because of the recursive CTEs and takes 1 to 5 ms duration. Adding another index would help eliminate the scans but it's still hitting the table 4 times. Let's hope that you don't need to run it often. 🙂
--===== These would be the parameters for, say, an iTVF function
DECLARE @MainParentID INT
,@Slug_En NVARCHAR(50)
;
SELECT @MainParentID = 129
,@Slug_En = N'cradles'
;
--===== Build the "Hierarchy" on-the-fly
WITH cteBuildPath AS
( --=== This is the "anchor" part of the recursive CTE.
-- The only thing it does is load the Root Nodes.
SELECT anchor.id
,anchor.parentid
,catlevel = 0
,anchor.slug_en
FROM dbo.articlegroups AS anchor
WHERE id = @MainParentID
UNION ALL
--==== This is the "recursive" part of the CTE that adds 1 for each level
SELECT recur.id
,recur.parentid
,catlevel = cte.catlevel + 1
,recur.slug_en
FROM dbo.articlegroups AS recur
JOIN cteBuildPath AS cte
ON cte.id = recur.parentid
)
,
cteReturnUpline AS
( --=== This is the anchor for the upline return
SELECT anchor.id
,anchor.parentid
,anchor.catlevel
--,anchor.slug_en
FROM cteBuildPath anchor
WHERE anchor.slug_en = @Slug_En
UNION ALL
--==== This is the recursive part that step through the upline.
SELECT recur.id
,recur.parentid
,recur.catlevel
--,recur.slug_en
FROM cteBuildPath AS recur
JOIN cteReturnUpline AS cte
ON cte.parentid = recur.id
)
--===== This produces the final output in the correct sorted order
SELECT id,parentid,catlevel
FROM cteReturnUpline
ORDER BY catlevel
;
--Jeff Moden
Change is inevitable... Change for the better is not.
August 17, 2015 at 3:26 am
sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.
Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.
August 17, 2015 at 7:31 am
serg-52 (8/17/2015)
sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.
Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 17, 2015 at 7:57 am
Jeff Moden (8/17/2015)
serg-52 (8/17/2015)
sqlserverstarter (8/16/2015)
To your questions: the slug_en can contain duplicates, but never within the same hierarchy, so "pillows" could occur multiple times but will always have a different parentid.Based on this rule only one recursion is needed. Start with slug_en= "pillows" and climb up along the hierachy . Then just check requierd ParentId is in the result set.
Good idea. Would only need to keep track of the child ID to replay the correct downline if more than one "pillows" occurred.
Heh... another reason to hate recursive CTEs...
If you need it all in one query, like this...
--===== These would be the parameters for, say, an iTVF function
DECLARE @MainParentID INT
,@Slug_En NVARCHAR(50)
;
SELECT @MainParentID = 129
,@Slug_En = N'cradles'
;
--===== Build the "UPLINE Hierarchy" on-the-fly
WITH
cteReturnUpline AS
( --=== This is the anchor for the upline return
SELECT anchor.id
,anchor.parentid
,catlevel=0
,ChildID = anchor.id
--,anchor.slug_en
FROM dbo.articlegroups anchor
WHERE anchor.slug_en = @Slug_En
UNION ALL
--==== This is the recursive part that step through the upline.
SELECT recur.id
,recur.parentid
,cte.catlevel+1
,cte.ChildID
--,recur.slug_en
FROM dbo.articlegroups AS recur
JOIN cteReturnUpline AS cte
ON cte.parentid = recur.id
)
--===== This produces the final output in the correct sorted order
SELECT id
,parentid
,catlevel=ROW_NUMBER()OVER(PARTITION BY ChildID ORDER BY catlevel DESC)-1
FROM cteReturnUpline
WHERE ChildID = (SELECT ChildID FROM cteReturnUpline WHERE parentid = @MainParentID)
ORDER BY catlevel
;
... it still consists of a double recursion because, like a view, every time you call it, it recalculates. The performance is still a little better than the previous code because it only has to calculate two direct uplines instead of a fanout downline and an upline but it's still double recursive and it still means that you're recalculating the upline over and over even for identical calls.
I still think a Nested Sets table would be better in the long run because it would only require a relatively simple lookup.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 17, 2015 at 8:33 am
Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.
declare @item varchar(50) = 'headrest'
;with cte (id,parentid,catlevel,slug_en)
as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item
union all
select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a
Join [dbo].[articlegroups] a1 on a1.id = a.parentid
)
select * from cte
order by catlevel
August 17, 2015 at 12:18 pm
AKIMwilliams (8/17/2015)
Below is a little script I created which seems to work the way you want. Check it out and see if it works for you.declare @item varchar(50) = 'headrest'
;with cte (id,parentid,catlevel,slug_en)
as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item
union all
select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a
Join [dbo].[articlegroups] a1 on a1.id = a.parentid
)
select * from cte
order by catlevel
Works fine except that slug_en isn't unique and the "upline" search code above will return multiple paths to multiple "main" parent_id's if that happens.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 17, 2015 at 7:10 pm
Here is my query modified. I created an additional cte that uses the initial result set. It will filter through and only show the records that uses the original id.
declare @item varchar(50) = 'mattresses',
@baseid int = 129
;with cte (id,parentid,catlevel,slug_en)
as (select id,parentid,catlevel,slug_en From [dbo].[articlegroups] Where slug_en = @item
union all
select a1.id,a1.parentid,a1.catlevel,a1.slug_en From cte a
Join [dbo].[articlegroups] a1 on a1.id = a.parentid
)
, cteresultset ( id,parentid,catlevel)
as (select id,parentid,catlevel from cte
where parentid = 0 And id = @baseid
union all
select a1.id,a1.parentid,a1.catlevel from cteresultset a
Join cte a1 On a.id = a1.parentid)
select * from cteresultset
order by catlevel
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply