April 9, 2012 at 4:35 pm
Greetings all.
I have used CTEs so I am familiar with their usage however, I need to create a recursive CTE.
Before I dive in and pull whats left of my hair out, I want to make sure this is the correct situation to use this in.
Currently I am creating data for my company's forecasting software. The data on this works fine and is as expected by the end users however, we have run into a snag regarding super-session (new items superseding old items).
I will try to create some fake data here soon so we can test this out but I like I said earlier I want to make sure this is the right situation for a recursive CTE, or perhaps I need a different method.
So lets say our current Item is abc10. Last years Item is abc9, and the year prior is Abc8. Next years will be abc11. For our forecasting software, we want all history (invoice data) 'rolled up' into Abc10. Additionally, once it is all rolled up into Abc10, we will create an additional data set that will roll up Abc10 data into Abc11 for future forecasting. So we have 2 data sets, all invoices summarized at Abc10 level, then that summary at Abc11 level.
The way the data is listed now is there is one item master record for each item, and there is a column stating which item it supersedes. So for instance Abc10's record would have Abc9's item code in the supersedes column.
So now this is the part that I think requires the recursive cte...
Would this be possible via a recursive cte? If so how would I go about building that?
Thanks in advance for the help!
I will try to make some fake data here to work with shortly.
Link to my blog http://notyelf.com/
April 9, 2012 at 4:49 pm
Can you set up sample tables and sample data for us to work on your query?
Also, are u asking how to design the table structure or how to retreive the values u alreay have in your system?
April 9, 2012 at 4:52 pm
You might want to take a good look at the following articles before you use a recursive CTE.
http://www.sqlservercentral.com/articles/Data+Generation/87901/
http://www.sqlservercentral.com/articles/T-SQL/74118/
--Jeff Moden
Change is inevitable... Change for the better is not.
April 9, 2012 at 5:35 pm
I will put along some sample data here today. Sorry this is not the only critical thing I have to do today! π
I can put together some sample data and the expected results along here shortly.
In regards to the structure vs how to retrieve...all the tables are in place and structured, I just need to return the data in an expected state π
Link to my blog http://notyelf.com/
April 9, 2012 at 5:38 pm
Good, with the sample data, sample structure, and the desired result, im sure we will be able to help you out!
And don forget to take a peek at the article Jeff points to. Those are awesome creations π
April 9, 2012 at 5:57 pm
Here is the code that creates the sample data
IF (SELECT OBJECT_ID('TEMPDB..#FakeItems')) is not null
DROP TABLE #FakeItems
CREATE TABLE #FakeItems (Item_Code nvarchar(20), Superseeds nvarchar(20))
GO
INSERT #FakeItems
SELECT 'Abc8','' UNION ALL
SELECT 'Abc9','Abc8' UNION ALL
SELECT 'Abc10','Abc9' UNION ALL
SELECT 'Abc11','Abc10'
GO
IF (SELECT OBJECT_ID('TEMPDB..#FakeInvoice')) is not null
DROP TABLE #FakeInvoice
CREATE Table #FakeInvoice (Item_Code nvarchar(20), Period int, Sales int, qty int)
GO
INSERT #FakeInvoice
SELECT 'Abc8',201201,904,5 UNION ALL
SELECT 'Abc8',201202,430,3 UNION ALL
SELECT 'Abc8',201203,812,10 UNION ALL
SELECT 'Abc9',201201,993,5 UNION ALL
SELECT 'Abc9',201202,736,7 UNION ALL
SELECT 'Abc9',201203,285,9 UNION ALL
SELECT 'Abc10',201201,330,2 UNION ALL
SELECT 'Abc10',201202,560,5 UNION ALL
SELECT 'Abc10',201203,802,4
GO
-- just returning the above data
SELECT * FROM #FakeItems
SELECT * FROM #FakeInvoice
--------------------------------------
The expected results are as follows
1st data set returns
Item Code PeriodSalesQty
abc10201201222712
abc10201202172615
abc10201203189923
2nd data set returns
abc11201201222712
abc11201202172615
abc11201203189923
It should be noted that Abc11 is 'known' to be a future item as it has no data in the invoice table.
Hopefully that is clear!
I have read the other 2 articles that Jeff posted before but I read them again just in case :). If there is a better method for this than a recursive CTE, then I am all for it!
Link to my blog http://notyelf.com/
April 9, 2012 at 6:12 pm
One last question, how to determice abc10 and abc11 are base for the data sets 1 and 2 respectively ?
April 9, 2012 at 6:58 pm
CELKO (4/9/2012)
This is more of a temporal problem than a recursive problem.A useful idiom is a report period calendar. It gives a name to a range of dates.
CREATE TABLE Report_Periods
(report_name VARCHAR(30) NOT NULL PRIMARY KEY,
report_start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
report_end_date DATE NOT NULL,
CONSTRAINT date_ordering
CHECK (report_start_date <= report_end_date),
etc);
These report periods can overlap; a fiscal quarter will be contained in the range of its fiscal year. There can be gaps between them; we have to wait a year between each βAnnual Going out Of Business Sale!β and there might be long stretches of time without any special sales. But we want to know their ranges so that the table is fairly constant once it is created.
I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL.
Can you get it from here or do you need more help?
Unfortunately, Mr. Celko, this isn't MySQL. This is MS SQL Server and it doesn't have the same capabilities as MySQL or Oracle.
April 9, 2012 at 8:17 pm
My apologies Mr. Celko but I disagree with you, not on the grounds that you're applying MySQL to a SQL Server problem, but rather to agree with the OP that this is a classic recursive CTE problem as it applies to the hierarchy of superceded items.
My solution illustrates this.
DECLARE @FakeItems TABLE (Item_Code nvarchar(20), Superseeds nvarchar(20))
INSERT @FakeItems (Item_Code, Superseeds)
SELECT 'Abc8',NULL UNION ALL
SELECT 'Abc9','Abc8' UNION ALL
SELECT 'Abc10','Abc9' UNION ALL
SELECT 'Abc11','Abc10'
DECLARE @FakeInvoice TABLE (Item_Code nvarchar(20), Period int, Sales int, qty int)
INSERT @FakeInvoice
SELECT 'Abc8',201201,904,5 UNION ALL
SELECT 'Abc8',201202,430,3 UNION ALL
SELECT 'Abc8',201203,812,10 UNION ALL
SELECT 'Abc9',201201,993,5 UNION ALL
SELECT 'Abc9',201202,736,7 UNION ALL
SELECT 'Abc9',201203,285,9 UNION ALL
SELECT 'Abc10',201201,330,2 UNION ALL
SELECT 'Abc10',201202,560,5 UNION ALL
SELECT 'Abc10',201203,802,4
;WITH ItemHier AS (
SELECT Item_Code, Superseeds
FROM @FakeItems
UNION ALL
SELECT h.Item_Code, i.Superseeds
FROM @FakeItems i
INNER JOIN ItemHier h ON h.Superseeds = i.Item_Code
)
SELECT h.Item_Code, Period, SUM(Sales) AS Sales, SUM(Qty) AS Qty
FROM @FakeInvoice i
INNER JOIN ItemHier h ON i.Item_Code = h.Superseeds OR
(h.item_code = i.item_code and h.Superseeds IS NULL)
WHERE h.Item_Code = 'abc10'
GROUP BY h.Item_Code, Period
ORDER BY h.Item_Code, Period
To get the second results set, just substitute h.item_code = 'abc10' with h.item_code = 'abc11' or if you want the both, make it h.item_code IN ('abc10', 'abc11')
Edit: Note: Being overly picky here but superseeds should be correctly spelled supercedes.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 9, 2012 at 8:26 pm
dwain.c (4/9/2012)
My solution illustrates this.
DECLARE @FakeItems TABLE (Item_Code nvarchar(20), Superseeds nvarchar(20))
INSERT @FakeItems (Item_Code, Superseeds)
SELECT 'Abc8',NULL UNION ALL
SELECT 'Abc9','Abc8' UNION ALL
SELECT 'Abc10','Abc9' UNION ALL
SELECT 'Abc11','Abc10'
DECLARE @FakeInvoice TABLE (Item_Code nvarchar(20), Period int, Sales int, qty int)
INSERT @FakeInvoice
SELECT 'Abc8',201201,904,5 UNION ALL
SELECT 'Abc8',201202,430,3 UNION ALL
SELECT 'Abc8',201203,812,10 UNION ALL
SELECT 'Abc9',201201,993,5 UNION ALL
SELECT 'Abc9',201202,736,7 UNION ALL
SELECT 'Abc9',201203,285,9 UNION ALL
SELECT 'Abc10',201201,330,2 UNION ALL
SELECT 'Abc10',201202,560,5 UNION ALL
SELECT 'Abc10',201203,802,4
;WITH ItemHier AS (
SELECT Item_Code, Superseeds
FROM @FakeItems
UNION ALL
SELECT h.Item_Code, i.Superseeds
FROM @FakeItems i
INNER JOIN ItemHier h ON h.Superseeds = i.Item_Code
)
SELECT h.Item_Code, Period, SUM(Sales) AS Sales, SUM(Qty) AS Qty
FROM @FakeInvoice i
INNER JOIN ItemHier h ON i.Item_Code = h.Superseeds OR
(h.item_code = i.item_code and h.Superseeds IS NULL)
WHERE h.Item_Code = 'abc10'
GROUP BY h.Item_Code, Period
ORDER BY h.Item_Code, Period
To get the second results set, just substitute h.item_code = 'abc10' with h.item_code = 'abc11'
My solution was very vry similar to this one! So im refraining from posting it π
April 9, 2012 at 8:50 pm
Cold Coffee: ITem 10 is the current most item that has data in the invoice table
I will create seperate scripts one for summing everything up that is 'current', and the second one will take that and do the same thing to 'future' planned items.
Celko: no it is not a temporal problem. For the sake of this sample set I added things in this simple matter but these calculations will be over time. I can safely say this is not a temporal problem. Also it is not a MsSql problem like Lynn had mentioned π
Dwain: that works as expected thank you! One caveat however, is the filter is going to be an issue since this will span across some 50k items. I just used this as a sample set :). I DO however have a sample set of all the 'current' items so I believe this might put me on the right path and I am pretty sure I can work my way from here! I will let you know if I can't though!
Link to my blog http://notyelf.com/
April 9, 2012 at 9:06 pm
One caveat however, is the filter is going to be an issue since this will span across some 50k items. I just used this as a sample set :). I DO however have a sample set of all the 'current' items so I believe this might put me on the right path and I am pretty sure I can work my way from here! I will let you know if I can't though!
I don't want to come across as an expert on recursive CTEs here (believe me, I am not) but I think that the issue that will impact performance is not so much the number of items, but rather the number of items that are superceded.
As you said, you'll need to run some tests to see what happens.
I did happen to take a look at the Execution Plan and I noticed that the solution I proposed does 2 table scans on the Fake_Items table and one on the Fake_Invoices table. Make sure you look into proper indexing if the query runs slow to see if those table scans can be converted to index seeks.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 9, 2012 at 9:49 pm
shannonjk (4/9/2012)
I will put along some sample data here today. Sorry this is not the only critical thing I have to do today! π
Not to worry. Wasn't critical for me either. π
--Jeff Moden
Change is inevitable... Change for the better is not.
April 9, 2012 at 10:10 pm
I've not sussed the problem yet but the recursive method is going to get pretty darned slow as Dwain indicated. The recursive CTE essentially causes a cross join (Cartesian Product) between the number of unique items in the Item_Code column and the number of rows in the fake invoice table.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 9, 2012 at 11:09 pm
Using the provided data except as Temp Tables instead of Table Variables, here's a recursive CTE solution without the Cartesian Product problem. Notice the "HRoot" column which allowed me to greatly simplify the outer query, as well. Just add the correct indexes and this should really fly for you. This will work for any value of Item_Code from the "FakeItems" table. The sums will go up to and including that Item_Code.
CREATE TABLE #FakeItems(Item_Code nvarchar(20), Superseeds nvarchar(20))
INSERT #FakeItems (Item_Code, Superseeds)
SELECT 'Abc8',NULL UNION ALL
SELECT 'Abc9','Abc8' UNION ALL
SELECT 'Abc10','Abc9' UNION ALL
SELECT 'Abc11','Abc10'
CREATE TABLE #FakeInvoice (Item_Code nvarchar(20), Period int, Sales int, qty int)
INSERT #FakeInvoice
SELECT 'Abc8',201201,904,5 UNION ALL
SELECT 'Abc8',201202,430,3 UNION ALL
SELECT 'Abc8',201203,812,10 UNION ALL
SELECT 'Abc9',201201,993,5 UNION ALL
SELECT 'Abc9',201202,736,7 UNION ALL
SELECT 'Abc9',201203,285,9 UNION ALL
SELECT 'Abc10',201201,330,2 UNION ALL
SELECT 'Abc10',201202,560,5 UNION ALL
SELECT 'Abc10',201203,802,4
;
WITH ItemHier AS
(
SELECT Item_Code, Superseeds, HRoot = Item_Code
FROM #FakeItems
WHERE Item_Code = 'Abc10'
UNION ALL
SELECT i.Item_Code, i.Superseeds, h.HRoot
FROM #FakeItems i
INNER JOIN ItemHier h ON h.Superseeds = i.Item_Code
)
SELECT h.HRoot, i.Period, Sales = SUM(i.Sales), Qty = SUM(i.Qty)
FROM #FakeInvoice i
INNER JOIN ItemHier h ON i.Item_Code = h.Item_Code
GROUP BY h.HRoot, i.Period
ORDER BY h.HRoot, i.Period
;
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 35 total)
You must be logged in to reply to this topic. Login to reply