July 7, 2012 at 8:31 am
Hi Guys,
Hope you're all doing well!
I have a problem and will provide an abstraction (extremely simplified) for it, in the hopes someone could maybe assist.
Please let me know if I could provide any more code to assist you in helping me.
Consider:
;WITH Abstraction(pYear, pMonth, pValue, Problem) AS
(
SELECT 2000
,1
,RAND() * 1000
,'?'
UNION ALL
SELECT CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END
,CASE WHEN pMonth = 12 THEN 1 ELSE pMonth + 1 END
,RAND(CAST(CAST(NEWID()AS VARBINARY) AS INT)) * 1000
,'?'
FROM Abstraction
WHERE CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END <= 2005
)
SELECT *
FROM Abstraction
In the column labelled "Problem", I would like to be have:
If we look at the output and we are for example in January 2003, the sum of the pValue column for every previous January INCLUSIVE - (January 2000, January 2001, January, 2002, January 2003)
This can be calc'd after the CTE has run, sure - but I require this figure inside the CTE as other columns will be dependent on it.
The inter-dependence is such, that "chained" CTEs will also not work.
I have attempted SUM() OVER (PARTITION BY...) and have tried joining to the CTE in the recursive block - which seems to be strangely illegal...
As I said, this is a simplified abstraction of the problem, so I hope I can answer your questions within the realm of this example.
Kind thanks in advance!
PS.
Not that I am certain that the extended syntax for window functions in SQL Server 2012 would help - I need to have this solved with syntax available to SQL Server 2008
July 7, 2012 at 12:45 pm
Try below..Modify it as per your need.
;WITH Abstraction(pYear, pMonth, pValue, jan,feb,mar,april,may,june,july,aug,sep,oct,nov,decb) AS
(
SELECT 1999
,12
,cast (0 as float)
,cast (0 as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
UNION ALL
SELECT CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END
,CASE WHEN pMonth = 12 THEN 1 ELSE pMonth + 1 END
,pv.pvalue
,case when pmonth = 12 then isnull(jan,0.0) + pv.pvalue else ISNULL(jan,0.0) end as jan --this value is for jan if the pmonth is 12 and simialrly below
,case when pmonth = 1 then isnull(feb,0.0) + pv.pvalue else isnull(feb,0.0) end as feb
,case when pmonth = 2 then isnull(mar,0.0) + pv.pvalue else isnull(mar,0.0) end as mar
,case when pmonth = 3 then isnull(april,0.0) + pv.pvalue else isnull(april,0.0) end as april
,case when pmonth = 4 then isnull(may,0.0) + pv.pvalue else isnull(may,0.0) end as may
,case when pmonth = 5 then isnull(june,0.0) + pv.pvalue else isnull(june,0.0) end as june
,case when pmonth = 6 then isnull(july,0.0) + pv.pvalue else isnull(july,0.0) end as july
,case when pmonth = 7 then isnull(aug,0.0) + pv.pvalue else isnull(aug,0.0) end as aug
,case when pmonth = 8 then isnull(sep,0.0) + pv.pvalue else isnull(sep,0.0) end as sep
,case when pmonth = 9 then isnull(oct,0.0) + pv.pvalue else isnull(oct,0.0) end as oct
,case when pmonth = 10 then isnull(nov,0.0) + pv.pvalue else isnull(nov,0.0) end as nov
,case when pmonth = 11 then isnull(decb,0.0) + pv.pvalue else isnull(decb,0.0) end as decb
FROM Abstraction a
cross apply(select cast (RAND(CAST(CAST(NEWID()AS VARBINARY) AS int)) * 1000 as float) pvalue) pv
WHERE CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END <= 2005
)
SELECT
pYear,pMonth,pValue,
case
when pmonth = 1 then jan
when pmonth = 2 then feb
when pmonth = 3 then mar
when pmonth = 4 then april
when pmonth = 5 then may
when pmonth = 6 then june
when pmonth = 7 then july
when pmonth = 8 then aug
when pmonth = 9 then sep
when pmonth = 10 then oct
when pmonth = 11 then nov
when pmonth = 12 then decb
end as problem
FROM Abstraction
where pYear>= 2000
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
July 7, 2012 at 12:59 pm
Thank you 🙂
Very cool approach!
Unfortunately, time was a worry, so I have re-written my code in to a while loop, and am inserting row-by-row.
While inside the loop, I can query the table I am inserting in to (To stay with the abstraction, I can find the sum of all January values as they are readily query-able in the table I am inserting in to). Not my first choice.
Hope I can find the time to fit your approach to the solution in to my code and measure the performance.
Sadly, the actual problem is a great deal more complex than the abstracted problem.
Thanks again 🙂
July 7, 2012 at 1:02 pm
rCTE is a kind of loop.If this loop works with set of data and very few loops(recursion) results are good.but if it does row by row it wont be so great..In you case it is row by row any way..but just 72 rows or recursion so shouldnt be big issue..But 72 recursion wont be great iwith big table..
Yes..Try both approaches and see which one is better..
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
July 8, 2012 at 6:11 pm
CELKO (7/8/2012)
I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. Let's re-do yo0ur table and assume it is loaded with the right data types, etc.
Since that would require storing temporal data as character based data, I have to ask why would you do that? And, no... not trying to be the smart guy here. I'd really like to know, Joe.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 9, 2012 at 8:28 am
I may be a little presumptuous in saying that I think that the problem stems from trying to build a SQL solution to a very complex problem that is currently solved in Excel.
August 8, 2012 at 7:24 pm
CELKO (7/9/2012)
Since that would require storing temporal data as character based data, I have to ask why would you do that?
I am naming a time period because T-SQL does not have an INTERVAL data type. No temporal math is done with it. It is the key to INTERVALs in the Calendar and report periods table.
This convention happens to have some major advantages:
language free
fixed length & short
easy regular expressions in DDL and DML
sorts with ISO-8601 display format
It is fast in a query.
Think about looking for all the Januaries in English versus "report_month LIKE '____-01-00' instead. Anyone got a better way to name a time period?
Maybe but the real key here is that I wouldn't name a time period, Joe. Why not use the number of months since "Date zero"?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply