August 31, 2011 at 5:16 pm
Greetings, I am attempting to do something that is seemingly simply, though I have run into a few challanages and figured I should consult this knowledgable community for a best practice approach. I have found many posts on this topic but none have helped me determine the best practice approach.
In my query I need to return a Monthly total and a YTD total for each month. The Where clause may include one or many months but the YTD balance should always start from the first month of the year. The actual table contains many dimensions and other columns not listed in the sample below. The data is at the day level but the query should return the results at the month YYYYMM level. Also note that I can reference a datetime dimension instead of the interger value, in case that causes an issue.
I have not included a sample query as I am looking for feedback here.
Thanks in advance for any ideas.
Expected Results:
ACCOUNTPERIODMTHTOTALYTDTOTAL
1000201101150150
1000201102200350
1000201103300650
1000201104100750
1000201105100850
1000201106100950
10002011071001050
10002011081001150
10002011091001250
10002011101001350
10002011111001450
10002011121001550
1050201101500500
10502011025001000
10502011065001500
10502011125002000
110020110110001000
110020111210002000
Sample data
CREATE TABLE [dbo].[Totals_Test](
[account] [varchar](50) NULL,
[period] int NULL,
[value1] [float] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110101', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110102', 50)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110201', 200)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110301', 300)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110401', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110501', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110601', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110701', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110801', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20110901', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20111001', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20111101', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20111201', 100)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1000', N'20111202', 50)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1050', N'20110101', 500)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1050', N'20110201', 500)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1050', N'20110601', 500)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1050', N'20111201', 500)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1100', N'20110101', 1000)
INSERT [dbo].[Totals_Test] ([account], [period], [value1]) VALUES (N'1100', N'20111231', 1000)
Thanks again for any ideas.
Regards,
Mike
August 31, 2011 at 7:40 pm
How will this do?
; WITH ConvertedValues AS
(
SELECT [account]
, DATEADD( MM , DATEDIFF( MM , 0, CONVERT( DATETIME , STUFF( STUFF ( [period] , 5, 0,'/') , 8 ,0,'/') ) ) , 0 ) period
, [value1]
FROM Totals_Test
),
Distinct_Dates AS
(
SELECT [account] , period , SUM( value1 ) AS MTH_TOTAL
FROM ConvertedValues
GROUP BY [account] , period
)
SELECT OuterTable.account , OuterTable.period , OuterTable.MTH_TOTAL , CrsAppOutPut.YTD_TOTAL YTD_TOTAL
FROM Distinct_Dates OuterTable
CROSS APPLY
( SELECT SUM( InnerTable.value1 ) AS YTD_TOTAL
FROM ConvertedValues InnerTable
WHERE 1 = 1
AND InnerTable.period <= OuterTable.period
AND InnerTable.period >= DATEADD( yy , DATEDIFF( YY , 0, OuterTable.period ) ,0)
AND InnerTable.account = OuterTable.account
) CrsAppOutPut
ORDER BY OuterTable.account , OuterTable.period
NOw, Jeff Moden is going to kill me for not using the Quirky update method. I will check if i can post it anytime soon.
And , please please please please store DATES as DATETIME/DATE.. not any other data-type.. its just a PITA 🙁
August 31, 2011 at 8:01 pm
ColdCoffee (8/31/2011)
NOw, Jeff Moden is going to kill me for not using the Quirky update method.
Nope... no killing. I just want to know how you like your Pork Chops done. 😛
You didn't do too bad... You DID correctly preaggregate and only spawned 102 unnecessary rows with the Triangular Join to do a total on 20. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
August 31, 2011 at 8:06 pm
Michael A. (8/31/2011)
Thanks again for any ideas.Regards,
Mike
Mike,
Take a look at the following article to understand what I'd recommend to do this. If you do try the code on your own, make sure you post here so I can take a look at it and make sure you followed the rules... they don't call it the "Quirky Update" for nothing.
It's worth it... the Quirky Update will do just about any kind of running total you can throw at it on a million rows in about 3 seconds on most servers. The next fastest non-CLR method, believe it or not, is a well written forward only, read only, static cursor which takes somewhere in the neighborhood of 2 minutes.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 1, 2011 at 8:33 am
Thanks for the feedback guys!
@ColdCoffee: Great idea! I had a slight goof-up in my example though. Since we need to capture the YTD values, we would need each month to return a result for the YTD column. If Jan = 1000 and Feb = 0, we would expect to see Feb = 1000 as well in the YTD column. In regards to the pain with the integer format for period, I actually do have the DateTime format in the table. However, because this is a denormalized star-schema, we use surrogate keys for all dimensional data and if you follow DW guru’s such as Kimball, using a DatetTime field as a primary key in a fact table is frowned upon. 🙂
@jeff: thanks for the info; I assume you are referring to this http://www.sqlservercentral.com/articles/T-SQL/68467/ . Looks interesting but will take some time to digest and test.
Thanks again for the ideas, I will review and post back what I find works best.
Regards,
Mike
September 1, 2011 at 11:37 am
Sorry I forgot the URL. Yep, that's the one. The code is very similar to the same way you would do a running total in a programming language with a couple of additional rules to follow.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply