March 30, 2015 at 10:37 pm
Could someone please! show me how to return the 3 month rolling average count per username? This means, that if jan = 4, feb = 5, mar = 5, then 3 month rolling average will be 7 in April. And if apr = 6, the May rolling average will be 8.
Columns are four:
username, current_tenure, move_in_date, and count.
DDL (create script generated by SSMS from sample table I created, which is why the move_in_date is in hex form. When run it's converted to date. Total size of table 22 rows, 4 columns.)
CREATE TABLE [dbo].[countHistory](
[username] [varchar](50) NULL,
[current_tenure] [int] NULL,
[move_in_date] [smalldatetime] NULL,
[Cnt_Lead_id] [int] NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA2C50000 AS SmallDateTime), 3)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA2E10000 AS SmallDateTime), 13)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3000000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA31E0000 AS SmallDateTime), 7)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA33D0000 AS SmallDateTime), 12)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA35B0000 AS SmallDateTime), 8)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA37A0000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3990000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3B70000 AS SmallDateTime), 8)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3D60000 AS SmallDateTime), 4)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'debiv', 19, CAST(0xA3F40000 AS SmallDateTime), 4)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA2C50000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA2E10000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3000000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA31E0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA33D0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA35B0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA37A0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3990000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3B70000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3D60000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3F40000 AS SmallDateTime), 6)
March 31, 2015 at 12:22 am
Check out this article by Dwain Camps...
there's an example using 2008 that doesn't use windowing functions.
This is my attempt at applying what he wrote... maybe it's because it's stupid late, but it looks waaaay wrong.
SELECT a.[username]
, a.[move_in_date]
, Value = MAX(CASE WHEN a.[move_in_date] = b.[move_in_date] THEN a.[Cnt_Lead_id] END)
, Rolling3Months = CASE
WHEN ROW_NUMBER() OVER (ORDER BY a.[move_in_date]) < 3
THEN NULL
ELSE AVG(b.[Cnt_Lead_id])
END
FROM countHistory a
JOIN countHistory b ON b.[move_in_date] BETWEEN DATEADD(month,-2,a.[move_in_date]) AND a.[move_in_date]
GROUP BY a.[username], a.[move_in_date]
ORDER BY a.[username], a.[move_in_date];
March 31, 2015 at 12:35 am
Hi pietlinden, for one username it should look like this.
For example April's rolling average is 9+13+3/3. May's is 7+9+13/3, June's is 12+7+9/3
username---move_in_date-----------cnt_lead_id------Rolling 3 month Avg
DebiV--------2014-02-01 00:00:00------------3----------NULL
DebiV--------2014-03-01 00:00:00------------13---------NULL
DebiV--------2014-04-01 00:00:00------------9----------8
DebiV--------2014-05-01 00:00:00------------7----------9
DebiV--------2014-06-01 00:00:00------------12---------9
DebiV--------2014-07-01 00:00:00------------8----------9
DebiV--------2014-08-01 00:00:00------------9----------9
DebiV--------2014-09-01 00:00:00------------11---------9
DebiV--------2014-10-01 00:00:00------------8----------9
DebiV--------2014-11-01 00:00:00------------4----------7
DebiV--------2014-12-01 00:00:00------------4----------5
I rounded down, in the event of a decimal.
I have Itzik Ben-Gan's 2008 querying book , but nothing in index for 'moving', 'rolling', or under 'average' for same.
March 31, 2015 at 12:56 am
took this query from stackoverflow and substituted my values. But, there's no concept of 3 month rolling average.
;WITH cte (rn, username, move_in_date, Cnt_Lead_id) AS (
SELECT
rn = ROW_NUMBER() OVER (ORDER BY move_in_date),
username,
move_in_date,
Cnt_Lead_id
FROM [dbo].[countHistory]
)
SELECT
username,
move_in_date,
Cnt_Lead_id,
movagv = (
SELECT AVG(Cnt_Lead_id)
FROM cte AS inner_ref
WHERE inner_ref.rn BETWEEN outer_ref.rn-3 AND outer_ref.rn
)
FROM cte AS outer_ref
ORDER BY username, move_in_date
If you run this against the sampletable DDL you'll see it doesn't do 3 month rolling.
March 31, 2015 at 1:49 am
I had similar experience with simple talk.
I tried as follows but at the very least, do not know how to group by username , etc.
SELECT username
,a.move_in_date
,a.Cnt_Lead_id
,Rolling12Months=CASE
WHEN ROW_NUMBER() OVER (ORDER BY a.move_in_date) < 3
THEN NULL
ELSE a.Cnt_Lead_id + b.Cnt_Lead_id
END
FROM [countHistory] a
CROSS APPLY
(
SELECT Cnt_Lead_id=SUM(Cnt_Lead_id)
FROM
(
SELECT TOP 2 username, b.move_in_date, Cnt_Lead_id
FROM [countHistory] b
WHERE b.move_in_date < a.move_in_date
ORDER BY b.move_in_date DESC
) b
) b
ORDER BY a.move_in_date;
March 31, 2015 at 2:47 am
Finally dug up Itzik's book on T-SQL 2008...
-- from p.458 "Inside SQL Server 2008 T-SQL Querying" By Itzik Ben-Gan
SELECT 01.empID,
CONVERT(VARCHAR(7), o1.ordMonth, 121) AS toMonth,
o1.qty AS qtyThisMonth,
SUM(o2.qty) AS totalQty,
CAST(AVG(1 * o2.qty) AS NUMERIC(12,2)) AS avgQty
FROM dbo.EmpOrders AS o1
JOIN dbo.EmpOrders o2
ON o2.empID = 01.empID
AND (02.ordmonth > DATEADD(month, -3, o1.ordMonth)
AND o2.ordMonth <= o1.ordMonth)
GROUP BY o1.empID, o1.ordMonth, o1.qty
ORDER BY o1.empID, o1.ordMonth;
Tweaking IBG's solution to your structure...
SELECT x1.username
, x1.move_in_date
, SUM(x2.cnt_lead_id) as totalqty
, CAST(AVG(1 * x2.cnt_lead_id) aS NUMERIC(12,2)) AS avgQty
FROM dbo.countHistory x1
JOIN dbo.countHistory x2
ON x2.username = x1.username
AND x2.move_in_date<=x1.move_in_date
GROUP BY x1.username, x1.move_in_date, x1.cnt_lead_id
ORDER BY x1.username, x1.move_in_date;
I think that's right... but anything involving my brain and T-SQL at 4AM could hardly be called "thinking".
March 31, 2015 at 7:14 am
pietlinden,
Aha, Itzik Ben-Gan calls this 'sliding' and 'cumulative'
Thanks for looking that up. Your query is for 'sliding aggregate'. I turned to pg 453 and copied the following pattern which appears to work for 'cumulative aggregate'.
SELECT o1.username,
o1.move_in_date AS toMonth,
o1.Cnt_Lead_id AS CntThisMonth,
CAST(AVG(1. * o2.Cnt_Lead_id) AS NUMERIC (4,0)) AS '3MonthRollingAvgCnt'
FROM dbo.countHistory o1
JOIN dbo.countHistory o2
ON o2.username = o1.username
AND (o2.move_in_date > DATEADD(month, -3, o1.move_in_date)
AND o2.move_in_date <= o1.move_in_date)
GROUP BY o1.username, o1.move_in_date, o1.Cnt_Lead_id
ORDER BY o1.username, o1.move_in_date;
IBG says "the main difference between the solution for cumulative aggregates and the solution for sliding aggregates is in the join condition (or in the subquery's filter in the case of the alternate solution using subqueries)....."I am trying to understand still, but it does appear I have a query that will now further me along in my goal today. Thanks much.
March 31, 2015 at 8:21 am
If as originally was stated you need preceeding months (jan, feb, march) for april, april not included, then a minor tweak is needed
LEFT JOIN
ON ... >=
... <
March 31, 2015 at 9:05 am
thank you serg for looking over my post. OK, will take a look.
March 31, 2015 at 6:45 pm
pietlinden (3/31/2015)
Check out this article by Dwain Camps...
You beat me to it.
I would have gone with the CROSS APPLY TOP method as this is SQL 2008 instead of the self-JOIN.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 31, 2015 at 7:00 pm
Took me a while to find it... and stop sitting on Itzik's book and actually read it. Got any good references for learning the hardert part of T-SQL queries (I guess just read Itzik's book?)
March 31, 2015 at 7:53 pm
pietlinden (3/31/2015)
Took me a while to find it... and stop sitting on Itzik's book and actually read it. Got any good references for learning the hardert part of T-SQL queries (I guess just read Itzik's book?)
Yeah, one of these days I probably should too. Mostly I've just read his articles.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 1, 2015 at 11:34 am
I vote for Itzhik. He also provides brain exercises at the end of the book.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply