running 3 month average

Question

Post reply

running 3 month average

KoldCoffee

SSCoach

Points: 18971
More actions
March 30, 2015 at 10:37 pm

#302535

Could someone please! show me how to return the 3 month rolling average count per username? This means, that if jan = 4, feb = 5, mar = 5, then 3 month rolling average will be 7 in April. And if apr = 6, the May rolling average will be 8.
Columns are four:
username, current_tenure, move_in_date, and count.
DDL (create script generated by SSMS from sample table I created, which is why the move_in_date is in hex form. When run it's converted to date. Total size of table 22 rows, 4 columns.)
CREATE TABLE [dbo].[countHistory](
[username] [varchar](50) NULL,
[current_tenure] [int] NULL,
[move_in_date] [smalldatetime] NULL,
[Cnt_Lead_id] [int] NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA2C50000 AS SmallDateTime), 3)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA2E10000 AS SmallDateTime), 13)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3000000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA31E0000 AS SmallDateTime), 7)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA33D0000 AS SmallDateTime), 12)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA35B0000 AS SmallDateTime), 8)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA37A0000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3990000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3B70000 AS SmallDateTime), 8)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'DebiV', 19, CAST(0xA3D60000 AS SmallDateTime), 4)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'debiv', 19, CAST(0xA3F40000 AS SmallDateTime), 4)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA2C50000 AS SmallDateTime), 9)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA2E10000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3000000 AS SmallDateTime), 11)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA31E0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA33D0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA35B0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA37A0000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3990000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3B70000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3D60000 AS SmallDateTime), 10)
INSERT [dbo].[countHistory] ([username], [current_tenure], [move_in_date], [Cnt_Lead_id]) VALUES (N'laurae', 23, CAST(0xA3F40000 AS SmallDateTime), 6)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63509 More actions · Answer 1

Check out this article by Dwain Camps...

https://www.simple-talk.com/sql/t-sql-programming/calculating-values-within-a-rolling-window-in-transact-sql/

there's an example using 2008 that doesn't use windowing functions.

This is my attempt at applying what he wrote... maybe it's because it's stupid late, but it looks waaaay wrong.

SELECT a.[username]

, a.[move_in_date]

, Value = MAX(CASE WHEN a.[move_in_date] = b.[move_in_date] THEN a.[Cnt_Lead_id] END)

, Rolling3Months = CASE

WHEN ROW_NUMBER() OVER (ORDER BY a.[move_in_date]) < 3

THEN NULL

ELSE AVG(b.[Cnt_Lead_id])

END

FROM countHistory a

JOIN countHistory b ON b.[move_in_date] BETWEEN DATEADD(month,-2,a.[move_in_date]) AND a.[move_in_date]

GROUP BY a.[username], a.[move_in_date]

ORDER BY a.[username], a.[move_in_date];

KoldCoffee SSCoach Points: 18971 More actions · Answer 2

Hi pietlinden, for one username it should look like this.

For example April's rolling average is 9+13+3/3. May's is 7+9+13/3, June's is 12+7+9/3

username---move_in_date-----------cnt_lead_id------Rolling 3 month Avg

DebiV--------2014-02-01 00:00:00------------3----------NULL

DebiV--------2014-03-01 00:00:00------------13---------NULL

DebiV--------2014-04-01 00:00:00------------9----------8

DebiV--------2014-05-01 00:00:00------------7----------9

DebiV--------2014-06-01 00:00:00------------12---------9

DebiV--------2014-07-01 00:00:00------------8----------9

DebiV--------2014-08-01 00:00:00------------9----------9

DebiV--------2014-09-01 00:00:00------------11---------9

DebiV--------2014-10-01 00:00:00------------8----------9

DebiV--------2014-11-01 00:00:00------------4----------7

DebiV--------2014-12-01 00:00:00------------4----------5

I rounded down, in the event of a decimal.

I have Itzik Ben-Gan's 2008 querying book , but nothing in index for 'moving', 'rolling', or under 'average' for same.

KoldCoffee SSCoach Points: 18971 More actions · Answer 3

took this query from stackoverflow and substituted my values. But, there's no concept of 3 month rolling average.

;WITH cte (rn, username, move_in_date, Cnt_Lead_id) AS (

SELECT

rn = ROW_NUMBER() OVER (ORDER BY move_in_date),

username,

move_in_date,

Cnt_Lead_id

FROM [dbo].[countHistory]

)

SELECT

username,

move_in_date,

Cnt_Lead_id,

movagv = (

SELECT AVG(Cnt_Lead_id)

FROM cte AS inner_ref

WHERE inner_ref.rn BETWEEN outer_ref.rn-3 AND outer_ref.rn

)

FROM cte AS outer_ref

ORDER BY username, move_in_date

If you run this against the sampletable DDL you'll see it doesn't do 3 month rolling.

KoldCoffee SSCoach Points: 18971 More actions · Answer 4

I had similar experience with simple talk.

I tried as follows but at the very least, do not know how to group by username , etc.

SELECT username

,a.move_in_date

,a.Cnt_Lead_id

,Rolling12Months=CASE

WHEN ROW_NUMBER() OVER (ORDER BY a.move_in_date) < 3

THEN NULL

ELSE a.Cnt_Lead_id + b.Cnt_Lead_id

END

FROM [countHistory] a

CROSS APPLY

(

SELECT Cnt_Lead_id=SUM(Cnt_Lead_id)

FROM

(

SELECT TOP 2 username, b.move_in_date, Cnt_Lead_id

FROM [countHistory] b

WHERE b.move_in_date < a.move_in_date

ORDER BY b.move_in_date DESC

) b

ORDER BY a.move_in_date;

pietlinden SSC Guru Points: 63509 More actions · Answer 5

Finally dug up Itzik's book on T-SQL 2008...

-- from p.458 "Inside SQL Server 2008 T-SQL Querying" By Itzik Ben-Gan

SELECT 01.empID,

CONVERT(VARCHAR(7), o1.ordMonth, 121) AS toMonth,

o1.qty AS qtyThisMonth,

SUM(o2.qty) AS totalQty,

CAST(AVG(1 * o2.qty) AS NUMERIC(12,2)) AS avgQty

FROM dbo.EmpOrders AS o1

JOIN dbo.EmpOrders o2

ON o2.empID = 01.empID

AND (02.ordmonth > DATEADD(month, -3, o1.ordMonth)

AND o2.ordMonth <= o1.ordMonth)

GROUP BY o1.empID, o1.ordMonth, o1.qty

ORDER BY o1.empID, o1.ordMonth;

Tweaking IBG's solution to your structure...

SELECT x1.username

, x1.move_in_date

, SUM(x2.cnt_lead_id) as totalqty

, CAST(AVG(1 * x2.cnt_lead_id) aS NUMERIC(12,2)) AS avgQty

FROM dbo.countHistory x1

JOIN dbo.countHistory x2

ON x2.username = x1.username

AND x2.move_in_date<=x1.move_in_date

GROUP BY x1.username, x1.move_in_date, x1.cnt_lead_id

ORDER BY x1.username, x1.move_in_date;

I think that's right... but anything involving my brain and T-SQL at 4AM could hardly be called "thinking".

KoldCoffee SSCoach Points: 18971 More actions · Answer 6

pietlinden,

Aha, Itzik Ben-Gan calls this 'sliding' and 'cumulative'

Thanks for looking that up. Your query is for 'sliding aggregate'. I turned to pg 453 and copied the following pattern which appears to work for 'cumulative aggregate'.

SELECT o1.username,

o1.move_in_date AS toMonth,

o1.Cnt_Lead_id AS CntThisMonth,

CAST(AVG(1. * o2.Cnt_Lead_id) AS NUMERIC (4,0)) AS '3MonthRollingAvgCnt'

FROM dbo.countHistory o1

JOIN dbo.countHistory o2

ON o2.username = o1.username

AND (o2.move_in_date > DATEADD(month, -3, o1.move_in_date)

AND o2.move_in_date <= o1.move_in_date)

GROUP BY o1.username, o1.move_in_date, o1.Cnt_Lead_id

ORDER BY o1.username, o1.move_in_date;

IBG says "the main difference between the solution for cumulative aggregates and the solution for sliding aggregates is in the join condition (or in the subquery's filter in the case of the alternate solution using subqueries)....."I am trying to understand still, but it does appear I have a query that will now further me along in my goal today. Thanks much.

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 7

If as originally was stated you need preceeding months (jan, feb, march) for april, april not included, then a minor tweak is needed

LEFT JOIN

ON ... >=

... <

KoldCoffee SSCoach Points: 18971 More actions · Answer 8

thank you serg for looking over my post. OK, will take a look.

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

pietlinden (3/31/2015)
Check out this article by Dwain Camps...
https://www.simple-talk.com/sql/t-sql-programming/calculating-values-within-a-rolling-window-in-transact-sql/

You beat me to it.

I would have gone with the CROSS APPLY TOP method as this is SQL 2008 instead of the self-JOIN.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

pietlinden SSC Guru Points: 63509 More actions · Answer 10

Took me a while to find it... and stop sitting on Itzik's book and actually read it. Got any good references for learning the hardert part of T-SQL queries (I guess just read Itzik's book?)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 11

pietlinden (3/31/2015)
Took me a while to find it... and stop sitting on Itzik's book and actually read it. Got any good references for learning the hardert part of T-SQL queries (I guess just read Itzik's book?)

Yeah, one of these days I probably should too. Mostly I've just read his articles.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

KoldCoffee SSCoach Points: 18971 More actions · Answer 12

I vote for Itzhik. He also provides brain exercises at the end of the book.