August 2, 2012 at 11:09 am
Hi All,
I'd like to perform a rolling monthly analysis.
Let's say I have a sales table and if a client has sales during a particular month a record gets entered into the table with the month end date indicating sales activity during that month.
I want to determine when there is a new client and when a client is lost.
Logically I would use NOT IN logic to compare the second month against the first and then the first month against the second to determine the second months results (February in my sample data)
Then I would compare the third month(March) against the second( February). And so on...
If a client drops out and then comes back we can still consider this a 'NewClient'
The number of months is indeterminate.
Thanks if you can help.
CREATE TABLE #Sales
(
ClientID int,
SaleDate date
)
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531')
SELECT * FROM #Sales
DROP TABLE #Sales
--Expected Results
ClientAction,MonthYear,ClientID
----------------------
Newclient,February 2012,5
Lostclient,February 2012,3
Lostclient,March 2012,4
Newclient,March 2012,6
Lostclient,April 2012,2
Lostclient,April 2012,5
Newclient,April 2012,7
Lostclient,May 2012,6
Newclient,May 2012,5
August 2, 2012 at 1:33 pm
Hi,
I've solved a problem similar to this before, and I can't for the life of me figure out how I did it.
Maybe the code below will put you (or someone) on the right track...
Good luck!
;WITH Moo AS
(
SELECT ClientID_L= T1.ClientID
,SaleDate_L= T1.SaleDate
,ClientID_R= T2.ClientID
,SaleDate_R= T2.SaleDate
FROM#Sales T1
LEFT JOIN#Sales T2 ON T1.ClientID = T2.ClientID AND DATEDIFF(MONTH, T1.SaleDate, T2.SaleDate) = 1
),
Moo2 AS
(
SELECT ClientID_L
,SaleDate_L
,sStatus =CASE
WHEN ClientID_R IS NOT NULL THEN 'Start' ELSE 'End'
END
FROM Moo
)
SELECT *
FROM Moo2
ORDER BY ClientID_L, SaleDate_L
August 2, 2012 at 1:41 pm
Here using a full join where both sides are null we can get all our new or old clients and depending on where the nulls are can assign whether they were a new client or lost client.
DECLARE @StartMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MIN(SaleDate)) + 1,0) FROM #Sales)
DECLARE @EndMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MAX(SaleDate)),0) FROM #Sales)
SELECT CASE WHEN s2.ClientID IS NULL THEN 'NewClient' ELSE 'LostClient' END AS ClientAction,
RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,
ISNULL(s1.ClientID, s2.ClientID) AS ClientID
FROM #Sales s1
FULL JOIN #Sales s2
ON s1.ClientID = s2.ClientID
AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1
WHERE (s1.ClientID IS NULL
OR s2.ClientID IS NULL)
AND (s2.SaleDate < @EndMonth
OR s1.SaleDate > @StartMonth)
ORDER BY ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), ISNULL(s1.ClientID, s2.ClientID)
EDIT: Fixed code to eliminate the fact that all clients in january were new (no DEC 2011 DATA) and all clients in Jun were lost (no data for JUN 2012).
this should work when you add in more months as i take the max and min from the table.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
August 2, 2012 at 2:20 pm
Something like this maybe?
CREATE TABLE #Sales
(
ClientID int,
SaleDate date
);
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531');
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531');
--SELECT * FROM #Sales;
declare @StartDate date = '20120101',
@EndDate date = '20120630';
WITH SalesActivity as (
SELECT
CASE WHEN s2.ClientID IS NULL
THEN 'NewClient'
ELSE 'LostClient'
END AS ClientAction,
RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,
ISNULL(s1.ClientID, s2.ClientID) AS ClientID,
ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)) SalesDate
FROM
#Sales s1
FULL JOIN #Sales s2
ON s1.ClientID = s2.ClientID
AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1
WHERE
s1.ClientID IS NULL
OR s2.ClientID IS NULL
)
SELECT
ClientAction,
SaleDate,
ClientID
FROM
SalesActivity
WHERE
SalesDate between dateadd(mm,1,@StartDate) and dateadd(mm,datediff(mm,0,@EndDate),-1)
ORDER BY
SalesDate,
ClientID;
GO
DROP TABLE #Sales
go
August 2, 2012 at 2:21 pm
That's what I get for not checking first.
August 2, 2012 at 2:25 pm
Lynn Pettis (8/2/2012)
That's what I get for not checking first.
LOL :w00t::w00t:
i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
August 2, 2012 at 2:28 pm
capn.hector (8/2/2012)
Lynn Pettis (8/2/2012)
That's what I get for not checking first.LOL :w00t::w00t:
i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.
I got interupted as well.
August 2, 2012 at 6:14 pm
I'm fascinated.....I am working with Lynn's code.
Say I wanted certain clients those whose ClientID was < than 3.
Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.
Can I filter on the CTE?
I tried the following but did not get the expected results.
WHERE
(s1.ClientID IS NULL OR s1.ClientID < 3)
OR
(s2.ClientID IS NULL OR s2.ClientID < 3)
I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀
I am also wondering if I should solve each problem (new and lost) separately and union the results...
August 3, 2012 at 7:58 am
Chrissy321 (8/2/2012)
I'm fascinated.....I am working with Lynn's code.Say I wanted certain clients those whose ClientID was < than 3.
Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.
Can I filter on the CTE?
I tried the following but did not get the expected results.
WHERE
(s1.ClientID IS NULL OR s1.ClientID < 3)
OR
(s2.ClientID IS NULL OR s2.ClientID < 3)
I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀
I am also wondering if I should solve each problem (new and lost) separately and union the results...
Lynn took my code and wrapped it so that Jan 2012 would not have 3 "New" clients and Jun 2012 would not have every client "Lost" i was able to fix my code before lynn posted his reply.
i am currently working on a test bed as i do believe my code will be more performant as i only have the single select but as i may get busy at work it may wait till monday.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
August 3, 2012 at 8:02 am
Thanks capn. I am going to look at Mr Celko's approach with the Months table since it may greatly simplify matters.
August 7, 2012 at 10:34 am
I ended up using a Months table, see below. This seems to simplify the analysis greatly.
Question: Is it appropriate to index every column in a calendar table like this that won't ever be updated and will only ever have on index??
I have 500 or so rows in this table.
CREATE TABLE [reference].[Months](
[Month_Name] [varchar](9) NOT NULL,
[Month_Year] [int] NOT NULL,
[Month_FirstDay] [date] NOT NULL,
[Month_LastDay] [date] NOT NULL,
[PreviousMonth_FirstDay] [date] NOT NULL,
[PreviousMonth_LastDay] [date] NOT NULL,
[NextMonth_FirstDay] [date] NOT NULL,
[NextMonth_LastDay] [date] NOT NULL,
[Quarter_Name] [varchar](7) NOT NULL,
CONSTRAINT [PK_Months_1] PRIMARY KEY CLUSTERED
(
[Month_Name] ASC,
[Month_Year] ASC,
[Month_FirstDay] ASC,
[Month_LastDay] ASC,
[PreviousMonth_FirstDay] ASC,
[PreviousMonth_LastDay] ASC,
[NextMonth_FirstDay] ASC,
[NextMonth_LastDay] ASC,
[Quarter_Name] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply