February 29, 2012 at 2:32 pm
Good Day Everyone,
I currently need to analyze a table which keeps tracks of every changes of a status for certain documents. My final goal is to count the number of documents that still remained in a given status at the end of each month.
While I could easily find a way to identify the last known status at the end of a month for a particular document, whenever a change occurred during this very month - I'm having a hard time to 'duplicate' this status for any following months until a new change occurs so that the analysis time-frame is continuous and therefore provides data for every months over this period.
I've included the result I aim at and a sample SQL code that goes with it.
Do you think there is a proper way to create such query in standard SQL?
Thanks for your time and suggestions.
CREATE TABLE [dbo].[DocumentsHistory]
(
[DocumentID] [int] NOT NULL ,
[DocumentUpdateDate] [datetime] NULL ,
[LastDayOfTheMonth] [datetime] NULL ,
[DocumentStatus] [varchar](30) NULL ,
[DocumentHistoryID] [varchar](8) NOT NULL
)
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Mar 12 2010 12:00:00:000AM','Mar 31 2010 12:00:00:000AM','INITIALIZED','00003JDP')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Jun 14 2011 12:00:00:000AM','Jun 30 2011 12:00:00:000AM','CLOSED','00004AQ3')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Mar 31 2011 12:00:00:000AM','Mar 31 2011 12:00:00:000AM','RESPONDED','000046I8')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Oct 8 2010 12:00:00:000AM','Oct 31 2010 12:00:00:000AM','ON-GOING(Rescheduled)','00003WIO')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Feb 26 2011 12:00:00:000AM','Feb 28 2011 12:00:00:000AM','INITIALIZED','000044B8')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Dec 29 2010 12:00:00:000AM','Dec 31 2010 12:00:00:000AM','RESPONDED','000040X5')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Jun 11 2010 12:00:00:000AM','Jun 30 2010 12:00:00:000AM','ON-GOING(Negotiation accepted)','00003P6V')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Apr 29 2011 12:00:00:000AM','Apr 30 2011 12:00:00:000AM','ON-GOING(Rescheduled)','000048D7')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Jan 9 2011 12:00:00:000AM','Jan 31 2011 12:00:00:000AM','CLOSED','000041J3')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'May 10 2010 12:00:00:000AM','May 31 2010 12:00:00:000AM','ON-GOING','00003N8X')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'May 31 2011 12:00:00:000AM','May 31 2011 12:00:00:000AM','RESPONDED','00004A4S')
INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Aug 12 2010 12:00:00:000AM','Aug 31 2010 12:00:00:000AM','ON-GOING(Rescheduled)','00003SZQ')
March 1, 2012 at 5:24 am
OK, here is one of possible solutions:
--1. create calendar table.
--you can create permanent calendar table
--with all required attributes if you don't have one already
-- the following will generate month end dates for 2010 and 2011
SELECT TOP 24 DATEADD(DAY,-1,DATEADD(month, row_number() over (order by (select null)),'20100101')) AS MonthEndDay
INTO #calendar
FROM sys.columns -- any table will do which have at least 24 rows in it :-)
-- I don't know exact requiremebnts for your report, but
-- let say you want report for two years 2010 & 2011,
-- the first appearance of document should be for the month it was first time logged into history
;WITH CTE_frstoccur
AS
(
SELECT DocumentID, MIN(LastDayOfTheMonth) minLastDayOfTheMonth
FROM dbo.DocumentsHistory
--here you may add a date filter depends on what your real requirements
GROUP BY DocumentID
)
, CTE_DocPerMonth --this will generate the list of MonthEndDays for each of document
--strating from the day of first occurance in the history table
AS
(
SELECT d.DocumentID, c.MonthEndDay
FROM CTE_frstoccur AS d
CROSS JOIN #calendar AS c
WHERE c.MonthEndDay >= d.minLastDayOfTheMonth
)
SELECT c.DocumentID
,c.MonthEndDay
,ISNULL(d.DocumentUpdateDate, a.DocumentUpdateDate) AS DocumentUpdateDate
,ISNULL(d.DocumentStatus, a.DocumentStatus) AS DocumentStatus
,ISNULL(d.DocumentHistoryID, a.DocumentHistoryID) AS DocumentHistoryID
FROM CTE_DocPerMonth AS c
LEFT JOIN dbo.DocumentsHistory AS d
ON d.DocumentID =c.DocumentID
AND c.MonthEndDay = d.LastDayOfTheMonth
OUTER APPLY (SELECT TOP 1 d1.DocumentUpdateDate, d1.DocumentStatus, d1.DocumentHistoryID
FROM dbo.DocumentsHistory d1
WHERE d1.DocumentID = c.DocumentID
AND d1.LastDayOfTheMonth < c.MonthEndDay
ORDER BY d1.LastDayOfTheMonth DESC) a
ORDER BY c.DocumentID, c.MonthEndDay
Instead of using OUTER APPLY you may use "quirky update" way (you may achieve much better performance with it, but it will be more coding and prerequisites), search for article by Jeff Moden...
March 1, 2012 at 6:58 am
What you really need is to insert end-of-moth records where they do not exist.
It's not that hard once you have so called "calendar" table with end dates of each month e.g. for last 24 months from now.
March 1, 2012 at 1:32 pm
Wow I'm stunned - I just tweaked your code a little so that 'CLOSED' statuses are not repeated.
CTE are of great interest - I suspected that someone answering me might use this concept. Do you think there might have been a solution not involving any specific T-SQL feature for portability sake ? I am not looking for any code since I will gladly use your solution.
Thanks a million times Eugene,
March 1, 2012 at 1:33 pm
Thank you for you answer Vedran, I will not have the opportunity to 'insert' any data.
Take care!
March 1, 2012 at 7:27 pm
donbuz (3/1/2012)
Wow I'm stunned - I just tweaked your code a little so that 'CLOSED' statuses are not repeated.CTE are of great interest - I suspected that someone answering me might use this concept. Do you think there might have been a solution not involving any specific T-SQL feature for portability sake ? I am not looking for any code since I will gladly use your solution.
Thanks a million times Eugene,
With only 2 exceptions, CTEs are nearly identical to having the subquery in the FROM clause. They're both basically "inline views". The subquery in a FROM clause is usually referred to as a "Derived Table" and, if you don't need recursion or multiple calls to the same subquery, will make for portable code.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply