Filling 'blank' with data from previous month for reporting purposes

  • Good Day Everyone,

    I currently need to analyze a table which keeps tracks of every changes of a status for certain documents. My final goal is to count the number of documents that still remained in a given status at the end of each month.

    While I could easily find a way to identify the last known status at the end of a month for a particular document, whenever a change occurred during this very month - I'm having a hard time to 'duplicate' this status for any following months until a new change occurs so that the analysis time-frame is continuous and therefore provides data for every months over this period.

    I've included the result I aim at and a sample SQL code that goes with it.

    Do you think there is a proper way to create such query in standard SQL?

    Thanks for your time and suggestions.

    CREATE TABLE [dbo].[DocumentsHistory]

    (

    [DocumentID] [int] NOT NULL ,

    [DocumentUpdateDate] [datetime] NULL ,

    [LastDayOfTheMonth] [datetime] NULL ,

    [DocumentStatus] [varchar](30) NULL ,

    [DocumentHistoryID] [varchar](8) NOT NULL

    )

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Mar 12 2010 12:00:00:000AM','Mar 31 2010 12:00:00:000AM','INITIALIZED','00003JDP')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Jun 14 2011 12:00:00:000AM','Jun 30 2011 12:00:00:000AM','CLOSED','00004AQ3')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Mar 31 2011 12:00:00:000AM','Mar 31 2011 12:00:00:000AM','RESPONDED','000046I8')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Oct 8 2010 12:00:00:000AM','Oct 31 2010 12:00:00:000AM','ON-GOING(Rescheduled)','00003WIO')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Feb 26 2011 12:00:00:000AM','Feb 28 2011 12:00:00:000AM','INITIALIZED','000044B8')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Dec 29 2010 12:00:00:000AM','Dec 31 2010 12:00:00:000AM','RESPONDED','000040X5')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Jun 11 2010 12:00:00:000AM','Jun 30 2010 12:00:00:000AM','ON-GOING(Negotiation accepted)','00003P6V')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'Apr 29 2011 12:00:00:000AM','Apr 30 2011 12:00:00:000AM','ON-GOING(Rescheduled)','000048D7')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Jan 9 2011 12:00:00:000AM','Jan 31 2011 12:00:00:000AM','CLOSED','000041J3')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'May 10 2010 12:00:00:000AM','May 31 2010 12:00:00:000AM','ON-GOING','00003N8X')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(85144,'May 31 2011 12:00:00:000AM','May 31 2011 12:00:00:000AM','RESPONDED','00004A4S')

    INSERT INTO [DocumentsHistory] ([DocumentID],[DocumentUpdateDate],[LastDayOfTheMonth],[DocumentStatus],[DocumentHistoryID])VALUES(75596,'Aug 12 2010 12:00:00:000AM','Aug 31 2010 12:00:00:000AM','ON-GOING(Rescheduled)','00003SZQ')

  • OK, here is one of possible solutions:

    --1. create calendar table.

    --you can create permanent calendar table

    --with all required attributes if you don't have one already

    -- the following will generate month end dates for 2010 and 2011

    SELECT TOP 24 DATEADD(DAY,-1,DATEADD(month, row_number() over (order by (select null)),'20100101')) AS MonthEndDay

    INTO #calendar

    FROM sys.columns -- any table will do which have at least 24 rows in it :-)

    -- I don't know exact requiremebnts for your report, but

    -- let say you want report for two years 2010 & 2011,

    -- the first appearance of document should be for the month it was first time logged into history

    ;WITH CTE_frstoccur

    AS

    (

    SELECT DocumentID, MIN(LastDayOfTheMonth) minLastDayOfTheMonth

    FROM dbo.DocumentsHistory

    --here you may add a date filter depends on what your real requirements

    GROUP BY DocumentID

    )

    , CTE_DocPerMonth --this will generate the list of MonthEndDays for each of document

    --strating from the day of first occurance in the history table

    AS

    (

    SELECT d.DocumentID, c.MonthEndDay

    FROM CTE_frstoccur AS d

    CROSS JOIN #calendar AS c

    WHERE c.MonthEndDay >= d.minLastDayOfTheMonth

    )

    SELECT c.DocumentID

    ,c.MonthEndDay

    ,ISNULL(d.DocumentUpdateDate, a.DocumentUpdateDate) AS DocumentUpdateDate

    ,ISNULL(d.DocumentStatus, a.DocumentStatus) AS DocumentStatus

    ,ISNULL(d.DocumentHistoryID, a.DocumentHistoryID) AS DocumentHistoryID

    FROM CTE_DocPerMonth AS c

    LEFT JOIN dbo.DocumentsHistory AS d

    ON d.DocumentID =c.DocumentID

    AND c.MonthEndDay = d.LastDayOfTheMonth

    OUTER APPLY (SELECT TOP 1 d1.DocumentUpdateDate, d1.DocumentStatus, d1.DocumentHistoryID

    FROM dbo.DocumentsHistory d1

    WHERE d1.DocumentID = c.DocumentID

    AND d1.LastDayOfTheMonth < c.MonthEndDay

    ORDER BY d1.LastDayOfTheMonth DESC) a

    ORDER BY c.DocumentID, c.MonthEndDay

    Instead of using OUTER APPLY you may use "quirky update" way (you may achieve much better performance with it, but it will be more coding and prerequisites), search for article by Jeff Moden...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • What you really need is to insert end-of-moth records where they do not exist.

    It's not that hard once you have so called "calendar" table with end dates of each month e.g. for last 24 months from now.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • Wow I'm stunned - I just tweaked your code a little so that 'CLOSED' statuses are not repeated.

    CTE are of great interest - I suspected that someone answering me might use this concept. Do you think there might have been a solution not involving any specific T-SQL feature for portability sake ? I am not looking for any code since I will gladly use your solution.

    Thanks a million times Eugene,

  • Thank you for you answer Vedran, I will not have the opportunity to 'insert' any data.

    Take care!

  • donbuz (3/1/2012)


    Wow I'm stunned - I just tweaked your code a little so that 'CLOSED' statuses are not repeated.

    CTE are of great interest - I suspected that someone answering me might use this concept. Do you think there might have been a solution not involving any specific T-SQL feature for portability sake ? I am not looking for any code since I will gladly use your solution.

    Thanks a million times Eugene,

    With only 2 exceptions, CTEs are nearly identical to having the subquery in the FROM clause. They're both basically "inline views". The subquery in a FROM clause is usually referred to as a "Derived Table" and, if you don't need recursion or multiple calls to the same subquery, will make for portable code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply