Single Query for getting missing row values based on Date for generating report output

  • Hi,

    I am trying to get an output from a give dataset as part of a reporting requirement.

    I am attaching as sample input and output data with some notes as what I need for my output from the

    given input for my reporting needs. I am trying to create a single query to accomplish this.

    I have tried using self-join on the input data by joining on Code column and comparing the date column from one data set to the other in the self-join, however not getting that far. Any help would be appreciated.

  • I'd recommend using a calendar table as a left outer join part. If you don't use one already you might want to have a look at the Tally table article referenced in my signature.

    You need to define how you'd handle two consecutive dates with missing values.

    Option 1: the third day would have NULL or Zero (since the previous day had no value)

    Option 2: the third day would have the last known value (since the previous day would have been updated with the value from the day before...)

    Side note: If you'd like to get tested solutions rather than verbal descriptions please provide sample data in a ready to use format as described in the first link in my signature. And please include your expected result as well as what you've tried so far.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thank you Imu92 for your response and suggestion.

    Here is a script to create the table, insert some sample data and my starting query to accomplish the task.

    --Creating the table

    CREATE TABLE T1

    (

    CreatedDate datetime not null,

    Code Char(1),

    Value smallint

    )

    --Inserting some data

    insert into T1

    select '1/1/2009','a',1

    union

    select '1/1/2009','b',2

    union

    select '1/1/2009','c',1

    union

    select '1/1/2009','d',2

    union

    select '1/2/2009','b',1

    union

    select '1/2/2009','c',2

    union

    select '1/3/2009','b',2

    union

    select '1/3/2009','d',1

    --This is my starting query

    select

    a.CreatedDate,

    a.Code,

    a.Value

    from T1 as a

    left outer join T1 as b

    on a.Code = b.Code

    and b.CreatedDate <= a.CreatedDate

  • This is the desired output. It is also attached as an excel file to the initial post.

    CreatedDateCodeValue

    1/1/2009 0:00a1

    1/1/2009 0:00b2

    1/1/2009 0:00c1

    1/1/2009 0:00d2

    1/2/2009 0:00a1

    1/2/2009 0:00b1

    1/2/2009 0:00c2

    1/2/2009 0:00d2

    1/2/2009 0:00a0

    1/3/2009 0:00b2

    1/3/2009 0:00c2

    1/3/2009 0:00d1

  • Neither the code nor the execution plan are pretty, but this will get it done.

    No guarantees on scalability, but it's late and I'm off to bed. 😉

    If anyone can come up with something more elegant, please do so.

    Look it over and let's talk about any questions you may have.

    ;WITH -- many CTEs following

    L0 AS (SELECT 1 AS C UNION ALL SELECT 1),--2 rows

    L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows

    L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows

    L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows

    Tally AS (SELECT top 1000000 ROW_NUMBER() OVER(ORDER BY C) AS N FROM L3),

    MinMaxDates AS (select min(createdDate) as MinDate, max(createdDate) as MaxDate from T1),

    xDates AS (select dateadd(dd,N-1,mindate) as xDate

    from tally

    cross join MinMaxDates

    where N <= dateadd(dd,1,dateDiff(dd,MinDate,MaxDate))

    )

    -- ctes end here

    select xDate as createdDate,dt1.code, isnull(t1.value,dt2.value) as value

    from xDates

    cross join (select distinct code from T1) dt1

    left join T1 on CreatedDate = xdate and t1.code = dt1.code

    outer apply (select top (1) T.value

    from T1 T

    where t.code = dt1.code and t.createdDate <= xDate

    order by createdDate desc) dt2

    order by createdDate, code, value

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Using a recursive CTE:

    DECLARE @Sample

    TABLE (

    code CHAR(1) NOT NULL,

    value INTEGER NOT NULL,

    dt DATETIME NOT NULL,

    PRIMARY KEY (dt, code, value)

    );

    INSERT @Sample (code, value, dt) VALUES ('a', 1, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('c', 1, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('d', 2, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('b', 1, {d '2009-01-02'});

    INSERT @Sample (code, value, dt) VALUES ('c', 2, {d '2009-01-02'});

    INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-03'});

    INSERT @Sample (code, value, dt) VALUES ('d', 1, {d '2009-01-03'});

    WITH CTE

    AS (

    SELECT TOP (1) WITH TIES

    S.code,

    S.value,

    S.dt,

    present = 1

    FROM @Sample S

    ORDER BY

    S.dt ASC

    UNION ALL

    SELECT CTE.code,

    COALESCE(CA.value, CTE.present, 0),

    DATEADD(DAY, 1, CTE.dt),

    present = CA.value

    FROM CTE

    OUTER

    APPLY (

    SELECT value

    FROM @Sample S2

    WHERE S2.dt = DATEADD(DAY, 1, CTE.dt)

    AND S2.code = CTE.code

    ) CA (value)

    WHERE EXISTS

    (

    SELECT *

    FROM @Sample S3

    WHERE S3.dt = DATEADD(DAY, 1, CTE.dt)

    )

    )

    SELECT code, value, dt

    FROM CTE

    ORDER BY

    dt, code;

  • Hey Paul, what got you onto a recursive kick?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (3/19/2010)


    Hey Paul, what got you onto a recursive kick?

    Blame Chris Morris 😀

    Actually, that is not quite fair - I do like recursive solutions (as a challenge) but the SQL Server implementation has some serious performance problems in many cases. There are still a number of cases where recursion makes sense, though in this case I used it mostly because it was hard to do 😉

    edit:

    Just remembered the other thing that got me into them again: Nested Sets.

    See http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/swinging-from-tree-to-tree-using-ctes-part-1-adjacency-to-nested-sets.aspx

    and http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/swinging-from-tree-to-tree-using-ctes-part-2-adjacency-to-nested-intervals.aspx

  • Thanks Paul and Dixie for your resolutions.

    I have used CTE's however haven't got the hang of recursive CTE's as yet, but will definitely will look into it as it is pretty powerful. Dixie, i believe your solution did not take care of the scenario to assign 0 to the missing code if it is not present for the previous date, but I will recheck.

    The problem in fact was a little complicated to resolve as I thought it would be.

    I will practice some recursive CTE's.

    Thanks again.

  • bdba (3/19/2010)


    Thanks Paul and Dixie for your resolutions.

    I would be surprised if Bob's solution was too far off. It is a much more natural solution - the recursive CTE was just a fun exercise for me really.

  • Actually, I misunderstood the requirements. Sorry. I shouldn't try to work when I'm tired.

    My solution reaches back to the last valid value for a code, instead of returning a zero if the previous date is null. This was by design, due to my misunderstanding the objective.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (3/19/2010)


    Actually, I misunderstood the requirements. Sorry. I shouldn't try to work when I'm tired. My solution reaches back to the last valid value for a code, instead of returning a zero if the previous date is null. This was by design, due to my misunderstanding the objective.

    Would you mind if I tweaked your code to fix this, or do you want to do that yourself?

  • Always feel free to tweak away, Paul.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Tweaked:

    DECLARE @Sample

    TABLE (

    code CHAR(1) NOT NULL,

    value INTEGER NOT NULL,

    dt DATETIME NOT NULL,

    PRIMARY KEY (code, dt)

    );

    INSERT @Sample (code, value, dt) VALUES ('a', 1, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('c', 1, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('d', 2, {d '2009-01-01'});

    INSERT @Sample (code, value, dt) VALUES ('b', 1, {d '2009-01-02'});

    INSERT @Sample (code, value, dt) VALUES ('c', 2, {d '2009-01-02'});

    INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-03'});

    INSERT @Sample (code, value, dt) VALUES ('d', 1, {d '2009-01-03'});

    WITH Dates

    AS (

    -- All dates

    SELECT DISTINCT dt

    FROM @Sample

    ),

    Codes

    AS (

    -- All codes

    SELECT DISTINCT code

    FROM @Sample

    ),

    DatesAndCodes

    AS (

    -- All combinations of date and code

    SELECT D.dt,

    C.code

    FROM Dates D

    CROSS

    JOIN Codes C

    )

    SELECT DC.dt,

    DC.code,

    value = COALESCE(iTVF.value, 0)

    FROM DatesAndCodes DC

    OUTER

    APPLY (

    -- Find the value for the current date and code

    -- Prefer real values for the date, but search

    -- previous date too

    SELECT TOP (1)

    value

    FROM @Sample S

    WHERE S.dt BETWEEN DATEADD(DAY, -1, DC.dt) AND DC.dt

    AND S.code = DC.code

    ORDER BY

    S.dt DESC

    ) iTVF

    ORDER BY

    DC.dt,

    DC.code;

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply