March 18, 2010 at 5:46 pm
Hi,
I am trying to get an output from a give dataset as part of a reporting requirement.
I am attaching as sample input and output data with some notes as what I need for my output from the
given input for my reporting needs. I am trying to create a single query to accomplish this.
I have tried using self-join on the input data by joining on Code column and comparing the date column from one data set to the other in the self-join, however not getting that far. Any help would be appreciated.
March 18, 2010 at 6:38 pm
I'd recommend using a calendar table as a left outer join part. If you don't use one already you might want to have a look at the Tally table article referenced in my signature.
You need to define how you'd handle two consecutive dates with missing values.
Option 1: the third day would have NULL or Zero (since the previous day had no value)
Option 2: the third day would have the last known value (since the previous day would have been updated with the value from the day before...)
Side note: If you'd like to get tested solutions rather than verbal descriptions please provide sample data in a ready to use format as described in the first link in my signature. And please include your expected result as well as what you've tried so far.
March 18, 2010 at 8:33 pm
Thank you Imu92 for your response and suggestion.
Here is a script to create the table, insert some sample data and my starting query to accomplish the task.
--Creating the table
CREATE TABLE T1
(
CreatedDate datetime not null,
Code Char(1),
Value smallint
)
--Inserting some data
insert into T1
select '1/1/2009','a',1
union
select '1/1/2009','b',2
union
select '1/1/2009','c',1
union
select '1/1/2009','d',2
union
select '1/2/2009','b',1
union
select '1/2/2009','c',2
union
select '1/3/2009','b',2
union
select '1/3/2009','d',1
--This is my starting query
select
a.CreatedDate,
a.Code,
a.Value
from T1 as a
left outer join T1 as b
on a.Code = b.Code
and b.CreatedDate <= a.CreatedDate
March 18, 2010 at 8:36 pm
This is the desired output. It is also attached as an excel file to the initial post.
CreatedDateCodeValue
1/1/2009 0:00a1
1/1/2009 0:00b2
1/1/2009 0:00c1
1/1/2009 0:00d2
1/2/2009 0:00a1
1/2/2009 0:00b1
1/2/2009 0:00c2
1/2/2009 0:00d2
1/2/2009 0:00a0
1/3/2009 0:00b2
1/3/2009 0:00c2
1/3/2009 0:00d1
March 18, 2010 at 10:44 pm
Neither the code nor the execution plan are pretty, but this will get it done.
No guarantees on scalability, but it's late and I'm off to bed. 😉
If anyone can come up with something more elegant, please do so.
Look it over and let's talk about any questions you may have.
;WITH -- many CTEs following
L0 AS (SELECT 1 AS C UNION ALL SELECT 1),--2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
Tally AS (SELECT top 1000000 ROW_NUMBER() OVER(ORDER BY C) AS N FROM L3),
MinMaxDates AS (select min(createdDate) as MinDate, max(createdDate) as MaxDate from T1),
xDates AS (select dateadd(dd,N-1,mindate) as xDate
from tally
cross join MinMaxDates
where N <= dateadd(dd,1,dateDiff(dd,MinDate,MaxDate))
)
-- ctes end here
select xDate as createdDate,dt1.code, isnull(t1.value,dt2.value) as value
from xDates
cross join (select distinct code from T1) dt1
left join T1 on CreatedDate = xdate and t1.code = dt1.code
outer apply (select top (1) T.value
from T1 T
where t.code = dt1.code and t.createdDate <= xDate
order by createdDate desc) dt2
order by createdDate, code, value
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 19, 2010 at 3:26 am
Using a recursive CTE:
DECLARE @Sample
TABLE (
code CHAR(1) NOT NULL,
value INTEGER NOT NULL,
dt DATETIME NOT NULL,
PRIMARY KEY (dt, code, value)
);
INSERT @Sample (code, value, dt) VALUES ('a', 1, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('c', 1, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('d', 2, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('b', 1, {d '2009-01-02'});
INSERT @Sample (code, value, dt) VALUES ('c', 2, {d '2009-01-02'});
INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-03'});
INSERT @Sample (code, value, dt) VALUES ('d', 1, {d '2009-01-03'});
WITH CTE
AS (
SELECT TOP (1) WITH TIES
S.code,
S.value,
S.dt,
present = 1
FROM @Sample S
ORDER BY
S.dt ASC
UNION ALL
SELECT CTE.code,
COALESCE(CA.value, CTE.present, 0),
DATEADD(DAY, 1, CTE.dt),
present = CA.value
FROM CTE
OUTER
APPLY (
SELECT value
FROM @Sample S2
WHERE S2.dt = DATEADD(DAY, 1, CTE.dt)
AND S2.code = CTE.code
) CA (value)
WHERE EXISTS
(
SELECT *
FROM @Sample S3
WHERE S3.dt = DATEADD(DAY, 1, CTE.dt)
)
)
SELECT code, value, dt
FROM CTE
ORDER BY
dt, code;
March 19, 2010 at 5:55 am
Hey Paul, what got you onto a recursive kick?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 19, 2010 at 6:09 am
The Dixie Flatline (3/19/2010)
Hey Paul, what got you onto a recursive kick?
Blame Chris Morris 😀
Actually, that is not quite fair - I do like recursive solutions (as a challenge) but the SQL Server implementation has some serious performance problems in many cases. There are still a number of cases where recursion makes sense, though in this case I used it mostly because it was hard to do 😉
edit:
Just remembered the other thing that got me into them again: Nested Sets.
March 19, 2010 at 10:46 am
Thanks Paul and Dixie for your resolutions.
I have used CTE's however haven't got the hang of recursive CTE's as yet, but will definitely will look into it as it is pretty powerful. Dixie, i believe your solution did not take care of the scenario to assign 0 to the missing code if it is not present for the previous date, but I will recheck.
The problem in fact was a little complicated to resolve as I thought it would be.
I will practice some recursive CTE's.
Thanks again.
March 19, 2010 at 10:50 am
bdba (3/19/2010)
Thanks Paul and Dixie for your resolutions.
I would be surprised if Bob's solution was too far off. It is a much more natural solution - the recursive CTE was just a fun exercise for me really.
March 19, 2010 at 3:08 pm
Actually, I misunderstood the requirements. Sorry. I shouldn't try to work when I'm tired.
My solution reaches back to the last valid value for a code, instead of returning a zero if the previous date is null. This was by design, due to my misunderstanding the objective.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 19, 2010 at 9:34 pm
The Dixie Flatline (3/19/2010)
Actually, I misunderstood the requirements. Sorry. I shouldn't try to work when I'm tired. My solution reaches back to the last valid value for a code, instead of returning a zero if the previous date is null. This was by design, due to my misunderstanding the objective.
Would you mind if I tweaked your code to fix this, or do you want to do that yourself?
March 22, 2010 at 2:23 pm
Always feel free to tweak away, Paul.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 22, 2010 at 7:11 pm
Tweaked:
DECLARE @Sample
TABLE (
code CHAR(1) NOT NULL,
value INTEGER NOT NULL,
dt DATETIME NOT NULL,
PRIMARY KEY (code, dt)
);
INSERT @Sample (code, value, dt) VALUES ('a', 1, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('c', 1, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('d', 2, {d '2009-01-01'});
INSERT @Sample (code, value, dt) VALUES ('b', 1, {d '2009-01-02'});
INSERT @Sample (code, value, dt) VALUES ('c', 2, {d '2009-01-02'});
INSERT @Sample (code, value, dt) VALUES ('b', 2, {d '2009-01-03'});
INSERT @Sample (code, value, dt) VALUES ('d', 1, {d '2009-01-03'});
WITH Dates
AS (
-- All dates
SELECT DISTINCT dt
FROM @Sample
),
Codes
AS (
-- All codes
SELECT DISTINCT code
FROM @Sample
),
DatesAndCodes
AS (
-- All combinations of date and code
SELECT D.dt,
C.code
FROM Dates D
CROSS
JOIN Codes C
)
SELECT DC.dt,
DC.code,
value = COALESCE(iTVF.value, 0)
FROM DatesAndCodes DC
OUTER
APPLY (
-- Find the value for the current date and code
-- Prefer real values for the date, but search
-- previous date too
SELECT TOP (1)
value
FROM @Sample S
WHERE S.dt BETWEEN DATEADD(DAY, -1, DC.dt) AND DC.dt
AND S.code = DC.code
ORDER BY
S.dt DESC
) iTVF
ORDER BY
DC.dt,
DC.code;
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply