October 10, 2013 at 10:34 am
Thanks if you can help.
I have a data set that contains a current record of available units for that day.
DECLARE @AvailableUnits TABLE
(
UnitDate DATE,
UnitID INT
)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ('10/1/2013',1)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/1/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/1/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/2/2013',1)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/2/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/2/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/2/2013',4)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/3/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/3/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/3/2013',4)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/4/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/4/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/4/2013',4)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/5/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/5/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/5/2013',4)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/6/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/6/2013',3)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/6/2013',4)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/6/2013',5)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/7/2013',2)
INSERT @AvailableUnits (UnitDate,UnitID) VALUES ( '10/7/2013',5)
SELECT * FROM @AvailableUnits
I'd like to compare each days data against the previous and identify any units that are newly available or no longer available.
My expected results are:
2013-10-02,4,'Newly Available Unit'
2013-10-03,1,'Unit No Longer Available'
2013-10-06,5,'Newly Available Unit'
2013-10-07,3,'Unit No Longer Available'
2013-10-07,4,'Unit No Longer Available'
Any day in the series could have results with the exception of the first day. If the data is unchanged compared to the previous day I don't want to return results.
October 10, 2013 at 11:20 am
This should get you pretty close.
select
coalesce(a.UnitDate,dateadd(dd,1,b.UnitDate)) as UnitDate
,coalesce(a.UnitID,b.UnitID) as UnitID
,case
when a.UnitID is null then 'Unit No Longer Available'
when b.UnitID is null then 'Newly Available Unit'
end
from @AvailableUnits a
full outer join @AvailableUnits b on a.UnitDate = dateadd(dd,1,b.UnitDate)
and a.UnitID = b.UnitID
where a.UnitIDis null
or b.UnitID is null
and
coalesce(a.UnitDate,dateadd(dd,1,b.UnitDate)) in (select UnitDate from @AvailableUnits)
order by UnitDate, UnitID
October 10, 2013 at 12:51 pm
Hi
Here's another option
select case when u = 1 then dateadd(dd,1,unitDate) else unitDate end unitDate, unitID,
case when u = 1 then 'Unit no longer available' else 'Newly available unit' end
from (
select unitDate, unitID
,row_number() over (partition by unitid order by unitdate desc) u
,row_number() over (partition by unitid order by unitdate asc) a
from @AvailableUnits
) a
where (u = 1 or a = 1) and a.unitDate not in (select max(unitdate) from @AvailableUnits union select min(unitdate) from @AvailableUnits)
order by case when u = 1 then dateadd(dd,1,unitDate) else unitDate end
However this assumes that you are only interested in the first and last availability and any gaps in the date range are acceptable. If you do want to do the gaps as well Jeff Moden has an excellent article here[/url]
October 13, 2013 at 9:08 pm
MickyT was definitely on the right track when he pointed you to Jeff Moden's article on finding islands in contiguous date ranges. However I believe a small modification (add-on) to that approach is required.
WITH DateEndPoints AS
(
SELECT DateStart=DATEADD(day, 1, MIN(UnitDate)), DateEnd=MAX(UnitDate)
FROM @AvailableUnits
)
SELECT UnitID, [Date], [Status]
FROM
(
SELECT UnitID, UnitStart=MIN(UnitDate), UnitEnd=MAX(UnitDate)
FROM
(
SELECT *, rn=DATEADD(day, -ROW_NUMBER() OVER (PARTITION BY UnitID ORDER BY UnitDate), UnitDate)
FROM @AvailableUnits
) a
GROUP BY UnitID, rn
) a
CROSS APPLY
(
VALUES (UnitStart, 'Newly Available Unit'), (DATEADD(day, 1, UnitEnd), 'Unit no longer available')
) b ([Date], [Status])
CROSS APPLY DateEndPoints
WHERE [Date] BETWEEN DateStart AND DateEnd
ORDER BY [Date], UnitID;
The CROSS APPLY VALUES construct is what I normally use to calculate gaps from islands, but I modified it to return not the day before the island but the day that the island started on, in order to use that date as the new product available date.
That method is noted here: The SQL of Gaps and Islands in Sequences[/url] (without the requisite regrouping which is taken care of by the WHERE clause.
This should also run pretty fast across large row sets.
Edit: Fixed a couple of formatting issues.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 15, 2013 at 8:10 am
Thanks for the insights everyone. I will take me a bit of time to review/understand the methods. I'll post back if I have any questions.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply