March 6, 2013 at 6:12 pm
Hey guys,
Can someone show me how to go about solving this problem? I thought it would be pretty straightforward but turns out it's not. I made some strides but have been stuck on the last portion. Here is what I have so far.
I have a table with the following data inside.
Create table #m_TestTable
(
Name varchar(20),
DateRecorded datetime
)
----Insert sample data
Insert into #m_TestTable Values ('John', dateadd(day,1,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,2,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,3,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,6,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,7,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,8,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,12,GetDate()))
Insert into #m_TestTable Values ('John', dateadd(day,13,GetDate()))
The idea is once the difference between the previous day and the current day is greater than or equal to 1, it should be counted as 1. I use the following query to get the difference in date between the current row and the previous row
;With tblDifference as
(
Select Row_Number() OVER (Order by DateRecorded) as RowNumber,Name, DateRecorded,PointValue from #m_TestTable
)
Select
Cur.Name
,Cur.DateRecorded as CurrentDay
,Prv.DateRecorded as PreviousDay
,Datediff(Day, Prv.DateRecorded, Cur.DateRecorded) as DateDifference
Into #temp1
From
tblDifference Cur Left Outer Join tblDifference Prv
On Cur.RowNumber = Prv.RowNumber + 1
Order by Cur.DateRecorded
--select * from #temp1
Here is where I get stuck. Based on the datedifference column, I need to perform a count.
The datedifference column data appears in following order. NULL, 1, 1, 3, 1, 1, 4, 1
So the count for John should be 3 broken down like this. NULL, 1, 1 should count as ONE; 3, 1, 1 should count as ONE and 4, 1 should count as ONE.
--But if the data was like this
Create table #m_TestTable1
(
Name varchar(20),
DateRecorded datetime
)
----Insert sample data
Insert into #m_TestTable1 Values ('John', dateadd(day,1,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,2,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,3,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,6,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,7,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,8,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,9,GetDate()))
Insert into #m_TestTable1 Values ('John', dateadd(day,10,GetDate()))
;With tblDifference as
(
Select Row_Number() OVER (Order by DateRecorded) as RowNumber,Name, DateRecorded from #m_TestTable1
)
Select
Cur.Name
,Cur.DateRecorded as CurrentDay
,Prv.DateRecorded as PreviousDay
,Datediff(Day, Prv.DateRecorded, Cur.DateRecorded) as DateDifference
Into #temp2
From
tblDifference Cur Left Outer Join tblDifference Prv
On Cur.RowNumber = Prv.RowNumber + 1
Order by Cur.DateRecorded
--select * from #temp2
The datedifference column data appears in following order. NULL, 1, 1, 3, 1, 1, 1, 1
Then in this case the count for John would be only 2. NULL, 1, 1 would count as ONE, 3, 1, 1, 1, 1 would count as ONE.
In summary, if there are '1's after a number, they count as ONE. If the number is anything other than 1 and is followed by '1's, it will count as ONE.
If the number is anything other than one and is followed by non 1 numbers, they count as one each. To illustrate this last point, if the data was like this
Create table #m_TestTable2
(
Name varchar(20),
DateRecorded datetime
)
----Insert sample data
Insert into #m_TestTable2 Values ('John', dateadd(day,1,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,4,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,7,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,8,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,9,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,10,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,16,GetDate()))
Insert into #m_TestTable2 Values ('John', dateadd(day,17,GetDate()))
;With tblDifference as
(
Select Row_Number() OVER (Order by DateRecorded) as RowNumber,Name, DateRecorded from #m_TestTable2
)
Select
Cur.Name
,Cur.DateRecorded as CurrentDay
,Prv.DateRecorded as PreviousDay
,Datediff(Day, Prv.DateRecorded, Cur.DateRecorded) as DateDifference
Into #temp3
From
tblDifference Cur Left Outer Join tblDifference Prv
On Cur.RowNumber = Prv.RowNumber + 1
Order by Cur.DateRecorded
--select * from #temp3
The datedifference column data appears in following order. NULL, 3, 3, 1, 1, 1, 6, 1
Then the count for John would be 3. NULL, 3, will count as ONE, 3, 1, 1, 1 will count as ONE and 6, 1 will count as 1.
I hope this makes sense. Thanks for your time. I know this is a strange one.
March 6, 2013 at 7:39 pm
Hi
The third example you gave doesn't appear to conform with the rules use on the other examples. I would have expect 4 based on the previous examples
Example 1
N,1,1,3,1,1,4,1
1---->2---->3->
Example 2
N,1,1,3,1,1,1,1
1---->2------->
Example
N,3,3,1,1,1,6,1
1>2>3------>4->
If my assumption is correct then you could do a count of differences greater than one and add one
SELECT COUNT(*) + 1 FROM #temp1 WHERE DataDifference > 1
If you want to assign a group then you could do something like the following quirky update. I may a some changes to your original query into #temp1 to accommodate it.
;With tblDifference as
(
Select Row_Number() OVER (Order by DateRecorded) as RowNumber,Name, DateRecorded from #m_TestTable
)
Select
Cur.Name
,Cur.DateRecorded as CurrentDay
,Prv.DateRecorded as PreviousDay
,Datediff(Day, Prv.DateRecorded, Cur.DateRecorded) as DateDifference
,Cur.RowNumber
,0 Grp
into #temp1
From
tblDifference Cur Left Outer Join tblDifference Prv
On Cur.RowNumber = Prv.RowNumber + 1
Order by Cur.DateRecorded
ALTER TABLE #temp1 ALTER COLUMN RowNumber int NOT NULL
ALTER TABLE #temp1 ADD CONSTRAINT tmp_pk PRIMARY KEY (RowNumber)
DECLARE @i AS INT = 1
UPDATE #temp1
SET Grp = @i,
@i = @i + isnull(cast(cast(DateDifference - 1 as bit) as integer),0)
SELECT * FROM #temp1
Hope this helps
March 7, 2013 at 12:41 am
Thanks for the prompt response. Yes your assumptions are correct.
Can you clarify this statement. I think it contains some special characters
SELECT COUNT(*) + 1 FROM #temp1 WHERE DataDifference > 1
What is "> 1" ?
How do you mean "assign a group" ?
Thanks.
March 7, 2013 at 1:38 am
This is called "Islands and Gaps". An island is a set of rows where the dates are consecutive, bounded by rows with non-consecutive dates. Jeff Moden has a great Spackle article [/url]covering the subject in some detail. Once you've identified the islands, what do you want to do?
Edit: clarity
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 7, 2013 at 5:49 am
ChrisM@Work (3/7/2013)
This is called "Islands and Gaps". An island is a set of rows where the dates are consecutive, bounded by rows with non-consecutive dates. Jeff Moden has a great Spackle article [/url]covering the subject in some detail. Once you've identified the islands, what do you want to do?Edit: clarity
That's a great article Chris! Oldey but goody.
Makes me think of something like this. π
WITH SampleData AS (
SELECT Name, RecordedDate
FROM (
Values ('John', dateadd(day,1,GetDate()))
,('John', dateadd(day,2,GetDate()))
,('John', dateadd(day,3,GetDate()))
,('John', dateadd(day,6,GetDate()))
,('John', dateadd(day,7,GetDate()))
,('John', dateadd(day,8,GetDate()))
, ('John', dateadd(day,12,GetDate()))
,('John', dateadd(day,13,GetDate())))a (Name, RecordedDate))
SELECT Name, COUNT(*)
FROM (
SELECT Name
FROM (
SELECT Name, RecordedDate
,rn=RecordedDate-ROW_NUMBER() OVER (PARTITION BY Name ORDER BY RecordedDate)
FROM SampleData) a
GROUP BY Name, rn) b
GROUP BY Name
Although I couldn't think of a good name for the second column. [Count of islands] perhaps?
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 7, 2013 at 6:04 am
Nice one geezer π
It makes for a surprisingly succinct query.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 7, 2013 at 7:50 am
ChrisM@Work (3/7/2013)
Nice one geezer πIt makes for a surprisingly succinct query.
Good thing I'm not getting paid by the keystroke. π
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 7, 2013 at 11:05 am
sqlislife (3/7/2013)
Thanks for the prompt response. Yes your assumptions are correct.Can you clarify this statement. I think it contains some special characters
SELECT COUNT(*) + 1 FROM #temp1 WHERE DataDifference > 1
What is "> 1" ?
How do you mean "assign a group" ?
Thanks.
Sorry about that the greater than symbol > appears ok on mine
March 7, 2013 at 11:07 am
ChrisM@Work (3/7/2013)
This is called "Islands and Gaps". An island is a set of rows where the dates are consecutive, bounded by rows with non-consecutive dates. Jeff Moden has a great Spackle article [/url]covering the subject in some detail. Once you've identified the islands, what do you want to do?
Have to agree ... a great article. Was concentrating on the dateDifference column, so totally over looked this.
March 7, 2013 at 11:20 am
Wow, interesting. Nice article by Jeff as always.
Thanks guys. I will modify this to suit my needs.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply