November 19, 2013 at 11:06 pm
Hi ,
I am replicating my problem in otherway .
Say , we are tracking record of bus service i.e. how buses are being used in company
EnddateKey as NULL means , currently the bus is in service
Following is sample set
SysKeyBusNameBusTypeStartDateKeyEndDatekey
1 A AC 20130101 20130112
2 A AC 20130113 20130115
3 A AC 20130116 20130118
4 B NON-AC 20130119 20130121
5 B NON-AC 20130122 20130124
6 A AC 20130125 NULL
Need output as below
SNOBusNameBusTypeStartDateKeyEndDatekey
1 A AC20130101 20130118
2 B NON-AC 20130119 20130124
3 A AC 20130125 NULL
Script to create a sample
Declare @temp table (SysKey int ,BusName varchar(9) ,BusType varchar(6) ,StartDateKey int ,EndDatekey int )
Insert into @Temp (SysKey ,BusName,BusType,StartDateKey,EndDatekey)
Select 1 , 'A' ,'AC',20130101,20130112
union
Select 2 , 'A' ,'AC', 20130113 ,20130115
union
Select 3 , 'A' ,'AC', 20130116 ,20130118
Union
Select 4 , 'B' , 'NON-AC', 20130119 ,20130121
union
Select 5 , 'B' , 'NON-AC', 20130122 ,20130124
Union
Select 6 , 'A' ,'AC', 20130125, NULL
Select * from @temp
Let me know , if i miss any clarity of question
Thanks
Surya Sunil
November 19, 2013 at 11:59 pm
SELECT
Busname
,BusType
,MIN(StartDateKey)
,CASE WHEN MAX(ISNULL(EndDateKey,99991231)) = 99991231
THEN NULL
ELSE MAX(EndDateKey)
END
FROM @temp
GROUP BY Busname, BusType
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 20, 2013 at 12:10 am
Result as not expected
BusnameBusType(No column name)(No column name)
A AC 20130101NULL
B NON-AC 2013011920130124
November 20, 2013 at 12:26 am
sunil.mvs (11/20/2013)
Result as not expectedBusnameBusType(No column name)(No column name)
A AC 20130101NULL
B NON-AC 2013011920130124
Right, missed the third line of the result set.
This is probably the easiest implemented with a cursor.
This might be slower for large data sets, but it is lineary scalable.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 20, 2013 at 1:32 am
solution should be set based only .. .as there are 98 million records that need to be taken care .
November 20, 2013 at 1:34 am
Which version of SQL Server?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 20, 2013 at 1:52 am
SQL server 2008R2
November 20, 2013 at 2:15 am
sunil.mvs (11/20/2013)
SQL server 2008R2
You cannot use windowing functions (SQL Server 2012), and it is possible set-based solutions are not linearly scalable (in the sense that they get slower and slower for larger data sets).
A cursor is slow (and I usually avoid them), but at least you get predictable performance.
If you change the defaults when creating a cursor, it is typically a lot faster then when you create a cursor with the pre-defined defaults.
If you'd really like a set-based solution, take a look at the gaps and islands problem, which is somewhat similar to your problem.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 20, 2013 at 2:46 am
hi,
is this what you wanted..
SELECT t1.Busname ,t1.BusType , MIN(StartDateKey) StartDateKey , Case when t1.EndDatekey is null then null else t2.EndDatekey end EndDatekey
From @temp t1
Left JOIN( SELECT Busname,BusType , max(EndDatekey)EndDatekey From @temp group by Busname,BusType)as t2 on t1.BusName = t2.BusName and t1.BusType = t2.BusType
GROUP BY t1.Busname, t1.BusType , Case when t1.EndDatekey is null then null else t2.EndDatekey end
November 20, 2013 at 2:56 am
Hi
I appreciate the solution but failing to the below scenario .. I have added few more records ..
Declare @temp table (SysKey int ,BusName varchar(9) ,BusType varchar(6) ,StartDateKey int ,EndDatekey int )
Insert into @Temp (SysKey ,BusName,BusType,StartDateKey,EndDatekey)
Select 1 , 'A' ,'AC',20130101,20130110
union
Select 2 , 'B' ,'NON-AC',20130111,20130112
union
Select 3 ,'A' ,'AC', 20130113 ,20130115
union
Select 4 , 'A' ,'AC', 20130116 ,20130118
Union
Select 5 , 'B' , 'NON-AC', 20130119 ,20130121
union
Select 6 , 'B' , 'NON-AC', 20130122 ,20130124
Union
Select 7 , 'A' ,'AC',20130125, 20130128
union
select 8,'A','AC',20130129,20130131
union
Select 9 ,'A','AC',20130201 , 20130204
union
Select 10 , 'C','NON-AC',20130205, 99991231
SELECT t1.Busname ,t1.BusType , MIN(StartDateKey) StartDateKey , Case when t1.EndDatekey is null then null else t2.EndDatekey end EndDatekey
From @temp t1
Left JOIN( SELECT Busname,BusType , max(EndDatekey)EndDatekey From @temp group by Busname,BusType)as t2 on t1.BusName = t2.BusName and t1.BusType = t2.BusType
GROUP BY t1.Busname, t1.BusType , Case when t1.EndDatekey is null then null else t2.EndDatekey end
November 20, 2013 at 3:20 am
With 98 million rows of data something tells me there is more to this than you are telling us, just a gut feeling.
Is there anything else we should know about the source data that could cause potential issues with a solution. Are bus names and dates actually as shown in the limited data set or are they interleaved with multiple bus names and start/end dates? Are there other columns that help identify buses and a sequential service. Is a bus considered in service as long as there is an unbroken sequence in dates. Looking at the following:
1 A AC 20130101 20130112
2 A AC 20130113 20130115
3 A AC 20130116 20130118
4 B NON-AC 20130119 20130121
5 B NON-AC 20130122 20130124
6 A AC 20130125 NULL
Looking at A there is a break between 20130118 and 20130125 making these separate entries for reporting. Had the last A been 20130119 for a start date, it would be included with the first 3 A's even though there are 2 B's between the 3rd and 4th A, correct? Or wouldthe 4th A still be separate because of the interruption of the sequence by B (assuming the 4th A had a date of 20130119, not the 20130125 as shown).
November 20, 2013 at 3:53 am
nice question .. Only one bus is used during a startDateKey and EndDatekey ... No other bus is allowed .
For Example Serial No 7 ,8 and 9 should be consolidated as 'A' ,'AC',20130125, 20130204
Serial no exists to identify ...
1 , 'A' ,'AC',20130101,20130110
2 , 'B' ,'NON-AC',20130111,20130112
3 ,'A' ,'AC', 20130113 ,20130115
4 , 'A' ,'AC', 20130116 ,20130118
5 , 'B' , 'NON-AC', 20130119 ,20130121
6 , 'B' , 'NON-AC', 20130122 ,20130124
7 , 'A' ,'AC',20130125, 20130128
8,'A','AC',20130129,20130131
9 ,'A','AC',20130201 , 20130204
10 , 'C','NON-AC',20130205, 99991231
November 20, 2013 at 5:20 am
what output do you expect from above table
November 20, 2013 at 5:29 am
Expected Output
1, 'A' ,'AC',20130101,20130110
2,'B' ,'NON-AC',20130111,20130112
3, 'A' ,'AC', 20130113 ,20130118
4,'B' , 'NON-AC', 20130119 ,20130124
5,'A' ,'AC',20130125, 20130204
6,'C','NON-AC',20130205, 99991231
November 20, 2013 at 5:46 am
sunil.mvs (11/20/2013)
Expected Output1, 'A' ,'AC',20130101,20130110
2,'B' ,'NON-AC',20130111,20130112
3, 'A' ,'AC', 20130113 ,20130118
4,'B' , 'NON-AC', 20130119 ,20130124
5,'A' ,'AC',20130125, 20130204
6,'C','NON-AC',20130205, 99991231
What if line 2 with B wasn't there?
Would you combine line 1 and 3 into single lines, or would you still keep them seperate as the dates are not adjacent?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply