March 13, 2012 at 6:21 am
Please Please Please - Can anyone help me.
I haven't been using SQL very long and this is driving me nuts!!:unsure:
I have managed to extract a selection of data... this shows a number of duplicate records.
What I would like to do is set up a flag called 'type' which will be populated by either 1 or 2 (based on the below criteria)
SubjectUniqueIDDate1 Date2 Code
1ABC12314/12/201116/12/2011H98
1ABC23414/12/201116/12/2011H98
2ABC34527/09/201106/12/2011C01
2ABC45627/09/201106/12/2011C98
3ABC56717/11/201108/12/2011H98
3ABC67817/11/201115/12/2011H98
-Where a subject has records that are identical - I would like the 1st record to be allocated type 1 and the others records allocated as 2
-Where a subject has records that are identical but the 'Code' field is different - I would like the record with 'C01' to be allocated type 1 and the other records to be allocated as 2
- Where a subject has duplicate records (Date1 is the same but date 2 isn't ) - I would like the 1st date to be allocated as type 1 otherwise 2
This would be my desired outcome
SubjectUniqueIDDate1 Date2 Code Type
1ABC12314/12/201116/12/2011H981
1ABC23414/12/201116/12/2011H982
2ABC34527/09/201106/12/2011C011
2ABC45627/09/201106/12/2011C982
3ABC56717/11/201108/12/2011H981
3ABC67817/11/201115/12/2011H982
3 ABC789 17/11/2011 17/12/2011 H98 2
Please could someone tell me what i have to do... better still provide me with some code
This is urgent; so if anyone can help I'd really apprecaite it!
Many Thanks
March 13, 2012 at 6:27 am
What's the criteria for "first row"?
There isn't such a thing as order in a relational table.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 13, 2012 at 7:00 am
The 1st and 2nd line are duplicates (exact match) therefore I would like the 1st record to be allocated 1 and the following record as 2
I have created a temp table ordering the data and its on this table that I would like to build in the rules.
Is this possible to do?
March 13, 2012 at 7:11 am
Question like that should be posted in according with the "forum etiquette" (link at the bottom of my signature). Only 'cause you "first-timer" here:
This how you should post "I have a table like that..." part:
-- table setup DDL
create table #table ([Subject] int
,UniqueID char(6)
,Date1 datetime
,Date2 datetime
,Codechar(3)
)
-- some data sample
set dateformat dmy
insert #table values
(1,'ABC123', '14/12/2011', '16/12/2011', 'H98'),
(1,'ABC234', '14/12/2011', '16/12/2011', 'H98'),
(2,'ABC345', '27/09/2011', '06/12/2011', 'C01'),
(2,'ABC456', '27/09/2011', '06/12/2011', 'C98'),
(3,'ABC567', '17/11/2011', '08/12/2011', 'H98'),
(3,'ABC678', '17/11/2011', '15/12/2011', 'H98')
So, we can just cut-&-paste and execute it to have your case going...
Now, what about if you have more than two duplicated records? Lets add some sample data:
insert #table values
(4,'ABC770', '14/12/2011', '16/12/2011', 'H98'),
(4,'ABC771', '14/12/2011', '16/12/2011', 'H98'),
(4,'ABC772', '14/12/2011', '16/12/2011', 'H98'),
(5,'ABC880', '27/09/2011', '06/12/2011', 'C01'),
(5,'ABC881', '27/09/2011', '06/12/2011', 'C98'),
(5,'ABC882', '27/09/2011', '06/12/2011', 'C99'),
(6,'ABC990', '17/11/2011', '08/12/2011', 'H98'),
(6,'ABC991', '17/11/2011', '15/12/2011', 'H98'),
(6,'ABC992', '17/11/2011', '25/12/2011', 'H98')
To calculate "Type" based on your three rules we can use ROW_NUMBER,
for each rule separately:
;with cte_num
as
(
select [Subject], UniqueID, Date1, Date2, Code
,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1, Date2, Code ORDER BY UniqueID) RN_identical
,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1, Date2 ORDER BY UniqueID, Code) RN_difCode
,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1 ORDER BY UniqueID, Date2, Code) RN_difDate2
from #table
)
select [Subject], UniqueID, Date1, Date2, Code
,case when RN_identical >= RN_difCode and RN_identical >= RN_difDate2 then RN_identical
when RN_difCode >= RN_difDate2 then RN_difCode
else RN_difDate2
end [Type]
from cte_num
order by [Subject], UniqueID
But, if you analyse your three business rules a bit more, you can see that in reality you need only one, the last one! Just state the order of how "type" should be assigned, so you can get away with more simple query, calculating ROW_NUMBER just once:
select [Subject], UniqueID, Date1, Date2, Code
,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1 ORDER BY UniqueID, Date2, Code) [Type]
from #table
order by [Subject], UniqueID
As per previous post, there is no "natural order" in SQL table, therefore I've used UniqueID for order, so in a "duplicate" pair, record with "lowest" UniqueID will be assigned to type 1.
March 14, 2012 at 8:04 am
Thank you so much for your help!
Enjoy the rest of the week 🙂
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply