December 18, 2010 at 9:22 am
Hi all
Could any body help me with a query?
I have a table with a title,forename,surname and postcode field, I need to ensure only 1 record is selected per postcode ensuring that in the case of a duplicate postcode I select the male as preference.
se following code to see what I am faced with:
create table ashley.dbo.sample(
title nvarchar(10),
forename nvarchar(50),
surname nvarchar(50),
postcode nvarchar(10),
dps nvarchar(10),
ref nvarchar(10))
insert into ashley.dbo.sample
values ('Mr','John','Smith','AA1 1AB','AB','1')
insert into ashley.dbo.sample
values ('Mrs','Jane','Smith','AA1 1AB','AB','2')
insert into ashley.dbo.sample
values ('','J','Smith','AA1 1AB','AB','3')
insert into ashley.dbo.sample
values ('Mr','Bob','Smith','AA1 1AB','AB','4')
insert into ashley.dbo.sample
values ('Miss','Vera','Duckworth','AA1 1AC','FC','5')
insert into ashley.dbo.sample
values ('Mr','Jack','Duckworth','AA1 1AC','FC','6')
insert into ashley.dbo.sample
values ('Mrs','J','Duckworth','AA1 1AC','FC','7')
insert into ashley.dbo.sample
values ('Mrs','Jane','Cond','AA1 1AD','AD','8')
insert into ashley.dbo.sample
values ('','J','Cond','AA1 1AD','AD','9')
So titles of Mr are the priority then i no male available any other 1 single record per postcode will do.
Could anybody offer any assistance to this?
Thanks in advance 🙂
December 19, 2010 at 4:27 pm
I imagine there are better ways to do this, but this is what I was able to come up with...
select * from
(
select *,
row_number() over(partition by postcode order by case when title = 'Mr' then 0 else 1 end, ref) as pref
from sample
) z where pref = 1
The magic is all in the row_number function, where it Identifies the rows with 'Mr' and pushes them to the top of the set, partitioned by postcode. You then just select the top item in the set.
December 19, 2010 at 10:34 pm
ROW_NUMBER() is an EXCELLENT way of identifying dups and getting rid of all but one.
Good job.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
January 10, 2011 at 5:46 am
Jason Norsworthy (12/19/2010)
I imagine there are better ways to do this, but this is what I was able to come up with...
select * from
(
select *,
row_number() over(partition by postcode order by case when title = 'Mr' then 0 else 1 end, ref) as pref
from sample
) z where pref = 1
The magic is all in the row_number function, where it Identifies the rows with 'Mr' and pushes them to the top of the set, partitioned by postcode. You then just select the top item in the set.
This works and I have been using it...However, I am inserting results into another table and sometimes I have to run the query a few time to get all the deduped results into the table. DOes anybody know why this is?
January 10, 2011 at 7:42 pm
bicky1980 (1/10/2011)
Jason Norsworthy (12/19/2010)
I imagine there are better ways to do this, but this is what I was able to come up with...
select * from
(
select *,
row_number() over(partition by postcode order by case when title = 'Mr' then 0 else 1 end, ref) as pref
from sample
) z where pref = 1
The magic is all in the row_number function, where it Identifies the rows with 'Mr' and pushes them to the top of the set, partitioned by postcode. You then just select the top item in the set.
This works and I have been using it...However, I am inserting results into another table and sometimes I have to run the query a few time to get all the deduped results into the table. DOes anybody know why this is?
Post the code where it's happening...
--Jeff Moden
Change is inevitable... Change for the better is not.
January 11, 2011 at 11:53 am
If you are using the code shown to insert into another table, you are only going to get one "duplicate" at a time. To insert all but the first row, you would want to select where the column produced by row_number() is greater than 1. Like Jeff said, please post up the code you are actually running.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
January 11, 2011 at 7:29 pm
I have no idea how to use OVER and PARTITION and I hardly use ROW_NUMBER (probably out of ignorance), but just for fun I wrote a script that I think would work as well and should work with just one run.
SELECT title, forename, surname, postcode, dps, ref
from ashley.dbo.sample a
where
(
title = 'Mr.'
or
NOT EXISTS (
SELECT 1
FROM ashley.dbo.sample b
where
a.postcode = b.postcode
and b.title = 'Mr.'
)
)
AND
NOT EXISTS (
SELECT 1
FROM ashley.dbo.sample c
where
c.ref < a.ref
and a.postcode = c.postcode
and
(
c.title = 'Mr.'
or
NOT EXISTS (
SELECT 1 FROM ashley.dbo.sample d where c.postcode = d.postcode and d.title = 'Mr.')
)
)
I didn't build a test table to run this against, but I think it should run. Thanks for the other script it gives me something to learn and blog about in a few days.
January 13, 2011 at 3:29 pm
Just out of curiosity, how do you use ROW_NUMBER() at all without using OVER? I'll encourage you to read up on ROW_NUMBER(), RANK(), and the other windowed functions in SQL2008. They are excellent tools to have in your toolbox.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
January 13, 2011 at 3:38 pm
I will have to study up on that. I should be more specific I suppose 🙂 I have maintained scripts that I have noticed that ROW_NUMBER was used (I probably didn't notice the other two.) I really love working in SQL and am always excited about learning new tricks; so please feel free to critique the mess out of any of my posts. I definitely learn from my mistakes and plan on making a lot.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply