May 24, 2006 at 1:38 am
Okay so we all know the select distinct {column name} from {Table name} query which returns all none duplicates for the column you are searching against, basic stuff, then what you want to do is add some filters, so you append a where {column name} like/= {value}.
Okay so what if you are returned rows where the first few words are duplicated like;
Tom Jones 1
Tom Jones 2
Tom Jones 3
Lucy Lane
Joe Soap
Mercy Moe 1
Mercy Moe 2
By looking at the results, you can see that there are duplicates, how on earth would you draw out results that would only list the exceptions such as in the case of old Tom Jones and Mercy Moe only and leave out the rest?
May 24, 2006 at 2:36 am
Tony
How about creating a view (indexed if necessary) that shows the name column with all non-alphabetic characters stripped from the end? Then you can do a select distinct against the view.
John
May 24, 2006 at 2:44 am
You wouldn't need to use a view. You could just say:
select distinct dbo.uf_stripnonalpha(colname)
from table
--Rob
Rob Farley
LobsterPot Solutions & Adelaide SQL Server User Group
Company: http://www.lobsterpot.com.au
Blog: http://blogs.lobsterpot.com.au
May 24, 2006 at 2:59 am
That would be easier. But bear in mind that any indexes you have on colname would not be used.
John
May 24, 2006 at 3:06 am
True. But if you wanted indexing, you could use a computed column and index (and query) that.
Rob Farley
LobsterPot Solutions & Adelaide SQL Server User Group
Company: http://www.lobsterpot.com.au
Blog: http://blogs.lobsterpot.com.au
May 24, 2006 at 3:24 am
This effort is part of a data normalisation project.
Indexing aside, both queries you have placed return distict lists, this is great, however I need to show the items that match this condition whereby Tom Jones is shown more than once irrispective that he has a numberic suffix. So a distinct select that only show alphanumeric, will not give me the exceptions
May 24, 2006 at 3:32 am
Tony
How about:
select colname where colname <> dbo.uf_stripnonalpha(colname)
from table
John
May 24, 2006 at 5:35 am
Hi all,
Tony - it sounds like you only want the ones that have a count of more than one, so maybe you need something like this...
select dbo.uf_stripnonalpha(colname) from table group by dbo.uf_stripnonalpha(colname) having count(*) > 1
Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
May 24, 2006 at 6:35 am
Or:
t1.vc_col, V.stub
tbl t
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
May 24, 2006 at 7:52 am
That may work, I will give it a try. I think that I need to be careful here, as I am expecting duplicates in any event, the duplicates with the data set are addresses (Pysical address) and first I am stripping off the first numerics to get the street name, to validate against the rest of the address. Then I need to make sure that there are no numerics post the street name.
So Data looks like this right now.
1 West Ave
1 West Ave 1
1 West Ave 2
etc, they are grouped well, what I need to do is find these types occurences and mark them as exceptions for manual intervention and validation. I am told that this is normal, as the users on the old system were too lazy to find the address, when they tried to enter new data with that address, they were kicked out because of an illigal address entry, so they enter in the 1, 2, 3 to overcome the issue. I am creating a new system for them where they have to select via other known data and the address is entered in automatically for them.
May 24, 2006 at 8:16 am
The code I've given should find the last occurrence of a space followed by a number, and remove it. Looking at the code again, it's wrong. Try:.
t1.vc_col, V.stub
tbl t
select distinct
substring(t.vc_col,1,len(t.vc_col)-patindex('%[0-9] %',reverse(t.vc_col))) stub
from tbl t) V
substring(t.vc_col,1,len(t.vc_col)-patindex('%[0-9] %',reverse(t.vc_col))) like V.stub + '%'
patindex('%[0-9] %',reverse(t.vc_col)) > 0
this will only affect records where the vc_col has a space followed by a numeral. Obviously you might want to make many other refinements.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
May 24, 2006 at 8:23 am
further mistakes are left as an exercise for the reader.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply