Help please - similar names

Question

Help please - similar names

dipesh187

Grasshopper

Points: 21
More actions
October 31, 2007 at 7:06 am

#199611

I need a T-SQL script that:
looks up a Account.Account field and sees whether the first 10 characters of any two accounts are the same
I done it by...
SELECT T1.Account, T2.NoOfMatches
FROM Account T1
INNER JOIN
(
SELECT SUBSTRING( Account, 1, 10 ) AS Account,
COUNT( * ) NoOfMatches
FROM Account
GROUP BY SUBSTRING( Account, 1, 10 )
HAVING COUNT( * ) > 1
) T2 ON T2.Account = SUBSTRING( T1.Account, 1, 10 )
The resulting table has:
(below is suppose to be a table)
Account----l-----NoOfMatches
Agcert-----l-----3
AgCert----l-----3
agcert-----l-----3
Is there now way of the result just being???
Account---l---NoOfMatches
Agcert----l----3
Dipesh---l-----4
etc.......
therefore grouping the reults?
Can anybody help me.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 1

well - you can convert them all to upper case during the grouping. You're already hurting performance with the function in the group by, so it might not stay the fastest query in the mix.

The better method would probably be to standardize them to either upper or lower case once and for all (i.e. updating the values). If that's not an option, then simply doing:

SELECT DISTINCT upper(T1.Account), T2.NoOfMatches

FROM Account T1

INNER JOIN

(

SELECT upper(SUBSTRING( Account, 1, 10 )) AS Account,

COUNT( * ) NoOfMatches

FROM Account

GROUP BY upper(SUBSTRING( Account, 1, 10 ) )

HAVING COUNT( * ) > 1

) T2 ON T2.Account = SUBSTRING( T1.Account, 1, 10 )

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Johannes Fourie Default port Points: 1470 More actions · Answer 2

Won't you get the desired result if you only run the query of the derived table?

Because you joining that back to Accounts it repeats for each account that starts with the same few characters, if you don't join back to the account and only run the code of the derived table then it should not be repeated?

SELECT upper(SUBSTRING( Account, 1, 10 )) AS Account,

COUNT( * ) NoOfMatches

FROM Account

GROUP BY upper(SUBSTRING( Account, 1, 10 ) )

HAVING COUNT( * ) > 1

Or maybe I'm missing something?

dipesh187 Grasshopper Points: 21 More actions · Answer 3

Thank you Matt,

i think i will be using Johannes's script.

Kenneth.Fisher SSC-Insane Points: 20396 More actions · Answer 4

Here is a different option that doesn't require using upper. I only mention it because I think its really cool 🙂

SELECT SUBSTRING( Account, 1, 10 ) collate SQL_Latin1_General_CP437_CI_AS AS Account,

COUNT( 1 ) NoOfMatches

FROM Account

GROUP BY SUBSTRING( Account, 1, 10 ) collate SQL_Latin1_General_CP437_CI_AS

HAVING COUNT( 1 ) > 1

It sounds like your default collation on your server is case sensative. If you specify a case insensative collation (above is just an example) it will do the grouping as case insensative. Just be careful to use your same collation (its in the properties on the server) and change it from CS to CI. That way you make sure you don't have any other little surprizes in your query.

Also note I changed the count from COUNT( * ) to COUNT( 1 ) its a little bit faster.

Kenneth

Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]