March 11, 2008 at 1:13 pm
so I have a query:
SELECT esarfUser.nameFirst, esarfUser.nameLast, esarfUser.fullName, esarfUser.userID, EsarfVCPICompare$.LastName,
EsarfVCPICompare$.FirstName
FROM esarfUser INNER JOIN
EsarfVCPICompare$ ON esarfUser.nameLast = EsarfVCPICompare$.LastName AND esarfUser.nameFirst = EsarfVCPICompare$.FirstName
that returns nothing. If I drop off the AND esarfUser.nameFirst = EsarfVCPICompare$.FirstName, I get results, but it's way too many.
I have two tables that have first and last names and I'm trying to figure out how many I have in common.
Any ideas?
March 11, 2008 at 1:49 pm
Are you looking for common firstname and lastname between the tables, meaning that you want to know if "John Smith" is in both tables?
Or do you want matching names between the tables, meaning if "Smith" is a last name in both tables you want to know.
March 11, 2008 at 2:06 pm
I'm trying to find common first and last name combinations so if it's
John Smith in table 1 and
Johnathon Smith in table 2
I want to flag it.
I had thought to join them on a substring of the first name or a like but I'm not having any luck.
March 11, 2008 at 3:02 pm
What you want to do is not easily done in T-SQL. Usually something like this would be done in SSIS or another package. What about Jon and John or Jonathan and John or Jon. Could there be just J Smith's? When doing a join with = you need an exact match. You may want to lookup SOUNDEX in BOL.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
March 11, 2008 at 5:20 pm
On top of all that, if you're trying to compare CHAR datatypes to VARCHAR datatypes, you may have to do an RTRIM on the CHAR datatypes to knock off any trailing spaces. If you have case sensitivity and accent sensitivity turned on, you may have to get into collation in the query.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 11, 2008 at 7:35 pm
Jack Corbett (3/11/2008)
What you want to do is not easily done in T-SQL. Usually something like this would be done in SSIS or another package. What about Jon and John or Jonathan and John or Jon. Could there be just J Smith's? When doing a join with = you need an exact match. You may want to lookup SOUNDEX in BOL.
Do you have any idea how to attack it in SSIS?
March 11, 2008 at 7:42 pm
This sounds like a good candidate for a fuzzy lookup or fuzzy grouping transformation. You would set it up for an exact match on Last Name and a fuzzy match on the first name. You will have to play with the similarity score to get the results you want.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
March 11, 2008 at 8:20 pm
With two tables a Data Flow like this should work.
[Table1] --
|--> [Union All] --> [Fuzzy Grouping] --> [Output Table]
[Table2] --
The output will have a grouping of names. Then you update the output table with the name you know is correct.
Have a look at these two articles:
Finding Similar Data Using SQL Server Integration Services[/url]
Using Fuzzy Lookup Transformations in SQL Server Integration Services
March 12, 2008 at 6:13 pm
You can use SOUNDEX() in sql server.
more details can be found at http://msdn2.microsoft.com/en-us/library/ms187384.aspx
It is sometimes very dangerous to use this function. It is entirely dependent on your business requirements.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply