Scalar function too slow

  • I created a function which basicallly takes the following input:

    then performs various checks on the name details to determine a match score for the name part of a record. The outcome of the function can be sure, likely, possible etc.... This is all part of a recod linkage exercise that I'm working on.

    The function performs badly as expected due to the number if IF statements but I can't seee how I can convert the following (see part of function below) to a iTVF? Replace all the IF with CASEes?

    CREATE FUNCTION [dbo].[Individual_Name_Score] (

    @Master_mkNormalisedName NVARCHAR(50)

    ,@Master_mkName1 NVARCHAR(50)

    ,@Master_mkName2 NVARCHAR(50)

    ,@Master_mkName3 NVARCHAR(50)

    ,@Duplicate_mkNormalisedName NVARCHAR(50)

    ,@Duplicate_mkName1 NVARCHAR(50)

    ,@Duplicate_mkName2 NVARCHAR(50)

    ,@Duplicate_mkName3 NVARCHAR(50)

    )

    RETURNS VARCHAR(50)

    AS

    BEGIN

    -- Split the master record

    DECLARE @Master_Surname NVARCHAR(50) = ''

    DECLARE @Master_Forename NVARCHAR(50) = ''

    DECLARE @Master_MiddleName NVARCHAR(50) = ''

    -- Split the duplicate recrod

    DECLARE @Duplicate_Surname NVARCHAR(50) = ''

    DECLARE @Duplicate_Forename NVARCHAR(50) = ''

    DECLARE @Duplicate_MiddleName NVARCHAR(50) = ''

    DECLARE @ScorerType VARCHAR(50) = ''

    --initialise variables used to determine name match score

    -- get the normalised name parts and the phonetic name parts for the master record

    SET @Master_Surname = LEFT(@Master_mkNormalisedName, CHARINDEX(',',@Master_mkNormalisedName)-1)

    SET @Master_Forename = LEFT(SUBSTRING(@Master_mkNormalisedName, CHARINDEX(',',@Master_mkNormalisedName)+1,

    LEN(@Master_mkNormalisedName)), CHARINDEX(',',SUBSTRING(@Master_mkNormalisedName,

    CHARINDEX(',',@Master_mkNormalisedName)+1,LEN(@Master_mkNormalisedName)))-1)

    SET @Master_MiddleName = REVERSE(LEFT(REVERSE(@Master_mkNormalisedName), CHARINDEX(',',REVERSE(@Master_mkNormalisedName))-1))

    -- get the normalised name parts and the phonetic name parts for the duplicate record

    SET @Duplicate_Surname = LEFT(@Duplicate_mkNormalisedName, CHARINDEX(',',@Duplicate_mkNormalisedName)-1)

    SET @Duplicate_Forename = LEFT(SUBSTRING(@Duplicate_mkNormalisedName, CHARINDEX(',',@Duplicate_mkNormalisedName)+1,

    LEN(@Duplicate_mkNormalisedName)), CHARINDEX(',',SUBSTRING(@Duplicate_mkNormalisedName,

    CHARINDEX(',',@Duplicate_mkNormalisedName)+1,LEN(@Duplicate_mkNormalisedName)))-1)

    SET @Duplicate_MiddleName = REVERSE(LEFT(REVERSE(@Duplicate_mkNormalisedName), CHARINDEX(',',REVERSE(@Duplicate_mkNormalisedName))-1))

    -- if both surnames are empty

    IF LEN(@Master_Surname) = 0 AND LEN(@Duplicate_Surname) = 0

    BEGIN

    BEGIN SET @ScorerType = 'bothEmpty' RETURN @ScorerType END

    END

    ----one of them is empty then return

    ELSE IF LEN(@Master_Surname) = 0 OR LEN(@Duplicate_Surname) = 0

    BEGIN

    BEGIN SET @ScorerType = 'oneEmpty' RETURN @ScorerType END

    END

    -- surnames both match

    ELSE IF @Master_Surname = @Duplicate_Surname

    BEGIN

    -- forenames both the same

    IF @Master_Forename = @Duplicate_Forename

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'sure' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'sure' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'sure' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    END

    -- forenames sound the same

    ELSE IF @Master_mkName2 = @Duplicate_mkName2

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'sure' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- forenames both empty

    ELSE IF LEN(@Master_Forename) = 0 AND LEN(@Duplicate_Forename) = 0

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- one of the forenames is empty

    ELSE IF LEN(@Master_Forename) = 0 OR LEN(@Duplicate_Forename) = 0

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- forenames approximately the same

    ELSE IF dbo.LEVENSHTEIN(@Master_Forename, @Duplicate_Forename) < 3

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- forenames approximately sound the same

    ELSE IF dbo.LEVENSHTEIN(@Master_mkName2, @Duplicate_mkName2) < 3

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- forenames possibly the same

    ELSE IF dbo.JaroWinkler(@Master_Forename, @Duplicate_Forename) > 0.6

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    -- forenames are not equal

    ELSE

    BEGIN

    IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END END

    IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    END

    END

    -- if lastname sounds equal:

    --ELSE IF @Master_mkName1 = @Duplicate_mkName1

    --BEGIN

    --IF @Master_Forename = @Duplicate_Forename

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF @Master_mkName2 = @Duplicate_mkName2

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF LEN(@Master_Forename) = 0 AND LEN(@Duplicate_Forename) = 0

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF LEN(@Master_Forename) = 0 OR LEN(@Duplicate_Forename) = 0

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'likely' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF dbo.LEVENSHTEIN(@Master_Forename, @Duplicate_Forename) < 3

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF dbo.LEVENSHTEIN(@Master_mkName2, @Duplicate_mkName2) < 3

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE IF dbo.JaroWinkler(@Master_Forename, @Duplicate_Forename) > 0.6

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'possible' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --ELSE

    --BEGIN

    --IF @Master_MiddleName = @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 AND LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF LEN(@Master_MiddleName) = 0 OR LEN(@Duplicate_MiddleName) = 0 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.LEVENSHTEIN(@Master_MiddleName, @Duplicate_MiddleName) < 3 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF dbo.JaroWinkler(@Master_MiddleName, @Duplicate_MiddleName) > 0.6 BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --IF @Master_MiddleName <> @Duplicate_MiddleName BEGIN SET @ScorerType = 'zero' RETURN @ScorerType END

    --END

    --END

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • So many views and no replies?! Where are the clever guys these days?! :crying:

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (2/5/2013)


    So many views and no replies?! Where are the clever guys these days?! :crying:

    I think the biggest challenge to getting an answer is that this code is really scary!!! It is a scalar function with lots and lots of calls to other scalar functions (LEVENSHTEIN, JaroWinkler). You execute a scalar function, examine the results and then execute another scalar function. It is no surprise the performance is not acceptable.

    The problem is we don't know what this is supposed to do. I know you have a certain format for the data being passed in. I have recently seen your threads splitting your strings into 3 which is obviously used inside here. You will need to rewrite this and very possibly those other 2 functions if you want to have a chance at making this faster.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for your input Sean. I really appreciate it.

    As mentioned, this is part one part of a record linkage exercise I'm working on. The way it works is like this:

    I have two record sets (both contain Firstname, Surname, address1, Town, Postcode) and I want to find duplicate individuals.

    Of course the eaiest thing to do is to join the two sets where all 5 columns match but as you know, data is never as clean as one would like. So another approach is to join the two data sets on phonetic code of the surname and phonetic address1.

    Once that's done, I take each matching set and then compare the other fields and give them scores. The function above is meant to work out the name score.

    So in my effort to make it faster, I pass all the required details to the function I then do checks like:

    If Surname from record Set A = Surname from record Set B

    AND forename = forename

    then the ultimate score will depends on the contents of the middlename

    IF A.Surname = B.surname

    AND Phonetic Forename = phonetic Forename

    then the score will against depends on middle name

    etc...

    But as you can see, there are so many variations like

    IF Surname SOUNDS like Surname

    AND forename = forename

    etc.. etc...

    Does this make sense?!

    Does this make sense?

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply