String Comparison function

  • Hi All,

    I have been trying to squeeze my mind to find a solution for this but couldn't yet.

    What i need is to create a function that compares 2 strings variables and if those 2 variables doesn't have at least 3 different characters then return failure , else return success.

    Best Regards

    Nader

  • I will give an example of what i need so it's not misunderstood.

    Let's assume we have string1 = 'Panadol'

    string 2 = 'Xynadol'

    string 3 =' XYZadol'

    if i compare string 2 against string1 it should return failure while if i compare string 3 against string1 it should return success as there are 3 different characters.

    Please note that those characters could be located anywhere in the string.

    So the following string will return success too.

    string4='adolxyz'

    Regards

    Nader

  • Hi,

    are both compared strings always of the same length? In your examples they always have 7 characters, but is that a rule?

    If not, please explain whether "Pand" and "Panadol" should return success or failure when compared.

  • Thanks for your reply.

    No the lengths could be different and up to 200 character.

    "Pand" and "Panadol" will return success as there are more than 3 different characters.

  • if you step back from the dual-three letter matching criteria, whit is it you a REALLY trying to do?

    identify misspelled drug names? create full text equivalent so that you get match variations like

    Abacavir Sulfate (Ziagen)- FDA

    Abacavir Sulfate (Ziagen)- Multum

    Abacavir Sulfate and Lamivudine Tablets (Epzicom)- FDA

    Abacavir Sulfate and Lamivudine Tablets (Epzicom)- Multum

    Abacavir Sulfate, Lamivudine, and Zidovudine (Trizivir)- FDA

    Abacavir Sulfate, Lamivudine, and Zidovudine (Trizivir)- Multum

    Abacavir, Dolutegravir, and Lamivudine Film-coated Tablets (Triumeq)-

    i'd lean more towards a dictionary and thesaurus of common misspellings for something like that.

    if you identify what you are really trying to do, we can probably offer a better solution.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • The following should do what you're trying to do... (you just need to wrap it in a function)

    DECLARE

    @String1 VARCHAR(200) = 'Panadol',

    @String2 VARCHAR(200) = 'XYZadol';

    WITH n (n) AS (

    SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)

    ), Tally (n) AS (

    SELECT TOP (SELECT MAX(LEN(x.String)) FROM (VALUES (@String1), (@String2)) x (String))

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM

    n n1, n n2, n n3

    ), Alphas AS (

    SELECT a.Letter FROM (VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),

    ('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) a (Letter)

    ), Split1 (Letter) AS (

    SELECT DISTINCT

    SUBSTRING(@String1, t.n, 1)

    FROM

    Tally t

    WHERE

    SUBSTRING(@String1, t.n, 1) IN (SELECT a.Letter FROM Alphas a)

    ), split2 (Letter) AS (

    SELECT DISTINCT

    SUBSTRING(@String2, t.n, 1)

    FROM

    Tally t

    WHERE

    SUBSTRING(@String2, t.n, 1) IN (SELECT a.Letter FROM Alphas a)

    )

    SELECT

    CASE WHEN COUNT(*) >= 3 THEN 'True' ELSE 'False' END AS IsMatch

    FROM

    Split1 s1

    JOIN split2 s2

    ON s1.Letter = s2.Letter;

    HTH,

    Jason

  • But if I would want to do such check, I would probably in the first step compare the length (if difference of the lengths is more than 2, no more checking is needed, return OK). In the second step I would parse the strings so that I have 2 tables of 1-character strings and then compare them. It is always easier to compare tables than strings...

    By parsing I mean something like that:

    CREATE TABLE #test(id int identity, value varchar(31))

    INSERT INTO #test(value) SELECT 'Panadol'

    SELECT *, SUBSTRING(t.value,n.Number,1) as letter

    FROM #test t

    JOIN Numbers n ON n.Number>0 AND n.Number <=LEN(t.value)

    Notice the code works with table "Numbers", if you don't have it in your DB, you will find here in the forums or articles how to create and populate such table (also goes under the name "Tally").

  • Hi guys,

    I would like to thank you very much for all the solutions, they are highly appreciated.

    I need to give them a try and understand the logic you are trying to do and i will send feedback.

    Regards

    Nader

  • Hi Guys,

    Please note that the function i need in this post is different from the previous one i described in another post, that's why i made a new post.

    The logic in this new function is to return success if there are 3 or more different characters between the input 2 strings, otherwise return failure.

    Regards

    Nader

  • Just to be sure i am not misunderstood, i will give some examples , please also note this is real live logic that needs to be applied.

    String1=Panadol

    String2=XYZadol

    String3=xyadolz

    String4=xynadol

    string5=lodanap (Same as panadol but in reverse)

    String1 is the string i am checking against.

    I need to check that strings 2,3,4,5 have at least 3 different characters.

    According to logic i need string 2,3,5 will return success but 4 will return failure.

    Regards

    Nader

  • String1=Panadol

    String2=XYZadol

    String3=xyadolz

    String4=xynadol

    string5=lodanap (Same as panadol but in reverse)

    In the above example "String 5"has less the 3 differences compared to "String 1" so it should be shown as failure right?

  • No it will return success as order is different.

  • nadersam (8/11/2015)


    No it will return success as order is different.

    What should the results be for a string like "XaYdZol"

    Here, the matching characters are all there, and in order, but they have non-matching characters intermixed in them.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • It will return success also, there are 3 different characters they don't have to follow each other.

    I was able to write the attached function but it doesn't work if the string is reversed

    Panadol and lodanap (This should be success but my function will return failure).

    I am sorry the logic is so weird 🙂 but it's actually needed to assess naming a new drug product in market and avoid name duplication or similarities, currently the users do it manually but we need to automate it.

    Regards

    Nader

  • Try this...

    DECLARE

    @String1 VARCHAR(200) = 'Panadol',

    @String2 VARCHAR(200) = 'XYZadol';

    WITH n (n) AS (

    SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)

    ), Tally (n) AS (

    SELECT TOP (SELECT MAX(LEN(x.String)) FROM (VALUES (@String1), (@String2)) x (String))

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM

    n n1, n n2, n n3

    ), Alphas AS (

    SELECT a.Letter FROM (VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),

    ('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) a (Letter)

    ), Split1 (Letter) AS (

    SELECT DISTINCT

    SUBSTRING(@String1, t.n, 1)

    FROM

    Tally t

    WHERE

    SUBSTRING(@String1, t.n, 1) IN (SELECT a.Letter FROM Alphas a)

    ), split2 (Letter) AS (

    SELECT DISTINCT

    SUBSTRING(@String2, t.n, 1)

    FROM

    Tally t

    WHERE

    SUBSTRING(@String2, t.n, 1) IN (SELECT a.Letter FROM Alphas a)

    )

    SELECT

    * --CASE WHEN COUNT(*) >= 3 THEN 'True' ELSE 'False' END AS IsMatch

    FROM

    Split1 s1

    FULL JOIN split2 s2

    ON s1.Letter = s2.Letter

    WHERE

    s1.Letter IS NULL

    OR s2.Letter IS NULL;

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply