Find second occurence of a word

  • Hi,

    I have a table named ABC. It has a field named Y and Z.

    Y='ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    I want to find the second instance of Word ABCDEFG from the field and then delete everyhing after that

    e.g

    Y='ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    becomes

    Y='ABCDEFG (9105) CDD'

    Thanks,

    PSB

  • PSB (4/23/2012)


    Hi,

    I have a table named ABC. It has a field named Y and Z.

    Y='ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    I want to find the second instance of Word ABCDEFG from the field and then delete everyhing after that

    e.g

    Y='ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    becomes

    Y='ABCDEFG (9105) CDD'

    Thanks,

    PSB

    for specific strings, you can use a function like CHARINDEX2:

    if you don't know which word is repeating, ie the first string in each row, that'll be a little harder.

    UPDATE ABC

    SET Y = CASE

    WHEN dbo.CHARINDEX2('ABCDEFG', Y, 2) > 0

    THEN SUBSTRING(Y,1,dbo.CHARINDEX2('ABCDEFG', Y, 2))

    ELSE Y

    END

    /*

    Example:

    SELECT dbo.CHARINDEX2('a', 'abbabba', 3)

    returns the location of the third occurrence of 'a'

    which is 7

    */

    CREATE FUNCTION CHARINDEX2(

    @TargetStr varchar(8000),

    @SearchedStr varchar(8000),

    @Occurrence int)

    RETURNS int

    AS

    BEGIN

    DECLARE @pos int, @counter int, @ret int

    SET @pos = CHARINDEX(@TargetStr, @SearchedStr)

    SET @counter = 1

    IF @Occurrence = 1

    SET @ret = @pos

    ELSE

    BEGIN

    WHILE (@counter < @Occurrence)

    BEGIN

    SELECT @ret = CHARINDEX(@TargetStr, @SearchedStr, @pos + 1)

    SET @counter = @counter + 1

    SET @pos = @ret

    END

    END

    RETURN(@ret)

    END

    GO

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • CHARINDEX() has an option parameter (start_location) that can be used to skip the first occurence of the specified string. If you want an example, you'll have to provide more information including sample data and expected results per the Forum Etiquette[/url]

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Declare @test-2 varchar(100)

    select @test-2 = 'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    select LEFT(@test,patindex('%ABCDEFG%',substring(@test,patindex('%ABCDEFG%',@test)+1,LEN(@test))))

  • If you don't know what string will appear twice, or if one will at all, try something like this:

    DECLARE @String VARCHAR(100) = 'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING' ;

    ;

    WITH Seeds(Seed)

    AS (SELECT R

    FROM ( VALUES ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1) ) AS V (R)),

    Numbers(Number)

    AS (SELECT ROW_NUMBER() OVER (ORDER BY S1.Seed)

    FROM Seeds AS S1

    CROSS JOIN Seeds AS S2),

    Parser(Parsed, Seq)

    AS (SELECT SUBSTRING(@String + ' ', Number, CHARINDEX(' ', @String + ' ', Number) - Number),

    Number

    FROM Numbers

    WHERE Number <= LEN(@String)

    AND SUBSTRING(' ' + @String, Number, 1) = ' ')

    SELECT (

    SELECT Parsed + ' '

    FROM Parser

    WHERE Seq < ISNULL((SELECT MIN(P2.Seq)

    FROM Parser AS P1

    INNER JOIN Parser AS P2

    ON P1.Parsed = P2.Parsed

    AND P1.Seq < P2.Seq), (SELECT MAX(Seq)

    FROM Parser))

    ORDER BY Parser.Seq

    FOR XML PATH(''),

    TYPE).value('.[1]', 'varchar(100)') ;

    Note that instead of a string variable, you can use a column and set this up to be a Cross/Outer Apply to a table.

    The first two CTEs just define a Numbers table in RAM instead of on disk. It's just a table of integer numbers between 1 and 100 (in this case). By adding more Cross Applies, you can achieve an arbitrary length that will be adequate to your maximum string size.

    There are some slightly faster string parsing methods, but they only matter if the strings are quite long. The best, in that case, is a CLR function. You can Bing/Google "SQL string parser" and find articles on that subject. This one will work if your string values are reasonably short.

    The final piece first selects the first substring that repeats itself, whatever that substring is. If there isn't one, it selects the final substring (MAX(Seq) + 1) and thus includes all pieces, using a simple IsNull construct. You can use Coalesce instead of IsNull if ANSI SQL is important to you. Then it takes those pieces, and uses a FOR XML trick to concatenate them back together. This part is usually very efficient. I specifically include converting the XML value back to varchar in order to make sure it correctly handles things like ampersands in the original string.

    Does that help?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Just to be sure, I tested my version with your original example string, and with:

    DECLARE @String VARCHAR(100) = 'ABCDEFG (9105) CDD NOT REPEATING & (1690) ENGINEERING' ;

    Nothing in that string repeats, so it returns the whole string.

    Also a correction: I tried turning this into a simple Cross Apply without encapsulating it into a UDF first, and no joy. Can't reference itself recursively, of course.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I tried to use

    Declare @test-2 varchar(100)

    select @test-2 = 'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    select LEFT(@test,patindex('%ABCDEFG%',substring(@test,patindex('%ABCDEFG%',@test)+1,LEN(@test))))

    If I have only the

    'ABCDEFG (9105) CDD' then it returns blank .

    'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    returned ''ABCDEFG (9105) CDD' which is correct

    'ABCDEFG (9105) CDD' retunned blank

    Thanks,

    PSB

  • You'll have to use a CASE statement that first tests if the PatIndex value is greater than zero, and if so use that, and if not, just return the string as-is.

    Again, that will only work if ABCDEFG is a constant in the data and is the only repeating pattern you're looking for.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • This may also work for you.

    DECLARE @word VARCHAR(MAX)

    SET @word = 'ABCDEFG'

    DECLARE @t TABLE (words VARCHAR(max))

    INSERT INTO @t

    SELECT 'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    UNION ALL SELECT 'ABCDEQG (9105) CDD ABCDEFG (1690) ENGINEERING'

    SELECT CASE CHARINDEX(@word, SUBSTRING(words, 1+CHARINDEX(@word, words, 1), LEN(words)), 1)

    WHEN 0 THEN words

    ELSE SUBSTRING(words, 1, CHARINDEX(@word, SUBSTRING(words, 1+CHARINDEX(@word, words, 1), LEN(words)), 1)-1)

    END As words

    FROM @t

    Edit: Replaced my silly, earlier solution, even though it also worked.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • This solution is a little more general in that it can remove any first word if it repeats within the string, instead of the one word (@word) set up in my prior suggestion.

    DECLARE @t TABLE (words VARCHAR(max))

    INSERT INTO @t

    SELECT 'ABCDEFG (9105) CDD ABCDEFG (1690) ENGINEERING'

    UNION ALL SELECT 'ABCDEQG (9105) CDD ABCDEFG (1690) ENGINEERING'

    ;WITH cte AS (

    SELECT SUBSTRING(words, 1, CHARINDEX(' ', words)) As word, words

    FROM @t)

    SELECT CASE CHARINDEX(word, SUBSTRING(words, 1+CHARINDEX(word, words, 1), LEN(words)), 1)

    WHEN 0 THEN words

    ELSE SUBSTRING(words, 1, CHARINDEX(word, SUBSTRING(words, 1+CHARINDEX(word, words, 1), LEN(words)), 1)-1)

    END As words

    FROM cte


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply