multiple strings between two different delimiters

  • I'm trying to extract multiple strings that exist between two different delimiters....

    It's almost but not quit HTML, just custom tags that are being used.

    this same functionality would be ideal for searching for a lot of different things.

    in this case, the sample data is from the public resource Gutenberg Dictionary...some words have more than one definition. definitions are between "<def>" tags and there is often more than one.

    without the tally table, for the first definition, i was able to use this SQL:

    [font="Courier New"]

    declare

    @pre varchar(10),

    @post varchar(10),

    @pr int,

    @po int,

    @st int

    set @pre = '<def>'

    set @post = '</def>'

    set @pr = len(@pre)

    set @po = len(@post)

    set @st = 1

    SELECT top 30 substring(DefinitionAsHTML,charindex(@pre,DefinitionAsHTML,@st)+ @pr,charindex(@post,DefinitionAsHTML,@st)- charindex(@pre,DefinitionAsHTML,@st) -@pr ) As Word ,

    *

    FROM #tmp

    WHERE charindex(@pre ,DefinitionAsHTML,@st) > 0

    AND charindex(@post ,DefinitionAsHTML,@st) > 0[/font]

    the only way i could think of processing records with more than one "<def>" tag,was to add a column to the table,a dn update the column with the charindex of the </def> tag, adn then and use the same function above, using the column value for the @st value...

    this is a little repetitive because some words have more than 20 definitions. i was 3 definitions in when i realized there's gotta be a better way.

    can anyone rub my nose in how to adapt this to work witht he Tally table to get all the definitions in a select?

    sample data is too wide to post and is wider than a varchar(8000), but you can copy from this link

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell,

    I just tried the below query and its giving what exactly you require....

    SELECTT1.N, T2.NumDef,

    SUBSTRING( T2.DefinitionAsHTML, ( CASE T1.N WHEN 1 THEN 1 ELSE T1.N + @pr END ),

    ( CASE CHARINDEX( @post, T2.DefinitionAsHTML, T1.N + @pr )

    WHEN 0 THEN LEN( T2.DefinitionAsHTML ) - T1.N + @pr

    ELSE CHARINDEX( @post, T2.DefinitionAsHTML, T1.N + @pr ) - ( T1.N + @pr )

    END ) ) AS String

    FROMdbo.Tally T1

    CROSS APPLY #tmp T2

    WHEREN <= LEN( T2.DefinitionAsHTML )

    AND SUBSTRING( T2.DefinitionAsHTML, T1.N, @pr ) = @pr

    --Ramesh


  • Thanks Ramesh; i'm looking it over now; i must have got a copy paste error, as it returns an erro based on something in the data;

    Conversion failed when converting the varchar value ' B' to data type int.

    I'll keep tweaking it to see if i can get some results (and make some sense out of it in my own mind)

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Ramesh that was spot on... the only thing that was missing from your sample was a single character due to my poorly named variables:

    WHEREN <= LEN( T2.DefinitionAsHTML )AND SUBSTRING( T2.DefinitionAsHTML, T1.N, @pr ) = @pre

    that's the peice i was puzzling over...how was your code finding the starting string.

    This will go a long way to helping me understand the tally concept, as it's a real example to me and not an abtract one with someone elses project.

    Thanks

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (10/29/2007)


    Ramesh that was spot on... the only thing that was missing from your sample was a single character due to my poorly named variables:

    WHEREN <= LEN( T2.DefinitionAsHTML )AND SUBSTRING( T2.DefinitionAsHTML, T1.N, @pr ) = @pre

    that's the peice i was puzzling over...how was your code finding the starting string.

    This will go a long way to helping me understand the tally concept, as it's a real example to me and not an abtract one with someone elses project.

    Thanks

    I didn't knew i missed a char over there;)....I'd copied it from SSMS:)

    I glad it helped solve the problem....:D

    --Ramesh


Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply