Delimiter of string data

  • Guys,

    I have scenario where in a column - col004 I have following data

    col001col004

    ______________________________________________________

    1D,HAINESVILLE CEMETERY,,I,SANDYSTON TOWNSHIP,,

    Each time after every 3rd delimiter it is populated with 'D' or 'I', the subsequent data (for next 2 delimiters)

    should be inserted in another table as seperate row.

    col001col004

    ______________________________________________________

    1D,HAINESVILLE CEMETERY,,

    1I,SANDYSTON TOWNSHIP,,

    For the same scenario I have written a stored procedure which uses cursor and function - which does the logic for the inserts but I am stuck at the loop. To pinpoint I am stuck at while (dbo.gettoken(@col004, ',', ??) = 'D' OR dbo.gettoken(@col004, ',', ??) = 'I'). I do not know what value should be 'token' which should be 1, 4, 7, 10. Each time when we encouter 'D' or 'I', I insert into different table as seperate row until we find next 'D' or 'I'.

    The function when I pass string to be parsed, delimiter and token number it gives me the string next to the token number.

    For example dbo.gettoken ('D,HAINESVILLE CEMETERY,,', ',', 1) gives D

    dbo.gettoken ('D,HAINESVILLE CEMETERY,,', ',', 2) gives HAINESVILLE CEMETERY

    CREATE PROCEDURE P_party aS

    begin

    declare C_partytwo cursor for

    select id, col004 from maps

    order by id

    declare

    @counter int,

    @col004 varchar(400),

    @col004_prev varchar(400),

    @id int,

    @id_prev int,

    @stmt Nvarchar(400),

    @fetchcount int

    set @counter=0

    set @col004=''

    set @id=0

    set @col004_prev=0

    set @id_prev=0

    set @fetchcount = 0

    open C_partytwo

    fetch next from C_partytwo into @id, @col004

    while @@fetch_status<>-1

    BEGIN

    while while (dbo.gettoken(@col004, ',', ??) = 'D' OR dbo.gettoken(@col004, ',', ??) = 'I')

    -- logic for inserting into different table

    INSERT INTO PARTY VALUES (@ID, DBO.GETTOKEN(@COL004, ',', ??))

    set @counter=1

    set @fetchcount=@fetchcount+1

    fetch next from C_partytwo into @id, @col004

    -- print 'fetchcount: ' + cast(@fetchcount as varchar)

    -- PRINT 'SEQ_KEY: ' + cast(@seq_key as varchar)

    end

    close c_partytwo

    deallocate c_partytwo

    end

    Any suggestions and inputs would help.

    Thanks

  • I'm sure others would agree with me but don't use a cursor.

    I would load the rows into a load table and then insert all of the 'D' values from the load into one table and then all the 'I' values into the other table.  If you split the col004 by comma then it becomes a matter of selecting the right column.

     

    Jez

  • Jez,

    I understand it can be done that way only if data if formatted consistently the same way

    For example

    id col004

    1 D,HAINESVILLE CEMETERY,,I,SANDYSTON TOWNSHIP,,

    2 D,RIDGEFIELD COMMONS,,D,HIGHVIEW KNOLL MULBERRY LANE,,D,HARDYSTON

    TOWNSHIP OF,,

    3 D,ASHDOWN FOREST ESTATES,,D,SPARTA TOWNSHIP OF,,I,SPARTA TOWNSHIP OF,,

    Is there any any to split the above data such that type 'D' does in different table and type 'I' goes in different table at the same retain the association with ID column

    id dcolumn icolumn

    1 D,HAINESVILLE CEMETERY,, I,SANDYSTON TOWNSHIP,,

    2 D,RIDGEFIELD COMMONS,,

    2 D,HIGHVIEW KNOLL MULBERRY LANE,,

    2 D,HARDYSTON TOWNSHIP OF,,

    Any suggestions and inputs would help

    Thanks

  • Why are you storing such data in your database?  Can't you get the source info into a little better shape, first?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I didn't realise that you might have multiple entries per row.

    There might be a set based solution but I cannot think of it at present - something may come to me later.

    As you have an unknown number of repeats in each line, I think that you should process the string (@col004) as a loop until each entry has been processed:

    while fetch_status <> -1 begin

    while @col004 <> '' begin

    if dbo.gettoken(@col004, ',', 1)  = 'D' begin

    -- process for D

    end

    else begin

    -- process for I

    end

    -- Now move the string along

    set @col004 = substring(@col004,charindex(dbo.gettoken(@col004,',',2), len(dbo.gettoken(@col004,',',2) + 2), len(@col004)

    end

    end

     

    No guarantees that I have got the coding perfect but the effect is to shorten @col004 each time you process a bit of it through the loop unitl you end up with nothing:

    D,RIDGEFIELD COMMONS,,D,HIGHVIEW KNOLL MULBERRY LANE,,D,HARDYSTON TOWNSHIP OF,,

    D,HIGHVIEW KNOLL MULBERRY LANE,,D,HARDYSTON TOWNSHIP OF,,

    D,HARDYSTON TOWNSHIP OF,,

     

    I hope I have explained this sufficiently for you to get the idea.  Ask if you don't understand anything (I'm sure you would anyway).

     

    Jez

  • Jeff,

    As I much as I would like to change the source data, since it is from vendor it has to be converted into our schema from the same format.

    Any suggestions in solving this scenario would help

    Thanks

  • Nasty solution but....

    SELECT SUBSTRING(col004,n.Number,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,n.Number)+1)+1) - n.Number + 1)

    FROM

    a

    INNER JOIN Numbers n 

    ON n.Number > 0

    AND n.Number < LEN(col004)

    AND SUBSTRING(a.col004,n.Number,1) <> ','

    AND ((LEN(LEFT(a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,n.Number)+1)+1)))

     - LEN(REPLACE(LEFT(a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,n.Number)+1)+1)),',',''))) % 3) = 0

    AND (LEN(LEFT(a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,n.Number)+1)+1)))

     - LEN(REPLACE(LEFT(a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,CHARINDEX(',',a.col004,n.Number)+1)+1)),',',''))) > 0

    * requires the now infamous Numbers table

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Jez,

    Thank you for your reply, i seem to have a problem at

    set @col004 = substring(@col004,charindex(dbo.gettoken(@col004,',',2), len(dbo.gettoken(@col004,',',2) + 2), len(@col004)

    I think the problem is with charindex

    CHARINDEX ( expression1 , expression2 [ , start_location ] )

    when I do same select

    select substring(col004,charindex(dbo.gettoken(col004,',',2), len(dbo.gettoken(col004,',',2) + 2), len(col004)

    from maps

    I get the error incorrect syntax near 'FROM'

    Any inputs would help

    Thanks

  • David,

    What is Numbers table, how is it created??

    Thanks

  • A permanant table containing numbers, I create my like this

    CREATE TABLE dbo.Numbers

    ([Number] int NOT NULL,

    CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number))

    GO

    SET NOCOUNT ON

    DECLARE @n int

    SET @n = 1

    WHILE @n <= 8000

    BEGIN

    INSERT INTO dbo.Numbers VALUES (@n)

    SET @n = @n + 1

    END

    SET NOCOUNT OFF

    Far away is close at hand in the images of elsewhere.
    Anon.

  • If you want to continue with the original idea (I'd missed out some end of brackets):

    set @col004 = substring(@col004, charindex(dbo.gettoken(@col004,',',2), len(dbo.gettoken(@col004,',',2) + 2)), len(@col004))

    The other solution would almost certainly be faster than a cursor based solution but the choice is yours (speed vs maintainability).

     

    Jez

  • OK... Thanks for the feedback and understood...

    I see a couple of folks have already given you a solution but I am curious about the source of data (always looking for a better way )... do you receive this data from the vendor in a text file by any chance?  Depending on how it's actually formatted, there may be a way to do this without any more processing than the import that you would currently be using.  I'm particullarly interested in anything that appears to be a line terminator or special character in the middle of long lines like you posted such as this...

    2 D,RIDGEFIELD COMMONS,,D,HIGHVIEW KNOLL MULBERRY LANE,,D,HARDYSTON TOWNSHIP OF,,

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • quote...I am curious about the source of data...

     killed the cat you know

    quoteDepending on how it's actually formatted, there may be a way to do this...

    Yep lots 

    I always preprocess this sort of data outside of SQL before import

    quote...special character...

    That always helps. I was going to ask if each group of data always has two commas at the end (but I doubt it ) as this would make the split even easier

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Thank Jez and David for the solution both of them work

  • So... no feedback on my questions, am?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply