newbie - need help tweaking this script to make data in 1 db column parsed and extracted to 3 columns

  • Hi, We have a table tblImport

    that we have on our production server to insert rows into or flat files then parse them out .

    Why, cuz we dont have permission on our own groups prod databases to create a table.

    that bs cuz the sql gods vs the sql developers are winning at my corp. SO insted of me importing an excel files to join with other tables into tables on our prod database, we have to do timely work arounds.

    like this.

    Is that level of denied permission normal at all big corps cuz i could of had this done 2 days ago?

    anway the table they graciously gave us on prod to insert into or import a file into them parse it out for insert/select type queries.

    CREATE TABLE [dbo].[tblImport](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [Data] [varchar](1000) NULL

    ) ON [PRIMARY]

    5 sample rows to insert into this table only to be pased out into 3 columns for an insert/select query.

    *Hic numbers have been falsafied for information regs below but reflect the changing lenght of the HICNumber column.

    INSERT [dbo].[tblImport] ([Data]) VALUES ('001015301| 201482354A| 304.11')

    INSERT [dbo].[tblImport] ([Data]) VALUES ('001015301| 509082354A| 416.8')

    INSERT [dbo].[tblImport] ([Data]) VALUES ('001185201| 463000528A| 996.78')

    INSERT [dbo].[tblImport] ([Data]) VALUES ('001188601| 457102613A| 733.13')

    INSERT [dbo].[tblImport] ([Data]) VALUES ('001210801| 462224814D6| 733.13')

    --Now once in tblImport I need to get the 3 piped separated columns extracted as 3 columns into a

    --temp for part of a join.

    Create table #tblParsedData (MemberID varchar(50),HICNumber varchar(50), ICD9 varchar(6))

    --Now below is a script I grabbed as a spring board tha works but not excatly what i need:

    --///////////////////WORKS but stacks the parsed string ////////////////

    declare @string varchar(500)

    set @string = 'ABC,DEF,GHIJK,LMNOPQRS,T,UV,WXY,Z'

    declare @pos int

    declare @piece varchar(500)

    ------– Need to tack a delimiter onto the end of the input string if one doesn’t exist

    if right(rtrim(@string),1) <> ','

    set @string = @string + ','

    set @pos = patindex('%,%' , @string)

    while @pos <> 0

    begin

    set @piece = left(@string, @pos - 1)

    ---– You have a piece of data, so insert it, print it, do whatever you want to with it.

    print cast(@piece as varchar(500))

    set @string = stuff(@string, 1, @pos, '|')

    set @pos = patindex('%,%' , @string)

    end

    --///////////////////////////////////////////////////////////////////////////////////////////

    -- Here is how I have been trying to tweak it to make it work for many rows not just a string

    --and realized is was in uncharted tsql territory and needed help. Also it staks the string as you will see ---when run it and i need it to spread horizontally into 3 columns . You can see my logic but

    -- I know there is a much easier way to do this..so below obviously does not work right

    declare @string varchar(500)

    set @string = 'ABC|DEF|GHIJK|LMNOPQRS|T|UV|WXY|Z'--select data from tblImport

    declare @pos int

    declare @piece varchar(500)

    Declare @delimiter varchar(3)

    Set @delimiter = '|'

    Declare @counter int = 1

    --– Need to tack a delimiter onto the end of the input string if one doesn’t exist

    if right(rtrim(@string),1) <> ltrim(rtrim(@delimiter)) --

    set @string = @string + ltrim(rtrim(@delimiter))

    set @pos = patindex('%' + @delimiter + '%' , @string)

    while @pos <> 0

    begin

    set @piece = left(@string, @pos - 1)

    If @counter = 1

    insert #tblParsedData (MemberID)--,HICNumber , ICD9)

    select @piece

    else if @counter = 2

    insert #tblParsedData (HICNumber)--,HICNumber , ICD9)

    select @piece

    else if @counter = 3

    insert #tblParsedData (ICD9)--,HICNumber , ICD9)

    select @piece

    ---– You have a piece of data, so insert it, print it, do whatever you want to with it.

    print cast(@piece as varchar(500))

    --@piece from

    set @string = stuff(@string, 1, @pos, ' A ')

    set @pos = patindex('%' + @delimiter + '%' , @string)

    end

    --------------------------------------------------

    SELECT top(3)* FROM tblImport]

    SELECT top(3)* FROM #tblParsedData

    THANKS in advance!!!!!!!!!!

  • our devs don't import objects into sqlserver.

    They write their own little apps to do the trick or use SSIS (very few) to parse, check and load the data.

    Especially because tsql isn't so fast for parsing string data.

    Your sqladmins just want to avoid rubbish in their db.

    Be sure to only load valid data for your system.

    Check, check and double check.

    Have a look at some of the split functions you can find at SSC.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Take this on one concept at a time here...

    adam spencer (1/29/2011)


    Hi, We have a table tblImport that we have on our production server to insert rows into or flat files then parse them out . Why, cuz we dont have permission on our own groups prod databases to create a table.

    This would depend on how it was described to me as a DBA about creating a table, and its purpose. However, I'm assuming that English is not your first language? You use a lot of slang text for being this poor grammatically, thus the reason I'm asking.

    that bs cuz the sql gods vs the sql developers are winning at my corp. SO insted of me importing an excel files to join with other tables into tables on our prod database, we have to do timely work arounds. like this.

    Joining an excel file to any table in my database would put up huge bells and whistles unless done via SSIS, the excel data in a datastream, and then lookup processes done afterwards. Openrowset can be an ugly thing.

    Is that level of denied permission normal at all big corps cuz i could of had this done 2 days ago?

    Again, I haven't heard your presentation nor know your data layer, but yes and no. It would depend on what you're trying to do with it. Usually this data would be worked via SSIS, dropped to a staging table that's explictly built for this import, and then that data would be used via T-SQL queries to whatever other processing it needed.

    Now, to the code!

    CREATE TABLE [dbo].[tblImport](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [Data] [varchar](1000) NULL

    ) ON [PRIMARY]

    Okay, first question... if this is a generic import table and you have two imports occuring simultaneously how do you find your data? Is this the full definition, or are there secondary identifier columns?

    --Now once in tblImport I need to get the 3 piped separated columns extracted as 3 columns into a

    --temp for part of a join.

    Create table #tblParsedData (MemberID varchar(50),HICNumber varchar(50), ICD9 varchar(6))

    This makes sense.

    If I might offer a similar conversation to your requirements, this one involved needing to work around quotes as well, but it's a similar vein:

    http://www.sqlservercentral.com/Forums/Topic1052054-391-1.aspx

    Replace comma delimited with pipes and you should be good to go once you translate that out. There's two successful versions in there. Lynn's is faster, but involves a Tally Table and some finer logic. Mine's a simple double loop, but will get you where you need to go. If you can understand what Lynn's does, I recommend you use that version, as it's faster.

    If you need more information on a Tally Table, check the link in my signature.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I am an Iraq combat vet. I speak english as a first language. I was just brain fired and frustrated in my post. I am new to cte, but I want to use the most advanced and profiencent tsql to get things done.

    Lynn, If u read this, I was the one 1.5 years back stuck working in an insurance broker company forced to use cusrsor's for everything. You were trying tohelp me but they gang up on me and had me doing C#, vba, SSIS, SSRS and email support all at once and I could not finish turning the cursor into your code.

  • Is that level of denied permission normal at all big corps cuz i could of had this done 2 days ago?

    What is your position and what do you do? I doubt most places would allow just anyone to arbitrarily create tables and import data on a PRODUCTION server.

    "Is that level of denied permission normal "? denied permission?

    "at all big corps cuz "? cuz?

    "i could of had "? of?

    Kinda makes me not want to help you out. But at least you used the code tags...

    Also, you are confusing the insertion of rows with columns. At least that's what it looks like to me.

    DECLARE @s-2 varchar(max)

    DECLARE @firstDelimiter int

    DECLARE @secondDelimiter int

    SELECT @s-2 = [Data] FROM tblImport where id = 1

    SELECT @firstDelimiter = charindex( '|', @s-2 ), @secondDelimiter = charindex( '|', @s-2, @firstDelimiter + 1)

    insert #tblParsedData (MemberID,HICNumber , ICD9)

    SELECT RTRIM( LTRIM( LEFT( @s-2, @firstDelimiter - 1 ) ) )

    , RTRIM( LTRIM( SUBSTRING( @s-2, @firstDelimiter + 1, @secondDelimiter - @firstDelimiter - 1) ) )

    , RTRIM( LTRIM( RIGHT( @s-2, LEN( @s-2 ) - @secondDelimiter ) ) )

    SELECT * FROM #tblParsedData

    Above is an example of breaking apart one [Data] column into the 3 columns. Now you need to create a loop to apply the above code to each row in tblImport. You could probably iterate over the id column.

    A better approach this problem is to do the parsing on the client/app side. Have a C# app or script split on '|' and trim the resulting strings. Then insert into tblImport, which would hold the "MemberID varchar(50),HICNumber varchar(50), ICD9 varchar(6)" columns instead of the [Data] column.

  • Sorry about my my poor grammar. I was frustrated and brain fried when I wrote that. I wish I would have seen the previous post. But I will copy and test and use that becuase I know having to make 3 temps and join at the end was not clean but it works.

    So here is my working code:

    Use [DataRapRaf]

    drop table #tempMemberID

    select case LTRIM(RTRIM(PATINDEX('%|%', Data)))when 10 then left(data,9) when 9 then left(data,8)end as MemberID, [Counter]

    into #tempMemberID

    from tblImport

    Use [DataRapRaf]

    drop table #tempHICNumber

    drop table #tempHICNumberWithPipe

    select case LTRIM(RTRIM(PATINDEX('%|%', Data)))when 10 then substring(data,12,12)when 9 then substring(data,11,12) end as HICNumber, [Counter]

    into #tempHICNumberWithPipe

    from tblImport

    select REPLACE(HICNumber,'|','')AS HICNumber, [Counter]

    into #tempHICNumber

    from #tempHICNumberWithPipe

    Use [DataRapRaf]

    drop table #tempICD9

    select Right(RTRIM(Data),6) as ICD9, [Counter]

    into #tempICD9

    from tblImport

    drop table #tblImportParsed

    select a.MemberID,b.HicNumber,Replace(c.ICD9,'|','') as ICD9

    into #tblImportParsed

    from #tempMemberID a join #tempHICNumber b

    on a.[counter]= b.[counter]

    join #tempICD9 c

    on b.[counter] = c.[counter]

    Select * From #tblImportParsed

    --data integrity test below. should come up with 001312701 on the left and 304.11 to the right of the HIC, which it does.

    select top(1) * from tblimport

    where data like '%109328621A%'

  • One of the better ways to do this would be to do a BULK INSERT using "|" as the delimiter.

    That, notwithstanding, I agree that a "Tally Table" based splitter function would work well but, since they won't let you do much, we certainly can't create a handy function to do that. The next best thing would be to do it all "inline" in a fashion like the link nadabadan provided.

    Using the test data you provided, the following looks very complex but it's also very, very fast because it doesn't do any string concatenation of delimiters like most splitters do. As a side bar, that extra speed doesn't really show up until you have more than 1000 characters or so in the data you wish to split. Here's the code.

    WITH

    --========= This section creates a "Tally" cte on the fly

    E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1

    ), --10

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E4),

    --========= This section determines the start and end of each "field"

    -- and numbers each "field".

    cteFindPositions AS

    (

    SELECT ItemNumber = ROW_NUMBER() OVER (PARTITION BY src.ID ORDER BY t.N),

    PosStart = CASE WHEN t.N > 1 THEN t.N+1 ELSE t.N END,

    PosEnd = ISNULL(NULLIF(CHARINDEX('|',src.Data,t.N+1),0),LEN(src.Data)+1),

    ID = src.ID,

    Data = src.Data

    FROM cteTally t

    CROSS JOIN dbo.tblImport src

    WHERE t.N BETWEEN 1 AND LEN(src.Data)

    AND (SUBSTRING(src.Data,t.N,1) = '|' OR t.N = 1)

    ),

    --========= This section does the actual split which causes the data to

    -- "stack up" as previously mentioned. We'll fix that in a minute.

    cteEAV AS

    (

    SELECT ID,

    ItemNumber,

    ItemValue = LTRIM(SUBSTRING(Data,PosStart,PosEnd-PosStart))

    FROM cteFindPositions

    )

    --========= This puts it all back together using a CROSS-TAB which is like a PIVOT, but faster.

    SELECT ID,

    Column1 = MAX(CASE WHEN ItemNumber = 1 THEN CAST(ItemValue AS VARCHAR(10)) END),

    Column2 = MAX(CASE WHEN ItemNumber = 2 THEN CAST(ItemValue AS VARCHAR(10)) END),

    Column3 = MAX(CASE WHEN ItemNumber = 3 THEN CAST(ItemValue AS DECIMAL(9,2)) END)

    FROM cteEAV

    GROUP BY ID

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, I copied and saved your code and saved in a .sql to pull up for next time which will be soon I am sure.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply