T-SQL text manipulation and recordset iteration

  • rjbirkett (6/29/2012)


    Bill, I'm sorry, I was not ignoring you, I was trying to figure out if I could use that function. There are 45 separate fields once the string is split, I would need to edit that to get it to work I think. I appreciate ALL the help from all who take the trouble to reply. I am a real newbie and trying something I probably should have spent more time learning.

    Lynn,

    MsgDate is in the original syslog table along with the CDR detail record. in C# I'm selecting all the records by MsgDate and processing them all at once. Sorry for the confusion. So, I'm goint to select * from dbo.syslogd where MsgDate = today and then run your script to process the data into a newly created sandbox and then process THAT data and insert it into dbo.CDR so wish me luck:-) I'm going to need it. What else am I going to do this weekend? It's July 4th celebrations, I'm in Chicago, and i'm ENGLISH! One of the rare time I keep my mouth shut:-)

    Rob

    Could you post the DDL (CREATE TABLE statement) for this table? Right click on the table in Object Explorer, click on Script Table, then CREATE to, then clipboard. Then all you have to do is paste it in a post on this thread.

  • Lynn,

    How am I going to process each record without looping through? For/Next? Next n? you have to iterate somehow, right?

    Rob

  • Bill Talada (6/29/2012)


    I can save you a ton of trouble. Pass your string to this function and it will return a table of all the tokens in order. Then you can select directly from the tokens table into a typed table.

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = object_id(N'dbo.StringParse') AND xtype IN (N'FN', N'IF', N'TF'))

    DROP FUNCTION dbo.StringParse

    GO

    CREATE FUNCTION dbo.StringParse(

    @STR varchar(max),

    @delimiter varchar(20),

    @emptytonull bit)

    returns @tokens TABLE (rownbr int PRIMARY KEY NOT NULL, token varchar(max) NULL)

    AS

    -- written BY Bill Talada

    BEGIN

    /*

    split a string INTO tokens AND RETURN them IN a TABLE.

    empty string returns one NULL row

    a string WITH only a delimiter returns two NULL rows, one before it AND one after it.

    a delimiter at the beginning means a NULL value appears before it.

    two delimiters IN a row implies a NULL value BETWEEN them.

    a delimiter at the END OF string implies a NULL value after it.

    */

    DECLARE

    @delimiterlength int,

    @currptr int,

    @nextptr int,

    @token varchar(max),

    @rownbr int

    SET @delimiterlength = datalength(@delimiter)

    SET @rownbr = 0

    SET @currptr = 1 - @delimiterlength

    SET @nextptr = charindex(@delimiter, @STR, @currptr+@delimiterlength)

    --SELECT @currptr, @nextptr, @delimiterlength

    WHILE @nextptr > 0

    BEGIN

    SET @rownbr = @rownbr + 1

    SET @token = substring(@str, @currptr+@delimiterlength, @nextptr - (@currptr+@delimiterlength))

    IF len(@token) = 0

    BEGIN

    IF @emptytonull=1

    SET @token = NULL

    ELSE

    SET @token = ''

    END

    INSERT INTO @tokens VALUES (@rownbr, @token)

    SET @currptr=@nextptr

    SET @nextptr=charindex(@delimiter, @STR, @currptr+@delimiterlength)

    END

    -- last row

    SET @rownbr = @rownbr + 1

    SET @token = substring(@str, @currptr + @delimiterlength, datalength(@str) - @currptr + @delimiterlength)

    IF len(@token) = 0

    BEGIN

    IF @emptytonull=1

    SET @token = NULL

    ELSE

    SET @token = ''

    END

    INSERT INTO @tokens VALUES (@rownbr, @token)

    RETURN

    END

    go

    There are a few small issues with this particular split function. One, it uses a while loop. Two, it is a mutli-statement table valued function.

    I would be more than willing to put Jeff's DelimitedSplit8K function up against it in a performance test. I will do that tonight when I can do it on my PC at home.

  • I assume you are looking for this. The table is being updated by the syslog service so the names ar based on what syslog required. Not pretty but it does work.

    Rob

  • rjbirkett (6/29/2012)


    Lynn,

    How am I going to process each record without looping through? For/Next? Next n? you have to iterate somehow, right?

    Rob

    Take a look at the latest code I posted. I added a second record and changed nothing. What I was doing for one record is what I want to do to all the records. Set-based processing, not record-based processing (or Row By Agonizing Row (RBAR)).

    SQL Server is meant to process sets of data.

  • Lynn,

    Coffee not working. Forgot to paste the SQL in:-/

    USE [SYSLOG]

    GO

    /****** Object: Table [dbo].[Syslogd] Script Date: 06/29/2012 14:23:29 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[Syslogd](

    [MsgDate] [date] NULL,

    [MsgTime] [time](7) NULL,

    [MsgPriority] [text] NULL,

    [MsgHostname] [text] NULL,

    [MsgText] [text] NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    GO

  • rjbirkett (6/29/2012)


    I assume you are looking for this. The table is being updated by the syslog service so the names ar based on what syslog required. Not pretty but it does work.

    Rob

    Not seeing what you think I should see.

  • rjbirkett (6/29/2012)


    Lynn,

    Coffee not working. Forgot to paste the SQL in:-/

    USE [SYSLOG]

    GO

    /****** Object: Table [dbo].[Syslogd] Script Date: 06/29/2012 14:23:29 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[Syslogd](

    [MsgDate] [date] NULL,

    [MsgTime] [time](7) NULL,

    [MsgPriority] [text] NULL,

    [MsgHostname] [text] NULL,

    [MsgText] [text] NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    GO

    Could you run this on your table:

    select max(datalength(MsgText)) from dbo.Syslogd;

  • Lynn,

    It came back with 1024 as the max length.

    Rob

  • rjbirkett (6/29/2012)


    Lynn,

    It came back with 1024 as the max length.

    Rob

    Good, then the DelimitedSplit8K will work.

  • rjbirkett (6/29/2012)


    Lynn,

    How am I going to process each record without looping through? For/Next? Next n? you have to iterate somehow, right?

    Rob

    In case Lynn's busy, the nature of the T-SQL script he gave you will process all the rows at once in an internally optimized loop instead of one you build yourself.

    What you're fighting with is the iterative concepts from front end coding (working with a single instance at a time) instead of working with the entire collection in a single pass. Don't worry, it's a huge hurdle. There's an 'internal loop' in SQL that is optimized to its engine that will allow you to work with entire recordsets at once. That's why Lynn keeps trying to get you to provide a set of 5-10 rows from your original table so we can show you how to work with them in bulk instead of one at a time. Don't worry, you're not the first to feel like you're having a 'bang head here' moment on the transition. 😉

    SQL's optimized to work against the collection, though you can force it to work with the internal instances. It's not QUITE the correct parlance, but it'll get you closer to the idea.

    So you understand the earlier RBAR comment, it's a phrase we toss around here that stands for "Row-By-Agonizing-Row". Basically, when you're looping a single record at a time when you don't need to.

    So in the end, no, you don't iterate yourself. Generally when you're working with log-feed data (my own name, but it gets the point) there's a few general steps you do.

    1) Get the data somewhere you can use it. In this case, you're done, you've gotten it to a local table.

    2) Transform the data that you haven't processed yet to a local staging table. This is the first query that Lynn's already shown you, that occurs on all the records and transforms them into columns and datatypes that you can function with inside the engine. This is usually done in a single pass (with caveats) from all the unprocessed data that you've stored so far. You'll mark the records that you 'pushed' to the staging as completed somehow as well. I usually push them to an archive table so they're not even part of the next run's concern.

    3) Load the staging table data into all the necessary 'real' tables that are normalized through your system, such as adding new addresses and ids, new customers, etc, as well as loading the normalized log entry into the real logging table.

    4) Cleanup the staging once complete so the next run has a 'blank slate' to work from.

    What you're doing here falls under what we call ETL (Extract, Transform, Load). The rabbit hole can go deep but we can help you with your specific case, but we're going to end up needing a lot more information from you.

    Like Lynn's signature, my signature contains a similar link. If you can provide us with sample schema/data, the schema of the target databases, and help us out as Lynn and others work with you we can get you end to end on this with pretty tight performance.

    @bill:

    RBAR has its place, but it's when you have cross-row dependencies that previously occurring logic affects further logic. Running Totals is an example of where that can come into play, another is the old mainframe file organization where associated sub-rows are loaded below a primary row and you need to trap the ID for the sub-rows off the leader row in the data. This isn't one of those cases.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Lynn, you have me at a disadvantage. RBAR I understand, a set of data I sort of understand, but I'm old school SQL. I'm assuming that by "set of data" you mean put all the data in a table after the function call, then update all the fields in one shot? Am I getting this straight or still fumbling around in the dark looking for the bra clasp:-)

    Rob

  • Take a close look at the CTE part of this code. You will see that I have written it to select all records for yesterday from dbo.Syslogd.

    with SampleData as (

    select

    MsgText as SampleRec

    from

    dbo.Syslogd

    where

    MsgDate = dateadd(dd, -1, cast(getdate() as date))

    )

    select

    --sd.SampleRec,

    --max(case ds.ItemNumber

    --when 1 then ds.Item else null end),

    max(case ds.ItemNumber

    when 2 then ds.Item else null end) [call_status],

    max(case ds.ItemNumber

    when 3 then ds.Item else null end) [channel_num],

    max(case ds.ItemNumber

    when 4 then ds.Item else null end) [sipcall_ident],

    max(case ds.ItemNumber

    when 5 then ds.Item else null end) [trunk_num],

    max(case ds.ItemNumber

    when 6 then ds.Item else null end) [b_channel],

    max(case ds.ItemNumber

    when 7 then ds.Item else null end) [sipconf_id],

    max(case ds.ItemNumber

    when 8 then ds.Item else null end) [trunk_group_num],

    max(case ds.ItemNumber

    when 9 then ds.Item else null end) [endpoint_type],

    max(case ds.ItemNumber

    when 10 then ds.Item else null end) [call_originator],

    max(case ds.ItemNumber

    when 11 then ds.Item else null end) [source_IP],

    max(case ds.ItemNumber

    when 12 then ds.Item else null end) [destination_IP],

    max(case ds.ItemNumber

    when 13 then ds.Item else null end) [sourcephone_numtype],

    max(case ds.ItemNumber

    when 14 then ds.Item else null end) [sourcephone_numplan],

    max(case ds.ItemNumber

    when 15 then ds.Item else null end) [sourcephone_number],

    max(case ds.ItemNumber

    when 16 then ds.Item else null end) [sourcephone_nummap],

    max(case ds.ItemNumber

    when 17 then ds.Item else null end) [destphone_numtype],

    max(case ds.ItemNumber

    when 18 then ds.Item else null end) [destphone_numplan],

    max(case ds.ItemNumber

    when 19 then ds.Item else null end) [destphone_number],

    max(case ds.ItemNumber

    when 20 then ds.Item else null end) [destphone_nummap],

    max(case ds.ItemNumber

    when 21 then ds.Item else null end) [call_duration],

    max(case ds.ItemNumber

    when 22 then ds.Item else null end) [call_codec],

    max(case ds.ItemNumber

    when 23 then ds.Item else null end) [packet_intervall],

    max(case ds.ItemNumber

    when 24 then ds.Item else null end) [rtp_IP],

    max(case ds.ItemNumber

    when 25 then ds.Item else null end) [rtp_port],

    max(case ds.ItemNumber

    when 26 then ds.Item else null end) [call_release],

    max(case ds.ItemNumber

    when 27 then ds.Item else null end) [termination_reason],

    max(case ds.ItemNumber

    when 28 then ds.Item else null end) [fax_data],

    max(case ds.ItemNumber

    when 29 then ds.Item else null end) [packets_in],

    max(case ds.ItemNumber

    when 30 then ds.Item else null end) [packets_out],

    max(case ds.ItemNumber

    when 31 then ds.Item else null end) [packets_lost],

    max(case ds.ItemNumber

    when 32 then ds.Item else null end) [header_id],

    max(case ds.ItemNumber

    when 33 then ds.Item else null end) [rtp_id],

    max(case ds.ItemNumber

    when 34 then convert(datetime,replace(substring(right(ds.Item, 12),5,3) + left(right(ds.Item, 12),4) + right(right(ds.Item, 12),5) + ' ' + left(ds.Item, 13),'.',':'),113) else null end) [callsetup_datetime],

    max(case ds.ItemNumber

    when 35 then convert(datetime,replace(substring(right(ds.Item, 12),5,3) + left(right(ds.Item, 12),4) + right(right(ds.Item, 12),5) + ' ' + left(ds.Item, 13),'.',':'),113) else null end) [callconnect_datetime],

    max(case ds.ItemNumber

    when 36 then convert(datetime,replace(substring(right(ds.Item, 12),5,3) + left(right(ds.Item, 12),4) + right(right(ds.Item, 12),5) + ' ' + left(ds.Item, 13),'.',':'),113) else null end) [callrelease_datetime],

    max(case ds.ItemNumber

    when 37 then ds.Item else null end) [rtp_delay],

    max(case ds.ItemNumber

    when 38 then ds.Item else null end) [rtp_jitter],

    max(case ds.ItemNumber

    when 39 then ds.Item else null end) [localrtp_source],

    max(case ds.ItemNumber

    when 40 then ds.Item else null end) [remotertp_source],

    max(case ds.ItemNumber

    when 41 then ds.Item else null end) [redirect_reason],

    max(case ds.ItemNumber

    when 42 then ds.Item else null end) [redirect_number],

    max(case ds.ItemNumber

    when 43 then ds.Item else null end) [redirect_numplan]--,

    --max(case ds.ItemNumber

    --when 44 then ds.Item else null end),

    --max(case ds.ItemNumber

    --when 45 then ds.Item else null end)

    from

    SampleData sd

    cross apply dbo.DelimitedSplit8K(SampleRec,'|') ds

    group by

    sd.SampleRec;

  • Yes, it does work. I think I get the picture. Use the script to put the data into dbo.CDR directly, dump all my useless code, and have access pull reports from that database. The data "looks" OK, I'm just wondering if dbo.CDR will accept it as is or if I should create a new table and see what field typoes SQL arbitrarily designates. You should see what it does if you try to move an Access table into SQL. All kinds of crazy s*&^ happens:-)

    Anyway. I think you have done enough already. I'm sure you have better things to do than waste time talking to an idiot. I have really enjoyed this. Odd I know, but learning is always fun for me. Have a great weekend. I will let you know how I get on.

    Robert.

  • Yes, it does work. I think I get the picture. Use the script to put the data into dbo.CDR directly, dump all my useless code, and have access pull reports from that database. The data "looks" OK, I'm just wondering if dbo.CDR will accept it as is or if I should create a new table and see what field typoes SQL arbitrarily designates. You should see what it does if you try to move an Access table into SQL. All kinds of crazy s*&^ happens:-)

    Anyway. I think you have done enough already. I'm sure you have better things to do than waste time talking to an idiot. I have really enjoyed this. Odd I know, but learning is always fun for me. Have a great weekend. I will let you know how I get on.

    Robert.

Viewing 15 posts - 31 through 45 (of 94 total)

You must be logged in to reply to this topic. Login to reply