Add a set number to each group of records in a table

  • My data is in my table like this:

    F1 F2 F3 F4 F5 <-- column names

    LX a b <-- start of set 1

    INS Y

    MEM 1000

    LOC 31

    LX a b <-- start of set 2

    INS Y

    MEM 2000

    LOC 31

    LX a b <-- start of set 3

    INS Y

    MEM 3000

    LOC 31

    I want to populate field F5 for each record in each set with the value in F2 on the record that has "MEM" in F1. So, at the end, my data will look like this:

    F1 F2 F3 F4 F5 <-- column names

    LX a b 1000 <-- start of set 1

    INS Y 1000

    MEM 1000 1000

    LOC 31 1000

    LX a b 2000 <-- start of set 2

    INS Y 2000

    MEM 2000 2000

    LOC 31 2000

    LX a b 3000 <-- start of set 3

    INS Y 3000

    MEM 3000 3000

    LOC 31 3000

    I suspect I need some kind of "over" and "partition" statement, but I don't really grasp those terms yet.

    Any help is appreciated.

    *******************
    What I lack in youth, I make up for in immaturity!

  • Do you have any kind of table structure or are you trying to use SQL Server like an Excel spreadsheet? Based on the table data you provided, it looks like you have duplicate rows, which is terribly problematic at best. I'm hoping it's just a matter of providing significantly too little information about your table structure.

    Let us know and we would be glad to help you.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • It's difficult to understand what your data is. Can you provide the data in the form of create table statements, and populate that table with your sample data in the form of insert statements? See the first link in my signature for how to accomplish this. Also, please provide what the end result should be based on the sample data provided.

    Thanks!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • I tried to simplify the problem in my example, but this is more like what the data actually looks like coming in:

    INS*Y*18*030**A*D**FT

    REF*0F*123456

    REF*1L*FL103

    REF*F6*999999999A

    NM1*IL*1*JONES*WALTER*J

    PER*IP**TE*6665550207

    N3*5400 17TH AVE N

    N4*ST PETERSBURG*FL*337105704**CY*PINEL

    DMG*D8*19351211*M**7

    LUI*LD*ENG

    HD*030**HMO*311-014*IND

    DTP*348*D8*20090101

    LX*1

    NM1*P3*2*JONES*HEMANT*R***SV*14911*25

    N4*ST PETERSBURG*FL*337107127

    COB*U**1

    INS*Y*18*030**A*D**FT

    REF*0F*657933

    REF*1L*FL033

    REF*F6*381343567A

    NM1*IL*1*ADAMS*KAREN*I

    PER*IP**TE*8135555039

    N3*3222 N FLORIDA AVE

    N4*TAMPA*FL*34603**CY*HILLS

    DMG*D8*19330501*F**7

    LUI*LD*ENG

    HD*030**HMO*456-024*IND

    DTP*348*D8*20090101

    LX*1

    NM1*P3*2*HANKS*ASHOK*K***SV*411180*25

    N4*TAMPA*FL*33802

    COB*U**1

    I'm populating a table that looks like this via BULK LOAD:

    CREATE TABLE PEC_FLATTENED

    (Segment VarChar(5) NULL,

    [01] VARCHAR(Max) NULL,

    [02] VARCHAR(Max) NULL,

    [03] VARCHAR(Max) NULL,

    [04] VARCHAR(Max) NULL,

    [05] VARCHAR(Max) NULL,

    [06] VARCHAR(Max) NULL,

    [07] VARCHAR(Max) NULL,

    [08] VARCHAR(Max) NULL,

    [09] VARCHAR(Max) NULL,

    [10] VARCHAR(Max) NULL,

    [11] VARCHAR(Max) NULL,

    [12] VARCHAR(Max) NULL,

    [ParentSegment] VARCHAR(Max) NULL,

    [RecordNumber] VARCHAR(Max) NULL,

    [Submitter] VARCHAR(Max) NULL,

    [Result] VARCHAR(Max) NULL,

    [ErrorMessage] VARCHAR(Max) NULL,

    [RunDate] VARCHAR(Max) NULL,

    [MemberID] VARCHAR(Max) NULL,

    );

    Everything from an INS* to a COB* belong to one another. In order to manipulate the data, I need to add a grouping number, so I can deal with all of the records for a group. (This is EDI stuff).

    *******************
    What I lack in youth, I make up for in immaturity!

  • This looks like a HIPAA data feed... I sure hope you sanitized the data before uploading it here.

    This should handle what you're trying to do:

    -- See how this starts off with a table and data in it?

    -- If you had provided us the data in this format,

    -- it would have made things easier for all of the

    -- volunteers on this site to help you out.

    if object_id('tempdb..#temp') IS NOT NULL DROP TABLE #temp

    CREATE TABLE #temp (ColA varchar(500))

    INSERT INTO #temp (ColA)

    SELECT 'INS*Y*18*030**A*D**FT' UNION ALL

    SELECT 'REF*0F*123456' UNION ALL

    SELECT 'REF*1L*FL103' UNION ALL

    SELECT 'REF*F6*999999999A' UNION ALL

    SELECT 'NM1*IL*1*JONES*WALTER*J' UNION ALL

    SELECT 'PER*IP**TE*6665550207' UNION ALL

    SELECT 'N3*5400 17TH AVE N' UNION ALL

    SELECT 'N4*ST PETERSBURG*FL*337105704**CY*PINEL' UNION ALL

    SELECT 'DMG*D8*19351211*M**7' UNION ALL

    SELECT 'LUI*LD*ENG' UNION ALL

    SELECT 'HD*030**HMO*311-014*IND' UNION ALL

    SELECT 'DTP*348*D8*20090101' UNION ALL

    SELECT 'LX*1' UNION ALL

    SELECT 'NM1*P3*2*JONES*HEMANT*R***SV*14911*25' UNION ALL

    SELECT 'N4*ST PETERSBURG*FL*337107127' UNION ALL

    SELECT 'COB*U**1' UNION ALL

    SELECT 'INS*Y*18*030**A*D**FT' UNION ALL

    SELECT 'REF*0F*657933' UNION ALL

    SELECT 'REF*1L*FL033' UNION ALL

    SELECT 'REF*F6*381343567A' UNION ALL

    SELECT 'NM1*IL*1*ADAMS*KAREN*I' UNION ALL

    SELECT 'PER*IP**TE*8135555039' UNION ALL

    SELECT 'N3*3222 N FLORIDA AVE' UNION ALL

    SELECT 'N4*TAMPA*FL*34603**CY*HILLS' UNION ALL

    SELECT 'DMG*D8*19330501*F**7' UNION ALL

    SELECT 'LUI*LD*ENG' UNION ALL

    SELECT 'HD*030**HMO*456-024*IND' UNION ALL

    SELECT 'DTP*348*D8*20090101' UNION ALL

    SELECT 'LX*1' UNION ALL

    SELECT 'NM1*P3*2*HANKS*ASHOK*K***SV*411180*25' UNION ALL

    SELECT 'N4*TAMPA*FL*33802' UNION ALL

    SELECT 'COB*U**1'

    -- add Group # and identity columns, put CLUSTERED PK on identity column

    ALTER TABLE #temp

    ADD GroupNbr int,

    RowID int IDENTITY CONSTRAINT [PK#temp] PRIMARY KEY CLUSTERED

    -- declare and initialize some variables.

    DECLARE @GroupNbr int, @RowID int

    set @GroupNbr = 0

    -- This form of the UPDATE statement has some particular rules in order to work properly.

    -- See Jeff Moden's article at http://www.sqlservercentral.com/articles/T-SQL/68467/

    -- for a complete discussion of how this works, and all of the rules for utilizing it.

    UPDATE #temp

    SET @RowID = RowID, -- << Anchor column

    @GroupNbr = CASE WHEN LEFT(ColA, CHARINDEX('*', ColA)-1) = 'INS' THEN @GroupNbr + 1 ELSE @GroupNbr END,

    GroupNbr = @GroupNbr

    FROM #temp WITH (TABLOCKX) -- << TABLOCKX not needed for a local temp table, but included so it's not forgotten when it is necessary

    OPTION (MAXDOP 0) -- << prevent parallelism

    -- display the results

    select * from #temp

    Edit: added additional comments

    Edit2: added the anchor column

    Edit3: corrected name misspelling

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • One way to do it:

    On your original insert, add an identity field to your bulk table. That's very important, because that's the only think that will keep the rows in order. Let's call that bulk table nonFlatData. If there are always a total of 16 lines for each record, and they're always in the same order and the same thing, it's really easy.

    SELECT q1.[dataField],

    q2.[dataField],

    q3.[dataField]

    FROM nonFlatData q1

    JOINnonFlatData q2

    ON q2.[identityField] = q1.[identityField] + 4

    JOINnonFlatData q3

    ON q3.[identityField] = q1.[identityField] + 5

    WHERE LEFT(q1.[dataField],3) = 'INS'

    From your above example, you would flatten your original line with the name and phone number. So it would just be a matter of what all you want to include on the same line and doing your character replacements and cleanup. (An additional join would be needed for each extra line that has information you want to include.)

    edit: terrible typing

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • To simplify the problem, really, all I need is a sequential "Set" number. The first set of records from INS through COB would all have "SET000001" in the last column, etc.

    In RBAR, I could keep a counter that increases by one when I hit an INS record, and populated all of the last fields with it until I hit another INS record, then up the counter...

    Is RBAR the way to go for this?

    *******************
    What I lack in youth, I make up for in immaturity!

  • Okay, I think I see... I'll try this.. thanks!

    *******************
    What I lack in youth, I make up for in immaturity!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply