Help with filling missing sequence numbering

  • I need to import data that is in a "report format" into MSSQL 2008 but I'm having problems with the missing SEQ# data.

    How do I set a variable that holds the current value of the previous line's SEQ# number, then use a

    CASE WHEN SEQ# = '' THEN (PREVIOUS SEQ#)

    Here is what I have:

    SEQ # ID# SERIAL

    -------- ---------- ------

    1 1003800533 A113541

    1003814526 F223554

    V33D004

    2 1023091055 C441097

    1801878566 H777111

    5 1073552782 A885680

    1205889870 E997347

    6 1740263771 A105871

    1871523209 C122536

    G133099

    7 1083722409 A142474

    1104816131 H159973

    1871549360 H188369

    8 1043358096 B197473

    1104848761 A207854

    1144218355 A214233

    Here is what I'm trying to get:

    SEQ # ID# SERIAL

    -------- ---------- ------

    1 1003800533 A113541

    1 1003814526 F223554

    1 V33D004

    2 1023091055 C441097

    2 1801878566 H777111

    5 1073552782 A885680

    5 1205889870 E997347

    6 1740263771 A105871

    6 1871523209 C122536

    6 G133099

    7 1083722409 A142474

    7 1104816131 H159973

    7 1871549360 H188369

    8 1043358096 B197473

    8 1104848761 A207854

    8 1144218355 A214233

    I was thinking a RANKING() function might help, but I cant' get my mind around it.

    Any help would be appreciated.

    -Allen

  • This article might help.

    Solving the Running Total and Ordinal Rank Problems[/url]

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Since you need to carry a value from one row to potentially multiple rows, you need to use the method that Stepan linked to.

    Here's the code - comments on how it works are in the code:

    -- See how you start off by actually creating a table

    -- and then inserting the data into it? Doing this really

    -- makes things a lot easier for all the people you want to

    -- help you. So, HELP US HELP YOU by doing this for us! See

    -- http://www.sqlservercentral.com/articles/Best+Practices/61537/

    -- for more details on how to do all of this.

    DECLARE @test-2 TABLE (RowID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    seq INT,

    ID INT,

    serial CHAR(7));

    INSERT INTO @test-2

    SELECT 1,1003800533,'A113541' UNION ALL

    SELECT NULL,1003814526,'F223554' UNION ALL

    SELECT NULL,NULL,'V33D004' UNION ALL

    SELECT 2,1023091055,'C441097' UNION ALL

    SELECT NULL,1801878566,'H777111' UNION ALL

    SELECT 5,1073552782,'A885680' UNION ALL

    SELECT NULL,1205889870,'E997347' UNION ALL

    SELECT 6,1740263771,'A105871' UNION ALL

    SELECT NULL,1871523209,'C122536' UNION ALL

    SELECT NULL,NULL,'G133099' UNION ALL

    SELECT 7,1083722409,'A142474' UNION ALL

    SELECT NULL,1104816131,'H159973' UNION ALL

    SELECT NULL,1871549360,'H188369' UNION ALL

    SELECT 8,1043358096,'B197473' UNION ALL

    SELECT NULL,1104848761,'A207854' UNION ALL

    SELECT NULL,1144218355,'A214233';

    -- declare and initialize variables needed for the update statement.

    DECLARE @Sequence INT, -- for safety check

    @RowID INT, -- for anchor column

    @Seq int ; -- to carry from one row to next

    SET @Sequence = 0;

    /*

    This form of the UPDATE statement has some rules for proper usage.

    See Jeff Moden's article at http://www.sqlservercentral.com/articles/T-SQL/68467/

    for a complete discussion of how this works, and all of the rules for utilizing it.

    >>>>>> If you don't follow ALL the rules, you WILL mess up your data!!! <<<<<<

    Also, see http://www.sqlservercentral.com/Forums/FindPost980118.aspx

    and http://www.sqlservercentral.com/Forums/FindPost981258.aspx for the logic

    behind the safety check process.

    */

    WITH SafeTable AS

    (

    -- generate table with a sequence column in clustered index order

    -- in order to verify that update is happening in the correct order

    SELECT *,

    Sequence = ROW_NUMBER() OVER (ORDER BY RowID)

    FROM @test-2

    )

    UPDATE t

    -- Verify in proper sequence order; if not, throw an error so nothing is updated

    -- Safety check on variable being carried over to the next row is to prevent

    -- rows from being updated in the wrong order.

    SET @Seq = Seq = CASE WHEN Sequence = @Sequence + 1 THEN

    CASE WHEN seq IS NULL THEN @seq

    ELSE seq

    END

    ELSE 1/0 END, -- not in proper sequence order; throw an error

    @Sequence = @Sequence + 1, -- update safety check column

    @RowID = RowId -- anchor column (first column in clustered index)

    FROM SafeTable t WITH (TABLOCKX) -- lock table to prevent changes by others

    OPTION (MAXDOP 1); -- prevent parallelism!

    SELECT seq, ID, serial FROM @test-2;

    results:

    seq ID serial

    ----------- ----------- -------

    1 1003800533 A113541

    1 1003814526 F223554

    1 NULL V33D004

    2 1023091055 C441097

    2 1801878566 H777111

    5 1073552782 A885680

    5 1205889870 E997347

    6 1740263771 A105871

    6 1871523209 C122536

    6 NULL G133099

    7 1083722409 A142474

    7 1104816131 H159973

    7 1871549360 H188369

    8 1043358096 B197473

    8 1104848761 A207854

    8 1144218355 A214233

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • This is awesome!

    I got this to work with my production data!!

    Thank you very much.. I was tearing my hair out trying to get this done (and I have very little hair left to tear out 😀

    The final SQL (slightly edited to remove sensitive information):

    DECLARE @test-2 TABLE

    ( RowID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED

    ,C1 INT, C2 INT, C3 CHAR(9), C4 CHAR(29)

    ,C5 CHAR(7), C6 CHAR(7), C7 CHAR(7));

    INSERT INTO @test-2

    select

    convert(int,COALESCE(LEFT(column1,10),NULL)) as C1

    ,convert(int,COALESCE(RIGHT(LEFT(column1,23),13),NULL)) as C2

    ,RIGHT(LEFT(column1,32),9) as C3

    ,RIGHT(LEFT(column1,61),29) as C4

    ,RIGHT(LEFT(column1,68),7) as C5

    ,RIGHT(LEFT(column1,75),7) as C6

    ,RIGHT(LEFT(column1,77),7) as C7

    from MyRawDataTable

    where

    Column1 not like '' and

    Column1 not like '' and

    Column1 not like '-------- ---------- ------ ------------------------- ------ --- ---------'

    DECLARE @Sequence INT

    ,@RowID INT

    ,@Seq int;

    SET @Sequence = 0;

    WITH SafeTable AS

    (

    SELECT *, Sequence = ROW_NUMBER() OVER (ORDER BY RowID)

    FROM @test-2

    )

    UPDATE t

    SET @Seq = C1 = CASE WHEN Sequence = @Sequence + 1 THEN

    CASE WHEN C1 = 0 THEN @seq

    ELSE C1

    END

    ELSE 1/0 END

    ,@Sequence = @Sequence + 1

    ,@RowID = RowId

    FROM SafeTable t WITH (TABLOCKX)

    OPTION (MAXDOP 1);

    SELECT C1, C2, C3, C4, C5, C6, C7 FROM @test-2;

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply