Subtracting a row from the next row

  • Hello Friends,

    I have more than 40,000 rows in a SQL Server table.

    Almost all of those rows contain numbers.

    For eg:

    ID X Y Z

    A1 10 12 5

    A2 13 17 7

    A3 9 19 4

    so on and so forth.

    Now, using SSIS, I want to know if it is possible to get a table populating the following table:

    For eg:

    ID X Y Z

    A1 10 12 5

    A2 3 5 2

    A3 -4 2 -3

    I am getting this data by subtracting the rows:

    A2 in the new table = A2 - A1 (in the old table)

    A3 in the new table = A3 - A2 (in the old table)

    Please do let me know is there any transformation that will help me achieve this or will I need a temporary excel sheet which will be doing the calculations and then I can just move that data to a SQL table. I would appreciate if someone can show me a way (if there is one) without using an excel sheet!

    Thanks in advance friends.........

  • Here is some test code for you to start with and it does it all in T-SQL, not SSIS:

    -- Create initial test table that will be the source data

    create table dbo.TestTab1 (

    TestID char(2),

    X int,

    Y int,

    Z int);

    -- Create secondary test table where the actual work will be completed

    create table dbo.TestTab2 (

    TestID char(2),

    X int,

    Y int,

    Z int);

    -- Insert test data into source table

    insert into dbo.TestTab1

    select 'A1', 10, 12, 5 union all

    select 'A2', 13, 17, 7 union all

    select 'A3', 9, 19, 4

    ;

    -- create the clustered on the second table

    create clustered index IX_TestTab on dbo.TestTab2 (

    TestID asc

    )

    ;

    -- display data from source test table

    select

    *

    from

    dbo.TestTab1

    ;

    -- insert data from source table to work table

    insert into dbo.TestTab2

    select

    *

    from

    dbo.TestTab1

    ;

    declare @X int,

    @Y int,

    @Z int,

    @X1 int,

    @Y1 int,

    @Z1 int

    ;

    -- Display data in work table prior to updates

    select

    *

    from

    dbo.TestTab2

    ;

    select

    @X1 = 0,

    @Y1 = 0,

    @Z1 = 0,

    @X = 0,

    @Y = 0,

    @Z = 0

    ;

    update dbo.TestTab2 set

    @X1 = X = X - @X,

    @Y1 = Y = Y - @Y,

    @Z1 = Z = Z - @Z,

    @X = @X1 + @X,

    @Y = @Y1 + @Y,

    @Z = @Z1 + @Z

    from

    dbo.TestTab2 WITH (INDEX = 1, TABLOCKX)

    ;

    -- Display data in work table after the updates

    select

    *

    from

    dbo.TestTab2

    ;

    drop table dbo.TestTab1; -- drop my test table, don't drop your production table(s)

    drop table dbo.TestTab2; -- drop my test table, don't drop your production table(s)

  • Hi,

    Thank you for your reply.

    But practically it will not be possible for me to add so many insert statements as I have more than 40,000 rows in this table.

    Is there any other way around?

  • You don't need to use the INSERT statements. Those were there to load the test table (dbo.TestTab1) with data, which was then used to load dbo.TestTab2 with the same data (per your requirements) where the actual work was then completed. You would use your own table and the code that I provided to complete the actual work.

  • Excellent. I tried it and its working fine. I tried with just two columns for now.

    I will verify the data minutely and then go to the next colums also.

    Thank you so much for your help.

    You are a real master.

    Thanks once again.

  • I figured out one more question.

    What if I have similar ID's for two rows.

    For some instances, I have same ID's.

    Two rows having one ID. I wish to group by such same id's and add two records of this same id and then subtract the result from the next one, as it is done for the other records.

    Hope I am not confusing you here.

    If I am, then let me know, I shall try to be more specific and clear!

  • notes4we (3/24/2009)


    I figured out one more question.

    What if I have similar ID's for two rows.

    For some instances, I have same ID's.

    Two rows having one ID. I wish to group by such same id's and add two records of this same id and then subtract the result from the next one, as it is done for the other records.

    Hope I am not confusing you here.

    If I am, then let me know, I shall try to be more specific and clear!

    Please provide sample data and expected results. Based on what you have articulated, I'm not sure what you are looking for. I really need to see what you want to get a full grasp of the question.

  • I have it like this;

    ID X Y Z

    A1 10 12 5

    A2 13 17 7

    A2 9 19 4

    A3 0 11 15

    So, I have two A2 (ID's same) here.

    I want to add this two A2 (similar id's) and then get the results.

    It should populate this:

    ID X Y Z

    A1 10 12 5

    A2 12 24 6

    A3 -22 -25 4

    Which means,

    A3 = A3 - (A2 + A2)

    A2 = (A2 + A2) - A1

    Summing the values of similar ID's.

  • That one is easy, you make a slight change to the INSERT to the second table like this, based on my sample code earlier:

    insert into dbo.TestTab2

    select

    TestID,

    sum(X),

    sum(Y),

    sum(Z)

    from

    dbo.TestTab1

    group by

    TestID

    ;

    Nothing else changes. You end up with a single row of data for each TestID.

  • No. This does not work. It gives me multiple records for a single ID.

    I am getting

    A1

    A1

    A1

    A2

    A2

    A2

    and one more thing is that I have some two datetime columns also in this table.

    So, the problem is with those datetime columns also.

    Trying to figure out now.

    Thanks buddy!

  • You aren't providing the whole story if you have left out vital information. Based SOLELY off the info you provided, the code I provided will work. How about all the info if you want a bette answer to your problem.

  • Lynn. Your code worked fine. I was a bit confused with few columns as the table is huge.

    Anyways, I have figured out the reason why it did not work. It is the data that I have.

    I will be checking the data with my superior tomorrow and then I will try the code again.

    So far it looks good.

    Once again. Thank you for all your help.

    Hard to believe, there are so helpful people like you too in this world!

    Thanks for all the time that you have given to solve my big issue.

    May be I might have some more questions tomorrow, when I still dig deeper into it and try to verify the results.

    Thank you.......:)

  • notes4we (3/24/2009)


    Lynn. Your code worked fine. I was a bit confused with few columns as the table is huge.

    Anyways, I have figured out the reason why it did not work. It is the data that I have.

    I will be checking the data with my superior tomorrow and then I will try the code again.

    So far it looks good.

    Once again. Thank you for all your help.

    Hard to believe, there are so helpful people like you too in this world!

    Thanks for all the time that you have given to solve my big issue.

    May be I might have some more questions tomorrow, when I still dig deeper into it and try to verify the results.

    Thank you.......:)

    Here's another tip that will not only save us some time, but will also get someone working on your problem quicker... take a look at the article at the link in my signature for how to post sample data. A lot of us will work on posts first where the OP has made it the easiest for us to set up a test and a lot of us don't like to post unless we can test the code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lynn Pettis (3/23/2009)


    Here is some test code for you to start with and it does it all in T-SQL, not SSIS:

    -- Create initial test table that will be the source data

    create table dbo.TestTab1 (

    TestID char(2),

    X int,

    Y int,

    Z int);

    -- Create secondary test table where the actual work will be completed

    create table dbo.TestTab2 (

    TestID char(2),

    X int,

    Y int,

    Z int);

    -- Insert test data into source table

    insert into dbo.TestTab1

    select 'A1', 10, 12, 5 union all

    select 'A2', 13, 17, 7 union all

    select 'A3', 9, 19, 4

    ;

    -- create the clustered on the second table

    create clustered index IX_TestTab on dbo.TestTab2 (

    TestID asc

    )

    ;

    -- display data from source test table

    select

    *

    from

    dbo.TestTab1

    ;

    -- insert data from source table to work table

    insert into dbo.TestTab2

    select

    *

    from

    dbo.TestTab1

    ;

    declare @X int,

    @Y int,

    @Z int,

    @X1 int,

    @Y1 int,

    @Z1 int

    ;

    -- Display data in work table prior to updates

    select

    *

    from

    dbo.TestTab2

    ;

    select

    @X1 = 0,

    @Y1 = 0,

    @Z1 = 0,

    @X = 0,

    @Y = 0,

    @Z = 0

    ;

    update dbo.TestTab2 set

    @X1 = X = X - @X,

    @Y1 = Y = Y - @Y,

    @Z1 = Z = Z - @Z,

    @X = @X1 + @X,

    @Y = @Y1 + @Y,

    @Z = @Z1 + @Z

    from

    dbo.TestTab2 WITH (INDEX = 1, TABLOCKX)

    ;

    -- Display data in work table after the updates

    select

    *

    from

    dbo.TestTab2

    ;

    drop table dbo.TestTab1; -- drop my test table, don't drop your production table(s)

    drop table dbo.TestTab2; -- drop my test table, don't drop your production table(s)

    Everything that is done above is ok, but there is one more column added named deptid (small int).

    Now, when I have

    DeptID X Y z

    A1 1 10 12 5

    A2 1 13 17 7

    A2 2 15 12 8

    A3 1 9 19 4

    A4 1 15 14 8

    A5 1 18 10 3

    A5 2 11 13 5

    Now, here, DeptID's are also to be taken care of.

    Which means when I look at the new table, I should get the following:

    The table should be arranged in a descending order and all the calculations are to be done in a descending order, which means, A5 (biggest) - A4 (lower than A5).

    DeptID X Y Z

    A5 1 3 = (8-15) -4 = (10-14) -5 = (3-8)

    A5 2 -4 = (11 - 15) 1 = (13 - 12) -3 = (5-8)

    A4 1 6 -5 4

    A3 1 -4 2 -3

    A2 1 3 5 2

    A2 2 15 12 8

    A1 1 10 12 5

    The numbers in paranthesis in the above table is just to show the calculations.

    A5 where DeptID = 1, A5(1) - A4(1)

    A5 where DeptID = 2, A5(2) - A2(2)

    And in the last, A2(2) and A1(1), where there are no numbers to subtract from, the numbers should populate as they are.

    So, in short, not only A1, A2, ........ should be matching, but also ID's should be considered while subtracting. Similar ID's can be subtracted.

    I would appreciate if anyone of you could help me achieve this via SSIS or SQL code.

    Thank you.

  • Changing methods a bit here, so here is some test code for you to play around with:

    --DeptID X Y z

    --A1 1 10 12 5

    --A2 1 13 17 7

    --A2 2 15 12 8

    --A3 1 9 19 4

    --A4 1 15 14 8

    --A5 1 18 10 3

    --A5 2 11 13 5

    create table #TestTab1 (

    Dept char(2),

    ID int,

    X int,

    Y int,

    Z int

    )

    ;

    insert into #TestTab1

    select 'A1',1,10,12,5 union all

    select 'A2',1,13,17,7 union all

    select 'A2',2,15,12,8 union all

    select 'A3',1,9,19,4 union all

    select 'A4',1,15,14,8 union all

    select 'A5',1,18,10,3 union all

    select 'A5',2,11,13,5

    ;

    create table #TestTab2 (

    Dept char(2),

    ID int,

    X int,

    Y int,

    Z int

    )

    ;

    with DeptID as

    (

    select

    row_number() over (partition by ID order by Dept desc) as RowNum,

    Dept,

    ID,

    X,

    Y,

    Z

    from

    #TestTab1

    )

    insert into #TestTab2

    select

    di1.Dept,

    di1.ID,

    di1.X - isnull(di2.X,0) as X,

    di1.Y - isnull(di2.Y,0) as y,

    di1.Z - isnull(di2.Z,0) as z

    from

    DeptID di1

    left outer join DeptID di2

    on (di1.ID = di2.ID

    and di1.RowNum = di2.RowNum - 1)

    ;

    select

    *

    from

    #TestTab2

    order by

    Dept desc,

    ID asc

    ;

    drop table #TestTab1;

    drop table #TestTab2;

    By the way, you notice how I have provided the DDL for the tables and sample data for the tables? That is how you should be providing us with your code/sample data instead of relying on us to do it for you. Comes down to helping us help you.

    Read the first article I have referenced below in my signature block.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply