Add incremental count to group of rows

  • Hi, I need to write a T-SQL(2000) extract based on existing tables to add a sequence number to each row in a group of like transactions. The table has a PK on the ID column only. After that there is a transaction_code column which will have a number which can be duplicated if there were more than one items sold within that transaction. ie

    id trans_code

    dfg1 5554

    dfg2 5554

    dfg3 5554

    dfg4 5553

    dfg5 5553

    So in my select statement I need to specify a seq code for each row in a group of transactions, so I'd get 3 rows back from trans 5554 with a seq 1, 2, 3 and 2 rows back for code 5553 with a seq of 1,2 and so on. I cant change the existing tables, this is just a query which will show a seq number for the rows retrieved.

    thank-you

  • In SQL2000 you will need to use a #temp table:

    http://www.sqlservercentral.com/Forums/Topic397675-8-1.aspx

  • Hi thanks for that but I cant see that the result of that query is the same as I need. Have you an actual example of what I need please.

    Thanks

  • Simon - could you post some more specifics as to the DDL of that table (i.e. column names and data types). Ken's correct and the example gives you what you'd need, but it will help if the example looked like your table. If you post your DDL we can try to adapt the logic to your table.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • 1. Create a #temp table with an IDENTITY column

    2. Insert your data into the #temp table

    3. Join the #temp table to a derived table containing the MINIMUM(ID) for each tans_code so that seq (IDNo) can be calculated.

    Among outher things, this is what the link shows!

  • Thanks guys, here is some code. As you'll see its part of an overall extract not just a one off query in itself..

    SELECT a.trans_code

    ,asize.adjustmentsize_id

    ,'LG'

    ,s.movement_type

    ,s.store_code

    , -- Here is where I also need to display the increment value per trans_code

    ...................other DDL

    FROMadjustment a,

    adjustmentsize asize,

    itemcolour ic,

    itemcoloursize ics,

    SAP_GM_Codes s

    WHEREa.trans_code = asize.trans_code

    ANDics.itemcolour_id = ic.itemcolour_id

    ANDasize.itemcoloursize_id = ics.itemcoloursize_id

    ANDic.itemcolour_ref not in ('20816','20817','62700','64543','22432','22433','22434','22435','01618','01002','01005','01000')

    ANDs.movement_type = 'J'

    ANDs.movement_code = a.trans_code

    ANDasize.itemcoloursize_id IS NOT NULL

    So for the trans_code, instead of just seeing

    trans_code

    5554

    5554

    5554

    5553

    5553

    I'd like

    trans_code

    5554 1

    5554 2

    5554 3

    5553 1

    5553 2

    Thanks v much

  • Here's an example using somethig approximating your data. You weren't really clear on the table layout, etc... so it's going to be tough to know.

    Anyway - see if this helps

    --Set up test data

    if object_ID('adjustment','u')>0

    drop table adjustment

    create table adjustment (iD int identity(1,1) primary key clustered, transcode varchar(20), dateval datetime)

    go

    insert adjustment(transcode, dateval)

    select 'aaa1', getdate()-rand()*20 UNION ALL

    select 'bbb1', getdate()-rand()*20 UNION ALL

    select 'ccc1', getdate()-rand()*20 UNION ALL

    select 'ddd1', getdate()-rand()*20 UNION ALL

    select 'eee1', getdate()-rand()*20 UNION ALL

    select 'fff1', getdate()-rand()*20 UNION ALL

    select 'hhh1', getdate()-rand()*20 UNION ALL

    select 'mmm1', getdate()-rand()*20 UNION ALL

    select 'aaa1', getdate()-rand()*20 UNION ALL

    select 'bbb1', getdate()-rand()*20 UNION ALL

    select 'ccc1', getdate()-rand()*20 UNION ALL

    select 'ddd1', getdate()-rand()*20 UNION ALL

    select 'eee1', getdate()-rand()*20 UNION ALL

    select 'fff1', getdate()-rand()*20 UNION ALL

    select 'hhh1', getdate()-rand()*20 UNION ALL

    select 'mmm1', getdate()-rand()*20 UNION ALL

    select 'aaa1', getdate()-rand()*20 UNION ALL

    select 'bbb1', getdate()-rand()*20 UNION ALL

    select 'ccc1', getdate()-rand()*20 UNION ALL

    select 'ddd1', getdate()-rand()*20 UNION ALL

    select 'eee1', getdate()-rand()*20 UNION ALL

    select 'fff1', getdate()-rand()*20 UNION ALL

    select 'hhh1', getdate()-rand()*20 UNION ALL

    select 'mmm1', getdate()-rand()*20 UNION ALL

    select 'aaa1', getdate()-rand()*20 UNION ALL

    select 'bbb1', getdate()-rand()*20 UNION ALL

    select 'ccc1', getdate()-rand()*20 UNION ALL

    select 'ddd1', getdate()-rand()*20 UNION ALL

    select 'eee1', getdate()-rand()*20 UNION ALL

    select 'fff1', getdate()-rand()*20 UNION ALL

    select 'hhh1', getdate()-rand()*20 UNION ALL

    select 'mmm1', getdate()-rand()*20

    --go 30

    --done with test data setup

    --Now - for the seqno part

    declare @prevtrans varchar(20)

    declare @dummy int

    declare @i int

    drop table #tmpTransWithSeq

    --create the temp table

    --needs to include all fields to return plus the original PK for ordering plus SEQNO

    create table #tmpTransWithSeq

    (

    TransCode varchar(20),

    ID int,

    seqno int

    )

    --create a clustered index to make the SEQNO work

    --needs to be created with transcode on the LEFT, since that's the grouping

    create index UC_tmpTransWithSeq on #tmpTransWithSeq(TRANSCODE,ID)

    --now, insert the results you need

    insert #tmpTransWithSeq

    (

    TransCode,

    ID)

    select TransCode, ID from adjustment

    --now run the update to create the sequence numbers

    --but first - set up the variables

    select @prevtrans='',@i=0, @dummy=0

    update #tmpTransWithSeq

    set

    seqno=@i+1,

    @i=CASE when @prevtrans=transcode then @i+1 else 0 end,

    @dummy=@i,

    @prevtrans=transcode

    from #tmpTransWithSeq

    WITH (INDEX(UC_tmpTransWithSeq))--this indexing hint set up the order

    OPTION (MAXDOP 1) --use just one processor, so that the order is respected

    --now show the result

    select * from #tmpTransWithSeq order by transcode,seqno

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks for that effort, it looks very complicated for what I want though.

    To make it easier ...lets say that I just have a table with just one column, trans_code and it has entires such as

    5554

    5554

    5554

    5553

    5553

    For every like entry i want to run a SELECT extract to display the row and a sequnce number within the trans_code

    5554 1

    5554 2

    5554 3

    5553 1

    5553 2

    However this will be running on an existing table as part of an overal select statement so I cant use a new table with identity or use an insert statement. the data (trans_code) is already there I simply want to display a sequence number, not add it to the table.

    thanks

  • The only other way to do that without a temporary table would be a correlated sub-query, which is going to kill your performance. If your adjustments table is big - this is going to peg your server big-time.

    If you're talking about a "normal size" table - don't be surprised if performance on the following sucks...

    select

    t1.*,

    seqno=(

    select count(*)

    from adjustment t2

    where t1.transcode=t2.transcode and

    t1.id>=t2.id

    )

    from adjustment t1

    order by transcode,seqno

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Just in case you want to know why Matt says the performance will suck...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for all your help guys. Matt, your script did the job fine!!!

    However we are currently testing against a limited set of rows so I will insist we do further tests against a full copy of the live DB so I can assess performance.

    thanks again

  • simon.letts (1/10/2008)


    Thanks for all your help guys. Matt, your script did the job fine!!!

    However we are currently testing against a limited set of rows so I will insist we do further tests against a full copy of the live DB so I can assess performance.

    thanks again

    Just many rows not suppose to be a problem.

    You should expect performance issues when you have many rows per group.

    And, of course, indexing must be done well.

    _____________
    Code for TallyGenerator

  • Nobody mentioned that you could use UPDATE. Add in integer column and try this:

    Declare @PrevGroupingColumn VarChar(256),@i Int

    Select @PrevGroupingColumn='' -- Any value that would not occur in the grouping column.

    Update dbo.YourTable Set

    @i=Case when GroupingColumn=@PrevGroupingColumn then @i+1 else 1 End,

    SequenceColumn=@i,@PrevGroupingColumn=GroupingColumn

    Adjust the above to match your column names and data types and consider performance if YourTable is large...



    PeteK
    I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

  • Careful... that's only guaranteed to work correctly if one of two things happen...

    1. You have a rock solid clustered index on the correct columns and you have an "anchor" column (thanks, Matt Miller) in the update AND you force the index with a WITH(INDEX()). Otherwise, you could end up with what Gail Shaw (GilaMonster) refers to as a "Merry-Go-Round" index that will give an incorrect answer.

    2. You have a TOP 100 PERCENT/ORDER BY as a "control" sub-query in the FROM clause of the update. Of course, that will greatly reduce the performance compared to method 1 above although it's still likely to be faster than a Cursor or While Loop.

    Both methods above, as well as the Cursor or While Loop method will be faster than using a correlated sub-query with a "Triangular Join"... please refer to the following article for what the impact on performance is for "Triangular Joins"...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    I will say that, due to caching, if the groups of rows to be "ranked" or counted are very small, the correlated sub-query method won't make a huge impact on performance an may probably suffice for most not concerned with scalability of the data.

    There is a very high performance alternate method to doing this type of grouped ranking which uses a Tally table... I'll see if I can dig it up...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks again everyone for your valued input. Fortunately although the table is large the actual groups to be ranked are always less than 5.

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply