Add incremental count to group of rows

Question

Add incremental count to group of rows

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply

Peter E. Kierstead SSCarpal Tunnel Points: 4114 More actions · Answer 1

Ya, I agree, the order of the data is important. It generally works for my jobs as the natural order of the data is correct for this process. However, you could do this (with a little cleanup):

Declare @i Int,@Prev VarChar(256)

Set @prev='' -- or some value this is does not occur in the grouping column.

Update OpenRowset( ,'Select * from YourTable order by GroupingColumns')

Set @i=SeqColumn=Case GroupingColumn when @prev then @1+1 else End,@Prev=GroupingColumn

Just trying to provide some alternatives... 🙂

PeteK
I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 2

Jeff Moden (1/11/2008)
Careful... that's only guaranteed to work correctly if one of two things happen...
1. You have a rock solid clustered index on the correct columns and you have an "anchor" column (thanks, Matt Miller) in the update AND you force the index with a WITH(INDEX()). Otherwise, you could end up with what Gail Shaw (GilaMonster) refers to as a "Merry-Go-Round" index that will give an incorrect answer.

You might also care to force this to stay in one thread (OPTION (MAXDOP 1)), so as to not run into another type of merry go round.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Peter E. Kierstead SSCarpal Tunnel Points: 4114 More actions · Answer 3

And yet again using Select ... Into ...

if object_Id('tempdb.dbo.#t1') is not null drop table dbo.#t1

if object_Id('tempdb.dbo.#t2') is not null drop table dbo.#t2

select 'Smith'[LName],'Randy'[FName] into dbo.#t1

union all select 'Smith','John'

union all select 'Jones','Carry'

union all select 'Wilson','Charles'

union all select 'Fix','Ron'

union all select 'Jones','Able'

union all select 'Fix','Donald'

union all select 'Smith','Steve'

select 'Initial Table'[Initial Table],* from dbo.#t1

select

LName,FName,

(select count(*) from dbo.#t1 t2 where t1.LName=t2.LName and t1.FName>=t2.FName)[Seq]

into dbo.#t2

from dbo.#t1 t1

select 'Sequenced Table'[Sequenced Table],* from dbo.#t2 order by LName,FName

PeteK
I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

Jeff Moden SSC Guru Points: 1004499 More actions · Answer 4

simon.letts (1/9/2008)
Thanks for that effort, it looks very complicated for what I want though.
To make it easier ...lets say that I just have a table with just one column, trans_code and it has entires such as
5554
5554
5554
5553
5553
For every like entry i want to run a SELECT extract to display the row and a sequnce number within the trans_code
5554 1
5554 2
5554 3
5553 1
5553 2
However this will be running on an existing table as part of an overal select statement so I cant use a new table with identity or use an insert statement. the data (trans_code) is already there I simply want to display a sequence number, not add it to the table.
thanks

Alright... this is the fastest "pure Select" to do "grouped running counts" that I know of... it does NOT require a temp table nor the update of a table. It does, however, require the use of a "Tally" table. In this case, I've used Master.dbo.spt_Values as the source of the required numbers instead of a real Tally table...

--===== Create some test data... this is NOT part of the solution

DECLARE @yourtable TABLE (Trans_Code INT)

INSERT INTO @yourtable (Trans_Code)

SELECT 5554 UNION ALL

SELECT 5551 UNION ALL

SELECT 5552 UNION ALL

SELECT 5567 UNION ALL

SELECT 5554 UNION ALL

SELECT 5552 UNION ALL

SELECT 5551 UNION ALL

SELECT 5555 UNION ALL

SELECT 5552 UNION ALL

SELECT 5551 UNION ALL

SELECT 5553 UNION ALL

SELECT 5554

--===== Produce a running count (sequence/ordinal rank) for

-- each Trans_Code.

SELECT c.Trans_Code, t.Number AS RunningSequence

FROM (--==== Find how many of each Trans_Code there are

SELECT Trans_Code, COUNT(*) AS TransCodeCount

FROM @yourtable

GROUP BY Trans_Code

)c

INNER JOIN Master.dbo.spt_Values t

ON t.Number <= c.TransCodeCount

WHERE t.Type = 'P'

AND t.Number > 0

ORDER BY c.Trans_Code, t.Number

... and here's the results ...

Trans_Code RunningSequence

----------- ---------------

5551 1

5551 2

5551 3

5552 1

5552 2

5552 3

5553 1

5554 1

5554 2

5554 3

5554 4

5554 5

5554 6

5555 1

5567 1

5567 2

You have to be a bit leary about using spt_Values instead of a real Tally table because it only has the numbers 0 to 255 in SQL Server 2000. If you have more than 255 of the same Trans_Code, the code will fault without generating an error.

Tally tables have a huge number of uses and you should have a permanent copy of one in your database. Here's how to make one...

--===== Create and populate the Tally table on the fly

SELECT TOP 11000 --equates to more than 30 years of dates

IDENTITY(INT,1,1) AS N

INTO dbo.Tally

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

--===== Add a Primary Key to maximize performance

ALTER TABLE dbo.Tally

ADD CONSTRAINT PK_Tally_N

PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

--===== Allow the general public to use it

GRANT SELECT ON dbo.Tally TO PUBLIC

Once that's done... the code for your solution becomes even simpler...

--===== Produce a running count (sequence/ordinal rank) for

-- each Trans_Code.

SELECT c.Trans_Code, t.N AS RunningSequence

FROM (--==== Find how many of each Trans_Code there are

SELECT Trans_Code, COUNT(*) AS TransCodeCount

FROM @yourtable

GROUP BY Trans_Code

)c

INNER JOIN dbo.Tally t

ON t.N <= c.TransCodeCount

ORDER BY c.Trans_Code, t.N

Of course, you must change @yourtable to your actual table name in all of the examples... 😉

And, the big problem is, there's no easy way to link it back to the original rows... but it does do what you asked 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 5

Oh me, oh my... Is that a Triangular Join (or as Microsoft calls it - a comparison non-Equijoin) I see? From Jeff?

hehe.....:P:hehe::crazy:

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Jeff Moden SSC Guru Points: 1004499 More actions · Answer 6

Heh... yeah... it is... 😛

Difference is that all rows it touches are generated in the output... nothing wasted. Like I said in the article, they're not all bad...

... I almost didn't post this code because it cannot be used to make reference back to the original row. Creation of a temp table and doing the running update is a much better solution.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004499 More actions · Answer 7

On second thought, it's not a triangular join... for a count of 3, a triangular join would generate...

1

1 2

1 2 3

... to get the count... 6 rows total. This code does no such thing... it produces...

1

2

3

... without generating the other rows... it's linear which means it's not a triangular join as I defined it in the article.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 8

Well - technically it IS a triangular join, but it's one against the summary. So - you summarize first, then triangular join against it (essentially unrolling the summary).

That being said - it IS linear against the table (the original). So - you're right - it is one of those times when a range non-equijoin is appropriate (I was just busting your chops earlier - you know that right?)

It's a neat trick though. I haven't had a chance to try it but I'm sure it will do well assuming you have decent indexing in place.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Jeff Moden SSC Guru Points: 1004499 More actions · Answer 9

I was just busting your chops earlier - you know that right?

Oh yeah... just got me thinking...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004499 More actions · Answer 10

And, I gotta disagree... this isn't a triangular join... internally generated rows to not form the classic triangle. No table has more rows spawned than either of the original two like a triangular join would. Is it an inequality... yep. But not a triangular join in this case.

Sure does look like one, though, until you look at the row counts in the execution plan.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Frances L SSCarpal Tunnel Points: 4727 More actions · Answer 11

Matt:

select count(*) from adjustment t2 where t1.transcode=t2.transcode and _t1.id>=t2.id

do not understand what _t1.id>=t2.id for here ?

Thx.

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 12

Frances L (1/14/2008)
Matt:
select count(*) from adjustment t2 where t1.transcode=t2.transcode and _t1.id>=t2.id
do not understand what _t1.id>=t2.id for here ?
Thx.

We're Essentially doing a sub-query so that we can perform a count of records. Remember this was a sub-query, so in this case T1 is the outer query, and T2 is the inner query.

For each row in the outer query, we are trying to count all of the rows in the table that have the same transaction code and are less than or equal to the current ID.

Let me give you a really small data exampl. If in your data you had a group that looks like:

Transcode ID

aaa 1

aaa 2

ddd 15

bbb 27

ddd 42

bbb 66

ddd 72

Now assuming we focus ONLY on the "ddd" transaction codes. For each of those records you'd be counting and end up as follows:

Transcode ID Count

ddd 15 1 --because it counted itself

ddd 42 2 --because it counted itself AND ID number 15

ddd 72 3 --because it counted itself and ID number 15 and 42

It's essentially a triangular join in disguise again, since we had to not use the temp table option (a linear option).

It's essentially doing math on this resultset. Run this and see if the picture gets to be clearer.

Select * from adjustment t1

inner join adjustment t2 on t1.transcode=t2.transcode and t1.id>=t2.id

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?