Running Count

  • I'm trying to get a running count of the rows in this table so that rows that have the same id also have the same rnk

    create table #tester (id int, rnk int)

    insert into #tester

    select 165935,1

    UNION

    select 165935,2

    UNION

    select 100266,3

    UNION

    select 100114,4

    UNION

    select 100641,5

    UNION

    select 105524,6

    UNION

    select 100481,7

    UNION

    select 100306,8

    UNION

    select 100160,9

    UNION

    select 100703,10

    UNION

    select 100158,11

    UNION

    select 100761,12

    UNION

    select 100072,13

    UNION

    select 100577,14

    UNION

    select 100550,15

    UNION

    select 100086,16

    UNION

    select 100513,17

    UNION

    select 100556,18

    UNION

    select 100676,19

    UNION

    select 100092,20

    UNION

    select 100092,21

    UNION

    select 100578,22

    UNION

    select 107756,23

    UNION

    select 105891,24

    UNION

    select 100394,25

    UNION

    select 114872,26

    select *

    from #tester

    order by rnk

    I'm trying to get the results to be

    ID rnk

    165935 1

    165935 1

    100266 2

    100114 3

    100641 4

    105524 5

    100481 6

    ...............

    100092 19

    100092 19

    100578 20

    I haven't had much success figuring this out hopefully someone can point me in the right direction.

    Thanks

  • This may help in getting you started in the right direction:

    select *, DENSE_RANK() over(order by id) DRank

    from #tester

    The rnk column is not even needed. In can be generated at any time.

    HTH.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • I want to keep the existing rnk order, so that both rows 165935 are first and 100266 is second and so on. Your query doesn't do that.

    1659351

    1659351

    1002662

    .................

  • There should be a way to do this without these self-joins but I can't figure it out. That said, this will do the trick:

    WITH distinct_ids AS

    (

    select DISTINCT t1.id, MAX(t2.rnk) OVER (PARTITION BY t1.id) AS rnk

    from #tester t1

    JOIN #tester t2 ON t1.id=t2.id

    )

    SELECT x.id, DENSE_RANK() OVER (ORDER BY x.rnk) AS d_rank

    FROM distinct_ids x

    JOIN #tester t ON x.id=t.id

    ORDER BY d_rank;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • You could do this

    select id, rnk, DENSE_RANK() over(order by mrnk, id) DRank

    from (select *, min(rnk) over (partition by id) mRank from #tester) a

    order by rnk;

  • A different approach:

    WITH DistinctIds AS(

    select id, MIN( rnk) rnk

    from #tester

    group by id

    ),

    Counts AS(

    select id, ROW_NUMBER() OVER( ORDER BY rnk) newrank

    FROM DistinctIds

    )

    SELECT t.*, c.newrank

    FROM #tester t

    JOIN Counts c ON t.id = c.id

    ORDER BY newrank;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • A couple more alternate approaches:

    -- #1: Using a Tally table

    WITH Tally (n) AS

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)

    )

    SELECT id, rnk=rn

    FROM

    (

    SELECT id, c=COUNT(*), rnk=MIN(rnk)

    ,rn=ROW_NUMBER() OVER (ORDER BY MIN(rnk))

    from #tester

    GROUP BY id

    ) a

    CROSS APPLY

    (

    SELECT n

    FROM Tally

    WHERE n <= c

    ) b

    ORDER BY rnk;

    -- #2: Kinda like MickyT's

    SELECT id, rnk=DENSE_RANK() OVER (ORDER BY rnk)

    FROM

    (

    SELECT id

    ,rnk=MIN(rnk) OVER(PARTITION BY id)

    from #tester

    ) a

    ORDER BY rnk;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • And so far Dwains Tally solution looks like the winner based on the IO stats:-D

  • mickyT (10/29/2013)


    And so far Dwains Tally solution looks like the winner based on the IO stats:-D

    So much for my guess. I was betting on #2 and a tie with yours!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • BTW. You can probably remove this from the Tally solution:

    , rnk=MIN(rnk)

    From the SELECT list of the inner query. I was using it for testing. It might drop the IO count a little.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (10/29/2013)


    BTW. You can probably remove this from the Tally solution:

    , rnk=MIN(rnk)

    From the SELECT list of the inner query. I was using it for testing. It might drop the IO count a little.

    Not sure if you can better what you had:-)

    Table '#tester_000000000026'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

  • mickyT (10/29/2013)


    dwain.c (10/29/2013)


    BTW. You can probably remove this from the Tally solution:

    , rnk=MIN(rnk)

    From the SELECT list of the inner query. I was using it for testing. It might drop the IO count a little.

    Not sure if you can better what you had:-)

    Table '#tester_000000000026'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Perhaps now I can read what you posted and I see what you mean.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • mickyT (10/29/2013)


    And so far Dwains Tally solution looks like the winner based on the IO stats:-D

    As far as speed is concerned, mine is still in the top 5!

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thank you all for your solutions. It's greatly appreciated.

  • AronB (10/30/2013)


    Thank you all for your solutions. It's greatly appreciated.

    The question now would be, do you understand how and why they work?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply