item counting

  • Hi All...
    
    I'm trying to write a query where the return is a list of thingie and thingiecount... 
    where thingiecount is returned in an incremental list instead of a sum.  
    
    As an example, I have the following rows, where thingie is:
    
    thingie
    =======
    foo
    foo
    foo
    foox
    foox
    fooy
    fooy
    fooy
    fooy
    
    I'd like my query to return:
    
    thingie thingiecount
    ======= ============
    foo     1
    foo     2
    foo     3
    foox    1
    foox    2
    fooy    1
    fooy    2
    fooy    3
    fooy    4
    
    Can someone show me how to do this?  Thanks a bunch in advance!
    
    Sam
    
  • Try a reverse-aggregate using a numbers or tally table.  This example creates the tally table on the fly, but you could create a static numbers/tally table for less overhead.

     

    declare @table table (thingie varchar(10))

    insert into @table

    select 'foo' union all

    select 'foo' union all

    select 'foo' union all

    select 'foox' union all

    select 'foox' union all

    select 'fooy' union all

    select 'fooy' union all

    select 'fooy' union all

    select 'fooy'

    DECLARE @Numbers table (Num int identity(1,1) primary key clustered, x bit)

    INSERT INTO @Numbers

    SELECT TOP 1000 NULL

    FROM dbo.SysColumns

    select thingie,

        num

    from (

            select thingie, COUNT(*) as Count

            from @table

            group by thingie

          ) t

        inner join @numbers

        on num <= t.count

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Thanks John! Exactly what I needed... Works great 🙂

  • Perhaps a little easier (code wise & in some cases execution wise)

    SELECT thingie, IDENTITY(INT, 1, 1 ) as sid, 0 as thingie_count

      INTO #tempthingie

    FROM thingie

    UPDATE t

       SET thingie_count = (t.sid - w.minsid) + 1

      FROM #tempthingie t

           JOIN (

                 SELECT thingie, min( sid ) as minsid

                   FROM #tempthingie

                  GROUP BY thingie

                 ) w on w.thingie = t.thingie

    then

    select thingie, thingie_count from #tempthingie

    Note that step 2 could be done as part of the result set instead of having a separate update process.

    EDIT: dont forget to sort the temptable during the insert with orderby

  • "in some cases execution wise"

    If the numbers table remains temporary/memory table, your solution may outperform.  If a static numbers/tally table is used, it will out-perform the temp-table solution. 

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • declare @t table (

    item varchar(40)

    )

    insert into @t values ('foo')

    insert into @t values ('foo')

    insert into @t values ('foo')

    insert into @t values ('foox')

    insert into @t values ('foox')

    insert into @t values ('fooy')

    insert into @t values ('fooy')

    insert into @t values ('fooy')

    insert into @t values ('fooy')

    select item, row_number() over (partition by item order by item)

    from @t

  • Nagabhushanam,

    That works great in SQL Server 2005 but has no chance of working in SQL Server 2000.  Good to see how this works, though.   Thank you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • One measurement is worth a thousand words... also I tried to make some unique table names, be careful folks... this test does delete some tables... didn't want to run anyone's server out of memory so did not use temp tables on this 4 million row test...

    --=============================================================================

    -- If the demo tables exist... drop them

    -- PLEASE CHECK THE NAMING OF THESE TABLES TO ENSURE THAT THEY DO NOT ALREADY

    -- EXIST IN YOUR DATABASE!!!  I TAKE NO RESPONSIBILITY IF YOU DROP ONE OF YOUR

    -- VALUED TABLES WITH THIS CODE!!!

    --=============================================================================

    --===== If the tally table exists, drop it

         IF OBJECT_ID('jbmTally') IS NOT NULL

            DROP TABLE jbmTally

    --===== If the data table exists, drop it

         IF OBJECT_ID('jbmData') IS NOT NULL

            DROP TABLE jbmData

    --===== If one of the run tables exists, drop it

         IF OBJECT_ID('jbmThingie') IS NOT NULL

            DROP TABLE jbmThingie

    --=============================================================================

    -- Declare some local variables and set some conditions

    --=============================================================================

    DECLARE @StartTime DATETIME --GETDATE() at start of run.

        SET NOCOUNT ON          --Supress autodisplay of rowcounts for appearance

                                --and speed.

    --=============================================================================

    -- Create a tally table

    --=============================================================================

    --===== Create and populate the Tally table on the fly

     SELECT TOP 20000

            IDENTITY(INT,1,1) AS N

       INTO jbmTally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE jbmTally

            ADD CONSTRAINT PK_jbmTally_N PRIMARY KEY CLUSTERED (N)

    --=============================================================================

    -- Create a data table with random data

    --=============================================================================

    --===== Create and populate the data table on the fly

     SELECT TOP 4000000 --<<<<<<LOOK!!! CHANGE THIS NUMBER TO CHANGE TEST SIZE!!!!

            IDENTITY(INT,1,1) AS RowNum,

            'Foo'

          + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

          + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS Thingie

       INTO jbmData

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE jbmData

            ADD CONSTRAINT PK_jbmData_RowNum PRIMARY KEY CLUSTERED (Rownum)

    --=============================================================================

    -- Clear cache and run John Rowan's solution

    --=============================================================================

    --===== If one of the run tables exists, drop it

         IF OBJECT_ID('jbmThingie') IS NOT NULL

            DROP TABLE jbmThingie

    --===== Clear cached data

       DBCC DROPCLEANBUFFERS 

       DBCC FREEPROCCACHE

    --===== Start the duration timer

        SET @StartTime = GETDATE()

    --===== Run the solution

     SELECT IDENTITY(INT, 1, 1 ) AS SID,

            d.Thingie,

            t.N AS Thingie_Count

       INTO jbmThingie

       FROM (

            SELECT Thingie, COUNT(*) AS Count

            FROM jbmData

            GROUP BY Thingie

            ) d

      INNER JOIN jbmTally t

            ON t.N <= d.Count

      ORDER BY d.Thingie

     SELECT Thingie,

            Thingie_Count

       FROM jbmThingie

      ORDER BY SID

    --===== Display the duration

      PRINT 'John Rowan''s solution: '

          + STR(DATEDIFF(ms,@StartTime,GETDATE())/1000.0,10,3)

          + ' Seconds'

      PRINT REPLICATE('-',95)

    --=============================================================================

    -- Clear cache and run the "other" John's solution

    --=============================================================================

    --===== If one of the run tables exists, drop it

         IF OBJECT_ID('jbmThingie') IS NOT NULL

            DROP TABLE jbmThingie

    --===== Clear cached data

       DBCC DROPCLEANBUFFERS 

       DBCC FREEPROCCACHE

    --===== Start the duration timer

        SET @StartTime = GETDATE()

    --===== Run the solution

     SELECT IDENTITY(INT, 1, 1 ) AS SID,

            Thingie,

            0 AS Thingie_Count

       INTO jbmThingie

       FROM jbmData

      ORDER BY Thingie

     UPDATE t

        SET Thingie_Count = (t.SID - w.MinSid) + 1

       FROM jbmThingie t

       JOIN (

            SELECT Thingie, MIN(SID) AS MinSid

              FROM jbmThingie

             GROUP BY Thingie

            ) w

         ON w.Thingie = t.Thingie

     SELECT Thingie,

            Thingie_Count

       FROM jbmThingie

      ORDER BY SID

    --===== Display the duration

      PRINT '"Other" Johns solution:'

          + STR(DATEDIFF(ms,@StartTime,GETDATE())/1000.0,10,3)

          + ' Seconds'

      PRINT REPLICATE('-',95)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You all are awesome. Thanks for the great posts! I already had a numbers table, and was playing with test queries, but just couldn't get the right one written. This is the query I'm using based on John's suggestion, and it works great:

    select PartNumber, num
    from (
            select PartNumber, COUNT(*) as Count
            from Product
            group by PartNumber
          ) t
    inner join numbers
    on num <= t.count
    order by PartNumber
    

    PS: I'm using this to help in creating unique SKUs for auto parts where the same partnumber fits many different vehicles.

  • "PS: I'm using this to help in creating unique SKUs for auto parts where the same partnumber fits many different vehicles. "


    Couldn't figure out how get the quote to work... but, Hmmm.

    Is the intent to track multiple inventories of a SKU based upon the target vehicle? As in, part XYZ fits seven models... do you really want XYZ1..XYZ7 with unique stocked quantities based on each SKU? And do you really want to purchase XYZ1..XYZ7 based on projected production / appearance / sell of a given vehicle model and part combination?

    --SJT--

  • Hi All... John answered my query question. No further help needed. Thanks! Sam

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply