Finding Most Common/Duplicated Value MAX(COUNT())

  • I have a need to find a value shared/duplicated across a number of records. There are 3 tables involved:

    Table_1 (ID int)

    Table_2 (Table_1_ID int, Table_3_ID int, CreatedOn datetime)

    Table_3 (ID int)

    Pretty straight forward -- Table 2 resolves the many-to-many relationship between tables 1 and 3. So my problem was finding which Table_1_ID occurred in Table_2 on a CreatedOn day.

    So what initially popped in my head was something like

    SELECT MAX(COUNT(Table_1_ID))

    FROM Table_2

    WHERE CreatedOn BETWEEN someDay AND someOtherDay

    Of course, this doesn't work because the COUNT aggregate function can't appear inside the MAX aggregate function. So I had to first get a table's worth of COUNT data using:

    Declare @a table (Table_1_ID int, countTotal int)

    INSERT INTO @a

    SELECT Table_1_ID, COUNT(Table_1_ID)

    FROM Table_2

    WHERE CreatedOn BETWEEN someDay AND someOtherDay

    And then do this to find the Table_1_ID that is being used the most:

    SELECT Table_1_ID

    FROM Table_2

    WHERE CreatedOn BETWEEN someDay AND someOtherDay

    AND COUNT(Table_1_ID) = (SELECT MAX(countTotal) FROM @a)

    While this works to eventually find which Table_1_ID is used the most in the date range, it seems like a roundabout way of getting there. Is there a shortcut to this logic that can fit inside a single query without using a temp table?

  • OK, slightly confused. It would help to provide sample data.

    Do you want the most common value on a day, or all duplicates? Meaning, if I have this

    CREATE TABLE t1 ( id int)

    GO

    CREATE TABLE t2

    ( t1_id int

    , createdon datetime

    )

    GO

    INSERT t1 SELECT 1

    INSERT t1 SELECT 2

    INSERT t1 SELECT 3

    INSERT t2 SELECT 1, '12/1/2011'

    INSERT t2 SELECT 2, '12/1/2011'

    INSERT t2 SELECT 2, '12/1/2011'

    INSERT t2 SELECT 3, '12/1/2011'

    INSERT t2 SELECT 3, '12/1/2011'

    INSERT t2 SELECT 3, '12/1/2011'

    INSERT t2 SELECT 2, '12/2/2011'

    INSERT t2 SELECT 2, '12/2/2011'

    INSERT t2 SELECT 3, '12/2/2011'

    INSERT t2 SELECT 3, '12/2/2011'

    INSERT t2 SELECT 3, '12/2/2011'

    GO

    SELECT * FROM dbo.t1

    SELECT * FROM dbo.t2

    go

    SELECT t1.ID

    , COUNT(t1.id) 'Count of IDs'

    FROM dbo.t1

    INNER JOIN dbo.t2

    ON dbo.t1.id = dbo.t2.t1_id

    WHERE t2.createdon = '12/1/2011'

    GROUP BY t1.id

    go

    SELECT t1.ID

    , COUNT(t1.id) 'Count of IDs'

    FROM dbo.t1

    INNER JOIN dbo.t2

    ON dbo.t1.id = dbo.t2.t1_id

    WHERE t2.createdon = '12/1/2011'

    GROUP BY t1.id

    HAVING COUNT(t1.id) > 1

    go

    ; WITH MyCTE (MyID, Cnt)

    AS

    ( SELECT t1.ID

    , COUNT(t1.id) 'Count of IDs'

    FROM dbo.t1

    INNER JOIN dbo.t2

    ON dbo.t1.id = dbo.t2.t1_id

    WHERE t2.createdon = '12/1/2011'

    GROUP BY t1.id

    )

    SELECT TOP 1

    MyID, CNT

    FROM MyCTE

    ORDER BY CNT DESC

    go

    DROP TABLE dbo.t1

    DROP TABLE dbo.t2

    GO

    My first query returns the count of each ID value for a particular day. If I only want duplicates, I can add a HAVING clause to filter those counts < 1.

    If I want the most common, I use a CTE that essentially rolls up my IDs and counts, and then grabs only the top 1.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply