Aggrigate Query Question

  • I have the following query:

    -----------------------------

     SELECT p.dept as Dept, f.value as OPID, count(*) as [Count]

     FROM policy as p

     INNER JOIN FLDS as f

      ON f.[value] = p.csr and f.fld = '1'

     WHERE p.Dept is not null

     GROUP BY p.dept, f.value

     ORDER BY OPID, [Count] DESC

    --------------------------------

    It basically returns reults like:

    deptopidcount
    42J12
    39J8
    2A01924
    3A010
    4A01
    2A11020
    4A1987
    5A125
    3A11
    4A2106
    1A21
    2A48
    4A5129
    2A72479
    3A7157
    4A75
    1A72

    What I want is just the [dept] and [opid] with the largetst count.  The result set I would like is:

    deptopidcount
    42J12
    39J8
    2A01924
    2A11020
    4A2106
    2A48
    4A5129
    2A72479

    Is this even possible?

     

  • Use derived table and "not exists(...)".

     SELECT Dept, OPID, [Count]

     FROM (

      SELECT p.dept as Dept, f.value as OPID, count(*) as [Count]

      FROM policy as p

      INNER JOIN FLDS as f

       ON f.[value] = p.csr and f.fld = '1'

      WHERE p.Dept is not null

      GROUP BY p.dept, f.value

       ) x

     WHERE not exist(

      SELECT *

      FROM policy as p

      INNER JOIN FLDS as f

       ON f.[value] = p.csr and f.fld = '1'

      WHERE p.Dept = x.Dept and f.value = x.OPID

      GROUP BY p.dept, f.value

      HAVING count(*) > x.[Count]

       )

     ORDER BY OPID, [Count] DESC 

    Looks inefficient. DDL and INSERTs of sample data may help get a better solution.

  • Okay.  I have pared down the query to:

    SELECT Dept, OPID, [Count]

    FROM (

     SELECT p.dept as Dept, p.csr as OPID, count(*) as [Count]

     FROM policy as p

     WHERE p.Dept is not null and p.csr is not null

     GROUP BY p.dept, p.csr

    &nbsp x

    WHERE not exists

    (

     SELECT *

     FROM policy as p

     WHERE p.Dept = x.Dept and p.csr = x.OPID

     GROUP BY p.dept, p.csr

     HAVING count(*) > x.[Count]

    )

    ORDER BY OPID, [Count] DESC

    but I am not getting only the largest count row.  I am getting all.  Any ideas?

  • The first step is to have an *exact* idea of what you want. You said

    > What I want is just the [dept] and [opid] with the largetst count.

    which as far as I can see isn't quite accurate. Judging by your desired output example, what you actually want is

    > For each opid, the dept with the largest count, and the value of that count

    This gives the clue that the basis of the query is going to be a list of OPIDs, plus some additional columns. So we start with

    SELECT f.value as OPID
    FROM FLDS AS f
    WHERE f.fld = '1'

    Now, what information do we want from policy, for a given OPID? If we were doing this by hand, we would say: "OK, for OPID A0, count up all the policies with that OPID, grouping them by dept:

    dept count
    2    1924 
    3    10 
    4    1 

    ... and then I will pick the line with highest count."

    Now to translate that into T-SQL. We use a correlated subquery to get each OPID in turn into the query over policy:

    SELECT dept, count(*) FROM policy p WHERE f.value=p.csr GROUP BY dept

    'Pick the line with highest count' translates straightforwardly into

    SELECT TOP 1 ... GROUP BY ... ORDER BY count(*) DESC

    The only little quirk is that because we want two columns from the subquery, and T-SQL doesn't have tuple handling, we have to have two subqueries that each refer the same query but return different columns:

    SELECT TOP 1 dept FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC
    SELECT TOP 1 count(*) FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC

    Putting it all together:

    SELECT f.value AS OPID,
    (SELECT TOP 1 dept FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC), 
    (SELECT TOP 1 count(*) [count] FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC)
    FROM FLDS AS f
    WHERE f.fld = '1'

    Let me know if it works

    One final thing: This query will return a row for every OPID, including those with no policies at all (the row would be OPID / null / 0). To exclude OPIDs with no policies, just tack

    AND EXISTS(SELECT * FROM policy p WHERE f.value=p.csr)

    on the end.

  • I apologize for my original post.  It was vague, incomplete and yes; the capitalization sucks (trying to keep with the capitalization that the database vendor had).  Thanks everybody for “interpreting” what I meant.  AKM you solution works perfectly.  Thank you very much I have been struggling with this for a while now and it will be nice to move on.

     

    Joe:

    With your solution I get:

    Column 'F1.FLD' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.

    So I changed the query to:

    SELECT P1.dept, F1.value AS opid

    FROM policy AS P1,

                Flds AS F1

    WHERE F1.value = P1.csr

    AND F1.fld = '1'

    AND P1.dept IS NOT NULL

    GROUP BY P1.dept, F1.value

    HAVING COUNT(*)

    >= (SELECT COUNT(*)

                FROM policy AS P2,

                Flds AS F2

                WHERE F2.fld = !!!!!‘1’!!!!!

                AND P2.dept = P1.dept)

    This returns no rows.  I am interested in your solution if you can figure out what the problem is.

    Thanks again for all of your help everyone.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply