October 14, 2004 at 9:29 am
I have the following query:
-----------------------------
SELECT p.dept as Dept, f.value as OPID, count(*) as [Count]
FROM policy as p
INNER JOIN FLDS as f
ON f.[value] = p.csr and f.fld = '1'
WHERE p.Dept is not null
GROUP BY p.dept, f.value
ORDER BY OPID, [Count] DESC
--------------------------------
It basically returns reults like:
dept | opid | count |
4 | 2J | 12 |
3 | 9J | 8 |
2 | A0 | 1924 |
3 | A0 | 10 |
4 | A0 | 1 |
2 | A1 | 1020 |
4 | A1 | 987 |
5 | A1 | 25 |
3 | A1 | 1 |
4 | A2 | 106 |
1 | A2 | 1 |
2 | A4 | 8 |
4 | A5 | 129 |
2 | A7 | 2479 |
3 | A7 | 157 |
4 | A7 | 5 |
1 | A7 | 2 |
What I want is just the [dept] and [opid] with the largetst count. The result set I would like is:
dept | opid | count |
4 | 2J | 12 |
3 | 9J | 8 |
2 | A0 | 1924 |
2 | A1 | 1020 |
4 | A2 | 106 |
2 | A4 | 8 |
4 | A5 | 129 |
2 | A7 | 2479 |
Is this even possible?
October 14, 2004 at 10:14 am
Use derived table and "not exists(...)".
SELECT Dept, OPID, [Count]
FROM (
SELECT p.dept as Dept, f.value as OPID, count(*) as [Count]
FROM policy as p
INNER JOIN FLDS as f
ON f.[value] = p.csr and f.fld = '1'
WHERE p.Dept is not null
GROUP BY p.dept, f.value
) x
WHERE not exist(
SELECT *
FROM policy as p
INNER JOIN FLDS as f
ON f.[value] = p.csr and f.fld = '1'
WHERE p.Dept = x.Dept and f.value = x.OPID
GROUP BY p.dept, f.value
HAVING count(*) > x.[Count]
)
ORDER BY OPID, [Count] DESC
Looks inefficient. DDL and INSERTs of sample data may help get a better solution.
October 14, 2004 at 11:41 am
Okay. I have pared down the query to:
SELECT Dept, OPID, [Count]
FROM (
SELECT p.dept as Dept, p.csr as OPID, count(*) as [Count]
FROM policy as p
WHERE p.Dept is not null and p.csr is not null
GROUP BY p.dept, p.csr
  x
WHERE not exists
(
SELECT *
FROM policy as p
WHERE p.Dept = x.Dept and p.csr = x.OPID
GROUP BY p.dept, p.csr
HAVING count(*) > x.[Count]
)
ORDER BY OPID, [Count] DESC
but I am not getting only the largest count row. I am getting all. Any ideas?
October 15, 2004 at 4:19 am
The first step is to have an *exact* idea of what you want. You said
> What I want is just the [dept] and [opid] with the largetst count.
which as far as I can see isn't quite accurate. Judging by your desired output example, what you actually want is
> For each opid, the dept with the largest count, and the value of that count
This gives the clue that the basis of the query is going to be a list of OPIDs, plus some additional columns. So we start with
SELECT f.value as OPID FROM FLDS AS f WHERE f.fld = '1'
Now, what information do we want from policy, for a given OPID? If we were doing this by hand, we would say: "OK, for OPID A0, count up all the policies with that OPID, grouping them by dept:
dept count 2 1924 3 10 4 1
... and then I will pick the line with highest count."
Now to translate that into T-SQL. We use a correlated subquery to get each OPID in turn into the query over policy:
SELECT dept, count(*) FROM policy p WHERE f.value=p.csr GROUP BY dept
'Pick the line with highest count' translates straightforwardly into
SELECT TOP 1 ... GROUP BY ... ORDER BY count(*) DESC
The only little quirk is that because we want two columns from the subquery, and T-SQL doesn't have tuple handling, we have to have two subqueries that each refer the same query but return different columns:
SELECT TOP 1 dept FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC SELECT TOP 1 count(*) FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC
Putting it all together:
SELECT f.value AS OPID, (SELECT TOP 1 dept FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC), (SELECT TOP 1 count(*) [count] FROM policy p WHERE f.value=p.csr GROUP BY dept ORDER BY count(*) DESC) FROM FLDS AS f WHERE f.fld = '1'
Let me know if it works
One final thing: This query will return a row for every OPID, including those with no policies at all (the row would be OPID / null / 0). To exclude OPIDs with no policies, just tack
AND EXISTS(SELECT * FROM policy p WHERE f.value=p.csr)
on the end.
October 18, 2004 at 9:37 am
I apologize for my original post. It was vague, incomplete and yes; the capitalization sucks (trying to keep with the capitalization that the database vendor had). Thanks everybody for “interpreting” what I meant. AKM you solution works perfectly. Thank you very much I have been struggling with this for a while now and it will be nice to move on.
Joe:
With your solution I get:
Column 'F1.FLD' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
So I changed the query to:
SELECT P1.dept, F1.value AS opid
FROM policy AS P1,
Flds AS F1
WHERE F1.value = P1.csr
AND F1.fld = '1'
AND P1.dept IS NOT NULL
GROUP BY P1.dept, F1.value
HAVING COUNT(*)
>= (SELECT COUNT(*)
FROM policy AS P2,
Flds AS F2
WHERE F2.fld = !!!!!‘1’!!!!!
AND P2.dept = P1.dept)
This returns no rows. I am interested in your solution if you can figure out what the problem is.
Thanks again for all of your help everyone.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply