Count with Group By clause

Question

Count with Group By clause

hkflight

SSCommitted

Points: 1949
More actions
September 21, 2011 at 2:52 pm

#248567

I need to pull the number of donors and the organization ID's of those donors (I need the orgID's as inputs to an outer query).
select orgID, count(distinct indID)
from transactions
group by orgID
The problem is, my count is off. Some employees retired during the year and so they have transactions when they were part of 2 organizations (regular and retired). My aggregates are counting these individuals twice. I get the correct counts if I do:
select count(distinct indID)
from transactions
But this does not give me the orgID's that I need. Any help would be appreciated!!!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

bkubicek SSChampion Points: 10948 More actions · Answer 1

This is just a shot in the dark, but if you are able to exlude the retired transactions it should take care of the problem. It seems like you should be able to join to some table that would give you that information. Just a guess.

Ben

hkflight SSCommitted Points: 1949 More actions · Answer 2

Yes, I could try to do this which would make my full query pretty complex.

I was hoping there would be a more simple elegant solution using the OVER clause or something. I simply need to count the number of unique individuals without any influence from the group by clause.

Perhaps a rank solution may work, but I have not used these functions before.

Hopefully someone else can provide some guidance.

ghanshyam.kundu Hall of Fame Points: 3337 More actions · Answer 3

I think u can do it though a rank funtion, rank it via as per your indid,

if u wanna to get help post your ddl and some sample data properly.

if u wanna to know about rank function follow below link:

http://blog.sqlauthority.com/2007/10/09/sql-server-2005-sample-example-of-ranking-functions-row_number-rank-dense_rank-ntile/

http://www.sqlservercurry.com/2009/04/rank-vs-denserank-with-example-using.html

hkflight SSCommitted Points: 1949 More actions · Answer 4

select orgID, count(distinct indID)

from transactions

group by orgID

The problem is, my count is off. Some employees retired during the year and so they have transactions when they were part of 2 organizations (regular and retired). My aggregates are counting these individuals twice. I get the correct counts if I do:

select count(distinct indID)

from transactions

Here is my schema, it is very simple:

TRANSACTIONS(TranID PK, OrgID FK, IndID FK)

INDIVIDUALS(IndID PK)

ORGANIZATIONS(OrgID PK)

Sample Data:

Individuals

1

2

3

Organizations

11

12

13

Transactions

20 | 11 | 1

21 | 11 | 1

22 | 12 | 2

23 | 12 | 1

Running the first query gives me:

11 | 1

12 | 2

When summed, this gives me 3 unique donors, but there are only 2. If I run the 2nd query, I get the correct # of donors, 2.

Thank you!

drew.allen SSC Guru Points: 76998 More actions · Answer 5

This should give you what you need. It's not tested.

SELECT OrgID, Count(IndID)

FROM Individuals AS i

CROSS APPLY(

SELECT TOP (1) OrgID

FROM Transactions AS t

WHERE t.IndID = i.IndID

ORDER BY TranID

) AS t

GROUP BY OrgID

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

NotoriousDBA SSChasing Mays Points: 658 More actions · Answer 6

If you're looking for the count of distinct donors per org, then your results are correct. If you're looking for the count of distinct donors, that's something else entirely. I doesn't seem like the problem is so much that your SQL isn't returning the right results so much as that you haven't clearly defined what you're after. If you only want to count a donor under one org, regardless of how many orgs they're associated with, that's fine, but you should define how to select the org you want to count; e.g. maybe only the org associated with a donor's most recent transaction.

Something like this:

SELECT y.OrgID, COUNT(DISTINCT y.IndID)

FROM (SELECT x.OrgID, x.IndID, RANK() OVER (PARTITION BY x.IndID ORDER BY x.OrgID) OrgRank FROM Transactions x) y

WHERE y.OrgRank = 1

GROUP BY y.OrgID;

Of course you'd change the "ORDER BY" clause of the RANK function to something that put the org you want in the first rank.