April 2, 2009 at 11:25 am
Hello
I have the following query, altered to protect the table names.
SELECT '0-10' Age, COUNT(CASE WHEN Age BETWEEN 0 AND 10 THEN Age ELSE NULL END) Total
,COUNT(CASE WHEN Age BETWEEN 0 AND 10 THEN Age ELSE NULL END)*1.0/COUNT(Age) Percentage
from dbo.PERSON ppp
union
select
'11-20', COUNT(CASE WHEN Age BETWEEN 11 AND 20 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 11 AND 20 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'21-30', COUNT(CASE WHEN Age BETWEEN 21 AND 30 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 21 AND 30 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'31-40', COUNT(CASE WHEN Age BETWEEN 31 AND 40 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 31 AND 40 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'41-50', COUNT(CASE WHEN Age BETWEEN 41 AND 50 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 41 AND 50 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'51-60', COUNT(CASE WHEN Age BETWEEN 51 AND 60 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 51 AND 60 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'61-70', COUNT(CASE WHEN Age BETWEEN 61 AND 70 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 61 AND 70 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'71-80', COUNT(CASE WHEN Age BETWEEN 71 AND 80 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 71 AND 80 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'81-90', COUNT(CASE WHEN Age BETWEEN 81 AND 90 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 81 AND 90 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'91-100', COUNT(CASE WHEN Age BETWEEN 91 AND 100 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age BETWEEN 91 AND 100 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'100+', COUNT(CASE WHEN Age > 100 THEN Age ELSE NULL END)
,COUNT(CASE WHEN Age > 100 THEN Age ELSE NULL END)*1.0/COUNT(Age)
from dbo.PERSON ppp union
select
'Not Entered', COUNT(CASE WHEN isnull(Age, 9999) = 9999 THEN 1 ELSE NULL END)
,COUNT(CASE WHEN isnull(Age, 999) = 999 THEN 1 ELSE NULL END)*1.0/COUNT(*)
from dbo.PERSON ppp
Which returns
AgeTotalPercentage
0-103550.045489492567
11-206090.078036904151
21-307000.089697590978
31-406930.088800615069
41-507360.094310609943
51-606920.088672475653
61-709040.115838031778
71-8013970.179010763710
81-9013700.175550999487
91-1003400.043567401332
100+80.001025115325
Not Entered92660.542823667252
Excellent, but wait, why the large number of exceptions? That is, records with no age entered.
It turns out the table has duplicate rows for each record. Here's a quick test setup. One row has the age, the other related record is null.
create table #PERSON(personId int not null identity(1,1) primary key,
Age int null);
insert #Person (Age) values(7)
insert #Person (Age) values(17)
insert #Person (Age) values(27)
insert #Person (Age) values(37)
insert #Person (Age) values(47)
insert #Person (Age) values(57)
insert #Person (Age) values(67)
insert #Person (Age) values(77)
insert #Person (Age) values(87)
insert #Person (Age) values(97)
insert #Person (Age) values(107)
insert #Person (Age) values(8)
insert #Person (Age) values(18)
insert #Person (Age) values(28)
insert #Person (Age) values(38)
insert #Person (Age) values(48)
insert #Person (Age) values(58)
insert #Person (Age) values(68)
insert #Person (Age) values(78)
insert #Person (Age) values(88)
insert #Person (Age) values(98)
insert #Person (Age) values(108)
insert #Person (Age) values(3)
insert #Person (Age) values(13)
insert #Person (Age) values(23)
insert #Person (Age) values(33)
insert #Person (Age) values(43)
create table PERSON(pkey int not null identity (1,1),
personId int not null,
Age int null);
Insert Person (PersonId, Age)
select * from #PERSON p
Insert Person (PersonId)
select personId from #PERSON p
delete person where age > 90
SELECT * from PERSON p order by personId, age
-- drop table #PERSON
-- drop table PERSON
If you run the first query on this test setup, you get
AgeTotalPercentage
0-1030.130434782608
11-2030.130434782608
21-3030.130434782608
31-4030.130434782608
41-5030.130434782608
51-6020.086956521739
61-7020.086956521739
71-8020.086956521739
81-9020.086956521739
91-10000.000000000000
100+00.000000000000
Not Entered270.540000000000
When really only the people over 90 have not had their age entered into the system. Not entered should be 4. How do I ignore null rows for ones that have a valid age, but not others?
Dave J
April 2, 2009 at 12:00 pm
I would do the whole thing this way:
declare @Total float;
select @Total =
(select count(*)
from #Person);
select
'0-10' as Age,
count(*) as Total,
count(*)/@Total as Percentage
from #Person
where age between 0 and 10
union all
select
'Not Entered',
count(*),
count(*)/@Total
from #Person
where age is null
and personid not in
(select personid
from #Person
where age is not null);
It'll be much more efficient than what you've got so far, and it'll give you an accurate count on the final query. Just copy and paste the firsts bit and plug in each of your ranges.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 2, 2009 at 12:09 pm
Here's another way
WITH AgeRanges(AgeDesc,AgeMin,AgeMax) AS(
SELECT '0-10',0,10
UNION ALL
SELECT '11-20',11,20
UNION ALL
SELECT '21-30',21,30
UNION ALL
SELECT '31-40',31,40
UNION ALL
SELECT '41-50',41,50
UNION ALL
SELECT '51-60',51,60
UNION ALL
SELECT '61-70',61,70
UNION ALL
SELECT '71-80',71,80
UNION ALL
SELECT '81-90',81,90
UNION ALL
SELECT '91-100',91,100
UNION ALL
SELECT '100+',101,9999
UNION ALL
SELECT 'Not Entered',NULL,NULL),
PersonAges AS (
SELECT personId,MAX(Age) AS Age FROM dbo.Person GROUP BY personId
)
SELECT AgeDesc,
COUNT(personId) AS Total,
COUNT(personId)*1.0 / (SELECT COUNT(Age) FROM PersonAges) AS Percentage
FROM AgeRanges
LEFT OUTER JOIN PersonAges ON (Age BETWEEN AgeMin AND AgeMax) OR (Age IS NULL AND AgeMin IS NULL)
GROUP BY AgeDesc
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537April 3, 2009 at 4:03 am
Thanks to both of you. Marks solution is interesting as it uses a 2005/08 feature I've seen but never used. This is my first position where SQL > 2000! :w00t:
My original stab at it returns this
AgeTotalPercentage
0-109080.049942247401
11-2014000.077003465155
21-3016640.091524118585
31-4015870.087288928001
41-5017260.094934272042
51-6016040.088223970078
61-7021420.117815301688
71-8031440.172927781750
81-9032120.176667950057
91-1007710.042406908310
100+230.001265056927
Not Entered220010.547533721566
I get this from GSquared query, altered to point to the real table
AgeTotalPercentage
0-109080.022588750404259
Not Entered24350.0605766599497475
So the Not entered looks good to me, but the 0-10 range percentage differs. Marks query returns
AgeDescTotalPercentage
0-10 9080.049934007919
11-2014000.076990761108
21-3016640.091509018917
31-4015870.087274527056
41-5017270.094973603167
51-6016040.088209414870
61-7021430.117850857897
71-8031450.172954245490
81-9032120.176638803343
91-1007710.042399912010
100+230.001264848218
Not Entered20740.114056313242
Which agrees with my 0-10 count, but differs from GSquared's Not Entered total. As I'm in the UK I suspect I have some time to figure it out, and I'll be mighty proud if I get my head round Marks code. I'll post back any findings.
Once again, many thanks to you both
Dave J
April 3, 2009 at 5:27 am
Cracked it, thanks to GSquared and Marks help. I noticed that Mark results differed from mine by one on 3 age groups, 41-50, 61-70 and 71-80, and frankly as I don't know this 'with' stuff works I concentrated on GSquared's solution. To cut a long story short I added this to the top of the script:
declare @notEntered float, @notEnteredPercentage float
select @total = COUNT(distinct personid) from person
select @notEntered = count(*)
from person
where age is null
and personid not in
(select personid
from person
where age is not null);
select @notEnteredPercentage = @notEntered/@total
and replaced the last select in the union with this
select
'Not Entered', @notEntered, round(@notEnteredPercentage , 3)
works a charm.
Apologies for saying GSquareds percentage was off earlier, my bad. I changed count(*) to count(age) and forgot I had done so. :blush:
Edit: I think I needed to on reflection.
Thanks guys.
Dave J
April 3, 2009 at 7:45 am
Glad we could help.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply