August 24, 2012 at 2:55 pm
Is there a more straightforward way to accomplish the following:
For each value in the column, I need to calculate a percentage of times this value occurs both relative to the total count of recrods and a subset of records
For example, let's say I have a table of car registrations in the U.S. states. I want to know the percentage of registrations by brand for California and for all states including California.
create table #Cars (CarModel varchar(50), RegState char(2), OtherStuff varchar(10))
insert into #Cars
values ('Ford', 'CA', 'xxx1'),
('Ford', 'WY', 'xxx2'),
('Ford', 'NY', 'xxx'),
('Honda', 'NY', 'xxx3'),
('Honda', 'CA', 'xxx3'),
('Honda', 'CA', 'xxx4')
So given this table:
CarModel RegState OtherStuff
FordCAxxx1
FordWYxxx2
FordNYxxx
HondaNYxxx3
HondaCAxxx3
HondaCAxxx4
I need to get
Brand CA US
Ford 33% 50%
Honda 66% 50%
I came up with this, which works
select CarModel, CA, US
from
(
SELECT area = 'CA',
CarModel,
Prcnt = 100.0 * COUNT(*)/(select COUNT(*) from #Cars where RegState = 'CA')
FROM #cars
where RegState = 'CA'
group by CarModel
union all
SELECT area = 'US',
CarModel,
Prcnt = 100.0 * COUNT(*)/(select COUNT(*) from #Cars)
FROM #cars
group by CarModel
) rt
pivot (sum(Prcnt) for area in (CA, US)) as pvt
order by CarModel
But I have a sneaking suspicion that there is a simpler way of doing this. Is there?
Thanks!
August 24, 2012 at 3:12 pm
I'm leaving the office now, but here's another option:
DECLARE @Total float,
@Total2 float
SELECT @Total = COUNT( *),
@Total2 = COUNT( CASE WHEN regstate = 'CA' THEN regstate END )
FROM #Cars
SELECT c.CarModel,
(COUNT( CASE WHEN regstate = 'CA' THEN regstate END ) / @Total2) * 100,
(COUNT( regstate) / @Total) * 100
FROM #Cars c
GROUP BY c.CarModel
August 24, 2012 at 3:16 pm
Well, sheesh, way to make it simple! I was having such fun with my pivots! 😀
Thanks, Luis.
August 24, 2012 at 8:13 pm
With slightly more complicated code, you can limit to one table scan, eliminate the need for variables, and make it a bit easier to do all of the States as well as the U.S. column.
WITH
ctePreAgg AS
(
SELECT Brand = CarModel,
RegState,
BrandStateCount = COUNT(*)*100.0,
StateCount = SUM(COUNT(*)) OVER (PARTITION BY RegState),
USCount = SUM(COUNT(*)) OVER (PARTITION BY (SELECT NULL))
FROM #Cars
GROUP BY CarModel, RegState
)
SELECT Brand,
CA = MAX(CASE WHEN RegState = 'CA' THEN BrandStateCount/StateCount ELSE 0 END),
NY = MAX(CASE WHEN RegState = 'NY' THEN BrandStateCount/StateCount ELSE 0 END),
WY = MAX(CASE WHEN RegState = 'WY' THEN BrandStateCount/StateCount ELSE 0 END),
US = SUM(BrandStateCount)/MAX(USCount)
FROM ctePreAgg
GROUP BY Brand
;
In fact, now it could be turned into dynamic SQL so that if you added States or car models, it would automatically "heal" itself and return the correct information.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply