July 24, 2013 at 3:31 pm
Hi SSC,
I've got a table which is wasting a lot of space in it's current narrow format and I'd like to pivot it out. The table contains about 44 million rows. The example below is a scaled down version of what I'm dealing with. The idea being to pivot out the [Measure] values into their own columns. In the real data there are about 12 different measures which will become their own columns.
My question is, is there a way to index the underlying table, or perhaps an analogous method to Pivot which would get index seeks or better performance over a data set of this size?
--Setup raw table
if object_id('tempdb.dbo.#Raw') is not null drop table #Raw
create table #Raw
(
Id varchar(25),
TradeDate int,
Measure varchar(100),
Value float
primary key clustered (Id, TradeDate, Measure)
)
--Populate sample data
;with nums as --numbers table
(
select num = row_number() over (order by [object_id])
from sys.objects with (nolock)
), measures as --measures
(
select Measure = 'Alpha' union all
select 'Beta' union all
select 'Gamma' union all
select 'Delta' union all
select 'Epsilon'
)
insert into #Raw
(
Id,
TradeDate,
Measure,
Value
)
select
Id = ids.num,
TradeDate = 40000 + dates.num,
Measure = m.Measure,
Value = checksum(newid()) % 10000
from nums ids
cross join nums dates
cross join measures m
where ids.Num between 1 and 100
/*********************
** Actual Statement **
*********************/
select Id, TradeDate, Alpha = max(Alpha), Beta = max(Beta), Gamma = max(Gamma), Delta = max(Delta), Epsilon = max(Epsilon)
from (select Id, TradeDate, Measure, Value
from #Raw) src
pivot (max(Value) for Measure in (Alpha, Beta, Gamma, Delta, Epsilon)) pvt
group by Id, TradeDate
July 24, 2013 at 3:55 pm
Maybe this index will help
CREATE NONCLUSTERED INDEX [IndexName] ON [dbo].[TableNAme]
(
[Id] ASC,
[TradeDate] ASC
)
INCLUDE ( [Value]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF)
But what should work even better is to use a cross-tab query instead of pivot.
select Id,
TradeDate,
Alpha = max(CASE WHEN Measure = 'Alpha' THEN Value END),
Beta = max(CASE WHEN Measure = 'Beta' THEN Value END),
Gamma = max(CASE WHEN Measure = 'Gamma' THEN Value END),
Delta = max(CASE WHEN Measure = 'Delta' THEN Value END),
Epsilon = max(CASE WHEN Measure = 'Epsilon' THEN Value END)
from (select Id, TradeDate, Measure, MAX(Value)
from #Raw
GROUP BY Id, TradeDate, Measure) src
GROUP BY Id,
TradeDate
The only way to be sure is to test and test again.
You can find a detailed explanation on this at the following articles
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply