Indexing for Pivot Performance?

  • Hi SSC,

    I've got a table which is wasting a lot of space in it's current narrow format and I'd like to pivot it out. The table contains about 44 million rows. The example below is a scaled down version of what I'm dealing with. The idea being to pivot out the [Measure] values into their own columns. In the real data there are about 12 different measures which will become their own columns.

    My question is, is there a way to index the underlying table, or perhaps an analogous method to Pivot which would get index seeks or better performance over a data set of this size?

    --Setup raw table

    if object_id('tempdb.dbo.#Raw') is not null drop table #Raw

    create table #Raw

    (

    Id varchar(25),

    TradeDate int,

    Measure varchar(100),

    Value float

    primary key clustered (Id, TradeDate, Measure)

    )

    --Populate sample data

    ;with nums as --numbers table

    (

    select num = row_number() over (order by [object_id])

    from sys.objects with (nolock)

    ), measures as --measures

    (

    select Measure = 'Alpha' union all

    select 'Beta' union all

    select 'Gamma' union all

    select 'Delta' union all

    select 'Epsilon'

    )

    insert into #Raw

    (

    Id,

    TradeDate,

    Measure,

    Value

    )

    select

    Id = ids.num,

    TradeDate = 40000 + dates.num,

    Measure = m.Measure,

    Value = checksum(newid()) % 10000

    from nums ids

    cross join nums dates

    cross join measures m

    where ids.Num between 1 and 100

    /*********************

    ** Actual Statement **

    *********************/

    select Id, TradeDate, Alpha = max(Alpha), Beta = max(Beta), Gamma = max(Gamma), Delta = max(Delta), Epsilon = max(Epsilon)

    from (select Id, TradeDate, Measure, Value

    from #Raw) src

    pivot (max(Value) for Measure in (Alpha, Beta, Gamma, Delta, Epsilon)) pvt

    group by Id, TradeDate

    Executive Junior Cowboy Developer, Esq.[/url]

  • Maybe this index will help

    CREATE NONCLUSTERED INDEX [IndexName] ON [dbo].[TableNAme]

    (

    [Id] ASC,

    [TradeDate] ASC

    )

    INCLUDE ( [Value]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF)

    But what should work even better is to use a cross-tab query instead of pivot.

    select Id,

    TradeDate,

    Alpha = max(CASE WHEN Measure = 'Alpha' THEN Value END),

    Beta = max(CASE WHEN Measure = 'Beta' THEN Value END),

    Gamma = max(CASE WHEN Measure = 'Gamma' THEN Value END),

    Delta = max(CASE WHEN Measure = 'Delta' THEN Value END),

    Epsilon = max(CASE WHEN Measure = 'Epsilon' THEN Value END)

    from (select Id, TradeDate, Measure, MAX(Value)

    from #Raw

    GROUP BY Id, TradeDate, Measure) src

    GROUP BY Id,

    TradeDate

    The only way to be sure is to test and test again.

    You can find a detailed explanation on this at the following articles

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply