Pivot Table Help Needed

Question

Post reply

Pivot Table Help Needed

Viewing 15 posts - 16 through 30 (of 46 total)

You must be logged in to reply to this topic. Login to reply

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 1

I also lack the oppurtunity to call the SP twice in a row.

I get an error "index already exists".

N 56°04'39.16"
E 12°55'05.25"

mrpolecat SSCertifiable Points: 7014 More actions · Answer 2

Sandy (10/23/2007)
Hey Jeff :),
Great Standard of Code,
I really feel its the way need to be done.
but i have a small issue here,
when i am trying to copy your code to my SQL editor its
mess up all the code,
How can i copy the your code in systematic way?
I mean the way it present here.
Cheers!
Sandy.

Not sure if it works everywhere but for Jeff's posts you can paste them into MS Word and then copy from word to QA and most of the formatting follows though.

mrpolecat SSCertifiable Points: 7014 More actions · Answer 3

Peter Larsson (10/23/2007)
I also lack the oppurtunity to call the SP twice in a row.
I get an error "index already exists".

I thnk it would make sense to move the create index out of the sp and into the setup where the table is created.

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 4

Peter Larsson (10/23/2007)
Jeff, excellent code but I think you should start using QUOTENAME function for those having brackets, spaces or even single quotes in their table/column names.

Yep... that would work, too.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 5

Sandy (10/23/2007)
Hey Jeff :),
Great Standard of Code,
I really feel its the way need to be done.
but i have a small issue here,
when i am trying to copy your code to my SQL editor its
mess up all the code,
How can i copy the your code in systematic way?
I mean the way it present here.
Sandy.

Heh... it's this bloody new forum code they're using.

If you copy'n'paste fromt the code window into MS Word and replace ^l (cirumflex with lower case "L") with ^p, at least the lines will end correctly instead of appearing as a single line of code. Unfortunately, leading spaces are NOT preserved either. I gotta write to Tony and let him know that's a pretty big problem for us.

And, thank you for the compliment, Sandy.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 6

Peter Larsson (10/23/2007)
I also lack the oppurtunity to call the SP twice in a row.
I get an error "index already exists".

I'll check it out, Peter... thanks for the feedback.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 7

mrpolecat (10/23/2007)
Peter Larsson (10/23/2007)
I also lack the oppurtunity to call the SP twice in a row.
I get an error "index already exists".
I thnk it would make sense to move the create index out of the sp and into the setup where the table is created.

Ah, bugger... thanks for the heads up... I'll go back and fix that, as well... I gotta stop posting stuff at 2 in the morning 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mrpolecat SSCertifiable Points: 7014 More actions · Answer 8

I copy your code straight from word without any replacing and it works great. I don't lose any leading spaces.

mrpolecat SSCertifiable Points: 7014 More actions · Answer 9

Just for S&G I removed your index completely and it didn't change performance time. Still 20 seconds. Here is the table I am using to test with if anybody is interested.

SELECT TOP 1000000

RowNum = IDENTITY(INT,1,1),

SomeInt = CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT),

SomeString = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)),

SomeDate = dateadd(dd,convert(int,(365 * rand(CAST(NEWID() AS VARBINARY)))),'1/1/2007')

INTO dbo.PVTest

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN

ALTER TABLE dbo.PVTest

ADD PRIMARY KEY CLUSTERED (RowNum)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 10

rog pike (10/23/2007)
RAC will never be amongst the fastest solutions. It is a simple tradoff, performance for functionality. RAC does not build a single SELECT statement to execute against a summary table. It would be virtually impossible to design a system (in t-sql) that could do everything RAC can do in a single framework and with a single pass thru the data /summary table where the end result is encapsulated in a single SELECT. But if that 'could' be done then it would be considerably faster. As a developer you understand the delicate balance between performance, functionality and stability. And you know there are no free lunches 🙂
best,

No doubt... RAC does everything...

... slower...

I already have enough performance impinged tools... but thanks anyway. You still haven't answered my question, though... how long does the million row example take?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 11

mrpolecat (10/23/2007)
Just for S&G I removed your index completely and it didn't change performance time. Still 20 seconds. Here is the table I am using to test with if anybody is interested.

I moved the index in my code example... thanks again for the heads up.

I applogize for not doing the thorough "runs first time every time" testing that I normally do... it was pretty late that day. Not a good reason, though.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mrpolecat SSCertifiable Points: 7014 More actions · Answer 12

I think that is how it is supposed to work. You do all the hard stuff 😀 and then the rest of us come along and QA it. A few tweaks and we go to production. I'll be selling your code soon by spamming all the other threads.:w00t:

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 13

Sandy (10/23/2007)
Hey Jason Tontz,
In SQL 2005, you can use this Query.

Hmmm... I wonder how Pivot in 2k5 would do against these simple 3 value pivots on a million rows? Too bad I don't have 2k5, yet, so I could find out.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 14

Jeff Moden (10/23/2007)
Sandy (10/23/2007)
Hey Jason Tontz,
In SQL 2005, you can use this Query.
Hmmm... I wonder how Pivot in 2k5 would do against these simple 3 value pivots on a million rows? Too bad I don't have 2k5, yet, so I could find out.

TESTING TESTING TESTING....did someone mention testing? hehe

So...I saw this laying out there...and I just couldn't help myself. I also had a hunch which I decided to test out.

First things first - I ran the "little" test you asked about, and after fixing the PIVOT syntax (I wasn't getting the correct results from Sandy's syntax), and....the CASE statement won...by 178 ms. The raw results were 672ms vs 860 ms.

Now - we've been down this path before so I KNOW that several people (you KNOW who you are) are just itching to take those numbers and run with it (PIVOT is 28% slower than CASE, etc...). But - I wasn't satisfied with the result, too easy, not enough effort for it.

In the words of Tim "the tool man" Taylor, the test "lacked power....so I rewired it"....hehe.

screw 3 results, let's go with 20 results. Screw 1M, go for 10M. and lest someone say it's reusing something - let's free the processor cache after each. Oh - and let's see if we can find an indexing scheme that helps.

Here's the testing scheme I used:

use test

go

--set this up

drop table #testpivot

go

create table #testpivot (rid int identity(1,1) not null, coID int not

null,prodID int not null, orderamount money not null)

--alter table #testpivot

--add primary key (rid) with fillfactor=100

go

insert #testpivot(coid,prodid,orderamount)

select top 5000000

cast(rand(cast(newid() as varbinary)) *20 as integer)+1,

cast(rand(cast(newid() as varbinary)) *50 as integer)+1,

cast(rand(cast(newid() as varbinary)) *35000 as money)+1

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

go 2

create clustered index pivot_pk on #testpivot(coid,prodid,rid)

--testing - good ol' fashioned case statements - multiple varieties

select '10M records, 20 companies over 50 products'

--first with "no helper indexes" - rely solely on clustered index

--variety #1

declare @g datetime

select @g=getdate()

select prodid,

sum(case when coid=1 then orderamount else 0 end) sum1,

sum(case when coid=2 then orderamount else 0 end) sum2,

sum(case when coid=3 then orderamount else 0 end) sum3,

sum(case when coid=4 then orderamount else 0 end) sum4,

sum(case when coid=5 then orderamount else 0 end) sum5,

sum(case when coid=6 then orderamount else 0 end) sum6,

sum(case when coid=7 then orderamount else 0 end) sum7,

sum(case when coid=8 then orderamount else 0 end) sum8,

sum(case when coid=9 then orderamount else 0 end) sum9,

sum(case when coid=10 then orderamount else 0 end) sum10,

sum(case when coid=11 then orderamount else 0 end) sum1,

sum(case when coid=12 then orderamount else 0 end) sum2,

sum(case when coid=13 then orderamount else 0 end) sum3,

sum(case when coid=14 then orderamount else 0 end) sum4,

sum(case when coid=15 then orderamount else 0 end) sum5,

sum(case when coid=16 then orderamount else 0 end) sum6,

sum(case when coid=17 then orderamount else 0 end) sum7,

sum(case when coid=18 then orderamount else 0 end) sum8,

sum(case when coid=19 then orderamount else 0 end) sum9,

sum(case when coid=20 then orderamount else 0 end) sum10

from #testpivot

group by prodid

select 'case method',datediff(ms,@g,getdate()),'no index'

--pivot

go

DBCC FREEPROCCACHE

declare @g datetime

select @g=getdate()

SELECT

[1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

[11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

FROM (select coid,prodid,orderamount from #testpivot) t

PIVOT (sum(orderamount) FOR coid IN ([1], [2],

[3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

[13],[14],[15],[16],[17],[18],[19],[20])

) AS OrdProductPivot

select 'pivot method',datediff(ms,@g,getdate()),'no index'

go

DBCC FREEPROCCACHE

declare @g datetime

--variety #2

select @g=getdate()

create index t_pivot4 on #testpivot(prodID) include (coid,orderamount)

select @g=getdate()

select prodid,

sum(case when coid=1 then orderamount else 0 end) sum1,

sum(case when coid=2 then orderamount else 0 end) sum2,

sum(case when coid=3 then orderamount else 0 end) sum3,

sum(case when coid=4 then orderamount else 0 end) sum4,

sum(case when coid=5 then orderamount else 0 end) sum5,

sum(case when coid=6 then orderamount else 0 end) sum6,

sum(case when coid=7 then orderamount else 0 end) sum7,

sum(case when coid=8 then orderamount else 0 end) sum8,

sum(case when coid=9 then orderamount else 0 end) sum9,

sum(case when coid=10 then orderamount else 0 end) sum10,

sum(case when coid=11 then orderamount else 0 end) sum1,

sum(case when coid=12 then orderamount else 0 end) sum2,

sum(case when coid=13 then orderamount else 0 end) sum3,

sum(case when coid=14 then orderamount else 0 end) sum4,

sum(case when coid=15 then orderamount else 0 end) sum5,

sum(case when coid=16 then orderamount else 0 end) sum6,

sum(case when coid=17 then orderamount else 0 end) sum7,

sum(case when coid=18 then orderamount else 0 end) sum8,

sum(case when coid=19 then orderamount else 0 end) sum9,

sum(case when coid=20 then orderamount else 0 end) sum10

from #testpivot

group by prodid

select 'case method',datediff(ms,@g,getdate()),'prodid','coid+amt'

--pivot

go

DBCC FREEPROCCACHE

declare @g datetime

select @g=getdate()

SELECT

[1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

[11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

FROM (select coid,prodid,orderamount from #testpivot) t

PIVOT (sum(orderamount) FOR coid IN ([1], [2],

[3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

[13],[14],[15],[16],[17],[18],[19],[20])

) AS OrdProductPivot

select 'pivot method',datediff(ms,@g,getdate()),'prodid','coid+amt'

drop index #testpivot.t_pivot4

go

declare @g datetime

DBCC FREEPROCCACHE

--variety #3

create index t_pivot3 on #testpivot(prodID,coid) include (orderamount)

select @g=getdate()

select prodid,

sum(case when coid=1 then orderamount else 0 end) sum1,

sum(case when coid=2 then orderamount else 0 end) sum2,

sum(case when coid=3 then orderamount else 0 end) sum3,

sum(case when coid=4 then orderamount else 0 end) sum4,

sum(case when coid=5 then orderamount else 0 end) sum5,

sum(case when coid=6 then orderamount else 0 end) sum6,

sum(case when coid=7 then orderamount else 0 end) sum7,

sum(case when coid=8 then orderamount else 0 end) sum8,

sum(case when coid=9 then orderamount else 0 end) sum9,

sum(case when coid=10 then orderamount else 0 end) sum10,

sum(case when coid=11 then orderamount else 0 end) sum1,

sum(case when coid=12 then orderamount else 0 end) sum2,

sum(case when coid=13 then orderamount else 0 end) sum3,

sum(case when coid=14 then orderamount else 0 end) sum4,

sum(case when coid=15 then orderamount else 0 end) sum5,

sum(case when coid=16 then orderamount else 0 end) sum6,

sum(case when coid=17 then orderamount else 0 end) sum7,

sum(case when coid=18 then orderamount else 0 end) sum8,

sum(case when coid=19 then orderamount else 0 end) sum9,

sum(case when coid=20 then orderamount else 0 end) sum10

from #testpivot

group by prodid

select 'case method',datediff(ms,@g,getdate()),'prodid+coid','amt'

--pivot

go

DBCC FREEPROCCACHE

declare @g datetime

select @g=getdate()

SELECT

[1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

[11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

FROM (select coid,prodid,orderamount from #testpivot) t

PIVOT (sum(orderamount) FOR coid IN ([1], [2],

[3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

[13],[14],[15],[16],[17],[18],[19],[20])

) AS OrdProductPivot

select 'pivot method',datediff(ms,@g,getdate()),'prodid+coid','amt'

drop index #testpivot.t_pivot3

go

declare @g datetime

DBCC FREEPROCCACHE

--variety #4

create index t_pivot2 on #testpivot(coID,prodid) include (orderamount)

select @g=getdate()

select prodid,

sum(case when coid=1 then orderamount else 0 end) sum1,

sum(case when coid=2 then orderamount else 0 end) sum2,

sum(case when coid=3 then orderamount else 0 end) sum3,

sum(case when coid=4 then orderamount else 0 end) sum4,

sum(case when coid=5 then orderamount else 0 end) sum5,

sum(case when coid=6 then orderamount else 0 end) sum6,

sum(case when coid=7 then orderamount else 0 end) sum7,

sum(case when coid=8 then orderamount else 0 end) sum8,

sum(case when coid=9 then orderamount else 0 end) sum9,

sum(case when coid=10 then orderamount else 0 end) sum10,

sum(case when coid=11 then orderamount else 0 end) sum1,

sum(case when coid=12 then orderamount else 0 end) sum2,

sum(case when coid=13 then orderamount else 0 end) sum3,

sum(case when coid=14 then orderamount else 0 end) sum4,

sum(case when coid=15 then orderamount else 0 end) sum5,

sum(case when coid=16 then orderamount else 0 end) sum6,

sum(case when coid=17 then orderamount else 0 end) sum7,

sum(case when coid=18 then orderamount else 0 end) sum8,

sum(case when coid=19 then orderamount else 0 end) sum9,

sum(case when coid=20 then orderamount else 0 end) sum10

from #testpivot

group by prodid

select 'case method',datediff(ms,@g,getdate()),'coID+prodid','amt'

--pivot

go

DBCC FREEPROCCACHE

declare @g datetime

select @g=getdate()

SELECT

[1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

[11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

FROM (select coid,prodid,orderamount from #testpivot) t

PIVOT (sum(orderamount) FOR coid IN ([1], [2],

[3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

[13],[14],[15],[16],[17],[18],[19],[20])

) AS OrdProductPivot

select 'pivot method',datediff(ms,@g,getdate()),'coID+prodid','amt'

drop index #testpivot.t_pivot2

go

declare @g datetime

DBCC FREEPROCCACHE

--variety #5

create index t_pivot on #testpivot(coID) include (prodid,orderamount)

select @g=getdate()

select prodid,

sum(case when coid=1 then orderamount else 0 end) sum1,

sum(case when coid=2 then orderamount else 0 end) sum2,

sum(case when coid=3 then orderamount else 0 end) sum3,

sum(case when coid=4 then orderamount else 0 end) sum4,

sum(case when coid=5 then orderamount else 0 end) sum5,

sum(case when coid=6 then orderamount else 0 end) sum6,

sum(case when coid=7 then orderamount else 0 end) sum7,

sum(case when coid=8 then orderamount else 0 end) sum8,

sum(case when coid=9 then orderamount else 0 end) sum9,

sum(case when coid=10 then orderamount else 0 end) sum10,

sum(case when coid=11 then orderamount else 0 end) sum1,

sum(case when coid=12 then orderamount else 0 end) sum2,

sum(case when coid=13 then orderamount else 0 end) sum3,

sum(case when coid=14 then orderamount else 0 end) sum4,

sum(case when coid=15 then orderamount else 0 end) sum5,

sum(case when coid=16 then orderamount else 0 end) sum6,

sum(case when coid=17 then orderamount else 0 end) sum7,

sum(case when coid=18 then orderamount else 0 end) sum8,

sum(case when coid=19 then orderamount else 0 end) sum9,

sum(case when coid=20 then orderamount else 0 end) sum10

from #testpivot

group by prodid

select 'case method',datediff(ms,@g,getdate()),'coID','prodid+amt'

--pivot

go

DBCC FREEPROCCACHE

declare @g datetime

select @g=getdate()

SELECT

[1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

[11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

FROM (select coid,prodid,orderamount from #testpivot) t

PIVOT (sum(orderamount) FOR coid IN ([1], [2],

[3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

[13],[14],[15],[16],[17],[18],[19],[20])

) AS OrdProductPivot

select 'pivot method',datediff(ms,@g,getdate()),'coID','prodid+amt'

drop index #testpivot.t_pivot

go

Now...for the results:

10M records, 20 companies, 50 products

06 case method38513no index

01 pivot method36500no index

07 case method39110prodidcoid+amt

10 pivot method41123prodidcoid+amt

05 case method37750prodid+coidamt

09 pivot method40766prodid+coidamt

03 case method37283coID+prodidamt

04 pivot method37393coID+prodidamt

08 case method39479coIDprodid+amt

02 pivot method36596coIDprodid+amt

So - long story to say - they both BASICALLY run the same (statistically a dead heat I'd say), with the two best times being returned by the NEW method. The right clustered index helps, but most "helper" indexes don't help - they actually make performance worse.

For the record - I'm also running them now without a clustered index that's helpful...we shall see what we get.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 15

Without a "useful" clustered index (i.e. using rowID as the clustered primary key), here are the results:

05 case method38563no index

08 pivot method41030no index

06 case method39186prodidcoid+amt

10 pivot method41420prodidcoid+amt

01 case method36216prodid+coidamt

09 pivot method41110prodid+coidamt

03 case method38186coID+prodidamt

02 pivot method36580coID+prodidamt

03 case method38186coIDprodid+amt

07 pivot method40673coIDprodid+amt

So - again - the indexing doesn't seem to do all that much and the results for both method are essentially tied.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?