Pivot Table Help Needed

  • >I tested my cursor version against this on a million rows

    >TESTING TESTING TESTING....did someone mention testing?

    (This is not directed at anyone in particular. At most it

    rises to the level of a shot across the bow of a ship

    whose course needs dire correction).

    Exactly what was tested here. What specific question(s) has been answered?

    There is a pervasive idea in sql communities that sql operates in a different environment than the rest of computer science. That basic concepts of computer science do not apply to sql databases. That somehow the 'sql standard' immunizes sql users from concepts and standards established in the rest of computer science. There exists a double standard if you will. This of course is not only bizarre but stupid and has significant consequences. The idea of an sql programmer is taken as synonymous with a developer much like an sql database system is taken as synonymous with a relational database. That knowing everything in bol equates to understanding the database environment. That knowing how to write complex queries implies a deep understanding of relational ideas. All these things are utter nonsense. It leads to the sorry spectacle of brilliant programmers, who when asked to explain what they have really done, proving that they have no idea of what they are talking about. It leads to the assumption that sql vendors are the real gatekeepers of how things ought to be. Sadly this means the database industry is based on faith and little else. This means that the confusion that exists within sql is perpetually passed on to each newbie. This leads to the sorry spectacle of users who are unable to even frame an intelligent question being only outdone by the ridiculous explanations they are given. Unfortunately in the ever increasingly isolated world of sql 'really' reading something like bol and testing software remain elusive targets.

    best,

    www.beyondsql.blogspot.com

  • And, thank you for the compliment, Sandy.

    --Jeff Moden

    Hey Jeff :),

    You deserve the compliment for this type of approach to SQL Script.

    its really good way what i feel.

    Cheers!

    Sandy.

    --

  • >No doubt... RAC does everything...

    >

    >... slower...

    >

    >I already have enough performance impinged tools... but thanks

    >anyway. You still haven't answered my question, though... how

    >long does the million row example take?

    Jeff,

    The short answer.

    That's like asking the fox to watch over the henhouse isnt' it?

    Your placing your blind trust in a answer by the vendor? So

    you want me to parrot the same nonsense offered by TPC. Allow

    me to save you from yourself. If your that interested in seeing

    a number from a million rows why in the world would you ask someone

    else to do it for you much less the vendor!? And it would please

    me even more if in addition you decided to test it.

    A perspective

    The time to go from A to B is one thing. To be able to get from

    A to B is another.

    It is one thing to recognize as significant elapsed time, it is

    quite another to worship it. Time is tangible, concepts are

    intangible. It is easier to sell time. You want to rank faster

    use a ranking function. It's fast as long as the rank and its

    order are the same. You just paid the price of fast. And how do

    you measure the elapsed time of not being able to use it? You

    want fast triggers ok you have them. But you also have undefined

    triggers, triggers that can't distinguish between one row and

    another. You want faster queries on joins. Ok you have them

    since sql doesn't concern itself with the key of a resultset.

    You want even faster queries on joins with any kind of

    predicate(s). You have them since there is no restrictions on

    the cardinality of a join. You want faster inserts well just

    limit the kinds of constraints defined on columns. You want

    as fast a response as possible on any operation well just

    remove the entire process of checking types. In fact remove

    any vestige of type altogether. How fast is lack of integrity?

    Sql still really hasn't learned how to tell time. But, like

    an old dog, it still can learn how to live with a few more

    ticks.

    best,

    www.beyondsql.blogspot.com

  • rog pike (10/23/2007)


    (This is not directed at anyone in particular. At most it

    rises to the level of a shot across the bow of a ship

    whose course needs dire correction).

    Exactly what was tested here. What specific question(s) has been answered?

    There is a pervasive idea in sql communities that sql operates in a different environment than the rest of computer science. That basic concepts of computer science do not apply to sql databases. That somehow the 'sql standard' immunizes sql users from concepts and standards established in the rest of computer science. There exists a double standard if you will. This of course is not only bizarre but stupid and has significant consequences. The idea of an sql programmer is taken as synonymous with a developer much like an sql database system is taken as synonymous with a relational database. That knowing everything in bol equates to understanding the database environment. That knowing how to write complex queries implies a deep understanding of relational ideas. All these things are utter nonsense. It leads to the sorry spectacle of brilliant programmers, who when asked to explain what they have really done, proving that they have no idea of what they are talking about. It leads to the assumption that sql vendors are the real gatekeepers of how things ought to be. Sadly this means the database industry is based on faith and little else. This means that the confusion that exists within sql is perpetually passed on to each newbie. This leads to the sorry spectacle of users who are unable to even frame an intelligent question being only outdone by the ridiculous explanations they are given. Unfortunately in the ever increasingly isolated world of sql 'really' reading something like bol and testing software remain elusive targets.

    So, you think the whole SQL Community is stupid... nice way to entice people to buy your product, Rog... and if you think this has anything to do with knowing everything in BOL, you're sadly mistaken.

    You still haven't answered the question of how long it takes RAC to do the same thing... take the test data these guys have provided and run it against RAC... report the results for duration. Think of it as me asking you to spam us with how good your product really is, just this once 😛 This is your opportunity to shine. And, if it happens to run a bit slower, shower us with your knowledge of why that's a good thing. Show us how your product uses fewer resources in terms of CPU usage, RAM usage, and Disk I/O than anything in T-SQL... show us this wonderful decrease in development time and how 20 or 30 batch processes that use your product won't simply bring the server to it's knees as so many other 3rd party product have been demonstrated to do. C'mon... you can do it! :w00t:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sometimes it is necessary to rotate results so that columns are presented horizontally and rows are presented vertically. This is known as creating a PivotTable®, creating a cross-tab report, or rotating data.

    Assume there is a table Pivot that has one row per quarter. A SELECT of Pivot reports the quarters vertically:

    Year Quarter Amount

    ---- ------- ------

    1990 1 1.1

    1990 2 1.2

    1990 3 1.3

    1990 4 1.4

    1991 1 2.1

    1991 2 2.2

    1991 3 2.3

    1991 4 2.4

    A report must be produced with a table that contains one row for each year, with the values for each quarter appearing in a separate column, such as:

    Year

    Q1

    Q2

    Q3

    Q4

    1990

    1.1

    1.2

    1.3

    1.4

    1991

    2.1

    2.2

    2.3

    2.4

    These are the statements used to create the Pivot table and populate it with the data from the first table:

    USE Northwind

    GO

    CREATE TABLE Pivot

    ( Year SMALLINT,

    Quarter TINYINT,

    Amount DECIMAL(2,1) )

    GO

    INSERT INTO Pivot VALUES (1990, 1, 1.1)

    INSERT INTO Pivot VALUES (1990, 2, 1.2)

    INSERT INTO Pivot VALUES (1990, 3, 1.3)

    INSERT INTO Pivot VALUES (1990, 4, 1.4)

    INSERT INTO Pivot VALUES (1991, 1, 2.1)

    INSERT INTO Pivot VALUES (1991, 2, 2.2)

    INSERT INTO Pivot VALUES (1991, 3, 2.3)

    INSERT INTO Pivot VALUES (1991, 4, 2.4)

    GO

    This is the SELECT statement used to create the rotated results:

    SELECT Year,

    SUM(CASE Quarter WHEN 1 THEN Amount ELSE 0 END) AS Q1,

    SUM(CASE Quarter WHEN 2 THEN Amount ELSE 0 END) AS Q2,

    SUM(CASE Quarter WHEN 3 THEN Amount ELSE 0 END) AS Q3,

    SUM(CASE Quarter WHEN 4 THEN Amount ELSE 0 END) AS Q4

    FROM Northwind.dbo.Pivot

    GROUP BY Year

    GO

    This SELECT statement also handles a table in which there are multiple rows for each quarter. The GROUP BY combines all rows in Pivot for a given year into a single row in the output. When the grouping operation is being performed, the CASE functions in the SUM aggregates are applied in such a way that the Amount values for each quarter are added into the proper column in the result set and 0 is added to the result set columns for the other quarters.

    If the results of this SELECT statement are used as input to a spreadsheet, it is easy for the spreadsheet to calculate a total for each year. When the SELECT is used from an application it may be easier to enhance the SELECT statement to calculate the yearly total. For example:

    SELECT P1.*, (P1.Q1 + P1.Q2 + P1.Q3 + P1.Q4) AS YearTotal

    FROM (SELECT Year,

    SUM(CASE P.Quarter WHEN 1 THEN P.Amount ELSE 0 END) AS Q1,

    SUM(CASE P.Quarter WHEN 2 THEN P.Amount ELSE 0 END) AS Q2,

    SUM(CASE P.Quarter WHEN 3 THEN P.Amount ELSE 0 END) AS Q3,

    SUM(CASE P.Quarter WHEN 4 THEN P.Amount ELSE 0 END) AS Q4

    FROM Pivot AS P

    GROUP BY P.Year) AS P1

    GO

    Both GROUP BY with CUBE and GROUP BY with ROLLUP compute the same sort of information as shown in the example, but in a slightly different format.

  • Matt, excellent testing.

    BUT... I am missing data from SQL Profiler regarding CPU/DURATION/READS and WRITES for the different methods used.

    When I started out with SQL Server 2005 and PIVOT operator in Beta1-3, I found it very slow and resource-hogging.

    Now with SP2 it seems to be fixed.

    For a month ago I helped Scott Klein with a problem that involved PIVOT.

    I found that now PIVOT takes fewer READS than CASE/GROUP BY statement.

    Do you think you can run the test again and add the missing SQL Profiler data?

    And yes, CASE/GROUP BY approach is still more versatile because of able to have multiple aggregations per record. PIVOT still is only able to handle one aggregation.


    N 56°04'39.16"
    E 12°55'05.25"

  • VAIYDEYANATHAN.V.S (10/23/2007)


    Sometimes it is necessary to rotate results so that columns are presented horizontally and rows are presented vertically. This is known as creating a PivotTable®, creating a cross-tab report, or rotating data.

    Assume there is a table Pivot that has one row per quarter. A SELECT of Pivot reports the quarters vertically:....

    Nice copy of BOL... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Peter Larsson (10/24/2007)


    Matt, excellent testing.

    BUT... I am missing data from SQL Profiler regarding CPU/DURATION/READS and WRITES for the different methods used.

    When I started out with SQL Server 2005 and PIVOT operator in Beta1-3, I found it very slow and resource-hogging.

    Now with SP2 it seems to be fixed.

    For a month ago I helped Scott Klein with a problem that involved PIVOT.

    I found that now PIVOT takes fewer READS than CASE/GROUP BY statement.

    Do you think you can run the test again and add the missing SQL Profiler data?

    And yes, CASE/GROUP BY approach is still more versatile because of able to have multiple aggregations per record. PIVOT still is only able to handle one aggregation.

    I gotta agree with the excellent testing, there. Nicely done.

    Just to be sure my eyes haven't gotten too bad, though... test results say 10M... but I'm pretty sure that the test pivot table only has 5M rows in it... true or do I really need a new set of glasses? :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/24/2007)


    I gotta agree with the excellent testing, there. Nicely done.

    Just to be sure my eyes haven't gotten too bad, though... test results say 10M... but I'm pretty sure that the test pivot table only has 5M rows in it... true or do I really need a new set of glasses? :hehe:

    Well - close but no cigar:) You missed the second little piece:

    Insert blah....

    Select top 50000000......

    Go 2

    It runs the insert twice.

    And - the group by is more versatile, no doubt.

    I'll get the profiler data later on.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Heh... good... not my eyes... just time of day. Thanks, Matt.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have the profiler results. They're attached, but I summarized them a bit.

    I ran it twice, once as is, and once after adding a CHAR(300) field (so that the process stops thinking a table scan is the best way to approach this).

    Anyway here's a summary of the results:

    TextData CPU Reads Writes Duration

    --CASE syntax - no helper index 26,719 37,963 - 0:00:14.412 14,412,168

    --CASE syntax - ProdID include (coid,orderamount) 28,030 31,257 - 0:00:14.446 14,445,524

    --CASE syntax - (ProdID,coid) include (orderamount) 27,892 31,285 - 0:00:14.408 14,407,634

    --CASE syntax - (CoID,ProdID) include (orderamount) 26,142 28,554 - 0:00:13.448 13,447,878

    --CASE syntax - (CoID) include (ProdID,orderamount) 25,610 28,540 - 0:00:13.188 13,188,389

    --PIVOT syntax - no helper index 32,874 40,096 2 0:00:18.319 18,318,680

    --PIVOT syntax - ProdID include (coid,orderamount) 34,390 31,289 - 0:00:18.009 18,009,332

    --PIVOT syntax - (ProdID,coid) include (orderamount) 34,703 31,261 - 0:00:17.800 17,799,724

    --PIVOT syntax - (CoID,ProdID) include (orderamount) 29,515 28,561 - 0:00:15.368 15,368,222

    --PIVOT syntax - (CoID) include (ProdID,orderamount) 29,297 28,548 - 0:00:29.993 29,992,939

    Pivot range 29,297 28,548 0:00:15.368 15,368,222

    34,703 40,096 0:00:29.993 29,992,939

    CASE range 25,610 28,540 0:00:13.188 13,188,389

    28,030 37,963 0:00:14.446 14,445,524

    Case 2 - same as 1, BUT add a CHAR(300) field to simulate all of the other data a prod table might have

    --CASE syntax - no helper index 58,608 313,642 1 0:02:31.113 151,113,171

    --CASE syntax - ProdID include (coid,orderamount) 72,062 31,203 - 0:00:40.085 40,084,973

    --CASE syntax - (ProdID,coid) include (orderamount) 71,750 31,231 - 0:00:36.794 36,793,597

    --CASE syntax - (CoID,ProdID) include (orderamount) 74,625 31,229 - 0:00:38.376 38,375,597

    --CASE syntax - (CoID) include (ProdID,orderamount) 74,642 31,201 - 0:00:37.861 37,860,628

    --PIVOT syntax - no helper index 57,407 313,436 1 0:02:40.110 160,110,450

    --PIVOT syntax - ProdID include (coid,orderamount) 79,672 31,207 - 0:00:41.119 41,119,222

    --PIVOT syntax - (ProdID,coid) include (orderamount) 87,063 31,235 - 0:00:44.353 44,352,686

    --PIVOT syntax - (CoID,ProdID) include (orderamount) 75,250 31,235 - 0:00:38.147 38,147,400

    --PIVOT syntax - (CoID) include (ProdID,orderamount) 80,062 31,272 - 0:00:40.515 40,514,660

    Pivot range 57,407 31,207 0:00:38.147 38,147,400

    87,063 313,436 0:02:40.110 160,110,450

    CASE range 58,608 31,201 0:00:36.794 36,793,597

    74,642 313,642 0:02:31.113 151,113,171

    This is no doubt going to get "whacked" formatting-wise, so I'll throw the spreadsheet in as well.

    Bottom line though -

    - PIVOT and CASE seem to both perform fairly close to on par, with a slight edge to CASE.

    - They don't respond the same to indexing, since their "optimal" indexing scheme seems to be different.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Jeff,

    absolutely brilliant!

    I have been using a different method for some time to do this - but yours is simpler and blistering fast!

    Well done and thank you.

    JJ

  • Thanks for the kudo, Jeremy. :blush: I really appreciate it. And thanks for helping me find one of my "lost posts". I've been looking for this one but got thousands of hits when searching for things like the word "Pivot".

    As a side bar, I wrote a couple of articles on Crosstabs and Pivots about a year after the code in this thread. I may have to revisit the code in this thread to see what I can do to improve it because the CrossTab method almost always beat the Pivot method in the articles.

    Here're the links for the articles.

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have code that creates a pivot table from a view and then appends a summary row with a UNION ALL.

    I need to take the above output and put it into a table that can be displayed on a website and downloaded if necessary.

    Will the published procedure do that?

  • tnpich (9/1/2015)


    I have code that creates a pivot table from a view and then appends a summary row with a UNION ALL.

    I need to take the above output and put it into a table that can be displayed on a website and downloaded if necessary.

    Will the published procedure do that?

    Which one? A lot of people have made entries on this thread.

    Also, your method above hits the table twice. Once for the pivot and once for the total. It's probably not necessary if you were to do the pivot with a CROSSTAB/GROUP and use WITH ROLLUP or similar.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 31 through 45 (of 46 total)

You must be logged in to reply to this topic. Login to reply