Cross Tabs and Pivots, Part 1 – Converting Rows to Columns

  • Ravi (5/24/2009)


    Hi, Gud morning

    i have used the same code for testing with 10 million records, as code itself is inserting records, so i have nothing to do with the data.

    the method i am using is "Pre-aggregated" Cross Tab with CTE" and when i am executing this , i am getting the following error msg:

    =======================================================

    ======= "Pre-aggregated" Cross Tab with CTE =====

    Msg 8115, Level 16, State 2, Line 5

    Arithmetic overflow error converting expression to data type int.

    SQL Server Execution Times:

    CPU time = 13202 ms, elapsed time = 4838 ms.

    ========================================================

    please note that its working great with less number of records, and i too believe that there is something wrong with data insertion part. I am also trying to find out what is causing error.

    Regards,

    Ravi

    Doesn't look like it's the data insertion part. It looks like a total violated the max value of an INT. Try changing it to BIGINT.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • yes jeff , you are right...following code will work on n Number of records. have a look!

    SELECT TOP 10000000 --<<! Change this number for testing different size tables

    RowNum = IDENTITY(BIGINT,1,1), Changed

    Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%2+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%2+65),

    Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),

    Quantity =CAST(ABS(CHECKSUM(NEWID()))%50000+1 AS BIGINT), Changed

    Date = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    Year = CAST(NULL AS SMALLINT),

    Quarter = CAST(NULL AS TINYINT)

    INTO SomeTable3

    FROM Master.sys.SysColumns t1

    CROSS JOIN

    Master.sys.SysColumns t2

    --===== Fill in the Year and Quarter columns from the Date column

    UPDATE SomeTable3

    SET Year = DATEPART(yy,Date),

    Quarter = DATEPART(qq,Date)

    --===== assign a primary key

    -- Takes about 1 second to execute.

    ALTER TABLE SomeTable3

    ADD PRIMARY KEY CLUSTERED (RowNum)

    CREATE NONCLUSTERED INDEX IX_SomeTable3_Cover1

    ON dbo.SomeTable3 (Company, Year)

    INCLUDE (Amount, Quantity, Quarter)

    GO

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    GO

    :hehe:

    ---------------------------------------------------
    The Greatest pleasure in life is doing what people say you can't do! 🙂

    MS-ACCESS DBA!! :hehe:

  • Cool... although I don't believe you need to change the datatype on the IDENTITY column. You're only working on 10 million rows. INT goes up to 2 billion. It's the Quantity that you needed to (and did) change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the wonderful code Jeff. Please keep posting this kind of code which will help others to write complex queries in simplest way.:-)

    ---------------------------------------------------
    The Greatest pleasure in life is doing what people say you can't do! 🙂

    MS-ACCESS DBA!! :hehe:

  • Thanks for the awesome compliment, Ravi. I'll try.

    BTW, I really like your signature line. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks very much Jeff!:-)

    ---------------------------------------------------
    The Greatest pleasure in life is doing what people say you can't do! 🙂

    MS-ACCESS DBA!! :hehe:

  • Absolutly amazing article, thanks for the contribution!

  • A few days ago, I had to do what seemed to me to be a fairly complicated cross tab report. I remembered, "Hey, there was some article that Jeff did some time ago that had some good hints." So, I did a search and found your article and sure enough, it helped me out--if nothing else, giving me lots of confidence that I was doing the right thing.

    Thanks!

  • Thanks Bradley and JJ. Just in case you're interested, "Part 2" can be found at the following URL:

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    It covers how to make/automate dynamic crosstabs for reporting purposes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great post Jeff!!! 🙂

    One thing to note on performance. The PIVOT operator is actually processing a cross tab query under the hood, so the performance is identical for both, on small scale samples. Meaning the optimizer uses the same case expression to evaluate a pivot as you did in your cross tab query. however, in the latter example you had to use two pivot operators. This is where the pivot operator is lacking. The cross tab method allows you to "customize" the case expression, which allows you to save IO and increase performance, while you cannot do this via the pivot operator. I just thought this is worth noting.

    Great job.

    -Adam Haines

  • Thanks for the feedback and the excellent observation, Adam. That's also why I prefer the Cross Tab method over the Pivot method. If they're both the same for a single aggregation, then I'll use the one that's simpler (in my eyes) to write/read which also happens to be better for performance on a multiple aggregation and also works for all versions of SQL. Working on all versions is a good thing for me especially... most of my customers still have at least one instance SQL Server 2000 on site.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Adam Haines (10/2/2009)


    The cross tab method allows you to "customize" the case expression, which allows you to save IO and increase performance, while you cannot do this via the pivot operator.

    It is possible to nest PIVOT to achieve similar results. See http://www.sqlservercentral.com/Forums/FindPost695425.aspx in this thread. With enough effort, it is usually possible to code something as fast as even Jeff's pre-aggregated routines using PIVOT. That's just for the record, I hardly ever bother...the explicit CASE solution is a natural and good choice in most circumstances.

    Paul

  • Hi Jeff,

    Thanks for the article! Good Stuff. I was just confused by one aspect.

    (I am not a DBA so if my question shows ignorance, be patient/gentle! 😉 )

    The statistics you presented for the PreAggregate only show the final query reads and time. Does the creation of a Temp table and subsequent query & writes make a notable impact in this exercise?

    Carla

  • Carla Tillman (10/5/2009)


    Hi Jeff,

    Thanks for the article! Good Stuff. I was just confused by one aspect.

    (I am not a DBA so if my question shows ignorance, be patient/gentle! 😉 )

    The statistics you presented for the PreAggregate only show the final query reads and time. Does the creation of a Temp table and subsequent query & writes make a notable impact in this exercise?

    Carla

    Understood and no problem. Welcome aboard.

    The temp tables are just demonstration tables for the code. In real life, they'd be real tables and their creation would not be necessary because they'd already exist. That's the reason the times weren't included for the temp tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • the content is very useful.

    one question-

    is it "somewhat" less complex to produce multi-aggregate crosstabs and pivots using ms-access than currently in sql server?

Viewing 15 posts - 166 through 180 (of 243 total)

You must be logged in to reply to this topic. Login to reply