Cross Tabs and Pivots, Part 1 – Converting Rows to Columns

  • Jeff Moden (10/14/2008)


    Heh... if you guys keeps it up, I won't have to write the article. 🙂

    Ooooh, No, we're not gonna let you off that easily! :laugh:

    And let's set some limits here: Tolkien only took about 10 years between volumes of the Trilogy... 😀

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Heh, no rest for the weary, eh? 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Nope. Back to the salt mines, dude. Publish or perish.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Richard Fryar (10/14/2008)


    Taking the basic idea posted by Chris Morris (page 2 of this thread) and modifying the string build part gets the following. If there are duplicate properties for a person this displays the maximum (alphabetically).

    DECLARE @sql VARCHAR(4000)

    SELECT @sql = 'SELECT p.PersonName, ' + STUFF(sep, LEN(sep), 1, '') +

    ' FROM #People p JOIN #PeopleProps pp ON p.PersonID = pp.PersonID GROUP BY p.PersonName'

    FROM (

    SELECT 'MAX(CASE PropertyName WHEN ''' + PropertyName + ''' THEN PropertyValue ELSE '''' END) AS [' + PropertyName + '],'

    FROM (SELECT PropertyName FROM #PeopleProps GROUP BY PropertyName) x

    FOR XML PATH('')

    ) AS z(sep)

    EXEC (@sql)

    Richard Fryar, you're not by any chance our Richard Fryar are you?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Wow...I know you guys take for granted how easy it is to spit those things out, but I can't imagine what people in my position would do if we didn't have resources like this site at our disposal.

    Thank you very much for taking the time to help me me out!

    Jon

  • I was just looking at my resultant code and it begged the question: How can I turn this procedure into a view? The challenge as I see it is that the column list is dynamic.

    For my edification, how would this better be accomplished instead of using Entity-Attribute-Value tables?

    Thanks.

    Jon

  • Well, dynamic SQL (like Richard Fryar's solution) cannot be made into a View or even a Table-Valued Function. If, however, you can get away with a static solution, then Jeff's solution should be pretty easy to make into a View.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for an excellent topic and a VERY well written article!!!

  • Thanks for taking the time to say so! Keeps folks like me going. Thanks CM...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have a similar issue except I want to display an ID and and HonorDesigination as follows

    ID HonorDesignation

    111 M

    111 H

    There is no summing or arithmetic here. I need the display to look like this

    ID HonorDesignation1 HonorDesignation2

    111 M H

    The table name is gifts

    Field names are ID and HonorDesignation

    How would I accomplish this in sql2005

  • Same thing, though... use MAX instead of SUM...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Revised post:

    Is it save to assume that both Pivot and Cross Tabs preform better than tabulating the data from a view that inner joins the table on itself for each quarter?

    I'm working on a survey database, the questions needs to be fairly dynamic. The survey question set is up to 500 questions and at least 20% of which will change. I was looking at making a participant table, a question table and an answer table.

    The answer table would be an EAV table with a participant id, question id, and then an answer.

    The initial survey count will be about 300, however they are talking about the program increasing to something like 1000-2000 and with dreams of hitting 20,000.

    At 300 surveys that would yield 150,000 rows.

    At 1500 surveys that would yield 750,000 rows

    At 20,000 surveys would yield 10,000,000 rows

    For the most part(95% of questions) the data needs to be extracted to insert, edit, and view a single survey. However for the reports they would want to view a survey_id along side all 500 columns(for each question) for each survey ids. (so they could put it in a spreadsheet and analyze it)

    In your article, the queries seemed to preform rather well on my machine, but it is only tabulated to 12 columns wide for 80 rows, do you think it would still preform as well if you went to 50 columns wide or even 500?

  • Kris (12/5/2008)


    I was reading somewhere that instead of using cross tabs they used inner joins on same table, Is it save to assume that both Pivot and Cross Tabs preform better than... say tabulating the data from an inner joined view?

    Next on another thread that discusses EAV tables

    http://www.sqlservercentral.com/Forums/Topic473346-1235-3.aspx

    The second post on page 3 the user talks about a client/survey_question/answer where the survey questions have a high possibility of changing.

    I'm dealing a very similar problem, my survey question set that is 500 or so questions and at least 20% of which will change every 6 months. Personally I would like to devote a table to questions so that THEY are populating the question text and possible answers and so on.

    The initial survey count will be about 300, however they are talking about the program increasing to something like 1000-2000 and with dreams of hitting 20,000.

    At 300 that would yield 150,000 rows.

    At 1500 that would yield 750,000 rows

    At 20,000 would yield 10,000,000 rows

    For the most part(95% of questions) the data needs to be extracted to insert, edit, and view a single survey. However for the reports they would want to view a survey_id along side all 500 columns(for each question) for each survey ids. (so they could put it in a spreadsheet and analyze it)

    In this example it seemed to preform rather well on my machine, but it is only tabulated to 12 columns wide for 80 rows, what type of performance hit would happen if you went to 50 columns wide or even 500?

    Heh... why not ask that same question on their thread? 😉

    I can't find the post you're taling about because my page settings are quite a bit different than yours. Please identify the post you're talking about by the post number in the "column" located to the left of each post near the bottom.

    As a side bar, I don't see why there would be 10 million of anything in anytable in a properly formed survey database unless 10 million questions were answered.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff:

    ...because this article is what I reading and seems to be the most viable option for working with EAV/NVP...

    I edited my previous post so it would not need to reference the other article.

    And yes, 10 million questions would had to have been answered at that point.

  • Kris (12/5/2008)


    Revised post:

    Is it save to assume that both Pivot and Cross Tabs preform better than tabulating the data from a view that inner joins the table on itself for each quarter?

    It's never safe to assume anything of this nature. Only testing with the appropriate amount of data will prove things.

    I'm working on a survey database, the questions needs to be fairly dynamic. The survey question set is up to 500 questions and at least 20% of which will change. I was looking at making a participant table, a question table and an answer table.

    The answer table would be an EAV table with a participant id, question id, and then an answer.

    The initial survey count will be about 300, however they are talking about the program increasing to something like 1000-2000 and with dreams of hitting 20,000.

    At 300 surveys that would yield 150,000 rows.

    At 1500 surveys that would yield 750,000 rows

    At 20,000 surveys would yield 10,000,000 rows

    For the most part(95% of questions) the data needs to be extracted to insert, edit, and view a single survey. However for the reports they would want to view a survey_id along side all 500 columns(for each question) for each survey ids. (so they could put it in a spreadsheet and analyze it)

    In your article, the queries seemed to preform rather well on my machine, but it is only tabulated to 12 columns wide for 80 rows, do you think it would still preform as well if you went to 50 columns wide or even 500?

    Again... I don't believe the survey database is correctly designed if you need to drag this into a spreadsheet. You should be able to get the survey ids for each answered question without loading it into a spreadsheet and you should be able to do all sorts of analysis without loading it into a spreadsheet.

    If people insist on loading this data into a spreadsheet, perhaps Part 2 of this series would be better suited...

    [font="Arial Black"](click here) Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs[/font][/url]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 106 through 120 (of 243 total)

You must be logged in to reply to this topic. Login to reply