Cross Tabs and Pivots, Part 1 – Converting Rows to Columns

  • Jeff writes

    >Ok... show us the code that you would do it in RAC with.

    Here yago.

    Minimal Rac execute using standard defaults.

    Exec Rac

    -- Similar to Access transform, the desired aggregates.

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year', -- Essentially a group by minus pivot column.

    @pvtcol='Quarter', -- Pivot column.

    @from='#SomeTable2', -- ANY valid FROM you can use in sql server.

    @shell='n' -- Method of building result.

    -- This method does not use any external resources.

    Result is virtual, it could be saved to a #, ## or permanent table.

    Company Year Funct Totals 1 2 3 4

    ------- ---- ----- ------ ---- ---- ---- ----

    ABC 2006 Amt 5.0 1.1 1.2 1.3 1.4

    Qty 10.1 2.2 2.4 1.3 4.2

    2007 Amt 9.0 2.1 2.2 2.3 2.4

    Qty 9.0 2.3 3.1 2.1 1.5

    2008 Amt 5.7 1.5 2.3 1.9

    Qty 12.6 5.1 3.3 4.2

    XYZ 2006 Amt 10.0 2.1 2.2 3.3 2.4

    Qty 11.7 3.6 1.8 2.6 3.7

    2007 Amt 9.0 3.1 1.2 3.3 1.4

    Qty 11.3 1.9 1.2 4.2 4.0

    2008 Amt 11.2 2.5 3.5 1.3 3.9

    Qty 13.3 3.9 2.1 3.9 3.4

    Totals Amt 49.9 12.4 10.3 13.8 13.4

    Qty 68.0 19.0 10.6 17.4 21.0

    Now it's a question of using Rac framework (options) to suit desired result.

    Exec Rac

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year',

    @pvtcol='Quarter',

    @from='#SomeTable2',

    @emptycell='0.0', -- Default value for empty cell.

    @rowbreak='n', -- Turns off all row breaking options (like Access reports).

    @grand_totals='n', -- Suppresses grand totals.

    @rotate='nest', -- Rotates/transposes multiple aggregates for a row combination to a

    -- single row. Nest method of rotate interleaves aggregates. Pivot

    -- col names are derived by appending pivot col to aggregate name.

    @rowtotalsposition='end',-- Changes default position of row totals from begin to end (of pivots).

    @bulkme='y', -- Method of building results using bulk insert. Good for large result.

    @racheck='y' -- Internal check of result.

    -- @convert='decimal(3,1)' -- Default datatype of pivot result is character strings. Convert to

    -- a different datatype (ie. decimal, integer etc).

    Company Year Amt_1 Qty_1 Amt_2 Qty_2 Amt_3 Qty_3 Amt_4 Qty_4 Amt_Totals Qty_Totals

    ------- ---- ----- ----- ----- ----- ----- ----- ----- ----- ---------- ----------

    ABC 2006 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1

    ABC 2007 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0

    ABC 2008 1.5 5.1 0.0 0.0 2.3 3.3 1.9 4.2 5.7 12.6

    XYZ 2006 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7

    XYZ 2007 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3

    XYZ 2008 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3

    Dynamically transform pivot column values and default rotated pivot names.

    Exec Rac

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year',

    @pvtcol='Quarter',

    @from='#SomeTable2',

    @emptycell='0.0',@rowbreak='n',@grand_totals='n',

    @rotate='nest' ,@rowtotalsposition='end',@bulkme='y',@racheck='y',

    @exceptions='Amt_Totals & Qty_Totals', -- Omit these columns from pivot column collection (_pvtcols_).

    -- Apply a transformation to each pivot column ([value]) and each pivot column name ([name]).

    -- Any valid expression in a SELECT statement (including subqueries) can be used

    -- for a transformation acting on [value] and [name].

    @replacepvtcols=

    '{cast([value] as decimal(3,1)) for [select ~Q~+right(~[name]~,1)+left(~[name]~,3)]}',

    -- A feature packed SELECT parameter that is usually used as a select statement to

    -- act on the result from Rac up to this point. Here the rows (Company/Year) as _rows_

    -- and the pivot columns (minus the exception pivot columns and the @replacepvtcols

    -- transformation applied) as _pvtcols_ are selected. The row totals are transformed

    -- by value and name individually in the SELECT. The rd column is Rac generated and

    -- is equivalent to a row_number() based on the ordering of rows (Company/Year) or

    -- a custom ordering based on the @rowsort parameter (not used in this example).

    -- (@rowsort and Rac running sums can easily simulate the sql-99 window,

    -- ie. the full implementation of AGGREGATE OVER(PARTITION..ORDER BY). It will solve

    -- the same type of problems easily. Rac is much more than a xtab generator:) ).

    @select='select _rows_,_pvtcols_,cast(Amt_Totals as decimal(3,1)) as TotalAmt,

    cast(Qty_Totals as decimal(3,1)) as TotalQty

    from rac

    order by rd'

    Company Year Q1Amt Q1Qty Q2Amt Q2Qty Q3Amt Q3Qty Q4Amt Q4Qty TotalAmt TotalQty

    ------- ---- ----- ----- ----- ----- ----- ----- ----- ----- -------- --------

    ABC 2006 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1

    ABC 2007 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0

    ABC 2008 1.5 5.1 0.0 0.0 2.3 3.3 1.9 4.2 5.7 12.6

    XYZ 2006 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7

    XYZ 2007 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3

    XYZ 2008 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3

    There's so many options and permutations of them I don't even know all of them 🙂

    Hope this gives some idea of what Rac is about.

    www.rac4sql.net

  • dphillips writes:

    >Without the TRANSFORM counterpart, I see absolutely no reason to

    >use it in it's current form. I would gladly hear anyone having an

    >alternate opinion.

    Nicole Bowman writes:

    >I too miss the easy cross tabs in MS Access. Dynamic cross tabs in

    >SQL are heavy going at times.

    Rac is the Access crosstab...but on steroids:) If MS extended its functionality they would look the same. Make a wish for Access and you'll find it in Rac:) Do you really care how these things work internally or are you concerned with getting your work done quickly and easily? Rac is written in t-sql, so what:) It's simply at a high level of abstraction which is what users like about Access. If you know a little SAS and some sql-99 olap you'll be even further along. If you still can't get your head around it just pretend it comes from Red-Gate:)

    www.rac4sql.net

  • Jeff Moden (8/20/2008)

    Nicely done, Richard and thanks for the "cover".

    Just to continue the thought, there are other ways to pull off such concatenations and some have a pitfall or two... take a look at the following link if you get the chance...

    http://www.sqlservercentral.com/articles/Test+Data/61572/

    sunjiulu (8/19/2008)

    --------------------------------------------------------------------------------

    hi Richard,

    that xml code works like a charm, thanks for sharing.

    jiulu

    Richard Fryar (8/19/2008)

    --------------------------------------------------------------------------------

    In response to sunjiulu's query, you don't need a cross-tab.

    A bit of magic with XML is all that's required:

    SELECT d1.tablename,

    (SELECT STUFF(sep, LEN(sep), 1, '')

    FROM (

    SELECT columnname + ',' AS [data()]

    FROM d as d2

    WHERE d1.tablename = d2.tablename

    FOR XML PATH('')

    ) AS z (sep)) AS columnlist

    FROM d as d1

    GROUP BY d1.tablename

    --------------------------------------------------------------------------------

    sunjiulu (8/19/2008)

    the pivot on aggregation function works on numbers, what I want is a pivot on literal/character. for example, instead of having sum(amount) to be pivoted, I'd like to see a varchar column to be concatenated.

    create table d(tablename varchar(9), columnmane varchar(9))

    insert into d values ('tableA','col1')

    insert into d values ('tableA','col2')

    insert into d values ('tableB','col1')

    insert into d values ('tableB','col2')

    I want to see a result like

    tablename , columnlist

    ---------, ----------

    tablea col1, col2

    tableb col1, col2

    I can only use cursor to program it for now, but like to see a solution with a single pivot sql statement if anyone know how to do it.

    Here is another way:

    SELECT tablename,

    CASE ISNULL([col1],',') WHEN ',' THEN '' ELSE [col1]+', ' END

    +

    CASE ISNULL([col2],',') WHEN ',' THEN '' ELSE [col2] END columnlist

    FROM

    (SELECT tablename, columnname

    FROM d) P

    PIVOT (max(columnname)

    FOR columnname IN ([col1],[col2])) AS PVT

    tablename columnlist

    tableA col1, col2

    tableB col1, col2

    This query will not allow the values col1 & col2 to be repeated. This is for you, sunjiulu to decide how it suits your requirement.

    In case you need to make the columnlist dynamic, you need to modify as descibed in the below article

    http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx

    Maz

    -----------------------------------------------------------[font=Arial Black]Time Is Money[/font][font=Arial Narrow]Calculating the Number of Business Hours Passed since a Point of Time[/url][/font][font=Arial Narrow]Calculating the Number of Business Hours Passed Between Two Points of Time[/font]

  • dphillips (8/22/2008)


    I wanted to retract with Jeff Moden before posting this publicly, but his mailbox here is full.

    You can always send me an email. My PM on SSC is full of things I want to keep and was looking for an easy way to download all the PM's.

    One thing I am highly curious about: what was it that may have intrigued you about my post Jeff, that you held off replying? Double checking Pivot info? Something about the cartesian join? Maybe the nukeing the nulls before they bubble up to the top layer? Curve Modelling? No time? Or just giving me time to dwell...

    I await your next article...

    My apologies to the community for my ignorance... I shall endeavor to be more careful.

    Nope... I got a new job and it's 112 miles round trip... plus, been putting in some extra time... haven't been able to spend much time on the home computer.

    I'm trying to catch up... reading these in reverse order...

    Edit: Oh, NOW I get it... no, I didn't mean to post with no response... must've been a late night read...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dphillips (8/20/2008)


    When one wants no NULLs in the output, is it not faster to put that in the base FROM "derived table", view, or query... the same as mentioned when converting dates to months or quarters? Why NULL check at the cross tab or PIVOT level? I do not know about speed, but it sure is easier to read not having ISNULL or COALESCE everywhere. I tend to lean on the idea that data cleansing should not be happening at the report, cube, or group level, and that these levels should merely roll up and present the data.

    Also, trying to compare "cross tab" to PIVOT seems arbitrary to me: like apples to oranges. PIVOT is highly useful where the result PIVOT column names are dynamic, whereas a "cross tab" to do the same would be... complicated... I mean, that is the whole point of having a PIVOT. But it sounds like you may probably cover that in a later section. The only reason in the examples given so far to specifically SELECT the fields in a PIVOT(s) is to demonstrate 1) COALESCE the output - see first paragraph, and 2) Alias the fields - if the column names are instead derived from a lookup table in the "derived table" FROM clause, or if PIVOTing GROUPed data to be used as your column names, hard aliasing goes away. For results that require all possible PIVOT columns in the output, even when column values have no data, a cartesian join against all the output columns solves that. For example, say I wanted to show a 120-day spread curve model on certain datapoint, but data was not logged or received every single day of the 120 days, I use a cartesian join against a temp (or fixed) dataset that numbers from 1 to 120, and then gather the PIVOT results to the days they land on (and handling the NULL values in the process). Bring on some graphical tool to display the data and the Execs go, "WOOOOOH... AHHHHH!" Do this for all datapoints at once and suddenly the CEO knows your first name... but I digress.

    Finally, the metion by another poster about MS ACCESS since 1992 is a fine one. As a user since MSA v2.0 and MS SQL Server since version 6... well, it is a fine mention, and completely off topic. 🙂

    You bring up some good questions about where the null conversions come in and I'll have to test. My point was to show that you don't need the conversions at all with a cross tab.

    Heh... obviously, I don't believe that comparing cross-tabs and pivots is like comparing apples and oranges. The Pivot command is supposed to be a replacement for cross tabs. Heck, because of the Pivot command, 2k5 doesn't even have an index listing for cross tabs, anymore.

    I agree... outer joining the data with a calendar table or some other source of contiguous dates can be made to do some magic with missing information depending on what you're trying to do.

    So far as any type of formatting goes, yeah... I thought I stressed that final formatting should probably be done at the GUI level... maybe I didn't stress it enough. Being more of a data troll and ETL critter than I care to admit, I've not played much with Reporting Services, yet... anyone know if SSRS has a built in cross tab or pivot function making these T-SQL antics obsolete?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Matt Frowe (8/21/2008)


    Hi,

    Just wanted to say thanks for the great article - I myself was actually one of the noobies who posted in the forums about this sort of thing. Looking forward to the dynamic cross-tab article(s)!

    Matt

    Thanks for the feedback, Matt. I really appreciate it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DAvid (8/20/2008)


    Wanted to see what the difference aggregating by scalar functions was to see how it would perform based on a native datetime so I ran

    PRINT '=============== "Normal" Cross Tab ==============='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT Company,

    Year,

    SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,

    SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,

    SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,

    SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,

    SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,

    SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,

    SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,

    SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,

    SUM(Amount) AS TotalAmt,

    SUM(Quantity) AS TotalQty

    FROM #SomeTable3

    GROUP BY Company, Year

    ORDER BY Company, Year

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT '=============== "UGLY inline datefunction" Cross Tab ==============='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT Company,

    Year(date),

    SUM(CASE WHEN month(date) between 1 and 3 THEN Amount ELSE 0 END) AS Q1Amt,

    SUM(CASE WHEN month(date) between 1 and 3 THEN Quantity ELSE 0 END) AS Q1Qty,

    SUM(CASE WHEN month(date) between 4 and 6 THEN Amount ELSE 0 END) AS Q2Amt,

    SUM(CASE WHEN month(date) between 4 and 6 THEN Quantity ELSE 0 END) AS Q2Qty,

    SUM(CASE WHEN month(date) between 7 and 9 THEN Amount ELSE 0 END) AS Q3Amt,

    SUM(CASE WHEN month(date) between 7 and 9 THEN Quantity ELSE 0 END) AS Q3Qty,

    SUM(CASE WHEN month(date) between 10 and 12 THEN Amount ELSE 0 END) AS Q4Amt,

    SUM(CASE WHEN month(date) between 10 and 12 THEN Quantity ELSE 0 END) AS Q4Qty,

    SUM(Amount) AS TotalAmt,

    SUM(Quantity) AS TotalQty

    FROM #SomeTable3

    GROUP BY Company, Year(date)

    ORDER BY Company, Year(date)

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    Which gave

    =============== "Normal" Cross Tab ===============

    CPU time = 2656 ms, elapsed time = 1506 ms.

    =============== "UGLY inline datefunction" Cross Tab ===============

    CPU time = 5016 ms, elapsed time = 2828 ms.

    Thanks DAvid... I haven't had the time to try it here, yet, but I think pre-aggregation may help the performance here...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rog pike (8/22/2008)


    Jeff writes

    >Ok... show us the code that you would do it in RAC with.

    Here yago.

    Minimal Rac execute using standard defaults.

    Nicely done... thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/23/2008)


    So far as any type of formatting goes, yeah... I thought I stressed that final formatting should probably be done at the GUI level... maybe I didn't stress it enough. Being more of a data troll and ETL critter than I care to admit, I've not played much with Reporting Services, yet... anyone know if SSRS has a built in cross tab or pivot function making these T-SQL antics obsolete?

    SSRS has what they call a matrix report that is supposed to do this. I have not worked with it yet so I really cannot say how well it works.

    Even with the ability to create cross-tab reports (in Access, SSRS, etc...) I have had to resort to pivoting the data in SQL more often than not because the cross-tab\pivot reports (or controls) were just too limited in what they provide.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks, Jeff...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeffrey Williams (8/23/2008)


    Jeff Moden (8/23/2008)


    So far as any type of formatting goes, yeah... I thought I stressed that final formatting should probably be done at the GUI level... maybe I didn't stress it enough. Being more of a data troll and ETL critter than I care to admit, I've not played much with Reporting Services, yet... anyone know if SSRS has a built in cross tab or pivot function making these T-SQL antics obsolete?

    SSRS has what they call a matrix report that is supposed to do this. I have not worked with it yet so I really cannot say how well it works.

    Even with the ability to create cross-tab reports (in Access, SSRS, etc...) I have had to resort to pivoting the data in

    SQL more often than not because the cross-tab\pivot reports (or controls) were just too limited in what they provide.

    The matrix in SSIS works okay, but when you don't have a fixed number of columns in the matrix you quickly lose the it's usefuleness because it overflows the page. It does not, to my knowledge, to keep a certain number of columns on the page.

  • Thanks, Jack...

    That's really the rub of it all when you're trying to print somerthing... fixed amount of real estate limited by the fat that the print still has to be large enough to be able to read it without a magnifying glass.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dphillips (8/20/2008)


    When one wants no NULLs in the output, is it not faster to put that in the base FROM "derived table", view, or query... the same as mentioned when converting dates to months or quarters? Why NULL check at the cross tab or PIVOT level? I do not know about speed, but it sure is easier to read not having ISNULL or COALESCE everywhere. I tend to lean on the idea that data cleansing should not be happening at the report, cube, or group level, and that these levels should merely roll up and present the data.

    Finally... some time to spend on this...

    If you want to suppress NULL's caused by missing data in a Pivot, there's no other way to do it than in the final Select. That's because you don't have Null's in the data until the final Select. Here's the proof... data is the same as before but has been "wounded" so it's missing all the data for a given Year, Company, and Quarter. Cross Tab takes it in stride no problem. Pivot, even though ALL data sources that can take COALESCE, still produces NULL's in the output because it's a position of data missing...

    drop table #SomeTable3

    go

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "Company" has a range of "AAA" to "BBB" non-unique 3 character strings

    -- Column "Amount has a range of 0.0000 to 9999.9900 non-unique numbers

    -- Column "Quantity" has a range of 1 to 50,000 non-unique numbers

    -- Column "Date" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Columns Year and Quarter are the similarly named components of Date

    -- Jeff Moden

    SELECT TOP 1000000 --<<Look! Change this number for testing different size tables

    RowNum = IDENTITY(INT,1,1),

    Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%2+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%2+65),

    Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),

    Quantity = ABS(CHECKSUM(NEWID()))%50000+1,

    Date = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    Year = CAST(NULL AS SMALLINT),

    Quarter = CAST(NULL AS TINYINT)

    INTO #SomeTable3

    FROM Master.sys.SysColumns t1

    CROSS JOIN

    Master.sys.SysColumns t2

    --===== Fill in the Year and Quarter columns from the Date column

    UPDATE #SomeTable3

    SET Year = DATEPART(yy,Date),

    Quarter = DATEPART(qq,Date)

    --===== A table is not properly formed unless a Primary Key has been assigned

    -- Takes about 1 second to execute.

    ALTER TABLE #SomeTable3

    ADD PRIMARY KEY CLUSTERED (RowNum)

    CREATE NONCLUSTERED INDEX IX_#SomeTable3_Cover1

    ON dbo.#SomeTable3 (Company, Year)

    INCLUDE (Amount, Quantity, Quarter)

    --===== "Wound" the table to produce some null data at the final select level

    DELETE #SomeTable3 WHERE Year = 2002 AND Company = 'AAA' AND Quarter = 1

    GO

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    GO

    ---------------------------------------------------------------------------------------------------

    --===== "Pre-aggregated" Cross Tab with CTE

    PRINT REPLICATE('=',100)

    PRINT '=============== "Pre-aggregated" Cross Tab with CTE ==============='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    ;WITH

    ctePreAgg AS

    (SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity

    FROM #SomeTable3

    GROUP BY Company,Year,Quarter

    )

    SELECT Company,

    Year,

    SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,

    SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,

    SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,

    SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,

    SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,

    SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,

    SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,

    SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,

    SUM(Amount) AS TotalAmt,

    SUM(Quantity) AS TotalQty

    FROM ctePreAgg

    GROUP BY Company, Year

    ORDER BY Company, Year

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    ---------------------------------------------------------------------------------------------------

    --===== "Pre-aggregated" Pivot with CTE

    PRINT REPLICATE('=',100)

    PRINT '=============== "Pre-aggregated" Pivot with CTE ==============='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    ;WITH

    ctePreAgg AS

    (SELECT Company,Year,Quarter,COALESCE(SUM(Amount),0) AS Amount,COALESCE(SUM(Quantity),0) AS Quantity

    FROM #SomeTable3

    GROUP BY Company,Year,Quarter

    )

    SELECT amt.Company,

    amt.Year,

    amt.[1] AS Q1Amt,

    qty.[1] AS Q1Qty,

    amt.[2] AS Q2Amt,

    qty.[2] AS Q2Qty,

    amt.[3] AS Q3Amt,

    qty.[3] AS Q3Qty,

    amt.[4] AS Q4Amt,

    qty.[4] AS Q5Qty,

    amt.[1]+amt.[2]+amt.[3]+amt.[4] AS TotalAmt,

    qty.[1]+ qty.[2]+qty.[3]+qty.[4] AS TotalQty

    FROM (SELECT Company, Year, Quarter, COALESCE(Amount,0) AS Amount FROM ctePreAgg) AS t1

    PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt

    INNER JOIN

    (SELECT Company, Year, Quarter, COALESCE(Quantity,0) AS Quantity FROM ctePreAgg) AS t2

    PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty

    ON qty.Company = amt.Company

    AND qty.Year = amt.Year

    ORDER BY amt.Company, amt.Year

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    So far as how "complicated" dynamic cross tab/pivot code is... especially with "intelligent aliasing", guess you'll have to wait for part 2.:P

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/23/2008)


    If you want to suppress NULL's caused by missing data in a Pivot, there's no other way to do it than in the final Select.

    Kudos for showing the difference between using case and S2005 PIVOT. As for PIVOT l'll give MS the benefit of the doubt and guess they had something different in mind and then realized they couldn't do it or time ran short. Something like OVER, only a partial implementation. As it stands it's rather childish and of little help to developers.

    As for 'nulls' wherever they come up it's nothing but trouble. To make matters worse many users are using null for two different scenarios. A null value for a column involved in a row, pivot or transform (aggregated) column and the situation you illustrated, unavailable data for a particular cell of the xtab. I won't go into nulls for group by/pivot column data but many companies do not want the entire row of a data eliminated from the xtab if one of these values is null. But some companies want the 'entire row' of a xtab eliminated if any cells are unavailable (your illustration). For example, using the article data, eliminate the Company/Year combination with less than 4 Quarters. Company 'ABC' for Year 2008 is missing 2nd quarter data so you want to eliminate this entire row from xtab. Something like this.

    select Company,year,count(*) as test,

    SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,

    SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,

    SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,

    SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,

    SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,

    SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,

    SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,

    SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,

    SUM(Amount) AS TotalAmt,

    SUM(Quantity) AS TotalQty

    from #sometable2

    group by company,year

    having count(*)=(select Count(Quarter)

    from (select Quarter

    from #sometable2

    group by Quarter) as T1) /* or just 4 here */

    order by company,year

    Company Year Q1Amt Q1Qty Q2Amt Q2Qty Q3Amt Q3Qty Q4Amt Q4Qty TotalAmt TotalQty

    ------- ---- ----- ----- ----- ----- ----- ----- ----- ----- -------- --------

    ABC 2006 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1

    ABC 2007 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0

    XYZ 2006 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7

    XYZ 2007 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3

    XYZ 2008 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3

    If MS incorporated the idea to put constraints on the xtab how should they go about it. Where should the logic go? Should data be eliminated prior to the construction of the xtab, during construction or after? How do you design it to make it easy for users with 'minimum' sql skills?

    For example, here's one way to handle this in Rac.

    Exec Rac

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year',

    @pvtcol='Quarter',

    @from='#SomeTable2',

    @emptycell='0.0',@rowbreak='n',@grand_totals='n',

    @rotate='nest' ,@rowtotalsposition='end',@bulkme='y',@racheck='y',

    @rowfunctions='Count(Amt)', -- Apply any aggregates to the pivoted values for each row. Here

    -- Count(Amt) is count of pivoted Quarters for each Company for

    -- which data is available. We know that the max count must be 4.

    @rowfunctionstrip='y', -- Returns Count(Amt) as a number (without aggregate label).

    @rowfunctionslabel='CntQters', -- Rename column for rowfunctions from default (Rowfunct) to user defined.

    -- Omit row totals and rowfunctions (CntQters) from pivot column collection.

    @exceptions='Amt_CntQters & Qty_CntQters & Amt_Totals & Qty_Totals', @pformat='_pvtcols_',

    @replacepvtcols=

    '{[value] for [select ~Q~+right(~[name]~,1)+left(~[name]~,3)]}',

    @select='select _rows_,Amt_CntQters as CntQters,_pvtcols_,Amt_Totals as TotalAmt,Qty_Totals as TotalQty

    from rac

    order by rd'

    Company Year CntQters Q1Amt Q1Qty Q2Amt Q2Qty Q3Amt Q3Qty Q4Amt Q4Qty TotalAmt TotalQty

    ------- ---- -------- ----- ----- ----- ----- ----- ----- ----- ----- -------- --------

    ABC 2006 4 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1

    ABC 2007 4 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0

    ABC 2008 3 1.5 5.1 0.0 0.0 2.3 3.3 1.9 4.2 5.7 12.6

    XYZ 2006 4 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7

    XYZ 2007 4 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3

    XYZ 2008 4 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3

    Use the rowfunction to eliminate the row with less than 4 quarters in the @select parameter.

    Exec Rac

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year',

    @pvtcol='Quarter',

    @from='#SomeTable2',

    @emptycell='0.0',@rowbreak='n',@grand_totals='n',

    @rotate='nest' ,@rowtotalsposition='end',@bulkme='y',@racheck='y',

    @rowfunctions='count(Amt)',@rowfunctionstrip='y',@rowfunctionslabel='CntQters',

    -- Omit all rowfunctions columns from rotated pivot collection.

    @exceptions='Amt_CntQters & Qty_CntQters & Amt_Totals & Qty_Totals',@pformat='_pvtcols_',

    @replacepvtcols=

    '{[value] for [select ~Q~+right(~[name]~,1)+left(~[name]~,3)]}',

    -- Omit any row of crosstab that doesnt have all (4) quarters. Simply use rotated

    -- column Amt_CntQters in a WHERE statement. The rowfunctions are omitted from xtab.

    @select='select _rows_,_pvtcols_,Amt_Totals as TotalAmt,Qty_Totals as TotalQty

    from rac

    where Amt_CntQters=4

    order by rd'

    Company Year Q1Amt Q1Qty Q2Amt Q2Qty Q3Amt Q3Qty Q4Amt Q4Qty TotalAmt TotalQty

    ------- ---- ----- ----- ----- ----- ----- ----- ----- ----- -------- --------

    ABC 2006 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1

    ABC 2007 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0

    XYZ 2006 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7

    XYZ 2007 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3

    XYZ 2008 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3

    Dynamically omit missing rows (no prior knowledge of distinct quarters).

    Exec Rac

    @transform='Sum(Amount) as Amt & Sum(Quantity) as Qty',

    @rows='Company & Year',

    @pvtcol='Quarter',

    @from='#SomeTable2',

    @emptycell='0.0',@rowbreak='n',@grand_totals='n',

    @rotate='nest' ,@rowtotalsposition='end',@bulkme='y',@racheck='y',

    @rowfunctions='count(Amt)',@rowfunctionstrip='y',@rowfunctionslabel='CntQters',

    @exceptions='Amt_CntQters & Qty_CntQters & Amt_Totals & Qty_Totals',

    @replacepvtcols=

    '{[value] for [select ~Q~+right(~[name]~,1)+left(~[name]~,3)]}',

    -- Get the Max CntQters over the xtab and use it in WHERE.

    @select='declare @MaxQtCnt int

    set @MaxQtCnt=(select Max(1*Amt_CntQters) from rac as R1)

    select _rows_,_pvtcols_,Amt_Totals as TotalAmt,Qty_Totals as TotalQty

    from rac as R2

    where 1*Amt_CntQters=@MaxQtCnt

    order by rd'

    You better believe that even a simple subquery is beyond many users. I hope you understand where I'm coming from. How to make this stuff more understandable and easier.

    www.rac4sql.net

    www.beyondsql.blogspot.com

  • Very cool feedback, Rog. Thanks.

    You better believe that even a simple subquery is beyond many users. I hope you understand where I'm coming from. How to make this stuff more understandable and easier.

    Absolutely agreed. But, even when you spoon feed some folks with something like RAC, I can still see the forum entries because some still won't take the time to RTFM and figure anything out on their own. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 61 through 75 (of 243 total)

You must be logged in to reply to this topic. Login to reply