Pivot Query Question

  • joshtheflame (11/14/2010)


    Sachin great work 🙂 just need four things and it will be perfect cause i am trying to change but it has error...not good with sql though.

    1. I've lost my field Aliases (eg. instead of 2010/10/01 i need 1-Oct, 2-Oct ....31-Oct)

    2. Order by Task_ID

    3. Total At the end instead of begining..

    4. Instead of showing NULL the field value should show "0" in column values

    rest is pefect ..

    What do you want to show for Task_ID's that have no entry for a given month? For example, Task_ID 10 has entries in the table but not for July 2010, do you want Task_ID 10 to show up at all if you request a report for July 2010?

    Also, you just want to show a total for the month... not a total for the year, right?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... here we go. First, since you're new to T-SQL, you might want to read up on the following articles so you can do this on your own in the future. They're kind of long but well worth it and they'll turn you into a reporting Ninja virtually overnight...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    In the code that follows, you'll find the need for a Tally table. I'm intentionally not giving you the code to build one here because I do want you to read the Tally Table article.

    We also need some test data... and we need LOTS of test data to make sure the code is running correctly. There are a few lessons to be had if you analyze the following test code generator. As usually with me, some details are in the comments but you'll learn some really cool stuff if you sit down and take the time to "play" with the code. Here's the code to build a million rows of test data and to apply what I believe are probably the correct indexes:

    --=====================================================================================================================

    -- Create a test table and load it with data. This is NOT a part of the solution

    -- Including building the two indexes, this should take less than 19 seconds which is how long it takes on my

    -- 8 year old, single CPU desktop box.

    --=====================================================================================================================

    --===== Conditionally drop the test table to make reruns easier.

    IF OBJECT_ID('tempdb..#YourTable','U') IS NOT NULL

    DROP TABLE #YourTable

    ;

    --===== Create and populate the test table on the fly.

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    Task_ID = ABS(CHECKSUM(NEWID()))%1000+1,

    Task_Date = DATEADD(dd,ABS(CHECKSUM(NEWID()))%DATEDIFF(dd,'2010','2020'),'2010'),

    Task_Count = ABS(CHECKSUM(NEWID()))%1000+1

    INTO #YourTable

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    ALTER TABLE #YourTable

    ADD PRIMARY KEY CLUSTERED (RowNum)

    ;

    CREATE INDEX IX_YourTable_Composite01

    ON #YourTable (Task_Date,Task_ID) INCLUDE (Task_Count)

    ;

    Here's the code to create your report. It can easily be turned into a stored procedure but I didn't want to take away all the fun. 😛 It takes about 2 seconds to run on my ol' single CPU desktop.

    --=====================================================================================================================

    -- Parameters and presets

    --=====================================================================================================================

    --===== Conditionally drop the temp tables to make reruns easier.

    IF OBJECT_ID('tempdb..#DesiredDates','U') IS NOT NULL

    DROP TABLE #DesiredDates

    ;

    IF OBJECT_ID('tempdb..#PreAggregation','U') IS NOT NULL

    DROP TABLE #PreAggregation

    ;

    --===== This would be the parameter for a stored procedure

    DECLARE @pDesiredMonth DATETIME

    ;

    SELECT @pDesiredMonth = 'July 2015'

    ;

    --===== These are some working variables. Their names tell what they do.

    DECLARE @MonthStart DATETIME,

    @NextMonthStart DATETIME,

    @Days INT,

    @sql VARCHAR(MAX)

    ;

    SELECT @MonthStart = DATEADD(mm,DATEDIFF(mm,'1753',@pDesiredMonth),'1753'),

    @NextMonthStart = DATEADD(mm,1,@MonthStart),

    @Days = DATEDIFF(dd,@MonthStart,@NextMonthStart)

    ;

    --=====================================================================================================================

    -- Produce the report.

    -- We could try to do this all in a single query using a CTE but using the "Divide'n'Conquer" methods I used do a

    -- couple of things...

    -- 1. It makes it a hell of a lot easier to troubleshoot especially since dynamic SQL is involved.

    -- 2. The "pre-aggregation" (as Peter Larsson calls it) speeds things up... a LOT! The code below returns

    -- more than 31000 "cells" of aggregated data for 1000 Task_ID's spread across all the dates of a month.

    -- 3. It also produces a "Total" line to total up the task count for each day and a "Grand Total". I know that's

    -- something you didn't ask for but your boss will someday in the future.

    --=====================================================================================================================

    --===== Create all of the dates we'll need for the month.

    -- We'll join the pre-aggregated data to this temp table to make sure that we list zero's for any given day where

    -- all tasks for the month are missing on that particular day.

    -- This is where you'll need a Tally Table. Read the article I posted on it. It WILL change your life.

    SELECT DesiredDay = N,

    DesiredDate = DATEADD(dd,t.N-1,@MonthStart)

    INTO #DesiredDates

    FROM dbo.Tally t

    WHERE N BETWEEN 1 AND @Days

    ;

    --===== This pre-aggregates the data and, yeah, it uses an index seek to do so.

    -- It's just easier than joining the dates above with a million row table and the "reflection" in a CTE

    -- will still cause that join to happen. Instead, we used "Divide'n'Conquer" on this.

    SELECT Task_ID,

    TheDay = DAY(Task_Date),

    Total_Count = SUM(Task_Count)

    INTO #PreAggregation

    FROM #YourTable

    WHERE Task_Date >= @MonthStart

    AND Task_Date < @NextMonthStart

    GROUP BY Task_ID, Task_Date

    ;

    --===== This creates the "Task_ID" portion of the report query.

    -- READ about "GROUP BY WITH CUBE" to learn what "GROUPING" does for WITH ROLLUP and WITH CUBE.

    SELECT @sql = 'SELECT Task_ID = CASE WHEN GROUPING(Task_ID) = 1 THEN SPACE(5)+''Total'' ELSE RIGHT(SPACE(10)+CAST(Task_ID AS VARCHAR(10)),10) END,'

    ;

    --===== This creates the section of the report query that creates the columns for each day of the selected month.

    SELECT @sql = @sql + CHAR(10)

    + 'SUM(CASE WHEN date.DesiredDay = ' + CAST(t.DesiredDay AS VARCHAR(2))

    + ' THEN preagg.Total_Count ELSE 0 END) AS ' + QUOTENAME(CONVERT(CHAR(6),t.DesiredDate,13))+','

    FROM #DesiredDates t

    ;

    --===== This creates the total for each Task_ID and finishes up the query with criteria, grouping, etc, etc.

    SELECT @sql = @sql + '

    Total = SUM(Total_Count)

    FROM #DesiredDates date

    LEFT JOIN #PreAggregation preagg

    ON date.DesiredDay = preagg.TheDay

    WHERE preagg.Task_ID > 0

    GROUP BY preagg.Task_ID WITH ROLLUP --READ ABOUT GROUP BY, WITH ROLLUP, and WITH CUBE (for "GROUPING")

    ORDER BY Task_ID

    ;

    '

    ;

    --===== This just prints the query out as a sanity check and may be eliminated from the production version of the code.

    PRINT @sql

    ;

    --===== This executes the dynamic SQL to actually produce the report

    EXEC (@SQL)

    ;

    Again, take a day or two to read and play with the articles I posted. You won't be sorry.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff thank you for your reply. Well all the tasks will appear with 0 if the data is NULL for any day. Yeah I want to display the TOTAL of each Task_ID of all days in a month not year. One more thing which is very important ...I want my results to appear like the following please follow my question.

    Task_ID....Task_Name......1-Oct...2-Oct...3-Oct....4-Oct....5-Oct...6-Oct...7-Oct.........31-Oct....Total

    Need a query using Pivot or whatever but for the given month data only and total of given month infront of the particular task..if there is no data in any task_id the data will be shown 0 in all days and total will also be 0.

  • joshtheflame (11/14/2010)


    Sachin great work 🙂 just need four things and it will be perfect cause i am trying to change but it has error...not good with sql though.

    1. I've lost my field Aliases (eg. instead of 2010/10/01 i need 1-Oct, 2-Oct ....31-Oct)

    2. Order by Task_ID

    3. Total At the end instead of begining..

    4. Instead of showing NULL the field value should show "0" in column values

    rest is pefect ..

    Hi,

    I dont think that point 3 and 4 is possible with dynamic Pivot but maybe I am wrong.Below is the code for 1 and 2

    create table #tbl (Task_id int,Task_Date datetime,Task_count int)

    insert into #tbl

    select 1,'2010-10-01 00:00:00', 10 union

    select 1,'2010-10-02 00:00:00', 20 union

    select 2,'2010-10-01 00:00:00', 1 union

    select 1,'2010-11-01 00:00:00', 50 union

    select 1,'2010-11-02 00:00:00', 40 union

    select 2,'2010-11-01 00:00:00', 60 union

    select 2,'2010-11-02 00:00:00', 40 union

    select 2,'2010-10-02 00:00:00', 2 union

    select 1,'2010-10-03 00:00:00', 450 union

    select 1,'2010-10-04 00:00:00', 3600

    select * from #tbl order by Task_id,Task_Date

    declare @columnnames varchar(MAX)='';

    declare @StartOfMonth datetime = '10/1/2010';

    declare @counter datetime = @StartOfMonth;

    declare @sql varchar(MAX) = '';

    declare @columnfilter varchar(MAX);

    declare @fieldname varchar(12);

    declare @fieldalias varchar(MAX);

    --First, create a string of dynamic columns, one for each day of the month.

    WHILE (MONTH(@counter) = MONTH(@StartOfMonth))

    BEGIN

    SET @fieldname = '[' + CONVERT(varchar(10), @counter, 101) + ']';

    SET @fieldalias = '[' + CONVERT(VARCHAR(20), datepart(day, @counter), 101) + '-'+ left(CONVERT(VARCHAR(20), datename(MONTH, @counter), 101),3) + ']' ;

    --Wrap the columns in ISNULL(@,0) to avoid having null values for days without tasks.

    SET @columnnames = ISNULL(@columnnames + ',', '') + '' + @fieldalias --+ + @fieldalias;

    --Also create a dynamic list of the Task_Date values to include in the pivot.

    SET @columnfilter = ISNULL(@columnfilter + ',', '') + @fieldname;

    SET @counter = DATEADD(DAY,1,@counter);

    END

    set @sql = 'SELECT * FROM (';

    set @sql = @sql + 'SELECT Task_id,CONVERT(VARCHAR(20), datepart(day, Task_Date), 101) + ''-''+ left(CONVERT(VARCHAR(20), datename(MONTH, Task_Date), 101),3) MM,Task_Count,isnull((select Sum(Task_Count) from #tbl t where datename(mm,Task_Date)=' + '''' + DATEname(mm,@StartOfMonth) +'''' + ' and t.Task_id=M.Task_id ),0)TaskTotal '

    set @sql = @sql + 'FROM #tbl M) as SourceTable ';

    set @sql = @sql + 'PIVOT (SUM(Task_Count) FOR MM IN (' + stuff(@columnnames,1,1,'') + ')) AS PivotTable';

    exec (@sql + ' Order by Task_id')

    drop table #tbl

    --------------------------------------------------------------------------------------------------
    I am just an another naive wannabe DBA trying to learn SQL Server

  • Sachin,

    Thanks alot for the solution....its working like a charm. I think NULL values problem I can overcome by inserting 0's while inserting data in table and total in the begining i can solve on report level as its going to be a stored procedure so I will set this column at the end..I guess this would solve the problem. isnt it?

    Once again I appreciate your efforts 🙂

    God bless you.

  • joshtheflame (11/14/2010)


    Jeff thank you for your reply. Well all the tasks will appear with 0 if the data is NULL for any day. Yeah I want to display the TOTAL of each Task_ID of all days in a month not year. One more thing which is very important ...I want my results to appear like the following please follow my question.

    Task_ID....Task_Name......1-Oct...2-Oct...3-Oct....4-Oct....5-Oct...6-Oct...7-Oct.........31-Oct....Total

    Need a query using Pivot or whatever but for the given month data only and total of given month infront of the particular task..if there is no data in any task_id the data will be shown 0 in all days and total will also be 0.

    So just add the Task_Name to the code I wrote and you'll be done.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff first of all let me salute you for the great procedure..It was way too hectic so I didn't get the chance to experiment something new but you showed the way which was great. I just got time to sit back peacefully and read/implement your solution...gosh my eyes got wide open when I saw the result...simply amazing and darn quick. 🙂

    What if I have TASK_MASTER table with the columns TASK_ID and TASK_NAME and i want to pick task_name from TASK_MASTER and want to show it in your query. Jeff where to add Task_Name in the following query. this is some sort of INNER join WITH TASK_MASTER.

    Actually its a master detail report n only shows the task exists in mater table so want to pick name from master

    --=====================================================================================================================

    -- Parameters and presets

    --=====================================================================================================================

    --===== Conditionally drop the temp tables to make reruns easier.

    IF OBJECT_ID('tempdb..#DesiredDates','U') IS NOT NULL

    DROP TABLE #DesiredDates

    ;

    IF OBJECT_ID('tempdb..#PreAggregation','U') IS NOT NULL

    DROP TABLE #PreAggregation

    ;

    --===== This would be the parameter for a stored procedure

    DECLARE @pDesiredMonth DATETIME

    ;

    SELECT @pDesiredMonth = 'October 2010'

    ;

    --===== These are some working variables. Their names tell what they do.

    DECLARE @MonthStart DATETIME,

    @NextMonthStart DATETIME,

    @Days INT,

    @sql VARCHAR(MAX)

    ;

    SELECT @MonthStart = DATEADD(mm,DATEDIFF(mm,'1753',@pDesiredMonth),'1753'),

    @NextMonthStart = DATEADD(mm,1,@MonthStart),

    @Days = DATEDIFF(dd,@MonthStart,@NextMonthStart)

    ;

    --=====================================================================================================================

    -- Produce the report.

    -- We could try to do this all in a single query using a CTE but using the "Divide'n'Conquer" methods I used do a

    -- couple of things...

    -- 1. It makes it a hell of a lot easier to troubleshoot especially since dynamic SQL is involved.

    -- 2. The "pre-aggregation" (as Peter Larsson calls it) speeds things up... a LOT! The code below returns

    -- more than 31000 "cells" of aggregated data for 1000 Task_ID's spread across all the dates of a month.

    -- 3. It also produces a "Total" line to total up the task count for each day and a "Grand Total". I know that's

    -- something you didn't ask for but your boss will someday in the future.

    --=====================================================================================================================

    --===== Create all of the dates we'll need for the month.

    -- We'll join the pre-aggregated data to this temp table to make sure that we list zero's for any given day where

    -- all tasks for the month are missing on that particular day.

    -- This is where you'll need a Tally Table. Read the article I posted on it. It WILL change your life.

    SELECT DesiredDay = N,

    DesiredDate = DATEADD(dd,t.N-1,@MonthStart)

    INTO #DesiredDates

    FROM dbo.Tally t

    WHERE N BETWEEN 1 AND @Days

    ;

    --===== This pre-aggregates the data and, yeah, it uses an index seek to do so.

    -- It's just easier than joining the dates above with a million row table and the "reflection" in a CTE

    -- will still cause that join to happen. Instead, we used "Divide'n'Conquer" on this.

    SELECT Task_ID,

    TheDay = DAY(Task_Date),

    Total_Count = SUM(Task_Count)

    INTO #PreAggregation

    FROM #TaskDetails

    WHERE Task_Date >= @MonthStart

    AND Task_Date < @NextMonthStart

    GROUP BY Task_ID, Task_Date

    ;

    --===== This creates the "Task_ID" portion of the report query.

    -- READ about "GROUP BY WITH CUBE" to learn what "GROUPING" does for WITH ROLLUP and WITH CUBE.

    SELECT @sql = 'SELECT Task_ID = CASE WHEN GROUPING(Task_ID) = 1 THEN SPACE(5)+''Total'' ELSE RIGHT(SPACE(10)+CAST(Task_ID AS VARCHAR(10)),10) END,'

    ;

    --===== This creates the section of the report query that creates the columns for each day of the selected month.

    SELECT @sql = @sql + CHAR(10)

    + 'SUM(CASE WHEN date.DesiredDay = ' + CAST(t.DesiredDay AS VARCHAR(2))

    + ' THEN preagg.Total_Count ELSE 0 END) AS ' + QUOTENAME(CONVERT(CHAR(6),t.DesiredDate,13))+','

    FROM #DesiredDates t

    ;

    --===== This creates the total for each Task_ID and finishes up the query with criteria, grouping, etc, etc.

    SELECT @sql = @sql + '

    Total = SUM(Total_Count)

    FROM #DesiredDates date

    LEFT JOIN #PreAggregation preagg

    ON date.DesiredDay = preagg.TheDay

    WHERE preagg.Task_ID > 0

    GROUP BY preagg.Task_ID WITH ROLLUP --READ ABOUT GROUP BY, WITH ROLLUP, and WITH CUBE (for "GROUPING")

    ORDER BY Task_ID

    ;

    '

    ;

    --===== This just prints the query out as a sanity check and may be eliminated from the production version of the code.

    PRINT @sql

    ;

    --===== This executes the dynamic SQL to actually produce the report

    EXEC (@SQL)

    ;

  • The "hard part" has been done. Please try adding the additional column on your own. As you say, it's an INNER JOIN.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply