December 11, 2015 at 7:26 pm
Larry.Hennig (12/11/2015)
Columnar output that is row-ordered is difficult to read,
I absolutely agree with that but only if such order is confined to individual pages. Just for grins, try the same thing with a million rows broken by 60 line page.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 11, 2015 at 8:58 pm
Ok... I realize that this is an older thread but people are still posting PIVOTs to do this and that's going to be slower than a good CROSSTAB. The problem is that the CROSSTABs that have been posted haven't been so hot and the best that's happened with all the code posted so far (well, except for the one that Larry.Hennig was kind enough to just make a posting for) is that the various CROSSTAB methods have mostly only tied the better PIVOT methods so far. As Scott Pletcher points out, part of the problem is that almost everyone has been using more than one ROW_NUMBER() and that's just not necessary.
But, even with using just one ROW_NUMBER(), even Scott's good CROSSTAB code was only neck and neck with the fastest (they take turns winning) PIVOT code written by Clif.Johnson and pcanino70 .
You can see all of this by running the following code, which creates a million row table similar to the original in the article (except I use GUIDs converted to VARCHAR to give it a little more bulk for testing and then runs almost all (saving one for later) of the various methods that actually accomplished the task cited in the original article for this discussion. It's all self-contained and runs in TempDB so we don't take a chance on dropping a real table just before building it.
RAISERROR('
--==================================================================================================
Building the test data...
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== Do this in a nice, safe place that everyone has.
USE tempdb
;
--===== If the test table already exists, drop it to make reruns in SSMS easier.
IF OBJECT_ID('tempdb.dbo.LotNo','U') IS NOT NULL
DROP TABLE dbo.LotNo
;
--===== Create and populate the table on-the-fly.
SELECT TOP 1000000
IDNo = IDENTITY(INT,1,7) --Increment by 7 to simulate gaps in IDs
, LotNo = CAST(NEWID() AS VARCHAR(50))
INTO dbo.LotNo
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
RAISERROR('
--==================================================================================================
Baseline Test with just the straight data...
This also "primes the pump" so that everyone has an equal chance.
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT Column1 = LotNo
, Column2 = LotNo
, Column3 = LotNo
INTO #Target1
FROM dbo.LotNo
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Code from the article...
(CROSSTAB - Multiple ROW_NUMBER())
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
with MstrTable ( LotNo, Remainder, ROWID )
as ( select
LotNo
, ( row_number() over ( order by LotNo ) + 2 ) % 3 as Remainder
, case when ( row_number() over ( order by LotNo ) + 2 ) % 3 = 0
then ( row_number() over ( order by LotNo ) + 2 )
else case when ( row_number() over ( order by LotNo )
+ 2 ) % 3 = 1
then ( row_number() over ( order by LotNo )
+ 1 )
else ( row_number() over ( order by LotNo ) )
end
end as RowID
from
dbo.LotNo
)
select
max(case when remainder = 0 then LotNo
else ''
end) as Col1
, max(case when remainder = 1 then LotNo
else ''
end) as Col2
, max(case when remainder = 2 then LotNo
else ''
end) as Col3
INTO #Target1 --ADDED THIS
from
mstrtable
group by
ROWID
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Kev Riley
(CROSSTAB - 2 ROW_NUMBER() in FROM Clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
select --ADDED COLUMN NAMES
Column1 = max(case when remainder = 0 then LotNo else '' end),
Column2 = max(case when remainder = 1 then LotNo else '' end),
Column3 = max(case when remainder = 2 then LotNo else '' end)
INTO #Target1 --ADDED THIS
from
(
select
LotNo
, ( row_number() over ( order by LotNo ) +2 ) % 3 as Remainder
, row_number() over ( order by LotNo ) - ( row_number() over ( order by LotNo ) +2 ) % 3 as ROWID
from
dbo.LotNo
)a
group by ROWID
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Clif.Johnson-989960
(PIVOT - 2 ROW_NUMBER() in FROM Clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT [0] Col1,[1] Col2,[2] Col3
INTO #Target1 --ADDED THIS
FROM (
SELECT LotNo
, ((ROW_NUMBER() OVER (ORDER BY LotNo)-1) % 3) col
, ((ROW_NUMBER() OVER (ORDER BY LotNo)-1) / 3) roww
FROM dbo.LotNo
) x
PIVOT (MIN(LotNo) FOR col IN ([0],[1],[2])) p
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Luis Cazares (Option 1)
(CROSSTAB - 2 ROW_NUMBER() in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH MstrTable AS(
SELECT LotNo
, ( ROW_NUMBER() OVER ( ORDER BY LotNo ) + 2 ) % 3 as Remainder
, ( ROW_NUMBER() OVER ( ORDER BY LotNo ) + 2 ) / 3 as RowID
FROM dbo.LotNo
)
SELECT
MAX ( CASE WHEN remainder = 0 THEN LotNo ELSE '' END ) as Col1
,MAX ( CASE WHEN remainder = 1 THEN LotNo ELSE '' END ) as Col2
,MAX ( CASE WHEN remainder = 2 THEN LotNo ELSE '' END ) as Col3
INTO #Target1 --ADDED THIS
FROM mstrtable
GROUP BY ROWID;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Luis Cazares (Option 2)
(CROSSTAB with NTILE() and ROW_NUMBER())
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH Groups AS(
SELECT LotNo
, NTILE( (SELECT COUNT(*) FROM dbo.LotNo) / 3 + 1) OVER(ORDER BY LotNo) as RowID
FROM dbo.LotNo
),
mstrtable AS(
SELECT LotNo,
RowID,
ROW_NUMBER() OVER( PARTITION BY RowID ORDER BY LotNo) AS remainder
FROM Groups
)
SELECT
MAX ( CASE WHEN remainder = 1 THEN LotNo ELSE '' END ) as Col1
,MAX ( CASE WHEN remainder = 2 THEN LotNo ELSE '' END ) as Col2
,MAX ( CASE WHEN remainder = 3 THEN LotNo ELSE '' END ) as Col3
INTO #Target1 --ADDED THIS
FROM mstrtable
GROUP BY ROWID;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
ScottPletcher
(CROSSTAB with single ROW_NUMBER() in FROM clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT
MAX(CASE WHEN row_num % 3 = 0 THEN LotNo ELSE '' END) AS Lot1,
MAX(CASE WHEN row_num % 3 = 1 THEN LotNo ELSE '' END) AS Lot2,
MAX(CASE WHEN row_num % 3 = 2 THEN LotNo ELSE '' END) AS Lot3
INTO #Target1 --ADDED THIS
FROM (
SELECT LotNo, ROW_NUMBER() OVER(ORDER BY LotNo) + 2 AS row_num
FROM dbo.LotNo
) AS derived
GROUP BY row_num / 3
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
zbychbor
(CROSSTAB - Multiple ROW_NUMBER() and 3 calls to the CTE using self joins)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
;with lot(lotNo) as (
select lotNo
FROM dbo.LotNo
)
, cte02 as (
select rn = row_number() over (order by lotNo)
,rn2 = 1 + (-1 + row_number() over (order by lotNo)) % 3
,rn3 = 1 + (-1 + row_number() over (order by lotNo)) / 3
,*
from lot
)
, cte03 as (
select rn = C1.rn
, L1 = C1.lotNo
, L2 = c2.lotNo
, L3 = c3.lotNo
from cte02 as C1
left join cte02 as C2 on C1.rn3 = C2.rn3 and c2.rn2 = 2
left join cte02 as C3 on C1.rn3 = C3.rn3 and c3.rn2 = 3
where c1.rn2 = 1
)
select L1
,L2 = isnull(L2, '')
,L3 = isnull(L3, '')
INTO #Target1 --ADDED THIS
from cte03
order by rn
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
pcanino70
(PIVOT - 2 ROW_NUMBER() in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
with cte
as
(
select LotNo
, ( row_number() over ( order by LotNo ) + 2 ) % 3 as Remainder
, ( row_number() over ( order by LotNo ) - 1) / 3 as Category
from dbo.LotNo
)
select Category, [0] as column1, [1] as column2, [2] as column3
INTO #Target1 -- ADDED THIS
from cte
pivot
(
min(LotNo) for Remainder in ([0],[1],[2])
) as piv
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Wilson.Mead
(PIVOT)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
IF OBJECT_ID('tempdb..#ColumnNames','U') IS NOT NULL
DROP TABLE #ColumnNames
;
--===== Conduct the test
SET STATISTICS TIME ON
;
CREATE TABLE #ColumnNames (ColumnKey int, Columnname VARCHAR(10))
DELETE #ColumnNames
INSERT INTO #ColumnNames (ColumnKey, ColumnName)
Values (0, 'Column1')
, (1, 'Column2')
, (2, 'Column3')
SELECT max(column1) Lot1, max(column2) Lot2, max(column3) Lot3
INTO #Target1 --ADDED THIS
FROM (
SELECT Source.LotNo, source.ColumnKey, Source.RowKey, map.Columnname FROM (
SELECT LotNo
, (Row_Number() over (order by LotNo) + 2) / 3 as RowKey
, (Row_Number() over (order by LotNo) - 1) % 3 as ColumnKey
FROM dbo.LotNo
) Source
INNER JOIN #ColumnNames Map on Map.ColumnKey = Source.ColumnKey %3
) AS ResultSet
PIVOT (
MAX (LotNo) FOR ColumnName in (Column1, Column2, Column3)
) as it
group by RowKey
SET STATISTICS TIME OFF
;
GO
Here are the run results that I was talking about. They don't show any particular advantage of CROSSTAB over pivot (yet).
--==================================================================================================
Building the test data...
--==================================================================================================
(1000000 row(s) affected)
--==================================================================================================
Baseline Test with just the straight data...
This also "primes the pump" so that everyone has an equal chance.
--==================================================================================================
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 985 ms, elapsed time = 987 ms.
(1000000 row(s) affected)
--==================================================================================================
Code from the article...
(CROSSTAB - Multiple ROW_NUMBER())
--==================================================================================================
SQL Server Execution Times:
CPU time = 15125 ms, elapsed time = 5330 ms.
(333334 row(s) affected)
--==================================================================================================
Kev Riley
(CROSSTAB - 2 ROW_NUMBER() in FROM Clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 14435 ms, elapsed time = 4950 ms.
(333334 row(s) affected)
--==================================================================================================
Clif.Johnson-989960
(PIVOT - 2 ROW_NUMBER() in FROM Clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13203 ms, elapsed time = 4625 ms.
(333334 row(s) affected)
--==================================================================================================
Luis Cazares (Option 1)
(CROSSTAB - 2 ROW_NUMBER() in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 14313 ms, elapsed time = 4758 ms.
(333334 row(s) affected)
--==================================================================================================
Luis Cazares (Option 2)
(CROSSTAB with NTILE() and ROW_NUMBER())
--==================================================================================================
SQL Server Execution Times:
CPU time = 16218 ms, elapsed time = 10088 ms.
(333334 row(s) affected)
--==================================================================================================
ScottPletcher
(CROSSTAB with single ROW_NUMBER() in FROM clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13501 ms, elapsed time = 4411 ms.
(333334 row(s) affected)
--==================================================================================================
zbychbor
(CROSSTAB - Multiple ROW_NUMBER() and 3 calls to the CTE using self joins)
--==================================================================================================
SQL Server Execution Times:
CPU time = 18828 ms, elapsed time = 19000 ms.
(333334 row(s) affected)
--==================================================================================================
pcanino70
(PIVOT - 2 ROW_NUMBER() in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13578 ms, elapsed time = 4698 ms.
(333334 row(s) affected)
--==================================================================================================
Wilson.Mead
(PIVOT)
--==================================================================================================
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 1 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(0 row(s) affected)
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(3 row(s) affected)
SQL Server parse and compile time:
CPU time = 16 ms, elapsed time = 16 ms.
SQL Server Execution Times:
CPU time = 18062 ms, elapsed time = 5708 ms.
Warning: Null value is eliminated by an aggregate or other SET operation.
(333334 row(s) affected)
[font="Arial Black"]A Faster CROSSTAB[/font]
Like I said, the results don't show any particular advantage of CROSSTAB over PIVOT and yet I'm still going to insist that CROSSTAB is about 38% faster and uses about 36% less CPU time than PIVOTs. Run the following code and see the results there.
RAISERROR('
--==================================================================================================
Jeff Moden
(CROSSTAB - 1 ROW_NUMBER in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH
cteEnumerate AS
(
SELECT RowNum = ROW_NUMBER() OVER (ORDER BY IDNo)-1
,LotNo
FROM dbo.LotNo
)
SELECT Column1 = MAX(CASE WHEN RowNum%3 = 0 THEN LotNo ELSE '' END)
, Column2 = MAX(CASE WHEN RowNum%3 = 1 THEN LotNo ELSE '' END)
, Column3 = MAX(CASE WHEN RowNum%3 = 2 THEN LotNo ELSE '' END)
INTO #Target1
FROM cteEnumerate
GROUP BY RowNum/3
;
SET STATISTICS TIME OFF
;
GO
Here are the results from the above.
--==================================================================================================
Jeff Moden
(CROSSTAB - 1 ROW_NUMBER in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 9657 ms, elapsed time = 3345 ms.
(333334 row(s) affected)
If you check the Actual Execution Plans between Scott's CROSSTAB with the ROW_NUMBER() in the FROM clause and mine with the ROW_NUMBER() in the CTE, you'll only see minor differences between them. Someone like Grant Fritchey could explain it much better but what I saw was that the optimizer over estimated the size of the rows for Scott's code. I've not done a deep enough dive to explain better than that but the CTE version also always came out the fastest in the article I posted earlier in this thread.
Can we make the PIVOT run a bit faster by limiting it to only a single ROW_NUMBER()? I'm thinking "NOT" but I welcome someone to try and report the test results compared to the CTE CROSSTAB that I wrote above.
[font="Arial Black"]And now... the rest of the story.[/font]
Now, I told you that story to tell you this one. A long time ago when 2005 first came out, a good friend of mine who goes by the handle of "Sergiy" on this forum and I took on all comers for similar testing except we beat the tar out of other folks methods that did use ROW_NUMBER() by using much older tricks that most have forgotten by now. Luis Cazares didn't forget the method when someone asked how to solve this problem in SQL Server 2000 where ROW_NUMBER() didn't yet exist but I'm not sure that he realizes that it's actually the fastest method. So let's run his code and see what happens.
RAISERROR('
--==================================================================================================
Luis Cazares
(CROSSTAB for SQL Server 2000 - uses Temp Table for numbering)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#LotNo','U') IS NOT NULL
DROP TABLE #LotNo
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT IDENTITY( int, 3, 1) AS RowNo, --Note the Identity definition starts at 3
LotNo
INTO #LotNo
FROM dbo.LotNo -- CHANGED THIS TO A TEMP TABLE
SELECT MAX( CASE WHEN RowNo % 3 = 0 THEN LotNo ELSE '' END) AS Column1,
MAX( CASE WHEN RowNo % 3 = 1 THEN LotNo ELSE '' END) AS Column2,
MAX( CASE WHEN RowNo % 3 = 2 THEN LotNo ELSE '' END) AS Column3
INTO #Target1 --ADDED THIS
FROM #LotNo
GROUP BY RowNo / 3
SET STATISTICS TIME OFF
;
GO
... and here are the results from his code.
--==================================================================================================
Luis Cazares
(CROSSTAB for SQL Server 2000 - uses Temp Table for numbering)
--==================================================================================================
(1000000 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 1734 ms, elapsed time = 1732 ms.
SQL Server parse and compile time:
CPU time = 382 ms, elapsed time = 382 ms.
(333334 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 3766 ms, elapsed time = 1187 ms.
If we add up all of the CPU and Elapsed Times including the recompile the creation and population of the Temp Table force, here's what we end up with...
CPU time = 5882 ms, elapsed time = 3301 ms.
... and just like Sergiy and I demonstrated years ago, that blows the doors off of any of the supposed modern methods of pivoting data using T-SQL.
So, not only have we proven that PIVOT is challenged for performance when compared to a properly written "modern" CTE-based CROSSTAB but, if you set bit 137 of DBCC TIMEWARP and meet Mr. Peabody and Sherman in the "Wayback Machine" sometime prior to 2005, you'll find that the older CROSSTABs beat the tar out of everything.
If you're still interested in other performance tests and how to make use of the CTE-based CROSSTABs even faster (for a more generic set of problems), please see the article that I originally posted. Here's the URL for that and the companion article on how to easily write dynamic CROSSTABs that are driven by the data.
http://www.sqlservercentral.com/articles/T-SQL/63681/
http://www.sqlservercentral.com/articles/Crosstab/65048/
--Jeff Moden
Change is inevitable... Change for the better is not.
December 11, 2015 at 9:07 pm
Lol! That would be more of a challenge! But still doable - you would have to calculate page numbers. A high-volume scenario would be another special case. I would probably look to a report formatting engine or automate word with a multiple-column layout and let that mature, optimized code to the work.
December 11, 2015 at 9:28 pm
Larry.Hennig (12/11/2015)
Lol! That would be more of a challenge! But still doable - you would have to calculate page numbers. A high-volume scenario would be another special case. I would probably look to a report formatting engine or automate word with a multiple-column layout and let that mature, optimized code to the work.
Heh... + 1000. Spot on with that! Still, it could be done if push came to shove. Thankfully, there's no pushing or shoving today! ๐
--Jeff Moden
Change is inevitable... Change for the better is not.
December 13, 2015 at 10:48 am
x
December 13, 2015 at 10:51 am
Jeff Moden (12/11/2015)
Ok... I realize that this is an older thread but people are still posting PIVOTs to do this and that's going to be slower than a good CROSSTAB. The problem is that the CROSSTABs that have been posted haven't been so hot and the best that's happened with all the code posted so far (well, except for the one that Larry.Hennig was kind enough to just make a posting for) is that the various CROSSTAB methods have mostly only tied the better PIVOT methods so far. As Scott Pletcher points out, part of the problem is that almost everyone has been using more than one ROW_NUMBER() and that's just not necessary.But, even with using just one ROW_NUMBER(), even Scott's good CROSSTAB code was only neck and neck with the fastest (they take turns winning) PIVOT code written by Clif.Johnson and pcanino70 .
You can see all of this by running the following code, which creates a million row table similar to the original in the article (except I use GUIDs converted to VARCHAR to give it a little more bulk for testing and then runs almost all (saving one for later) of the various methods that actually accomplished the task cited in the original article for this discussion. It's all self-contained and runs in TempDB so we don't take a chance on dropping a real table just before building it.
RAISERROR('
--==================================================================================================
Building the test data...
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== Do this in a nice, safe place that everyone has.
USE tempdb
;
--===== If the test table already exists, drop it to make reruns in SSMS easier.
IF OBJECT_ID('tempdb.dbo.LotNo','U') IS NOT NULL
DROP TABLE dbo.LotNo
;
--===== Create and populate the table on-the-fly.
SELECT TOP 1000000
IDNo = IDENTITY(INT,1,7) --Increment by 7 to simulate gaps in IDs
, LotNo = CAST(NEWID() AS VARCHAR(50))
INTO dbo.LotNo
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
RAISERROR('
--==================================================================================================
Baseline Test with just the straight data...
This also "primes the pump" so that everyone has an equal chance.
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT Column1 = LotNo
, Column2 = LotNo
, Column3 = LotNo
INTO #Target1
FROM dbo.LotNo
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Code from the article...
(CROSSTAB - Multiple ROW_NUMBER())
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
with MstrTable ( LotNo, Remainder, ROWID )
as ( select
LotNo
, ( row_number() over ( order by LotNo ) + 2 ) % 3 as Remainder
, case when ( row_number() over ( order by LotNo ) + 2 ) % 3 = 0
then ( row_number() over ( order by LotNo ) + 2 )
else case when ( row_number() over ( order by LotNo )
+ 2 ) % 3 = 1
then ( row_number() over ( order by LotNo )
+ 1 )
else ( row_number() over ( order by LotNo ) )
end
end as RowID
from
dbo.LotNo
)
select
max(case when remainder = 0 then LotNo
else ''
end) as Col1
, max(case when remainder = 1 then LotNo
else ''
end) as Col2
, max(case when remainder = 2 then LotNo
else ''
end) as Col3
INTO #Target1 --ADDED THIS
from
mstrtable
group by
ROWID
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Kev Riley
(CROSSTAB - 2 ROW_NUMBER() in FROM Clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
select --ADDED COLUMN NAMES
Column1 = max(case when remainder = 0 then LotNo else '' end),
Column2 = max(case when remainder = 1 then LotNo else '' end),
Column3 = max(case when remainder = 2 then LotNo else '' end)
INTO #Target1 --ADDED THIS
from
(
select
LotNo
, ( row_number() over ( order by LotNo ) +2 ) % 3 as Remainder
, row_number() over ( order by LotNo ) - ( row_number() over ( order by LotNo ) +2 ) % 3 as ROWID
from
dbo.LotNo
)a
group by ROWID
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Clif.Johnson-989960
(PIVOT - 2 ROW_NUMBER() in FROM Clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT [0] Col1,[1] Col2,[2] Col3
INTO #Target1 --ADDED THIS
FROM (
SELECT LotNo
, ((ROW_NUMBER() OVER (ORDER BY LotNo)-1) % 3) col
, ((ROW_NUMBER() OVER (ORDER BY LotNo)-1) / 3) roww
FROM dbo.LotNo
) x
PIVOT (MIN(LotNo) FOR col IN ([0],[1],[2])) p
;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Luis Cazares (Option 1)
(CROSSTAB - 2 ROW_NUMBER() in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH MstrTable AS(
SELECT LotNo
, ( ROW_NUMBER() OVER ( ORDER BY LotNo ) + 2 ) % 3 as Remainder
, ( ROW_NUMBER() OVER ( ORDER BY LotNo ) + 2 ) / 3 as RowID
FROM dbo.LotNo
)
SELECT
MAX ( CASE WHEN remainder = 0 THEN LotNo ELSE '' END ) as Col1
,MAX ( CASE WHEN remainder = 1 THEN LotNo ELSE '' END ) as Col2
,MAX ( CASE WHEN remainder = 2 THEN LotNo ELSE '' END ) as Col3
INTO #Target1 --ADDED THIS
FROM mstrtable
GROUP BY ROWID;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Luis Cazares (Option 2)
(CROSSTAB with NTILE() and ROW_NUMBER())
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH Groups AS(
SELECT LotNo
, NTILE( (SELECT COUNT(*) FROM dbo.LotNo) / 3 + 1) OVER(ORDER BY LotNo) as RowID
FROM dbo.LotNo
),
mstrtable AS(
SELECT LotNo,
RowID,
ROW_NUMBER() OVER( PARTITION BY RowID ORDER BY LotNo) AS remainder
FROM Groups
)
SELECT
MAX ( CASE WHEN remainder = 1 THEN LotNo ELSE '' END ) as Col1
,MAX ( CASE WHEN remainder = 2 THEN LotNo ELSE '' END ) as Col2
,MAX ( CASE WHEN remainder = 3 THEN LotNo ELSE '' END ) as Col3
INTO #Target1 --ADDED THIS
FROM mstrtable
GROUP BY ROWID;
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
ScottPletcher
(CROSSTAB with single ROW_NUMBER() in FROM clause)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT
MAX(CASE WHEN row_num % 3 = 0 THEN LotNo ELSE '' END) AS Lot1,
MAX(CASE WHEN row_num % 3 = 1 THEN LotNo ELSE '' END) AS Lot2,
MAX(CASE WHEN row_num % 3 = 2 THEN LotNo ELSE '' END) AS Lot3
INTO #Target1 --ADDED THIS
FROM (
SELECT LotNo, ROW_NUMBER() OVER(ORDER BY LotNo) + 2 AS row_num
FROM dbo.LotNo
) AS derived
GROUP BY row_num / 3
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
zbychbor
(CROSSTAB - Multiple ROW_NUMBER() and 3 calls to the CTE using self joins)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
;with lot(lotNo) as (
select lotNo
FROM dbo.LotNo
)
, cte02 as (
select rn = row_number() over (order by lotNo)
,rn2 = 1 + (-1 + row_number() over (order by lotNo)) % 3
,rn3 = 1 + (-1 + row_number() over (order by lotNo)) / 3
,*
from lot
)
, cte03 as (
select rn = C1.rn
, L1 = C1.lotNo
, L2 = c2.lotNo
, L3 = c3.lotNo
from cte02 as C1
left join cte02 as C2 on C1.rn3 = C2.rn3 and c2.rn2 = 2
left join cte02 as C3 on C1.rn3 = C3.rn3 and c3.rn2 = 3
where c1.rn2 = 1
)
select L1
,L2 = isnull(L2, '')
,L3 = isnull(L3, '')
INTO #Target1 --ADDED THIS
from cte03
order by rn
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
pcanino70
(PIVOT - 2 ROW_NUMBER() in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
with cte
as
(
select LotNo
, ( row_number() over ( order by LotNo ) + 2 ) % 3 as Remainder
, ( row_number() over ( order by LotNo ) - 1) / 3 as Category
from dbo.LotNo
)
select Category, [0] as column1, [1] as column2, [2] as column3
INTO #Target1 -- ADDED THIS
from cte
pivot
(
min(LotNo) for Remainder in ([0],[1],[2])
) as piv
SET STATISTICS TIME OFF
;
GO
RAISERROR('
--==================================================================================================
Wilson.Mead
(PIVOT)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
IF OBJECT_ID('tempdb..#ColumnNames','U') IS NOT NULL
DROP TABLE #ColumnNames
;
--===== Conduct the test
SET STATISTICS TIME ON
;
CREATE TABLE #ColumnNames (ColumnKey int, Columnname VARCHAR(10))
DELETE #ColumnNames
INSERT INTO #ColumnNames (ColumnKey, ColumnName)
Values (0, 'Column1')
, (1, 'Column2')
, (2, 'Column3')
SELECT max(column1) Lot1, max(column2) Lot2, max(column3) Lot3
INTO #Target1 --ADDED THIS
FROM (
SELECT Source.LotNo, source.ColumnKey, Source.RowKey, map.Columnname FROM (
SELECT LotNo
, (Row_Number() over (order by LotNo) + 2) / 3 as RowKey
, (Row_Number() over (order by LotNo) - 1) % 3 as ColumnKey
FROM dbo.LotNo
) Source
INNER JOIN #ColumnNames Map on Map.ColumnKey = Source.ColumnKey %3
) AS ResultSet
PIVOT (
MAX (LotNo) FOR ColumnName in (Column1, Column2, Column3)
) as it
group by RowKey
SET STATISTICS TIME OFF
;
GO
Here are the run results that I was talking about. They don't show any particular advantage of CROSSTAB over pivot (yet).
--==================================================================================================
Building the test data...
--==================================================================================================
(1000000 row(s) affected)
--==================================================================================================
Baseline Test with just the straight data...
This also "primes the pump" so that everyone has an equal chance.
--==================================================================================================
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 985 ms, elapsed time = 987 ms.
(1000000 row(s) affected)
--==================================================================================================
Code from the article...
(CROSSTAB - Multiple ROW_NUMBER())
--==================================================================================================
SQL Server Execution Times:
CPU time = 15125 ms, elapsed time = 5330 ms.
(333334 row(s) affected)
--==================================================================================================
Kev Riley
(CROSSTAB - 2 ROW_NUMBER() in FROM Clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 14435 ms, elapsed time = 4950 ms.
(333334 row(s) affected)
--==================================================================================================
Clif.Johnson-989960
(PIVOT - 2 ROW_NUMBER() in FROM Clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13203 ms, elapsed time = 4625 ms.
(333334 row(s) affected)
--==================================================================================================
Luis Cazares (Option 1)
(CROSSTAB - 2 ROW_NUMBER() in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 14313 ms, elapsed time = 4758 ms.
(333334 row(s) affected)
--==================================================================================================
Luis Cazares (Option 2)
(CROSSTAB with NTILE() and ROW_NUMBER())
--==================================================================================================
SQL Server Execution Times:
CPU time = 16218 ms, elapsed time = 10088 ms.
(333334 row(s) affected)
--==================================================================================================
ScottPletcher
(CROSSTAB with single ROW_NUMBER() in FROM clause)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13501 ms, elapsed time = 4411 ms.
(333334 row(s) affected)
--==================================================================================================
zbychbor
(CROSSTAB - Multiple ROW_NUMBER() and 3 calls to the CTE using self joins)
--==================================================================================================
SQL Server Execution Times:
CPU time = 18828 ms, elapsed time = 19000 ms.
(333334 row(s) affected)
--==================================================================================================
pcanino70
(PIVOT - 2 ROW_NUMBER() in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 13578 ms, elapsed time = 4698 ms.
(333334 row(s) affected)
--==================================================================================================
Wilson.Mead
(PIVOT)
--==================================================================================================
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 1 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(0 row(s) affected)
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(3 row(s) affected)
SQL Server parse and compile time:
CPU time = 16 ms, elapsed time = 16 ms.
SQL Server Execution Times:
CPU time = 18062 ms, elapsed time = 5708 ms.
Warning: Null value is eliminated by an aggregate or other SET operation.
(333334 row(s) affected)
[font="Arial Black"]A Faster CROSSTAB[/font]
Like I said, the results don't show any particular advantage of CROSSTAB over PIVOT and yet I'm still going to insist that CROSSTAB is about 38% faster and uses about 36% less CPU time than PIVOTs. Run the following code and see the results there.
RAISERROR('
--==================================================================================================
Jeff Moden
(CROSSTAB - 1 ROW_NUMBER in CTE)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== Conduct the test
SET STATISTICS TIME ON
;
WITH
cteEnumerate AS
(
SELECT RowNum = ROW_NUMBER() OVER (ORDER BY IDNo)-1
,LotNo
FROM dbo.LotNo
)
SELECT Column1 = MAX(CASE WHEN RowNum%3 = 0 THEN LotNo ELSE '' END)
, Column2 = MAX(CASE WHEN RowNum%3 = 1 THEN LotNo ELSE '' END)
, Column3 = MAX(CASE WHEN RowNum%3 = 2 THEN LotNo ELSE '' END)
INTO #Target1
FROM cteEnumerate
GROUP BY RowNum/3
;
SET STATISTICS TIME OFF
;
GO
Here are the results from the above.
--==================================================================================================
Jeff Moden
(CROSSTAB - 1 ROW_NUMBER in CTE)
--==================================================================================================
SQL Server Execution Times:
CPU time = 9657 ms, elapsed time = 3345 ms.
(333334 row(s) affected)
If you check the Actual Execution Plans between Scott's CROSSTAB with the ROW_NUMBER() in the FROM clause and mine with the ROW_NUMBER() in the CTE, you'll only see minor differences between them. Someone like Grant Fritchey could explain it much better but what I saw was that the optimizer over estimated the size of the rows for Scott's code. I've not done a deep enough dive to explain better than that but the CTE version also always came out the fastest in the article I posted earlier in this thread.
Can we make the PIVOT run a bit faster by limiting it to only a single ROW_NUMBER()? I'm thinking "NOT" but I welcome someone to try and report the test results compared to the CTE CROSSTAB that I wrote above.
[font="Arial Black"]And now... the rest of the story.[/font]
Now, I told you that story to tell you this one. A long time ago when 2005 first came out, a good friend of mine who goes by the handle of "Sergiy" on this forum and I took on all comers for similar testing except we beat the tar out of other folks methods that did use ROW_NUMBER() by using much older tricks that most have forgotten by now. Luis Cazares didn't forget the method when someone asked how to solve this problem in SQL Server 2000 where ROW_NUMBER() didn't yet exist but I'm not sure that he realizes that it's actually the fastest method. So let's run his code and see what happens.
RAISERROR('
--==================================================================================================
Luis Cazares
(CROSSTAB for SQL Server 2000 - uses Temp Table for numbering)
--=================================================================================================='
,0,0) WITH NOWAIT
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Target1','U') IS NOT NULL
DROP TABLE #Target1
;
--===== If the Temp Tables already exist, drop them to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#LotNo','U') IS NOT NULL
DROP TABLE #LotNo
;
--===== Conduct the test
SET STATISTICS TIME ON
;
SELECT IDENTITY( int, 3, 1) AS RowNo, --Note the Identity definition starts at 3
LotNo
INTO #LotNo
FROM dbo.LotNo -- CHANGED THIS TO A TEMP TABLE
SELECT MAX( CASE WHEN RowNo % 3 = 0 THEN LotNo ELSE '' END) AS Column1,
MAX( CASE WHEN RowNo % 3 = 1 THEN LotNo ELSE '' END) AS Column2,
MAX( CASE WHEN RowNo % 3 = 2 THEN LotNo ELSE '' END) AS Column3
INTO #Target1 --ADDED THIS
FROM #LotNo
GROUP BY RowNo / 3
SET STATISTICS TIME OFF
;
GO
... and here are the results from his code.
--==================================================================================================
Luis Cazares
(CROSSTAB for SQL Server 2000 - uses Temp Table for numbering)
--==================================================================================================
(1000000 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 1734 ms, elapsed time = 1732 ms.
SQL Server parse and compile time:
CPU time = 382 ms, elapsed time = 382 ms.
(333334 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 3766 ms, elapsed time = 1187 ms.
If we add up all of the CPU and Elapsed Times including the recompile the creation and population of the Temp Table force, here's what we end up with...
CPU time = 5882 ms, elapsed time = 3301 ms.
... and just like Sergiy and I demonstrated years ago, that blows the doors off of any of the supposed modern methods of pivoting data using T-SQL.
So, not only have we proven that PIVOT is challenged for performance when compared to a properly written "modern" CTE-based CROSSTAB but, if you set bit 137 of DBCC TIMEWARP and meet Mr. Peabody and Sherman in the "Wayback Machine" sometime prior to 2005, you'll find that the older CROSSTABs beat the tar out of everything.
If you're still interested in other performance tests and how to make use of the CTE-based CROSSTABs even faster (for a more generic set of problems), please see the article that I originally posted. Here's the URL for that and the companion article on how to easily write dynamic CROSSTABs that are driven by the data.
Well done! Like the results. The one thing I like about the pivot is for me it allows easy expansion if you need to fatten this out more.
December 13, 2015 at 7:25 pm
pcanino70 (12/13/2015)
Well done! Like the results. The one thing I like about the pivot is for me it allows easy expansion if you need to fatten this out more.
Thanks for the feedback but let me ask you a question... why do you think doing that same thing with a CROSSTAB is difficult?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 13, 2015 at 10:09 pm
Then I suggest, Jeff, you present your solution for an unknown ahead of time number of columns, which may be different between records. The simplest way I could find to do it was with dynamic SQL temp table, scaning the table once to determine the columns, once to determine the rows, and a final update to fill in the values,
Ted Seeber
I find your secrets in your data.
http://www.informaitonr.us
December 14, 2015 at 5:48 am
Jeff Moden (12/13/2015)
pcanino70 (12/13/2015)
Well done! Like the results. The one thing I like about the pivot is for me it allows easy expansion if you need to fatten this out more.Thanks for the feedback but let me ask you a question... why do you think doing that same thing with a CROSSTAB is difficult?
For this example I think you are right on the mark and the cross tab is defiantly the correct approach and Iโm rethinking some of my queries, my concern was on some more complex queries that I havenโt had a chance to evaluate to see if there is a benefit of pivot over cross tabs. Excellent post, thank you very much for your insight.
December 14, 2015 at 6:29 am
Thanks for the article. Quite a number of good alternatives in the comments too.
December 14, 2015 at 7:01 am
I did some further analysis on Jeff Moden's example and I'm convinced his cross tab method is the proper way to go, I just converted a query that is almost 40% faster to execute with80% less CPU time. Well done sir, you have changed my approach going forward...
December 16, 2015 at 6:42 am
Here is simple way of getting the output.
declare @LotNo table (id int identity(1,1),LotNo varchar(50) Null)
insert into @LotNo
select 'A'
union all
select 'B'
union all
Select 'C'
union all
select 'D'
union all
Select 'E'
union all
select 'F'
union all
Select 'G'
union all
select 'H'
union all
Select 'I'
union all
select 'J'
union all
Select 'K'
union all
select 'L'
union all
Select 'M'
union all
select 'N'
union all
Select 'O'
union all
select 'P'
union all
Select 'Q'
union all
select 'R'
union all
Select 'S'
union all
select 'T'
union all
Select 'U'
union all
select 'V'
union all
Select 'W'
union all
select 'X'
union all
Select 'Y'
union all
select 'Z'
select col1,col2,col3 from
(
select Id,LotNo col1,
lead(LotNo,1,'') over (order by LotNo) col2,
lead(LotNo,2,'') over (order by LotNo) col3
from @LotNo
)a
where (id%3)=1
Regards
Siva ๐
December 16, 2015 at 7:04 am
sivaj2k (12/16/2015)
Here is simple way of getting the output....
Regards
Siva ๐
Only if you can be sure that there won't be any gaps. In the real world, that's unlikely to happen.
December 16, 2015 at 2:21 pm
pcanino70 (12/14/2015)
I did some further analysis on Jeff Moden's example and I'm convinced his cross tab method is the proper way to go, I just converted a query that is almost 40% faster to execute with80% less CPU time. Well done sir, you have changed my approach going forward...
Well done to you! A lot of people won't go back to things that are working to try to make them better. And, thank you very much for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 14 posts - 46 through 58 (of 58 total)
You must be logged in to reply to this topic. Login to reply