April 15, 2011 at 11:38 am
hi all,
I want to unpivot multiple columns in my table and i came across this article
http://mangalpardeshi.blogspot.com/2009/04/unpivot-multiple-columns.html
This blog is very nice ,but the problem i have is i don't have an ID column in my table , i am not getting good results with out the where condition
WHERE RIGHT(Suppliers,1) = RIGHT(Cities,1) in the blog .
Is there any way i can unpivot multiple columns for the example in the blog with out a ID(unique) column?
April 16, 2011 at 8:16 pm
srilu_bannu (4/15/2011)
hi all,I want to unpivot multiple columns in my table and i came across this article
http://mangalpardeshi.blogspot.com/2009/04/unpivot-multiple-columns.html
This blog is very nice ,but the problem i have is i don't have an ID column in my table , i am not getting good results with out the where condition
WHERE RIGHT(Suppliers,1) = RIGHT(Cities,1) in the blog .
Is there any way i can unpivot multiple columns for the example in the blog with out a ID(unique) column?
There's a huge performance problem associated with the code in the blog. It has an "Accidental CUBE Join" (Also known as a Double Cross Join) it it. If you use the test data and the code provided in that article and run it with the Actual Execution Plan turned on, you'll find an arrow which shows that 27 rows (3 sets * 3 rows * 3 unpivots) have been spawned interally. With the addition of a 4th row, 36 internal rows are generated. With 1000 rows, 9000 internal rows would be generated. In other words, the code will take 9 times longer and use 9 times the number of resources than it needs to.
Further, as you've already found out, the code also relies on joins that you may not have available.
Try something like this, instead... it only creates the same number of internal rows as what will appear in the final output and it was done without the ID column that you say you don't have... and the code is a whole lot less complex, as well. This code has been tested against the code provided in the blog article you provided the link for.
SELECT s.Product, d.Supplier, d.City
FROM dbo.Suppliers s
CROSS APPLY (
SELECT Supplier1, City1 UNION ALL
SELECT Supplier2, City2 UNION ALL
SELECT Supplier3, City3
) d (Supplier, City)
GO
Hat's off to Paul White for first showing me this method.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 2, 2013 at 9:08 am
Just what I needed !!!
Hats off for sharing this very nice solution!
November 13, 2014 at 1:06 pm
Amazing!
November 14, 2014 at 11:18 am
Thank you both for the kind feedback.
For those looking to modernize (even though a bit less obvious to read, IMHO), we can change the multiple SELECT/UNION ALLs to VALUES, thusly.
SELECT s.Product, d.Supplier, d.City
FROM dbo.Suppliers s
CROSS APPLY (VALUES
(Supplier1, City1)
,(Supplier2, City2)
,(Supplier3, City3)
) d (Supplier, City)
;
Please not that the change does NOT make the code any faster so, when it comes to readability, use your own preference or follow the standards that may be enforced in your shop.
Also notice that the code above does NOT work in version 2005. The original works in all versions from 2005 and up.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 15, 2014 at 4:37 am
I cannot resist pitching in another method using a Tally type cross-tab, which performs at least as good as the method Jeff posted, although it doesn't result in a parallel plan. The difference isn't great on smaller and simpler sets, somewhat greater on larger and more complex sets.
😎
Simple sample set and the two methods
USE tempdb;
GO
SET NOCOUNT ON;
IF OBJECT_ID('dbo.TBL_SAMPLE_DATA') IS NOT NULL DROP TABLE dbo.TBL_SAMPLE_DATA;
CREATE TABLE dbo.TBL_SAMPLE_DATA
(
ROW_ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED NOT NULL
,Airline VARCHAR(50) NOT NULL
,Aircraft1 VARCHAR(50) NOT NULL
,Aircraft2 VARCHAR(50) NOT NULL
,Aircraft3 VARCHAR(50) NOT NULL
,City1 VARCHAR(50) NOT NULL
,City2 VARCHAR(50) NOT NULL
,City3 VARCHAR(50) NOT NULL
)
INSERT INTO dbo.TBL_SAMPLE_DATA
(
Airline
,Aircraft1
,Aircraft2
,Aircraft3
,City1
,City2
,City3
)
VALUES
('AZ','A319','A320','A330','FCO','LIN','MXP')
,('BA','A380','B747','A320','LHR','LGW','LCY')
,('DL','B777','A330','B757','DTW','BOS','LAX')
,('AA','B767','B777','A319','DFW','JFK','ORD')
,('NW','A320','B757','MD90','MSP','SLC','SEA')
,('KL','B747','A380','B737','AMS','CDG','TPE')
;
RAISERROR('CROSS APPLY UNION',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
SELECT s.Airline, d.Aircraft, d.City
FROM dbo.TBL_SAMPLE_DATA s
CROSS APPLY (
SELECT Aircraft1, City1 UNION ALL
SELECT Aircraft2, City2 UNION ALL
SELECT Aircraft3, City3
) d (Aircraft, City);
SET STATISTICS TIME, IO OFF;
RAISERROR('TALLY CASE',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
;WITH NUMA(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
SELECT
SD.Airline
,CASE
WHEN NA.N = 1 THEN SD.Aircraft1
WHEN NA.N = 2 THEN SD.Aircraft2
WHEN NA.N = 3 THEN SD.Aircraft3
END AS Aircraft
,CASE
WHEN NA.N = 1 THEN SD.City1
WHEN NA.N = 2 THEN SD.City2
WHEN NA.N = 3 THEN SD.City3
END AS City
FROM dbo.TBL_SAMPLE_DATA SD
CROSS JOIN NUMA NA
SET STATISTICS TIME, IO OFF;
Output
CROSS APPLY UNION
Msg 50000, Level 1, State 1
SQL Server parse and compile time:
CPU time = 15 ms, elapsed time = 21 ms.
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
TALLY CASE
Msg 50000, Level 1, State 1
SQL Server parse and compile time:
CPU time = 4 ms, elapsed time = 4 ms.
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
A test set generator
USE tempdb;
GO
SET NOCOUNT ON;
IF OBJECT_ID('dbo.TBL_SAMPLE_DATA') IS NOT NULL DROP TABLE dbo.TBL_SAMPLE_DATA;
CREATE TABLE dbo.TBL_SAMPLE_DATA
(
ROW_ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED NOT NULL
,Airline VARCHAR(10) NOT NULL
,Aircraft1 VARCHAR(10) NOT NULL
,Aircraft2 VARCHAR(10) NOT NULL
,Aircraft3 VARCHAR(10) NOT NULL
,City1 VARCHAR(10) NOT NULL
,City2 VARCHAR(10) NOT NULL
,City3 VARCHAR(10) NOT NULL
)
DECLARE @SAMPLE_SIZE INT = 10000000;
;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
,BASE_SAMPLE AS
(
SELECT
STUFF(REPLICATE('0',LEN(@SAMPLE_SIZE) + 1),(LEN(@SAMPLE_SIZE) + 2) - LEN(NM.N),LEN(NM.N),CAST(NM.N AS VARCHAR(10))) AS KEY_COLUMN
,REPLACE(NEWID(),'-','X') AS SAMPLE_STRING
FROM NUMS NM
)
INSERT INTO dbo.TBL_SAMPLE_DATA
(
Airline
,Aircraft1
,Aircraft2
,Aircraft3
,City1
,City2
,City3
)
SELECT
BS.KEY_COLUMN
,SUBSTRING(BS.SAMPLE_STRING, 1, 3)
,SUBSTRING(BS.SAMPLE_STRING, 4, 3)
,SUBSTRING(BS.SAMPLE_STRING, 7, 3)
,SUBSTRING(BS.SAMPLE_STRING,10, 3)
,SUBSTRING(BS.SAMPLE_STRING,13, 3)
,SUBSTRING(BS.SAMPLE_STRING,16, 3)
FROM BASE_SAMPLE BS;
November 15, 2014 at 8:38 am
Very clever.
As you know, I'm one of those people that absolutely love Tally-Table-like constructs but, as I've also said, it's not a panacea of performance. In this case, it requires what I believe to be some unnecessary complexity and, although it performs in roughly the same amount of time, does require 3 times the number of reads on the million row example.
My suggestion would be to go with the simplest code in this case especially because it uses only a third of the IO even if it is logical reads.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2014 at 4:59 am
Jeff Moden (11/15/2014)
Very clever.
😀
As you know, I'm one of those people that absolutely love Tally-Table-like constructs but, as I've also said, it's not a panacea of performance. In this case, it requires what I believe to be some unnecessary complexity and, although it performs in roughly the same amount of time, does require 3 times the number of reads on the million row example.
My suggestion would be to go with the simplest code in this case especially because it uses only a third of the IO even if it is logical reads.
I fully agree that the simpler method is better when working with a simple transposition, in those cases, the Tally CrossTab is an overkill, but it doesn't take much increase in complexity to make it more attractive. As an example, in the previous sample the transposition can be described as A(B_n,C_n) where n is the number of column pairs, each key(A) will produce n number of lines. By expanding the product to A(BC), that is all combinations of Aircraft and City, the CROSS APPLY UNION method requires either a mapping of the Cartesian join in the subquery, where the code quickly becomes somewhat illegible or an introduction of another CROSS APPLY, which seriously hurts the performance.
😎
Quick code sample for the previous data sample
NOTE: this is a comparison of two unpivoting/transposing methods, this code will NOT produce the correct results for the posted problem!
USE tempdb;
GO
SET NOCOUNT ON;
DECLARE @CHAR_BUCKET_01 VARCHAR(10) = '';
DECLARE @CHAR_BUCKET_02 VARCHAR(10) = '';
DECLARE @CHAR_BUCKET_03 VARCHAR(10) = '';
RAISERROR('CROSS APPLY UNION',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
SELECT
@CHAR_BUCKET_01 = Airline
,@CHAR_BUCKET_02 = d.Aircraft
,@CHAR_BUCKET_03 = d.City
FROM dbo.TBL_SAMPLE_DATA s
CROSS APPLY (
SELECT Aircraft1, City1 UNION ALL
SELECT Aircraft1, City2 UNION ALL
SELECT Aircraft1, City3 UNION ALL
SELECT Aircraft2, City1 UNION ALL
SELECT Aircraft2, City2 UNION ALL
SELECT Aircraft2, City3 UNION ALL
SELECT Aircraft3, City1 UNION ALL
SELECT Aircraft3, City2 UNION ALL
SELECT Aircraft3, City3
) d (Aircraft, City);
SET STATISTICS TIME, IO OFF;
RAISERROR('CROSS APPLY UNION 2',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
SELECT
@CHAR_BUCKET_01 = s.Airline
,@CHAR_BUCKET_02 = d.Aircraft
,@CHAR_BUCKET_03 = C.City
FROM dbo.TBL_SAMPLE_DATA s
CROSS APPLY (
SELECT Aircraft1 UNION ALL
SELECT Aircraft2 UNION ALL
SELECT Aircraft3
) d (Aircraft)
CROSS APPLY (
SELECT City1 UNION ALL
SELECT City2 UNION ALL
SELECT City3
) c (City)
SET STATISTICS TIME, IO OFF;
RAISERROR('TALLY CASE',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
;WITH NUMA(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
, NUMB(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
SELECT
@CHAR_BUCKET_01 = SD.Airline
,@CHAR_BUCKET_02 = CASE
WHEN NA.N = 1 THEN SD.Aircraft1
WHEN NA.N = 2 THEN SD.Aircraft2
WHEN NA.N = 3 THEN SD.Aircraft3
END --AS Aircraft
,@CHAR_BUCKET_03 = CASE
WHEN NB.N = 1 THEN SD.City1
WHEN NB.N = 2 THEN SD.City2
WHEN NB.N = 3 THEN SD.City3
END --AS City
FROM dbo.TBL_SAMPLE_DATA SD
OUTER APPLY NUMA NA
OUTER APPLY NUMB NB
SET STATISTICS TIME, IO OFF;
RAISERROR('TALLY CASE 2',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
;WITH NUMA(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
SELECT
@CHAR_BUCKET_01 = SD.Airline
,@CHAR_BUCKET_02 = CASE
WHEN NA.N = 1 THEN SD.Aircraft1
WHEN NA.N = 2 THEN SD.Aircraft2
WHEN NA.N = 3 THEN SD.Aircraft3
END --AS Aircraft
,@CHAR_BUCKET_03 = CASE
WHEN NB.N = 1 THEN SD.City1
WHEN NB.N = 2 THEN SD.City2
WHEN NB.N = 3 THEN SD.City3
END --AS City
FROM dbo.TBL_SAMPLE_DATA SD
OUTER APPLY NUMA NA
OUTER APPLY NUMA NB
SET STATISTICS TIME, IO OFF;
Output (10^6 rows)
CROSS APPLY UNION
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 5, logical reads 6907, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4463 ms, elapsed time = 1822 ms.
CROSS APPLY UNION 2
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 5, logical reads 6907, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5568 ms, elapsed time = 1917 ms.
TALLY CASE
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 6829, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1670 ms, elapsed time = 1685 ms.
TALLY CASE 2
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 6829, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1653 ms, elapsed time = 1662 ms.
Edit: added a NOTE
November 16, 2014 at 4:10 pm
Eirikur Eiriksson (11/16/2014)
Jeff Moden (11/15/2014)
Very clever.😀
As you know, I'm one of those people that absolutely love Tally-Table-like constructs but, as I've also said, it's not a panacea of performance. In this case, it requires what I believe to be some unnecessary complexity and, although it performs in roughly the same amount of time, does require 3 times the number of reads on the million row example.
My suggestion would be to go with the simplest code in this case especially because it uses only a third of the IO even if it is logical reads.
I fully agree that the simpler method is better when working with a simple transposition, in those cases, the Tally CrossTab is an overkill, but it doesn't take much increase in complexity to make it more attractive. As an example, in the previous sample the transposition can be described as A(B_n,C_n) where n is the number of column pairs, each key(A) will produce n number of lines. By expanding the product to A(BC), that is all combinations of Aircraft and City, the CROSS APPLY UNION method requires either a mapping of the Cartesian join in the subquery, where the code quickly becomes somewhat illegible or an introduction of another CROSS APPLY, which seriously hurts the performance.
😎
Quick code sample for the previous data sample
USE tempdb;
GO
SET NOCOUNT ON;
DECLARE @CHAR_BUCKET_01 VARCHAR(10) = '';
DECLARE @CHAR_BUCKET_02 VARCHAR(10) = '';
DECLARE @CHAR_BUCKET_03 VARCHAR(10) = '';
RAISERROR('CROSS APPLY UNION',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
SELECT
@CHAR_BUCKET_01 = Airline
,@CHAR_BUCKET_02 = d.Aircraft
,@CHAR_BUCKET_03 = d.City
FROM dbo.TBL_SAMPLE_DATA s
CROSS APPLY (
SELECT Aircraft1, City1 UNION ALL
SELECT Aircraft1, City2 UNION ALL
SELECT Aircraft1, City3 UNION ALL
SELECT Aircraft2, City1 UNION ALL
SELECT Aircraft2, City2 UNION ALL
SELECT Aircraft2, City3 UNION ALL
SELECT Aircraft3, City1 UNION ALL
SELECT Aircraft3, City2 UNION ALL
SELECT Aircraft3, City3
) d (Aircraft, City);
SET STATISTICS TIME, IO OFF;
RAISERROR('CROSS APPLY UNION 2',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
SELECT
@CHAR_BUCKET_01 = s.Airline
,@CHAR_BUCKET_02 = d.Aircraft
,@CHAR_BUCKET_03 = C.City
FROM dbo.TBL_SAMPLE_DATA s
CROSS APPLY (
SELECT Aircraft1 UNION ALL
SELECT Aircraft2 UNION ALL
SELECT Aircraft3
) d (Aircraft)
CROSS APPLY (
SELECT City1 UNION ALL
SELECT City2 UNION ALL
SELECT City3
) c (City)
SET STATISTICS TIME, IO OFF;
RAISERROR('TALLY CASE',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
;WITH NUMA(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
, NUMB(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
SELECT
@CHAR_BUCKET_01 = SD.Airline
,@CHAR_BUCKET_02 = CASE
WHEN NA.N = 1 THEN SD.Aircraft1
WHEN NA.N = 2 THEN SD.Aircraft2
WHEN NA.N = 3 THEN SD.Aircraft3
END --AS Aircraft
,@CHAR_BUCKET_03 = CASE
WHEN NB.N = 1 THEN SD.City1
WHEN NB.N = 2 THEN SD.City2
WHEN NB.N = 3 THEN SD.City3
END --AS City
FROM dbo.TBL_SAMPLE_DATA SD
OUTER APPLY NUMA NA
OUTER APPLY NUMB NB
SET STATISTICS TIME, IO OFF;
RAISERROR('TALLY CASE 2',1,1) WITH NOWAIT;
SET STATISTICS IO, TIME ON;
;WITH NUMA(N) AS (SELECT N FROM (VALUES (1),(2),(3))AS X(N))
SELECT
@CHAR_BUCKET_01 = SD.Airline
,@CHAR_BUCKET_02 = CASE
WHEN NA.N = 1 THEN SD.Aircraft1
WHEN NA.N = 2 THEN SD.Aircraft2
WHEN NA.N = 3 THEN SD.Aircraft3
END --AS Aircraft
,@CHAR_BUCKET_03 = CASE
WHEN NB.N = 1 THEN SD.City1
WHEN NB.N = 2 THEN SD.City2
WHEN NB.N = 3 THEN SD.City3
END --AS City
FROM dbo.TBL_SAMPLE_DATA SD
OUTER APPLY NUMA NA
OUTER APPLY NUMA NB
SET STATISTICS TIME, IO OFF;
Output (10^6 rows)
CROSS APPLY UNION
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 5, logical reads 6907, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4463 ms, elapsed time = 1822 ms.
CROSS APPLY UNION 2
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 5, logical reads 6907, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5568 ms, elapsed time = 1917 ms.
TALLY CASE
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 6829, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1670 ms, elapsed time = 1685 ms.
TALLY CASE 2
Msg 50000, Level 1, State 1
Table 'TBL_SAMPLE_DATA'. Scan count 1, logical reads 6829, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1653 ms, elapsed time = 1662 ms.
Careful now... If there are 3 column pairs for AirCraft and City, each pair of which is supposed to return a single row (i.e. 3 un-pivoted rows for each original row), then none of the queries above are correct because, in the presence of 1*10^6 original rows, the result set returns 9*10^6 rows instead of 3*10^6 rows.
While that may be what you intended, it's not the solution for the given problem.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2014 at 10:55 pm
Jeff Moden (11/16/2014)
Careful now... If there are 3 column pairs for AirCraft and City, each pair of which is supposed to return a single row (i.e. 3 un-pivoted rows for each original row), then none of the queries above are correct because, in the presence of 1*10^6 original rows, the result set returns 9*10^6 rows instead of 3*10^6 rows.While that may be what you intended, it's not the solution for the given problem.
Absolutely right, the last code example is not a correct solution to the problem, it's a comparison of two unpivoting/transposing methods.
😎
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply