October 27, 2009 at 3:58 pm
I've got a table that is populated from the OUTPUT from an update statement:
declare @Update TABLE (
RowID INT,
Column1_old DECIMAL(2,0),
Column1_new DECIMAL(2,0),
Column2_old DATETIME,
Column2_new DATETIME,
Column3_old BIT,
Column3_new BIT)
insert into @Update
select 1, 25, 52, '20090101', '20091001', 1, 1 UNION
select 2, 82, 56, '20090101', '20090101', 1, 0 UNION
select 3, 25, 25, '20091001', '20091001', 1, 1
-- what the incoming data looks like
select * from @Update
I'd like to unpivot the different columns so that it looks like this:
-- desired results:
declare @Updates TABLE (RowID INT, Column_nm varchar(30), old_value varchar(30), new_value varchar(30))
insert into @Updates
select 1, 'Column1', 25, 52 UNION
select 1, 'Column2', '20090101','20091001' UNION
select 1, 'Column3', 1, 1 UNION
select 2, 'Column1', 82, 56 UNION
select 2, 'Column2', '20090101','20090101' UNION
select 2, 'Column3', 1, 0 UNION
select 3, 'Column1', 25, 25 UNION
select 3, 'Column2', '20090101','20091001' UNION
select 3, 'Column3', 1, 1
select * from @Updates order by RowID, Column_nm
The only solution that I've been able to come up with uses a select for each pair of columns, all unioned together:
declare @Updates2 TABLE (RowID INT, Column_nm varchar(30), old_value varchar(30), new_value varchar(30))
insert into @Updates2
select RowID, 'Column1', convert(varchar(30),column1_old), convert(varchar(30), column1_new) from @Update UNION
select RowID, 'Column2', convert(char(8), column2_old, 112), convert(char(8), column2_new, 112) from @Update UNION
select RowID, 'Column3', convert(char(1),column3_old), convert(char(1),column3_new) from @Update
select * from @Updates2 order by RowID, Column_nm
But, I'd like to avoid the repetitive hits against the table (especially when the actual output is about 30 pairs of columns).
Do you have any idea of if this can be accomplished, and if so, how?
Thanks!!!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
October 27, 2009 at 5:12 pm
To my amazement, this can be done. I sort of spaced out while playing with this and when I regained consciousness here it was. The query plan shows only a single scan of the source table (@update).
Let me know if you have any questions, Wayne.
-- cte is necessary to make all columns compatible datatypes.
-- if date format isn't what you want to see, add CONVERT()
;with cte (rowid,column1_old,column1_new,column2_old,column2_new,column3_old,column3_new) as
(select rowId, cast(column1_old as varchar),cast(column1_new as varchar)
,cast(column2_old as varchar),cast(column2_new as varchar),cast(column3_old as varchar)
,cast(column3_new as varchar)
from @update)
/*
UNPIVOT allows us to write out three rows for 1, but is limited to combining only a single
set of columns into one column. BUT, combined with old-school cross-tabbing using a
CASE expression, it gets the job done.
In the query below, we UNPIVOT to get the "New" column" but crosstab to get the "Old" column.
*/
select rowID,left(c1,7) as column_nm
,case when left(c1,7) = 'column1' then column1_old
when left(c1,7) = 'column2' then column2_old
when left(c1,7) = 'column3' then column3_old
end as old
,p1 as new
FROM
(SELECT *
FROM cte) AS p
UNPIVOT
(p1 FOR c1 IN
(column1_new,column2_new,column3_new)
)AS unpvt
order by rowID,left(c1,7)
I think this is worth filing away for future use when you want to convert a single row into multiple rows, and UNPIVOT alone isn't enough. The code might look clunky, but it's the best execution plan I've seen for that purpose
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
October 28, 2009 at 3:22 am
I have seen how fast is this technique for the first time during the "Phil Factor SQL Speed Phreak Competition: No 1".
It was Peso that came up with the UNPIVOT solution and, with my greatest surprise, it did a single scan on the table, with a very good execution time.
I would have expected UNPIVOT to perform something like a join or union internally, but it doesn't seem to need any of that.
-- Gianluca Sartori
October 28, 2009 at 4:36 am
A nice and surprising solution by Bob (it took me some time to see where c1 in left(c1, 7) was coming from). All I could come up with was:
select
RowID,
case
when column_no = 1 then
'Column1'
when column_no = 2 then
'Column2'
when column_no = 3 then
'Column3'
end Column_nm,
case
when column_no = 1 then
cast(Column1_old as sql_variant)
when column_no = 2 then
cast(Column2_old as sql_variant)
when column_no = 3 then
cast(Column3_old as sql_variant)
end old_value,
case
when column_no = 1 then
cast(Column1_new as sql_variant)
when column_no = 2 then
cast(Column2_new as sql_variant)
when column_no = 3 then
cast(Column3_new as sql_variant)
end new_value
from
@Update
cross join
(
select 1 column_no union all select 2 union all select 3
) n
order by
1, 2
To my surprise both solutions produce nearly the same query plan. But if you have 30 pairs of columns, as the OP mentioned I should definitely stay with Bob's solution.
Peter
October 28, 2009 at 6:32 am
DOH!!
Peter, I've been using Cross-Joins to expand tables forever and it never occurred to me last night. I think that your solution has serious merit and that Wayne should test both with his 30 column table to compare the performance with real world volumes. UNPIVOT may involve some overhead that your solution avoids. Frankly, I find yours easier to read and understand at a glance. It doesn't blend concepts. Nice work!!
Gianluca, I've seen Phil's competitions posted, but haven't had the time to study the winning solutions. I should probably make time to do that because I'm sure they are producing some truly ninja techniques.:-P
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
October 28, 2009 at 7:05 am
Hey guys, this looks awesome. I had pushed the UNPIVOT out of my mind 'cause I didn't think it would work with multiple sets of columns.
I'll post some results as soon as I can wrap my head around all of this, get it coded, and see what the test looks like. However, if you're talking about a single scan of the table, that sure beats 30 scans, once for each pair. Which is what I was trying to avoid.
Thanks for the help!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
October 28, 2009 at 7:10 am
You're welcome, Wayne. Please let us know which you finally go with and why. If one is significantly faster than the other, I'd like to know. I'm actually hoping that Peter's cross join wins, just because I'm sentimental about old-school techniques getting the job done. Worst case, we have a good technique for SQL2000 and a good technique for SQL25K and beyond.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
October 28, 2009 at 7:18 am
Thanks Bob. I just expected the pivot solution to require far less coding in the case of 30 column pairs. But even that doesn't seem to be the case, now I take closer look at it.
Peter
October 28, 2009 at 3:42 pm
Is it too late to add this to the mix...?
;WITH
TypeConvert AS
(
-- Convert to sql_variant for the upcoming UNPIVOT
SELECT rowid,
column1_old = CONVERT(SQL_VARIANT, column1_old),
column2_old = CONVERT(SQL_VARIANT, column2_old),
column3_old = CONVERT(SQL_VARIANT, column3_old),
column1_new = CONVERT(SQL_VARIANT, column1_new),
column2_new = CONVERT(SQL_VARIANT, column2_new),
column3_new = CONVERT(SQL_VARIANT, column3_new)
FROM @Update
),
Unpivoted AS
(
-- Break the column names up into key-value pairs
SELECT rowid,
column_nm = LEFT(column_name, 7),
version_nm = RIGHT(column_name,3) + '_value',
value
FROM TypeConvert
UNPIVOT (
value
FOR column_name IN
(column1_old, column2_old, column3_old, column1_new, column2_new, column3_new)
) PointlessAlias
),
Pivoted AS
(
-- Un-unpivot in the way we want stuff displayed
SELECT rowid,
column_nm,
old_value,
new_value
FROM Unpivoted
PIVOT (
MAX(value)
FOR version_nm IN (old_value, new_value)
) PointlessAlias
)
-- The output, already
SELECT rowid,
column_nm,
old_value,
new_value
FROM Pivoted
ORDER BY
rowid, column_nm;
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
October 28, 2009 at 5:35 pm
Your input is always welcome, Emperor Paulpatine. Your solution is cool. I even like the use of SQL_VARIANT. The execution plan also looks sweet, although I am mystified as to how it gets there from here.
I have one more variation I want to write up that will use a CTE to pair up the old/new sets, a second CTE to UNPIVOT, and a third CTE to split the pairs. When this is all done, I am going to try and write a little article. This seems like a niche problem, but the variety of solutions is proving to be a lot of fun. 🙂
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
October 28, 2009 at 5:59 pm
Cheers Bob - and yes I think an article is a great idea. The solution I posted owes much to your ideas and Peter's, but just does things slightly differently overall. The optimizer does a great job with it, I agree 🙂
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
October 28, 2009 at 6:09 pm
Bob Hovious 24601 (10/28/2009)
You're welcome, Wayne. Please let us know which you finally go with and why. If one is significantly faster than the other, I'd like to know. I'm actually hoping that Peter's cross join wins, just because I'm sentimental about old-school techniques getting the job done. Worst case, we have a good technique for SQL2000 and a good technique for SQL25K and beyond.
Okay, I'm still working on getting some performance specs to compare all three methods. However, I'm leaning strongly towards Peter's method. The reason... the flexibility to put into the "column_nm" column user-friendly text, like "First Name", "Birth Date", etc.
I'm pretty impressed that all three examples are showing single table scans. The method I had posted was going to do a scan for each column pair, and avoiding that (especially with nearly 30 column pairs) was the reason for this post in the first place.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
October 28, 2009 at 6:14 pm
This has been an interesting problem to follow.
I really like the PointlessAlias as the Table Alias Emperator PaulPatine.
I agree with Wayne that it is interesting that all three solutions create just about the same execution plan and statistics.
Good luck on the Article Bob.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
October 29, 2009 at 1:57 am
Matt Whitfield has written a .NET test harness that maybe could be helpful in choosing the fastest solution.
You could call him in the thread and ask for some info.
-- Gianluca Sartori
October 29, 2009 at 2:10 am
Thanks, Gianluca. I'll ask him about it.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 15 posts - 1 through 15 (of 50 total)
You must be logged in to reply to this topic. Login to reply