July 7, 2011 at 6:20 am
Both are brilliant!
The string-based solution rocks...how simple!
I'll throw one more approach out there using XML just because it seemed interesting when I thought of it. There is a chance it could be improved but I am no XML guru...the way it stands now it does not compete with either of the others.
Here are all three solutions with slight mods so they all use the same temp tables.
SET STATISTICS TIME OFF
SET NOCOUNT ON ;
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
build temp tables'
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby1')
AND type IN ( N'U' ) )
DROP TABLE hobby1 ;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby2')
AND type IN ( N'U' ) )
DROP TABLE hobby2 ;
CREATE TABLE hobby1
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
CREATE TABLE hobby2
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
INSERT INTO dbo.hobby1
( id, hobby1, hobby2 )
VALUES ( 1, 'basketball', 'tennis' ),
( 2, 'basketball', NULL ),
( 3, NULL, NULL ),
( 4, NULL, 'basketball' ),
( 5, 'tennis', NULL ) ;
INSERT INTO dbo.hobby2
( id, hobby1, hobby2 )
VALUES ( 1, 'football', 'rugby' ),
( 2, 'football', NULL ),
( 3, NULL, NULL ),
( 4, NULL, 'football' ),
( 5, 'rugby', NULL ) ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
get data as xml'
SET STATISTICS TIME ON ;
DECLARE @xml XML = (
SELECT h1.id AS id,
h1.hobby1 AS 'hobbies/hobby1',
h1.hobby2 AS 'hobbies/hobby2',
h2.hobby1 AS 'hobbies/hobby3',
h2.hobby2 AS 'hobbies/hobby4'
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
FOR
XML PATH('person'),
TYPE,
ROOT('people')
) ;
--SELECT @xml
SET STATISTICS TIME OFF ;
PRINT 'xml parsing'
SET STATISTICS TIME ON ;
SELECT pref.value('(id/text())[1]', 'int') AS id,
pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,
pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,
pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,
pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4
FROM @xml.nodes('/people/person') AS People ( pref )
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
APPLY method'
SET STATISTICS TIME ON ;
-- now transform it by moving all nulls to the end without RBAR
SELECT t.id,
c1toc4 AS newc1,
newc2,
newc3,
newc4
FROM (
SELECT h1.id AS id,
h1.hobby1 AS c1,
h1.hobby2 AS c2,
h2.hobby1 AS c3,
h2.hobby2 AS c4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements
CROSS APPLY (
SELECT c1toc4 = COALESCE(c1, c2, c3, c4),
c2toc4 = COALESCE(c2, c3, c4),
c3toc4 = COALESCE(c3, c4)
) cx
-- second CROSS APPLY figures out what goes in the second column
-- we know the 1st column is always c1toc4 - i.e. the first non-null
-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4
-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)
CROSS APPLY (
SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))
) cx2
-- and this is for the third column
CROSS APPLY (
SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))
) cx3
-- and this is for the fourth column
CROSS APPLY (
SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)
) cx4 ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
String manipulation'
SET STATISTICS TIME ON ;
SELECT t.id,
RTRIM(SUBSTRING(t.s, 1, 20)),
RTRIM(SUBSTRING(t.s, 21, 20)),
RTRIM(SUBSTRING(t.s, 41, 20)),
RTRIM(SUBSTRING(t.s, 61, 20))
FROM (
SELECT h1.id,
ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +
ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +
ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +
ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t ;
SET STATISTICS TIME OFF ;
GO
On a quiet system these are the results I was seeing consistently:
---------------------------------------------------------------------------
build temp tables
---------------------------------------------------------------------------
get data as xml
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
xml parsing
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.
---------------------------------------------------------------------------
APPLY method
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
---------------------------------------------------------------------------
String manipulation
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 7, 2011 at 11:02 am
The string manipulation is probably the fastest.
I wanted to try the UNPIVOT/PIVOT just for the heck of it.
Using Opc.Three's test setup, here is what I did first:
;WITH OneTableCTE AS
(
SELECTa.id,
a.hobby1 AS 'h1',
a.hobby2 AS 'h2',
b.hobby1 AS 'h3',
b.hobby2 AS 'h4'
FROM
hobby1 a
INNER JOIN hobby2 b
ON a.id = b.id
), UnPivotCTE AS
(
SELECTid,
hobby,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder
FROM OneTableCTE
UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4)) AS Normalized
)
SELECTid,
[1],
[2],
[3],
[4]
FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone
PIVOT
(
min(hobby)
FOR fixedorder IN ([1],[2],[3],[4])
) AS PivotTable;
It works pretty well and is pretty fast about it. My machine was running a bit too quick so I didn't get a good comparison to the string method speed (might try later at home on a slower system).
The problem with this is that it doesn't handle the "all hobbies are null" case. It just excludes that ID on the unpivot. To get around this, I added in an outer join back to the original tables.
The string way is certainly more elegant, but this was a lot of fun.
Here is the finished code for all 4 methods:
SET STATISTICS TIME OFF
SET NOCOUNT ON ;
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
build temp tables'
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby1')
AND type IN ( N'U' ) )
DROP TABLE hobby1 ;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby2')
AND type IN ( N'U' ) )
DROP TABLE hobby2 ;
CREATE TABLE hobby1
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
CREATE TABLE hobby2
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
INSERT INTO dbo.hobby1
( id, hobby1, hobby2 )
VALUES ( 1, 'basketball', 'tennis' ),
( 2, 'basketball', NULL ),
( 3, NULL, NULL ),
( 4, NULL, 'basketball' ),
( 5, 'tennis', NULL ) ;
INSERT INTO dbo.hobby2
( id, hobby1, hobby2 )
VALUES ( 1, 'football', 'rugby' ),
( 2, 'football', NULL ),
( 3, NULL, NULL ),
( 4, NULL, 'football' ),
( 5, 'rugby', NULL ) ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
get data as xml'
SET STATISTICS TIME ON ;
DECLARE @xml XML = (
SELECT h1.id AS id,
h1.hobby1 AS 'hobbies/hobby1',
h1.hobby2 AS 'hobbies/hobby2',
h2.hobby1 AS 'hobbies/hobby3',
h2.hobby2 AS 'hobbies/hobby4'
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
FOR
XML PATH('person'),
TYPE,
ROOT('people')
) ;
--SELECT @xml
SET STATISTICS TIME OFF ;
PRINT 'xml parsing'
SET STATISTICS TIME ON ;
SELECT pref.value('(id/text())[1]', 'int') AS id,
pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,
pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,
pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,
pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4
FROM @xml.nodes('/people/person') AS People ( pref )
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
APPLY method'
SET STATISTICS TIME ON ;
-- now transform it by moving all nulls to the end without RBAR
SELECT t.id,
c1toc4 AS newc1,
newc2,
newc3,
newc4
FROM (
SELECT h1.id AS id,
h1.hobby1 AS c1,
h1.hobby2 AS c2,
h2.hobby1 AS c3,
h2.hobby2 AS c4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements
CROSS APPLY (
SELECT c1toc4 = COALESCE(c1, c2, c3, c4),
c2toc4 = COALESCE(c2, c3, c4),
c3toc4 = COALESCE(c3, c4)
) cx
-- second CROSS APPLY figures out what goes in the second column
-- we know the 1st column is always c1toc4 - i.e. the first non-null
-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4
-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)
CROSS APPLY (
SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))
) cx2
-- and this is for the third column
CROSS APPLY (
SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))
) cx3
-- and this is for the fourth column
CROSS APPLY (
SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)
) cx4 ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
String manipulation'
SET STATISTICS TIME ON ;
SELECT t.id,
RTRIM(SUBSTRING(t.s, 1, 20)),
RTRIM(SUBSTRING(t.s, 21, 20)),
RTRIM(SUBSTRING(t.s, 41, 20)),
RTRIM(SUBSTRING(t.s, 61, 20))
FROM (
SELECT h1.id,
ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +
ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +
ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +
ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-',75) + '
Unpivot/Pivot'
SET STATISTICS TIME ON ;
;WITH OneTableCTE AS
(
SELECTa.id,
a.hobby1 AS 'h1',
a.hobby2 AS 'h2',
b.hobby1 AS 'h3',
b.hobby2 AS 'h4'
FROM
hobby1 a
INNER JOIN hobby2 b
ON a.id = b.id
), UnPivotCTE AS
(
SELECTid,
hobby,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder
FROM OneTableCTE
UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4) ) AS Normalized
)
SELECTOneTableCTE.id,
[1],
[2],
[3],
[4]
FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone
PIVOT
(
min(hobby)
FOR fixedorder IN ([1],[2],[3],[4])
) AS PivotTable
RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id
SET STATISTICS TIME OFF ;
GO
July 7, 2011 at 12:15 pm
I'd love to hear from the OP again 🙂 Are you out there mikes84?
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 7, 2011 at 12:25 pm
Yeah, I'm not sure if he thought he'd get 6 or 7 different working answers.
July 7, 2011 at 12:57 pm
Yeah, I sure wasn't expecting so many answers. I started to revisit the problem again. What do you think of this (also not that pretty):
SELECT
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 1)), '') AS Hobby1,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 2)), '') AS Hobby2,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 3)), '') AS Hobby3,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 4)), '') AS Hobby4
FROM <tables> etc.
Mike Scalise, PMP
https://www.michaelscalise.com
July 7, 2011 at 1:04 pm
Actually that won't work. I think some derivative of that would though.
Mike Scalise, PMP
https://www.michaelscalise.com
July 7, 2011 at 1:22 pm
OK, how about this?
;WITH data(hobby1,hobby2,hobby3,hobby4)
AS
(
SELECT 'basketball',NULL,NULL,NULL UNION ALL
SELECT NULL,'basketball',NULL,NULL UNION ALL
SELECT NULL,NULL,'basketball',NULL UNION ALL
SELECT NULL,NULL,NULL,'basketball' UNION ALL
SELECT 'basketball','tennis',NULL,NULL UNION ALL
SELECT 'basketball',NULL,'tennis',NULL UNION ALL
SELECT 'basketball',NULL,NULL,'tennis' UNION ALL
SELECT NULL,'basketball','tennis',NULL UNION ALL
SELECT NULL,'basketball',NULL,'tennis' UNION ALL
SELECT NULL,NULL,'basketball','tennis' UNION ALL
SELECT 'basketball','tennis','cricket',NULL UNION ALL
SELECT 'basketball','tennis',NULL,'cricket' UNION ALL
SELECT 'basketball',NULL,'tennis','cricket' UNION ALL
SELECT NULL,'basketball','tennis','cricket' UNION ALL
SELECT 'basketball','tennis','cricket','curling'
)
SELECT *
INTO #temp
FROM data d1
SELECT REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +
CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +
CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +
CASE WHEN hobby4 IS NULL
THEN '' ELSE hobby4 END), ' ', '.')), 1)) AS Hobby1,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +
CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +
CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +
CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 2)) AS Hobby2,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +
CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +
CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +
CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 3)) AS Hobby3,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +
CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +
CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +
CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 4)) AS Hobby4
FROM #temp
Mike Scalise, PMP
https://www.michaelscalise.com
July 7, 2011 at 1:51 pm
What nerve did these nulls hit, that sop many people are spending so much time to develop and perfect so many solutions?
July 7, 2011 at 2:03 pm
keep in mind any string parsing relies on finding a pattern that isn't used. I personally wouldn't be banking on space as a delimiter.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
July 7, 2011 at 2:15 pm
I upped the ante to see how each technique would really do when presented with more than 5 rows of data in each table and in the process I broke the APPLY method (see the NULL in h3 and h4 now) AND the PARSENAME method (nothing coming back at all in any column). Further to that the Union All/Cross-tab method is taking more CPU but less time than the elegant String Manipulation method :unsure:
Please someone tell me what I did wrong:
SET NOCOUNT ON;
GO
SET STATISTICS TIME OFF ;
PRINT REPLICATE('-', 75) + '
build temp tables'
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby1')
AND type IN (N'U') )
DROP TABLE hobby1 ;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby2')
AND type IN (N'U') )
DROP TABLE hobby2 ;
CREATE TABLE hobby1
(
id INT IDENTITY(1, 1),
hobby1 VARCHAR(36),
hobby2 VARCHAR(36)
) ;
CREATE TABLE hobby2
(
id INT,
hobby1 VARCHAR(36),
hobby2 VARCHAR(36)
) ;
INSERT INTO dbo.hobby1
(
hobby1,
hobby2
)
SELECT DISTINCT TOP (1000)
CAST(NEWID() AS VARCHAR(36)),
CAST(NEWID() AS VARCHAR(36))
FROM master.sys.columns c1
CROSS JOIN master.sys.columns c2 ;
INSERT INTO dbo.hobby2
(
id,
hobby1,
hobby2
)
SELECT id,
hobby2,
hobby1
FROM dbo.hobby1 ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-', 75) + '
APPLY method'
SELECT REPLICATE('-', 75) + '
APPLY method'
SET STATISTICS TIME ON ;
-- now transform it by moving all nulls to the end without RBAR
SELECT t.id,
c1toc4 AS newc1,
newc2,
newc3,
newc4
FROM (
SELECT h1.id AS id,
h1.hobby1 AS c1,
h1.hobby2 AS c2,
h2.hobby1 AS c3,
h2.hobby2 AS c4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements
CROSS APPLY (
SELECT c1toc4 = COALESCE(c1, c2, c3, c4),
c2toc4 = COALESCE(c2, c3, c4),
c3toc4 = COALESCE(c3, c4)
) cx
-- second CROSS APPLY figures out what goes in the second column
-- we know the 1st column is always c1toc4 - i.e. the first non-null
-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4
-- if it matches either or those we need to either take column 4 or
-- null (if col4 matches c1toc4 we have already used it for new column 1)
CROSS APPLY (
SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,
c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))
) cx2
-- and this is for the third column
CROSS APPLY (
SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),
NULLIF(NULLIF(c4, newc2), c1toc4))
) cx3
-- and this is for the fourth column
CROSS APPLY (
SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)
) cx4 ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-', 75) + '
String manipulation'
SELECT REPLICATE('-', 75) + '
String manipulation'
SET STATISTICS TIME ON ;
SELECT t.id,
RTRIM(SUBSTRING(t.s, 1, 36)),
RTRIM(SUBSTRING(t.s, 37, 36)),
RTRIM(SUBSTRING(t.s, 73, 36)),
RTRIM(SUBSTRING(t.s, 109, 36))
FROM (
SELECT h1.id,
ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +
ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36
- LEN(h2.hobby1)),'') + ISNULL(h2.hobby2
+ SPACE(36- LEN(h2.hobby2)),'') s
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
) t ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-', 75) + '
Unpivot/Pivot'
SELECT REPLICATE('-', 75) + '
Unpivot/Pivot'
SET STATISTICS TIME ON ;
;
WITH OneTableCTE
AS (
SELECT a.id,
a.hobby1 AS 'h1',
a.hobby2 AS 'h2',
b.hobby1 AS 'h3',
b.hobby2 AS 'h4'
FROM hobby1 a
INNER JOIN hobby2 b ON a.id = b.id
),
UnPivotCTE
AS (
SELECT id,
hobby,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder
FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN (h1, h2, h3, h4) )
AS Normalized
)
SELECT OneTableCTE.id,
[1],
[2],
[3],
[4]
FROM (
SELECT id,
hobby,
fixedorder
FROM UnPivotCTE
) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ([1], [2], [3], [4]) ) AS PivotTable
RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-', 75) + '
Union All/Cross tab'
SELECT REPLICATE('-', 75) + '
Union All/Cross tab'
SET STATISTICS TIME ON ;
WITH cte(id, hobby, hobby_num)
AS (
SELECT h.id,
h.hobby1,
1
FROM dbo.hobby1 h
WHERE EXISTS ( SELECT *
FROM dbo.hobby2
WHERE id = h.id )
UNION ALL
SELECT h.id,
h.hobby1,
2
FROM dbo.hobby1 h
WHERE EXISTS ( SELECT *
FROM dbo.hobby2
WHERE id = h.id )
UNION ALL
SELECT h.id,
h.hobby1,
3
FROM dbo.hobby2 h
WHERE EXISTS ( SELECT *
FROM dbo.hobby1
WHERE id = h.id )
UNION ALL
SELECT h.id,
h.hobby2,
4
FROM dbo.hobby2 h
WHERE EXISTS ( SELECT *
FROM dbo.hobby2
WHERE id = h.id )
),
cte2(id, hobby, row_num)
AS (
SELECT id,
hobby,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)
FROM cte
WHERE hobby IS NOT NULL
)
SELECT id,
MAX(CASE WHEN row_num = 1 THEN hobby
END) AS hobby1,
MAX(CASE WHEN row_num = 2 THEN hobby
END) AS hobby2,
MAX(CASE WHEN row_num = 3 THEN hobby
END) AS hobby3,
MAX(CASE WHEN row_num = 4 THEN hobby
END) AS hobby4
FROM cte2
GROUP BY id ;
SET STATISTICS TIME OFF ;
GO
------------------------------------------------------------------------------------
PRINT REPLICATE('-', 75) + '
Parsename'
SELECT REPLICATE('-', 75) + '
Parsename'
SET STATISTICS TIME ON ;
SELECT h1.id,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +
ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),
1)), '') AS Hobby1,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +
ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),
2)), '') AS Hobby2,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +
ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),
3)), '') AS Hobby3,
ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +
ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),
4)), '') AS Hobby4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id ;
SET STATISTICS TIME OFF ;
GO
Results on my machine:
---------------------------------------------------------------------------
build temp tables
---------------------------------------------------------------------------
APPLY method
SQL Server Execution Times:
CPU time = 63 ms, elapsed time = 232 ms.
---------------------------------------------------------------------------
String manipulation
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 115 ms.
---------------------------------------------------------------------------
Unpivot/Pivot
SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 270 ms.
---------------------------------------------------------------------------
Union All/Cross tab
Warning: Null value is eliminated by an aggregate or other SET operation.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 79 ms.
---------------------------------------------------------------------------
Parsename
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 170 ms.
EDIT: add some line breaks in code window
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 7, 2011 at 2:16 pm
mikes84, could you edit your post so the code has some line breaks in it? It's making the page reallllly wide.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 7, 2011 at 2:36 pm
opc, what about this for the PARSENAME method?
SELECT h1.id,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id ;
Mike Scalise, PMP
https://www.michaelscalise.com
July 7, 2011 at 2:43 pm
mikes84 (7/7/2011)
opc, what about this for the PARSENAME method?
SELECT h1.id,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id ;
The result set looks much fuller now 🙂
Perf results:
Parsename
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 884 ms.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 7, 2011 at 5:10 pm
mikes84 (7/6/2011)
I have a situation where I'm querying two tables, each that has up to two hobbies listed. The first table has hobby1 and hobby2, and the second table has hobby3 and hobby4. It's possible for there to be values in any of the fields. So, it could look like:hobby1, hobby2, hobby3, hobby4
basketball, tennis, NULL, NULL
or
hobby1, hobby2, hobby3, hobby4
NULL, NULL, basketball, tennis
However, it could also look like:
hobby1, hobby2, hobby3, hobby4
basketball, NULL, tennis, NULL
Ideally, I want it to look like the first example, where the values are brought to the left. I could use COALESCE in the first and second examples and have it work, but it wouldn't work on the third example. Any ideas how I could get something like the third example to look like the first one?
Mike
Hi Orlando (and other contributors),
I don't understand the test data you are using... perhaps it's me but the question was about shuffling values into the leftmost columns where there were nulls between them.
The test data you generate has no NULLS and duplicate entries.
As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...
Or am I completely missing the point?
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
July 7, 2011 at 7:49 pm
mister.magoo (7/7/2011)
. . . As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...
One should NOT remoive duplicates unless it is an explicit requirement.
Viewing 15 posts - 16 through 30 (of 92 total)
You must be logged in to reply to this topic. Login to reply