Handle NULLs

Question

Post reply

Handle NULLs

Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 1

Both are brilliant!

The string-based solution rocks...how simple!

I'll throw one more approach out there using XML just because it seemed interesting when I thought of it. There is a chance it could be improved but I am no XML guru...the way it stands now it does not compete with either of the others.

Here are all three solutions with slight mods so they all use the same temp tables.

SET STATISTICS TIME OFF

SET NOCOUNT ON ;

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

build temp tables'

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby1')

AND type IN ( N'U' ) )

DROP TABLE hobby1 ;

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby2')

AND type IN ( N'U' ) )

DROP TABLE hobby2 ;

CREATE TABLE hobby1

(

id INT,

hobby1 VARCHAR(20),

hobby2 VARCHAR(20)

) ;

CREATE TABLE hobby2

(

id INT,

hobby1 VARCHAR(20),

hobby2 VARCHAR(20)

) ;

INSERT INTO dbo.hobby1

( id, hobby1, hobby2 )

VALUES ( 1, 'basketball', 'tennis' ),

( 2, 'basketball', NULL ),

( 3, NULL, NULL ),

( 4, NULL, 'basketball' ),

( 5, 'tennis', NULL ) ;

INSERT INTO dbo.hobby2

( id, hobby1, hobby2 )

VALUES ( 1, 'football', 'rugby' ),

( 2, 'football', NULL ),

( 3, NULL, NULL ),

( 4, NULL, 'football' ),

( 5, 'rugby', NULL ) ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

get data as xml'

SET STATISTICS TIME ON ;

DECLARE @xml XML = (

SELECT h1.id AS id,

h1.hobby1 AS 'hobbies/hobby1',

h1.hobby2 AS 'hobbies/hobby2',

h2.hobby1 AS 'hobbies/hobby3',

h2.hobby2 AS 'hobbies/hobby4'

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

FOR

XML PATH('person'),

TYPE,

ROOT('people')

) ;

--SELECT @xml

SET STATISTICS TIME OFF ;

PRINT 'xml parsing'

SET STATISTICS TIME ON ;

SELECT pref.value('(id/text())[1]', 'int') AS id,

pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,

pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,

pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,

pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4

FROM @xml.nodes('/people/person') AS People ( pref )

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

APPLY method'

SET STATISTICS TIME ON ;

-- now transform it by moving all nulls to the end without RBAR

SELECT t.id,

c1toc4 AS newc1,

newc2,

newc3,

newc4

FROM (

SELECT h1.id AS id,

h1.hobby1 AS c1,

h1.hobby2 AS c2,

h2.hobby1 AS c3,

h2.hobby2 AS c4

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

CROSS APPLY (

SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

c2toc4 = COALESCE(c2, c3, c4),

c3toc4 = COALESCE(c3, c4)

) cx

-- second CROSS APPLY figures out what goes in the second column

-- we know the 1st column is always c1toc4 - i.e. the first non-null

-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)

CROSS APPLY (

SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

) cx2

-- and this is for the third column

CROSS APPLY (

SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))

) cx3

-- and this is for the fourth column

CROSS APPLY (

SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

) cx4 ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

String manipulation'

SET STATISTICS TIME ON ;

SELECT t.id,

RTRIM(SUBSTRING(t.s, 1, 20)),

RTRIM(SUBSTRING(t.s, 21, 20)),

RTRIM(SUBSTRING(t.s, 41, 20)),

RTRIM(SUBSTRING(t.s, 61, 20))

FROM (

SELECT h1.id,

ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +

ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +

ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +

ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t ;

SET STATISTICS TIME OFF ;

GO

On a quiet system these are the results I was seeing consistently:

---------------------------------------------------------------------------
build temp tables
---------------------------------------------------------------------------
get data as xml
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
xml parsing
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.
---------------------------------------------------------------------------
APPLY method
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
---------------------------------------------------------------------------
String manipulation
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Nevyn SSChampion Points: 14210 More actions · Answer 2

The string manipulation is probably the fastest.

I wanted to try the UNPIVOT/PIVOT just for the heck of it.

Using Opc.Three's test setup, here is what I did first:

;WITH OneTableCTE AS

(

SELECTa.id,

a.hobby1 AS 'h1',

a.hobby2 AS 'h2',

b.hobby1 AS 'h3',

b.hobby2 AS 'h4'

FROM

hobby1 a

INNER JOIN hobby2 b

ON a.id = b.id

), UnPivotCTE AS

(

SELECTid,

hobby,

ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

FROM OneTableCTE

UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4)) AS Normalized

)

SELECTid,

[1],

[2],

[3],

[4]

FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone

PIVOT

(

min(hobby)

FOR fixedorder IN ([1],[2],[3],[4])

) AS PivotTable;

It works pretty well and is pretty fast about it. My machine was running a bit too quick so I didn't get a good comparison to the string method speed (might try later at home on a slower system).

The problem with this is that it doesn't handle the "all hobbies are null" case. It just excludes that ID on the unpivot. To get around this, I added in an outer join back to the original tables.

The string way is certainly more elegant, but this was a lot of fun.

Here is the finished code for all 4 methods:

SET STATISTICS TIME OFF

SET NOCOUNT ON ;

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

build temp tables'

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby1')

AND type IN ( N'U' ) )

DROP TABLE hobby1 ;

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby2')

AND type IN ( N'U' ) )

DROP TABLE hobby2 ;

CREATE TABLE hobby1

(

id INT,

hobby1 VARCHAR(20),

hobby2 VARCHAR(20)

) ;

CREATE TABLE hobby2

(

id INT,

hobby1 VARCHAR(20),

hobby2 VARCHAR(20)

) ;

INSERT INTO dbo.hobby1

( id, hobby1, hobby2 )

VALUES ( 1, 'basketball', 'tennis' ),

( 2, 'basketball', NULL ),

( 3, NULL, NULL ),

( 4, NULL, 'basketball' ),

( 5, 'tennis', NULL ) ;

INSERT INTO dbo.hobby2

( id, hobby1, hobby2 )

VALUES ( 1, 'football', 'rugby' ),

( 2, 'football', NULL ),

( 3, NULL, NULL ),

( 4, NULL, 'football' ),

( 5, 'rugby', NULL ) ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

get data as xml'

SET STATISTICS TIME ON ;

DECLARE @xml XML = (

SELECT h1.id AS id,

h1.hobby1 AS 'hobbies/hobby1',

h1.hobby2 AS 'hobbies/hobby2',

h2.hobby1 AS 'hobbies/hobby3',

h2.hobby2 AS 'hobbies/hobby4'

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

FOR

XML PATH('person'),

TYPE,

ROOT('people')

) ;

--SELECT @xml

SET STATISTICS TIME OFF ;

PRINT 'xml parsing'

SET STATISTICS TIME ON ;

SELECT pref.value('(id/text())[1]', 'int') AS id,

pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,

pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,

pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,

pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4

FROM @xml.nodes('/people/person') AS People ( pref )

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

APPLY method'

SET STATISTICS TIME ON ;

-- now transform it by moving all nulls to the end without RBAR

SELECT t.id,

c1toc4 AS newc1,

newc2,

newc3,

newc4

FROM (

SELECT h1.id AS id,

h1.hobby1 AS c1,

h1.hobby2 AS c2,

h2.hobby1 AS c3,

h2.hobby2 AS c4

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

CROSS APPLY (

SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

c2toc4 = COALESCE(c2, c3, c4),

c3toc4 = COALESCE(c3, c4)

) cx

-- second CROSS APPLY figures out what goes in the second column

-- we know the 1st column is always c1toc4 - i.e. the first non-null

-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)

CROSS APPLY (

SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

) cx2

-- and this is for the third column

CROSS APPLY (

SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))

) cx3

-- and this is for the fourth column

CROSS APPLY (

SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

) cx4 ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

String manipulation'

SET STATISTICS TIME ON ;

SELECT t.id,

RTRIM(SUBSTRING(t.s, 1, 20)),

RTRIM(SUBSTRING(t.s, 21, 20)),

RTRIM(SUBSTRING(t.s, 41, 20)),

RTRIM(SUBSTRING(t.s, 61, 20))

FROM (

SELECT h1.id,

ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +

ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +

ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +

ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-',75) + '

Unpivot/Pivot'

SET STATISTICS TIME ON ;

;WITH OneTableCTE AS

(

SELECTa.id,

a.hobby1 AS 'h1',

a.hobby2 AS 'h2',

b.hobby1 AS 'h3',

b.hobby2 AS 'h4'

FROM

hobby1 a

INNER JOIN hobby2 b

ON a.id = b.id

), UnPivotCTE AS

(

SELECTid,

hobby,

ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

FROM OneTableCTE

UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4) ) AS Normalized

)

SELECTOneTableCTE.id,

[1],

[2],

[3],

[4]

FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone

PIVOT

(

min(hobby)

FOR fixedorder IN ([1],[2],[3],[4])

) AS PivotTable

RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id

SET STATISTICS TIME OFF ;

GO

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 3

I'd love to hear from the OP again 🙂 Are you out there mikes84?

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Nevyn SSChampion Points: 14210 More actions · Answer 4

Yeah, I'm not sure if he thought he'd get 6 or 7 different working answers.

Mike Scalise SSCrazy Eights Points: 8145 More actions · Answer 5

Yeah, I sure wasn't expecting so many answers. I started to revisit the problem again. What do you think of this (also not that pretty):

SELECT

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 1)), '') AS Hobby1,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 2)), '') AS Hobby2,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 3)), '') AS Hobby3,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 4)), '') AS Hobby4

FROM <tables> etc.

Mike Scalise, PMP
https://www.michaelscalise.com

Mike Scalise SSCrazy Eights Points: 8145 More actions · Answer 6

Actually that won't work. I think some derivative of that would though.

Mike Scalise, PMP
https://www.michaelscalise.com

Mike Scalise SSCrazy Eights Points: 8145 More actions · Answer 7

OK, how about this?

;WITH data(hobby1,hobby2,hobby3,hobby4)

AS

(

SELECT 'basketball',NULL,NULL,NULL UNION ALL

SELECT NULL,'basketball',NULL,NULL UNION ALL

SELECT NULL,NULL,'basketball',NULL UNION ALL

SELECT NULL,NULL,NULL,'basketball' UNION ALL

SELECT 'basketball','tennis',NULL,NULL UNION ALL

SELECT 'basketball',NULL,'tennis',NULL UNION ALL

SELECT 'basketball',NULL,NULL,'tennis' UNION ALL

SELECT NULL,'basketball','tennis',NULL UNION ALL

SELECT NULL,'basketball',NULL,'tennis' UNION ALL

SELECT NULL,NULL,'basketball','tennis' UNION ALL

SELECT 'basketball','tennis','cricket',NULL UNION ALL

SELECT 'basketball','tennis',NULL,'cricket' UNION ALL

SELECT 'basketball',NULL,'tennis','cricket' UNION ALL

SELECT NULL,'basketball','tennis','cricket' UNION ALL

SELECT 'basketball','tennis','cricket','curling'

)

SELECT *

INTO #temp

FROM data d1

SELECT REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

CASE WHEN hobby4 IS NULL

THEN '' ELSE hobby4 END), ' ', '.')), 1)) AS Hobby1,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 2)) AS Hobby2,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 3)) AS Hobby3,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 4)) AS Hobby4

FROM #temp

Mike Scalise, PMP
https://www.michaelscalise.com

Revenant SSC-Forever Points: 42467 More actions · Answer 8

What nerve did these nulls hit, that sop many people are spending so much time to develop and perfect so many solutions?

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 9

keep in mind any string parsing relies on finding a pattern that isn't used. I personally wouldn't be banking on space as a delimiter.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 10

I upped the ante to see how each technique would really do when presented with more than 5 rows of data in each table and in the process I broke the APPLY method (see the NULL in h3 and h4 now) AND the PARSENAME method (nothing coming back at all in any column). Further to that the Union All/Cross-tab method is taking more CPU but less time than the elegant String Manipulation method :unsure:

Please someone tell me what I did wrong:

SET NOCOUNT ON;

GO

SET STATISTICS TIME OFF ;

PRINT REPLICATE('-', 75) + '

build temp tables'

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby1')

AND type IN (N'U') )

DROP TABLE hobby1 ;

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'hobby2')

AND type IN (N'U') )

DROP TABLE hobby2 ;

CREATE TABLE hobby1

(

id INT IDENTITY(1, 1),

hobby1 VARCHAR(36),

hobby2 VARCHAR(36)

) ;

CREATE TABLE hobby2

(

id INT,

hobby1 VARCHAR(36),

hobby2 VARCHAR(36)

) ;

INSERT INTO dbo.hobby1

(

hobby1,

hobby2

)

SELECT DISTINCT TOP (1000)

CAST(NEWID() AS VARCHAR(36)),

CAST(NEWID() AS VARCHAR(36))

FROM master.sys.columns c1

CROSS JOIN master.sys.columns c2 ;

INSERT INTO dbo.hobby2

(

id,

hobby1,

hobby2

)

SELECT id,

hobby2,

hobby1

FROM dbo.hobby1 ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-', 75) + '

APPLY method'

SELECT REPLICATE('-', 75) + '

APPLY method'

SET STATISTICS TIME ON ;

-- now transform it by moving all nulls to the end without RBAR

SELECT t.id,

c1toc4 AS newc1,

newc2,

newc3,

newc4

FROM (

SELECT h1.id AS id,

h1.hobby1 AS c1,

h1.hobby2 AS c2,

h2.hobby1 AS c3,

h2.hobby2 AS c4

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

CROSS APPLY (

SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

c2toc4 = COALESCE(c2, c3, c4),

c3toc4 = COALESCE(c3, c4)

) cx

-- second CROSS APPLY figures out what goes in the second column

-- we know the 1st column is always c1toc4 - i.e. the first non-null

-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

-- if it matches either or those we need to either take column 4 or

-- null (if col4 matches c1toc4 we have already used it for new column 1)

CROSS APPLY (

SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,

c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

) cx2

-- and this is for the third column

CROSS APPLY (

SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),

NULLIF(NULLIF(c4, newc2), c1toc4))

) cx3

-- and this is for the fourth column

CROSS APPLY (

SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

) cx4 ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-', 75) + '

String manipulation'

SELECT REPLICATE('-', 75) + '

String manipulation'

SET STATISTICS TIME ON ;

SELECT t.id,

RTRIM(SUBSTRING(t.s, 1, 36)),

RTRIM(SUBSTRING(t.s, 37, 36)),

RTRIM(SUBSTRING(t.s, 73, 36)),

RTRIM(SUBSTRING(t.s, 109, 36))

FROM (

SELECT h1.id,

ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36

- LEN(h2.hobby1)),'') + ISNULL(h2.hobby2

+ SPACE(36- LEN(h2.hobby2)),'') s

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id

) t ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-', 75) + '

Unpivot/Pivot'

SELECT REPLICATE('-', 75) + '

Unpivot/Pivot'

SET STATISTICS TIME ON ;

;

WITH OneTableCTE

AS (

SELECT a.id,

a.hobby1 AS 'h1',

a.hobby2 AS 'h2',

b.hobby1 AS 'h3',

b.hobby2 AS 'h4'

FROM hobby1 a

INNER JOIN hobby2 b ON a.id = b.id

),

UnPivotCTE

AS (

SELECT id,

hobby,

ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN (h1, h2, h3, h4) )

AS Normalized

)

SELECT OneTableCTE.id,

[1],

[2],

[3],

[4]

FROM (

SELECT id,

hobby,

fixedorder

FROM UnPivotCTE

) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ([1], [2], [3], [4]) ) AS PivotTable

RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-', 75) + '

Union All/Cross tab'

SELECT REPLICATE('-', 75) + '

Union All/Cross tab'

SET STATISTICS TIME ON ;

WITH cte(id, hobby, hobby_num)

AS (

SELECT h.id,

h.hobby1,

1

FROM dbo.hobby1 h

WHERE EXISTS ( SELECT *

FROM dbo.hobby2

WHERE id = h.id )

UNION ALL

SELECT h.id,

h.hobby1,

2

FROM dbo.hobby1 h

WHERE EXISTS ( SELECT *

FROM dbo.hobby2

WHERE id = h.id )

UNION ALL

SELECT h.id,

h.hobby1,

3

FROM dbo.hobby2 h

WHERE EXISTS ( SELECT *

FROM dbo.hobby1

WHERE id = h.id )

UNION ALL

SELECT h.id,

h.hobby2,

4

FROM dbo.hobby2 h

WHERE EXISTS ( SELECT *

FROM dbo.hobby2

WHERE id = h.id )

),

cte2(id, hobby, row_num)

AS (

SELECT id,

hobby,

ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)

FROM cte

WHERE hobby IS NOT NULL

)

SELECT id,

MAX(CASE WHEN row_num = 1 THEN hobby

END) AS hobby1,

MAX(CASE WHEN row_num = 2 THEN hobby

END) AS hobby2,

MAX(CASE WHEN row_num = 3 THEN hobby

END) AS hobby3,

MAX(CASE WHEN row_num = 4 THEN hobby

END) AS hobby4

FROM cte2

GROUP BY id ;

SET STATISTICS TIME OFF ;

GO

------------------------------------------------------------------------------------

PRINT REPLICATE('-', 75) + '

Parsename'

SELECT REPLICATE('-', 75) + '

Parsename'

SET STATISTICS TIME ON ;

SELECT h1.id,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

1)), '') AS Hobby1,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

2)), '') AS Hobby2,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

3)), '') AS Hobby3,

ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

4)), '') AS Hobby4

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

SET STATISTICS TIME OFF ;

GO

Results on my machine:

---------------------------------------------------------------------------
build temp tables
---------------------------------------------------------------------------
APPLY method
SQL Server Execution Times:
CPU time = 63 ms, elapsed time = 232 ms.
---------------------------------------------------------------------------
String manipulation
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 115 ms.
---------------------------------------------------------------------------
Unpivot/Pivot
SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 270 ms.
---------------------------------------------------------------------------
Union All/Cross tab
Warning: Null value is eliminated by an aggregate or other SET operation.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 79 ms.
---------------------------------------------------------------------------
Parsename
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 170 ms.

EDIT: add some line breaks in code window

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 11

mikes84, could you edit your post so the code has some line breaks in it? It's making the page reallllly wide.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Mike Scalise SSCrazy Eights Points: 8145 More actions · Answer 12

opc, what about this for the PARSENAME method?

SELECT h1.id,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

FROM dbo.hobby1 h1

JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

Mike Scalise, PMP
https://www.michaelscalise.com

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 13

mikes84 (7/7/2011)
opc, what about this for the PARSENAME method?
SELECT h1.id,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,
REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(
CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +
CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +
CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +
CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

The result set looks much fuller now 🙂

Perf results:

Parsename
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 884 ms.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

mister.magoo SSC-Forever Points: 47174 More actions · Answer 14

mikes84 (7/6/2011)
I have a situation where I'm querying two tables, each that has up to two hobbies listed. The first table has hobby1 and hobby2, and the second table has hobby3 and hobby4. It's possible for there to be values in any of the fields. So, it could look like:
hobby1, hobby2, hobby3, hobby4
basketball, tennis, NULL, NULL
or
hobby1, hobby2, hobby3, hobby4
NULL, NULL, basketball, tennis
However, it could also look like:
hobby1, hobby2, hobby3, hobby4
basketball, NULL, tennis, NULL
Ideally, I want it to look like the first example, where the values are brought to the left. I could use COALESCE in the first and second examples and have it work, but it wouldn't work on the third example. Any ideas how I could get something like the third example to look like the first one?
Mike

Hi Orlando (and other contributors),

I don't understand the test data you are using... perhaps it's me but the question was about shuffling values into the leftmost columns where there were nulls between them.

The test data you generate has no NULLS and duplicate entries.

As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...

Or am I completely missing the point?

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Revenant SSC-Forever Points: 42467 More actions · Answer 15

mister.magoo (7/7/2011)
. . . As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...

One should NOT remoive duplicates unless it is an explicit requirement.