July 6, 2011 at 10:52 am
I have a situation where I'm querying two tables, each that has up to two hobbies listed. The first table has hobby1 and hobby2, and the second table has hobby3 and hobby4. It's possible for there to be values in any of the fields. So, it could look like:
hobby1, hobby2, hobby3, hobby4
basketball, tennis, NULL, NULL
or
hobby1, hobby2, hobby3, hobby4
NULL, NULL, basketball, tennis
However, it could also look like:
hobby1, hobby2, hobby3, hobby4
basketball, NULL, tennis, NULL
Ideally, I want it to look like the first example, where the values are brought to the left. I could use COALESCE in the first and second examples and have it work, but it wouldn't work on the third example. Any ideas how I could get something like the third example to look like the first one?
Mike
Mike Scalise, PMP
https://www.michaelscalise.com
July 6, 2011 at 11:50 am
This is ugly and it would kill performance on large rowsets, but it works:
DECLARE @hobby1 NVARCHAR(50),
@hobby2 NVARCHAR(50),
@hobby3 NVARCHAR(50),
@hobby4 NVARCHAR(50);
DECLARE hobbiesCursor CURSOR FOR
SELECT hobby1, hobby2, hobby3, hobby4
FROM Hobbies
OPEN hobbiesCursor;
FETCH NEXT FROM hobbiesCursor
INTO @hobby1, @hobby2, @hobby3, @hobby4;
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @i INT = 0;
WHILE @i < 3
BEGIN
IF @hobby1 IS NULL
BEGIN
SET @hobby1 = @hobby2;
SET @hobby2 = @hobby3;
SET @hobby3 = @hobby4;
SET @hobby4 = NULL;
END
SET @i += 1;
END
SET @i = 0;
WHILE @i < 2
BEGIN
IF @hobby2 IS NULL
BEGIN
SET @hobby2 = @hobby3;
SET @hobby3 = @hobby4;
SET @hobby4 = NULL;
END
SET @i += 1;
END
IF @hobby3 IS NULL
BEGIN
SET @hobby3 = @hobby4;
SET @hobby4 = NULL;
END
PRINT 'Hobby 1 ' + @hobby1;
PRINT 'Hobby 2 ' + @hobby2;
PRINT 'Hobby 3 ' + COALESCE( @hobby3, 'NULL' );
PRINT 'Hobby 4 ' + COALESCE( @hobby4, 'NULL' );
FETCH NEXT FROM hobbiesCursor
INTO @hobby1, @hobby2, @hobby3, @hobby4;
END;
CLOSE hobbiesCursor;
DEALLOCATE hobbiesCursor;
July 6, 2011 at 12:09 pm
Why not normalize the data and do away with the entire issue? Having repeating data in multiple columns is a basic violation of first normal form, so you will end up having lots of issues like the one you mentioned.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
July 6, 2011 at 12:15 pm
I suppose I could union the fields between the two tables and remove the NULLs that way, then convert the rows into columns if that's what you mean.
Mike Scalise, PMP
https://www.michaelscalise.com
July 6, 2011 at 12:52 pm
I think he was asking why you don't refactor your tables. Your comment is where I was headed. Here is what I came up with:
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby1')
AND type IN (N'U') )
DROP TABLE hobby1 ;
GO
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'hobby2')
AND type IN (N'U') )
DROP TABLE hobby2 ;
GO
CREATE TABLE hobby1
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
CREATE TABLE hobby2
(
id INT,
hobby1 VARCHAR(20),
hobby2 VARCHAR(20)
) ;
INSERT INTO dbo.hobby1
(id, hobby1, hobby2)
VALUES (1, 'basketball', 'tennis'),
(2, 'basketball', NULL),
(3, NULL, NULL),
(4, NULL, 'basketball') ;
go
INSERT INTO dbo.hobby2
(id, hobby1, hobby2)
VALUES (1, 'basketball', 'tennis'),
(2, 'tennis', NULL),
(3, 'basketball', 'tennis'),
(4, NULL, 'tennis') ;
go
WITH cte(id, hobby, hobby_num)
AS (
SELECT h1.id,
h1.hobby1,
1
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
UNION ALL
SELECT h1.id,
h1.hobby2,
2
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
UNION ALL
SELECT h1.id,
h2.hobby1,
3
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
UNION ALL
SELECT h1.id,
h2.hobby2,
4
FROM dbo.hobby1 h1
JOIN dbo.hobby2 h2 ON h1.id = h2.id
),
cte2(id, hobby, row_num)
AS (
SELECT id,
hobby,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)
FROM cte
WHERE hobby IS NOT NULL
)
SELECT id,
MAX(CASE WHEN row_num = 1 THEN hobby
END) AS hobby1,
MAX(CASE WHEN row_num = 2 THEN hobby
END) AS hobby2,
MAX(CASE WHEN row_num = 3 THEN hobby
END) AS hobby3,
MAX(CASE WHEN row_num = 4 THEN hobby
END) AS hobby4
FROM cte2
GROUP BY id ;
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 6, 2011 at 12:55 pm
Yes, you could select individual columns, UNION them, GROUP them (which would remove nulls), and pivot them back into four columns. That would be uglier than the RBAR example I gave you.
July 6, 2011 at 12:59 pm
Sorry - I was in a bit of a hurry - so I wasn't as helpful as I should have been. That said - you did get the overall idea.
If you were to change your table structure to
PersonID (or some way to identify the individual)
Hobby
HobbyRank
So you'd end up with
PersonID Hobby Rank
1 Golf 1
1 Fishing 2
2 Piano 1
3 Video Games 3
3 Knitting 4
etc.....
for example, then it would be easy to completely ignore (or not store) any of the unpopulated/unknown/NULL entries. You could also easily into queries where you find those who have both hobbies in common (just not necessarily entered in the same order.
Interestingly enough - this is the middle state OPC.three is heading for in the CTE.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
July 6, 2011 at 1:01 pm
Revenant (7/6/2011)
Yes, you could select individual columns, UNION them, GROUP them (which would remove nulls), and pivot them back into four columns. That would be uglier than the RBAR example I gave you.
:unsure: ...even with the nested loops introduced by the CTE I just posted I am not sure I would choose the cursor method you posted over them
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 6, 2011 at 1:08 pm
opc.three (7/6/2011)
...even with the nested loops introduced by the CTE I just posted I am not sure I would choose the cursor method you posted over them
Yes, RBAR is not my favorite technique but I had less than ten minutes before I had to go to a meeting, so it was written "under duress." 🙂
July 6, 2011 at 1:11 pm
Revenant (7/6/2011)
opc.three (7/6/2011)
...even with the nested loops introduced by the CTE I just posted I am not sure I would choose the cursor method you posted over themYes, RBAR is not my favorite technique but I had less than ten minutes before I had to go to a meeting, so it was written "under duress." 🙂
I hear ya, not trying to nitpick, just saying I would opt for some crazy RBAR CTEs over a cursor. Meetings not withstanding I am always under duress when writing RBAR processes 🙂
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 6, 2011 at 2:42 pm
Can't you just unpivot and then repivot?
July 6, 2011 at 2:51 pm
Nevyn (7/6/2011)
Can't you just unpivot and then repivot?
You sure could. The only thing that may become apparent is that hobby_num is a value-add column that the UNION ALL method provides that I do not know if the UNPIVOT method can provide.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 6, 2011 at 5:17 pm
Hi,
I didn't like all the RBAR so I had a go at a non-RBAR and came up with this.
It is a bit hard to read because of the complexity of the nested functions but works well and has a nice simple plan:
-- create some test data
;WITH data(c1,c2,c3,c4)
AS
(
SELECT 'basketball',NULL,NULL,NULL UNION ALL
SELECT NULL,'basketball',NULL,NULL UNION ALL
SELECT NULL,NULL,'basketball',NULL UNION ALL
SELECT NULL,NULL,NULL,'basketball' UNION ALL
SELECT 'basketball','tennis',NULL,NULL UNION ALL
SELECT 'basketball',NULL,'tennis',NULL UNION ALL
SELECT 'basketball',NULL,NULL,'tennis' UNION ALL
SELECT NULL,'basketball','tennis',NULL UNION ALL
SELECT NULL,'basketball',NULL,'tennis' UNION ALL
SELECT NULL,NULL,'basketball','tennis' UNION ALL
SELECT 'basketball','tennis','cricket',NULL UNION ALL
SELECT 'basketball','tennis',NULL,'cricket' UNION ALL
SELECT 'basketball',NULL,'tennis','cricket' UNION ALL
SELECT NULL,'basketball','tennis','cricket' UNION ALL
SELECT 'basketball','tennis','cricket','curling'
)
SELECT d1.c1,d1.c2,d1.c3,d1.c4
INTO #temp
FROM data d1
-- now transform it by moving all nulls to the end without RBAR
SELECT c1toc4 AS newc1,
newc2,
newc3,
newc4,
c1,
c2,
c3,
c4
FROM #temp
-- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements
CROSS APPLY (
SELECT c1toc4 = COALESCE(c1,c2,c3,c4) ,
c2toc4 = COALESCE(c2,c3,c4) ,
c3toc4 = COALESCE(c3,c4)
) cx
-- second CROSS APPLY figures out what goes in the second column
-- we know the 1st column is always c1toc4 - i.e. the first non-null
-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4
-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)
CROSS APPLY (
SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,c1toc4),c3toc4),c1toc4),NULLIF(c4,c1toc4))
) cx2
-- and this is for the third column
CROSS APPLY (
SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4,newc2),c1toc4),NULLIF(NULLIF(c4,newc2),c1toc4))
) cx3
-- and this is for the fourth column
CROSS APPLY (
SELECT newc4 = NULLIF(NULLIF(NULLIF(c4,newc3),newc2),c1toc4)
) cx4
I hope you like it and don't want even more columns!
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
July 7, 2011 at 12:39 am
mister.magoo (7/6/2011)
Hi,I didn't like all the RBAR so I had a go at a non-RBAR and came up with this.
It is a bit hard to read because of the complexity of the nested functions but works well and has a nice simple plan:
-- create some test data
;WITH data(c1,c2,c3,c4)
AS
(
SELECT 'basketball',NULL,NULL,NULL UNION ALL
SELECT NULL,'basketball',NULL,NULL UNION ALL
SELECT NULL,NULL,'basketball',NULL UNION ALL
SELECT NULL,NULL,NULL,'basketball' UNION ALL
SELECT 'basketball','tennis',NULL,NULL UNION ALL
SELECT 'basketball',NULL,'tennis',NULL UNION ALL
SELECT 'basketball',NULL,NULL,'tennis' UNION ALL
SELECT NULL,'basketball','tennis',NULL UNION ALL
SELECT NULL,'basketball',NULL,'tennis' UNION ALL
SELECT NULL,NULL,'basketball','tennis' UNION ALL
SELECT 'basketball','tennis','cricket',NULL UNION ALL
SELECT 'basketball','tennis',NULL,'cricket' UNION ALL
SELECT 'basketball',NULL,'tennis','cricket' UNION ALL
SELECT NULL,'basketball','tennis','cricket' UNION ALL
SELECT 'basketball','tennis','cricket','curling'
)
SELECT d1.c1,d1.c2,d1.c3,d1.c4
INTO #temp
FROM data d1
-- now transform it by moving all nulls to the end without RBAR
SELECT c1toc4 AS newc1,
newc2,
newc3,
newc4,
c1,
c2,
c3,
c4
FROM #temp
-- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements
CROSS APPLY (
SELECT c1toc4 = COALESCE(c1,c2,c3,c4) ,
c2toc4 = COALESCE(c2,c3,c4) ,
c3toc4 = COALESCE(c3,c4)
) cx
-- second CROSS APPLY figures out what goes in the second column
-- we know the 1st column is always c1toc4 - i.e. the first non-null
-- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4
-- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)
CROSS APPLY (
SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,c1toc4),c3toc4),c1toc4),NULLIF(c4,c1toc4))
) cx2
-- and this is for the third column
CROSS APPLY (
SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4,newc2),c1toc4),NULLIF(NULLIF(c4,newc2),c1toc4))
) cx3
-- and this is for the fourth column
CROSS APPLY (
SELECT newc4 = NULLIF(NULLIF(NULLIF(c4,newc3),newc2),c1toc4)
) cx4
I hope you like it and don't want even more columns!
Using your excellent testdata i came up with
select
SubString(s, 1, 20),
SubString(s, 21, 20),
SubString(s, 41, 20),
SubString(s, 61, 20)
from (select
IsNull(c1 + Space(20-Len(c1)), '') + IsNull(c2 + Space(20-Len(c2)), '') +
IsNull(c3 + Space(20-Len(c3)), '') + IsNull(c4 + Space(20-Len(c4)), '') s
from #temp) t
Which would atleast make it fairly simple to add more columns to it.
/T
July 7, 2011 at 12:53 am
tommyh (7/7/2011)
Using your excellent testdata i came up with
select
SubString(s, 1, 20),
SubString(s, 21, 20),
SubString(s, 41, 20),
SubString(s, 61, 20)
from (select
IsNull(c1 + Space(20-Len(c1)), '') + IsNull(c2 + Space(20-Len(c2)), '') +
IsNull(c3 + Space(20-Len(c3)), '') + IsNull(c4 + Space(20-Len(c4)), '') s
from #temp) t
Which would atleast make it fairly simple to add more columns to it.
/T
Gosh! I feel so stupid now - nice lateral thinking!
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
Viewing 15 posts - 1 through 15 (of 92 total)
You must be logged in to reply to this topic. Login to reply