October 15, 2008 at 9:04 am
What about this sample data?
SELECT'', '', 'Y', 3 UNION ALL
SELECT'', 'X', 'Y', 3
Should first line be deleted since it is a subset of second line?
N 56°04'39.16"
E 12°55'05.25"
October 15, 2008 at 9:10 am
Thanks Peso,
Yes, The first line should be deleted. Thanks for your codes. I am still trying to understand it.
October 15, 2008 at 9:14 am
Steve Jones - Editor (10/15/2008)
Multiple passes would be looking for a pattern like Row 2, where Col1 and 2 match, but Col3 is a null/blank. Then a second pattern would be col1 and 3 match, but 2 has a null/blank.
Max (10/15/2008)
Yes, The first line should be deleted. Thanks for your codes. I am still trying to understand it.
In that case, use this more generic one step delete algorithm
DECLARE@Sample TABLE
(
recID INT IDENTITY(1, 1),
col1 VARCHAR(1),
col2 VARCHAR(2),
col3 VARCHAR(3),
userID INT
)
INSERT@Sample
SELECT'A', 'B', 'C', 1 UNION ALL
SELECT'A', 'B', '', 1 UNION ALL
SELECT'A', '', 'C', 1 UNION ALL
SELECT'F', '', 'C', 1 UNION ALL
SELECT'', 'M', '', 2 UNION ALL
SELECT'T', 'M', 'O', 2 UNION ALL
SELECT'', 'M', 'O', 2 UNION ALL
SELECT'X', 'M', 'O', 2 UNION ALL
SELECT'', '', 'Y', 3 UNION ALL
SELECT'', 'X', 'Y', 3
SELECT*
FROM@Sample
DELETEs
FROM(
SELECTrecID,
col1,
col2,
col3,
userID,
CASE WHEN col1 = '' THEN 0 ELSE 1 END
+ CASE WHEN col2 = '' THEN 0 ELSE 1 END
+ CASE WHEN col3 = '' THEN 0 ELSE 1 END AS theItems
FROM@Sample
) AS s
INNER JOIN(
SELECTMIN(recID) AS minID,
col1,
col2,
col3,
userID,
CASE WHEN col1 = '' THEN 0 ELSE 1 END
+ CASE WHEN col2 = '' THEN 0 ELSE 1 END
+ CASE WHEN col3 = '' THEN 0 ELSE 1 END AS theItems
FROM@Sample
GROUP BYcol1,
col2,
col3,
userID,
CASE WHEN col1 = '' THEN 0 ELSE 1 END
+ CASE WHEN col2 = '' THEN 0 ELSE 1 END
+ CASE WHEN col3 = '' THEN 0 ELSE 1 END
) AS w ON w.userID = s.userID
AND w.minID <> s.recID
AND w.theItems >= s.theItems
WHEREw.col1 = COALESCE(NULLIF(s.col1, ''), w.col1)
AND w.col2 = COALESCE(NULLIF(s.col2, ''), w.col2)
AND w.col3 = COALESCE(NULLIF(s.col3, ''), w.col3)
SELECT*
FROM@Sample
Also see http://weblogs.sqlteam.com/peterl/archive/2008/10/15/Delete-all-subset-records.aspx
N 56°04'39.16"
E 12°55'05.25"
October 15, 2008 at 9:56 am
Thanks Peso. You're brilliant. The codes look really great. I haven't fully tested it but I believe it will work.
Thanks a lot all others who participated. Have a good day!!
October 15, 2008 at 12:23 pm
It turns out you can write the algorithm even cleaner.
I got a comment on my blog to use an algorithm like this instead.
DELETEa
FROM@Sample AS a
INNER JOIN@Sample AS b ON b.userID = a.userID
AND b.recID <> a.recID
WHERE(a.col1 = b.col1 OR a.col1 = '')
AND (a.col2 = b.col2 OR a.col2 = '')
AND (a.col3 = b.col3 OR a.col3 = '')
It's very nice but it creates several cross joins for each userID (n * m).
I tried to create a triangular join with help of the "count of values" column ( n * m / 2).
N 56°04'39.16"
E 12°55'05.25"
October 15, 2008 at 1:27 pm
Thanks Peso,
The new algorithm looks good, but I need to make sure There are no pure duplicates. If there are, it will delete them all.
INSERT @SAMPLE
SELECT 'A', 'B', 'C', 1 UNION ALL <-- duplicate
SELECT 'A', 'B', 'C', 1 UNION ALL <-- duplicate
SELECT 'A', 'B', ' ', 1 UNION ALL
SELECT 'A', ' ', 'C', 1 UNION ALL
SELECT 'F', ' ', 'C', 1 UNION ALL
SELECT ' ', 'M', ' ', 2 UNION ALL
SELECT 'T', 'M', 'O', 2 UNION ALL
SELECT ' ', 'M', 'O', 2 UNION ALL
SELECT 'X', 'M', 'O', 2 UNION ALL
SELECT ' ', ' ', 'Y', 3 UNION ALL
SELECT ' ', 'X', 'Y', 3
Thanks.
p.s. Maybe I can group it by col1, col2, and col3 again and select only one of them to delete. I will run some tests. Thanks.
October 15, 2008 at 1:45 pm
Easy fixed when you understand the code.
DELETEa
FROM@Sample AS a
INNER JOIN(
SELECTMIN(recID) AS recID,
col1,
col2,
col3,
userID
FROM@Sample
GROUP BYcol1,
col2,
col3,
userID
) AS b ON b.userID = a.userID
AND b.recID <> a.recID
WHERE(a.col1 = b.col1 OR a.col1 = '')
AND (a.col2 = b.col2 OR a.col2 = '')
AND (a.col3 = b.col3 OR a.col3 = '')
And now the code looks very similar to my first suggestion. The thing missing is the WHERE '' NOT IN part 😀
N 56°04'39.16"
E 12°55'05.25"
Viewing 7 posts - 16 through 21 (of 21 total)
You must be logged in to reply to this topic. Login to reply