January 23, 2011 at 9:55 am
ColdCoffee (1/22/2011)
@Jeff, u are absolutely right.. i mis-read the requirement; from morning (its morning in INDIA now, as i write) i was thinking about that only.. So i sat and put together my piece of code to find the unique row . Hope this turns out to be correct
; WITH ROW_NUMBERED_LIST AS
(
SELECT ROW_NUM = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) ,
idMvt1,idMvt2,idMvt3
FROM MVTS
),
Unpivoted_List AS
(
SELECT ROW_NUM , ValInCol
FROM
( SELECT ROW_NUM ,idMvt1,idMvt2,idMvt3 FROM ROW_NUMBERED_LIST ) AS PIVOT_TABLE
UNPIVOT
( ValInCol FOR ColName IN (idMvt1,idMvt2,idMvt3) ) AS PIVOT_HANDLE
--ORDER BY ROW_NUM
) ,
Ranked_List AS
(
SELECT ROW_NUM , ValInCol,
RANKING = ROW_NUMBER() OVER( PARTITION BY ValInCol ORDER BY ROW_NUM )
FROM UNPIVOTED_LIST
) ,
UNIQUE_IDS AS
(
SELECT ROW_NUM
FROM Ranked_List
GROUP BY ROW_NUM
HAVING COUNT( DISTINCT RANKING ) = 1
)
SELECT R_N_LST.ROW_NUM, R_N_LST.idMvt1,R_N_LST.idMvt2,R_N_LST.idMvt3
FROM ROW_NUMBERED_LIST R_N_LST
INNER JOIN UNIQUE_IDS UNIQ
ON R_N_LST.ROW_NUM = UNIQ.ROW_NUM
Very cool, CC... that correctly produces rows 1 and 4 on the original test data. Now I have to study what you did because, frankly, I couldn't get my cold-medicine-induced head around this one. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 10:14 am
Gah! Sorry CC... although your code works with the original data, it doesn't work when we insert a new row. Try yours with the following data...
DROP TABLE MVTS
CREATE TABLE MVTS
(
ID INT IDENTITY(1,1),
idMvt1 int,
idMvt2 int,
idMvt3 int
);
INSERT INTO MVTS
(idMvt1,idMvt2,idMvt3)
SELECT 271, 204, 136 UNION ALL --1 This will always show up because it's the "first"
SELECT 271, 202, 136 UNION ALL --2 This should NOT show because 271 and 136 is contained in 1 (which is the only "active" row at this point)
SELECT 268, 253, 204 UNION ALL --3 This should NOT show because 204 is contained in 1 (which is the only "active" row at this point)
SELECT 268, 001, 204 UNION ALL [font="Arial Black"]--New row added to see if Cold Coffee's algo works. It doesn't.[/font]
SELECT 268, 253, 202 UNION ALL --4 This SHOULD show because even though 268 and 253 are in the previous row,
-- none of the elements live in 1 which is the only active row at this point.
SELECT 268, 256, 204 UNION ALL --5 Should NOT show because at least 1 element lives in 1 or 4 which are "active"
SELECT 268, 256, 202 --6 Should NOT show because at least 1 element lives in 4 which is "active"
Heh... the good part about this is that I don't feel like such an idiot on this problem, now. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 10:50 am
Can I throw in a bone too?
--===== Peso's code
;WITH cteSource(ID, SequenceID)
AS (
SELECTu.ID,
ROW_NUMBER() OVER (PARTITION BY u.theValue ORDER BY u.ID) AS SequenceID
FROMdbo.Mvts AS s
UNPIVOT(
theValue
FOR theCol IN (s.idMvt1, s.idMvt2, s.idMvt3)
) AS u
)
SELECTID
FROMcteSource
GROUP BYID
--HAVINGSUM(SequenceID) = MIN(SequenceID) * COUNT(*)
HAVINGMIN(SequenceID) = MAX(SequenceID)
N 56°04'39.16"
E 12°55'05.25"
January 23, 2011 at 2:56 pm
SwePeso (1/23/2011)
Can I throw in a bone too?
--===== Peso's code
;WITH cteSource(ID, SequenceID)
AS (
SELECTu.ID,
ROW_NUMBER() OVER (PARTITION BY u.theValue ORDER BY u.ID) AS SequenceID
FROMdbo.Mvts AS s
UNPIVOT(
theValue
FOR theCol IN (s.idMvt1, s.idMvt2, s.idMvt3)
) AS u
)
SELECTID
FROMcteSource
GROUP BYID
--HAVINGSUM(SequenceID) = MIN(SequenceID) * COUNT(*)
HAVINGMIN(SequenceID) = MAX(SequenceID)
Yep. It's just the bone is missing a bit of marrow. With the following data, it should return ID's 1 AND 5 (labeled as "4" in the comments because I added an extra row to break CC's code).
DROP TABLE MVTS
CREATE TABLE MVTS
(
ID INT IDENTITY(1,1),
idMvt1 int,
idMvt2 int,
idMvt3 int
);
INSERT INTO MVTS
(idMvt1,idMvt2,idMvt3)
SELECT 271, 204, 136 UNION ALL --1 This will always show up because it's the "first"
SELECT 271, 202, 136 UNION ALL --2 This should NOT show because 271 and 136 is contained in 1 (which is the only "active" row at this point)
SELECT 268, 253, 204 UNION ALL --3 This should NOT show because 204 is contained in 1 (which is the only "active" row at this point)
SELECT 268, 001, 204 UNION ALL --New row added to see if Cold Coffee's algo works. It doesn't.
SELECT 268, 253, 202 UNION ALL --4 This SHOULD show because even though 268 and 253 are in the previous row,
-- none of the elements live in 1 which is the only active row at this point.
SELECT 268, 256, 204 UNION ALL --5 Should NOT show because at least 1 element lives in 1 or 4 which are "active"
SELECT 268, 256, 202 --6 Should NOT show because at least 1 element lives in 4 which is "active"
--===== Peso's code
;WITH cteSource(ID, SequenceID)
AS (
SELECT u.ID,
ROW_NUMBER() OVER (PARTITION BY u.theValue ORDER BY u.ID) AS SequenceID
FROM dbo.Mvts AS s
UNPIVOT (
theValue
FOR theCol IN (s.idMvt1, s.idMvt2, s.idMvt3)
) AS u
)
SELECT ID
FROM cteSource
GROUP BY ID
--HAVING SUM(SequenceID) = MIN(SequenceID) * COUNT(*)
HAVING MIN(SequenceID) = MAX(SequenceID)
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 3:39 pm
DECLARE@Sample TABLE
(
ID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
idMvt1 int,
idMvt2 int,
idMvt3 int
)
INSERT@Sample
(
idMvt1,
idMvt2,
idMvt3
)
SELECT271, 204, 136 UNION ALL --1 This will always show up because it's the "first"
SELECT271, 202, 136 UNION ALL --2 This should NOT show because 271 and 136 is contained in 1 (which is the only "active" row at this point)
SELECT268, 253, 204 UNION ALL --3 This should NOT show because 204 is contained in 1 (which is the only "active" row at this point)
SELECT268, 001, 204 UNION ALL --4 New row added to see if Cold Coffee's algo works. It doesn't.
SELECT268, 253, 202 UNION ALL --5 This SHOULD show because even though 268 and 253 are in the previous row,
-- none of the elements live in 1 which is the only active row at this point.
SELECT268, 256, 204 UNION ALL --6 Should NOT show because at least 1 element lives in 1 or 4 which are "active"
SELECT268, 256, 202 --7 Should NOT show because at least 1 element lives in 4 which is "active"
--===== Peso's code
;WITH cteSource
AS (
SELECTID,
idMvt1,
idMvt2,
idMvt3
FROM@Sample
WHEREID = 1
UNION ALL
SELECTID,
idMvt1,
idMvt2,
idMvt3
FROM(
SELECTs.ID,
s.idMvt1,
s.idMvt2,
s.idMvt3,
ROW_NUMBER() OVER (ORDER BY s.ID) AS YakID
FROM@Sample AS s
INNER JOINcteSource AS c ON c.ID < s.ID
WHEREs.idMvt1 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
AND s.idMvt2 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
AND s.idMvt3 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
) AS d
WHEREYakID = 1
)
SELECTID,
idMvt1,
idMvt2,
idMvt3
FROMcteSource
N 56°04'39.16"
E 12°55'05.25"
January 23, 2011 at 5:34 pm
SwePeso (1/23/2011)
...
Sorry Peso, but while testing mine for more complicated datasets, I came across this one that your latest does not cope with...
INSERT INTO MVTS
(idMvt1,idMvt2,idMvt3)
SELECT 1, 2, 3 UNION ALL
SELECT 4, 5, 6 UNION ALL
SELECT 2, 1, 3 -- this row should NOT be selected.
Hope you can work around this because mine is way too slooooowww! and ugly!
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 23, 2011 at 5:40 pm
Darned good try, Peter. But it doesn't hold up for other data ({edit} Magoo beat me to it while I was writing this post). I know... it's a tough problem to do even with a RBAR loop. Even a "Lasagna" loop is going to take a long time across, say, 5 million rows.
DECLARE @Sample TABLE
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
idMvt1 int,
idMvt2 int,
idMvt3 int
);
--===== Build the test data
INSERT INTO @Sample
(idMvt1, idMvt2, idMvt3)
SELECT TOP 20
idMvt1 = ABS(CHECKSUM(NEWID()))%10+1,
idMvt2 = ABS(CHECKSUM(NEWID()))%10+1,
idMvt3 = ABS(CHECKSUM(NEWID()))%10+1
FROM sys.All_Columns ac1,
sys.All_Columns ac2
;
-----------------------------------------------------------------------------------------------------------------------
--===== Peso's code
;WITH cteSource
AS (
SELECT ID,
idMvt1,
idMvt2,
idMvt3
FROM @Sample
WHERE ID = 1
UNION ALL
SELECT ID,
idMvt1,
idMvt2,
idMvt3
FROM (
SELECT s.ID,
s.idMvt1,
s.idMvt2,
s.idMvt3,
ROW_NUMBER() OVER (ORDER BY s.ID) AS YakID
FROM @Sample AS s
INNER JOIN cteSource AS c ON c.ID < s.ID
WHERE s.idMvt1 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
AND s.idMvt2 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
AND s.idMvt3 NOT IN (c.idMvt1, c.idMvt2, c.idMvt3)
) AS d
WHERE YakID = 1
)
SELECT ID,
idMvt1,
idMvt2,
idMvt3
FROM cteSource
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 5:45 pm
mister.magoo (1/23/2011)
Hope you can work around this because mine is way too slooooowww! and ugly!
Glad to hear that my code isn't the only slow and ugly one on this problem. :blush:
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 5:52 pm
Jeff Moden (1/23/2011)
mister.magoo (1/23/2011)
Hope you can work around this because mine is way too slooooowww! and ugly!Glad to hear that my code isn't the only slow and ugly one on this problem. :blush:
At the moment, my code works but has been running for three minutes and not completed on a 1M row set....:pinch:
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 23, 2011 at 6:56 pm
Jeff Moden (1/23/2011)
mister.magoo (1/23/2011)
Hope you can work around this because mine is way too slooooowww! and ugly!Glad to hear that my code isn't the only slow and ugly one on this problem. :blush:
Ok, this may be ugly, but it processes 1M rows in 1 second and correctly processes all the test cases posted so far.
IF OBJECT_ID ('tempdb..#accepted') IS NOT NULL DROP TABLE #accepted;
CREATE TABLE #accepted
(
ID INT IDENTITY(1,1) ,
VALUE INT
);
CREATE INDEX ix1 ON #accepted(VALUE);
--= allow identity inserts - this is key to the solution
SET IDENTITY_INSERT #accepted ON;
--= pop the first row in
INSERT #accepted(ID,VALUE)
SELECT ID,VALUE
FROM mvts AS o
UNPIVOT (VALUE FOR c IN (idMvt1,idMvt2,idMvt3)) upvt
WHERE id=1;
--== now do a yucky WHILE loop to process the rows one at a time into the accepted table.
--== ROWCOUNT=3 because I unpivot the values
WHILE @@ROWCOUNT=3
INSERT #accepted(ID,VALUE)
SELECT ID, VALUE
FROM (
SELECT TOP 1 ID,idMvt1,idMvt2,idMvt3
FROM mvts AS o
WHERE NOT EXISTS (SELECT 1 FROM #accepted AS a WHERE a.value = o.idMvt1)
AND NOT EXISTS ( SELECT1 FROM #accepted AS a WHEREa.value = o.idMvt2)
AND NOT EXISTS ( SELECT 1 FROM #accepted AS a WHERE a.value = o.idMvt3)
AND id>SCOPE_IDENTITY() --= using SCOPE_IDENTITY to tell me the last ID I stored in the table.
ORDER BY id
) NextRow(ID,idMvt1,idMvt2,idMvt3)
UNPIVOT (
VALUE FOR c IN (idMvt1,idMvt2,idMvt3)
) upvt
SET IDENTITY_INSERT #accepted OFF;
SELECT mvts.id,mvts.idMvt1,mvts.idMvt2,mvts.idMvt3
FROM mvts AS mvts
WHERE EXISTS (SELECT 1 FROM #accepted AS a WHERE a.id=mvts.id)
ORDER BY mvts.id
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 23, 2011 at 9:22 pm
Ugly is OK... especially if Ugly works fast. I'll give it a try.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 9:34 pm
OMG! How did the old cartoon series used to say it? "Ahhhh, Magoo... You've done it again!". Well done. NOW I can go to bed! 😛
Even on my 9 year old, single 1.8Ghz P4 CPU, it still only takes 14 seconds on a million rows. Freakin' awesome job, Magoo. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
January 23, 2011 at 11:52 pm
Ingenious use. 🙂
One word of warning though set MAXDOP 1 , scope_identity can return incorrect results in a parallel query.
January 24, 2011 at 12:19 am
Dave Ballantyne (1/23/2011)
Ingenious use. 🙂One word of warning though set MAXDOP 1 , scope_identity can return incorrect results in a parallel query.
And using partitions, right?
N 56°04'39.16"
E 12°55'05.25"
January 24, 2011 at 2:56 am
Thx everyone I would never hoped for this number of solutions, I will shrotly post the results.
Viewing 15 posts - 16 through 30 (of 75 total)
You must be logged in to reply to this topic. Login to reply