January 24, 2011 at 3:14 pm
mister.magoo (1/24/2011)
Dave,I haven't tested yours, but can see that it does not meet with Sir Jeff's strict guidelines....
BWAA-HAA!!!... I'm just the humble tester here. The OP posted the original guidelines. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
January 24, 2011 at 3:16 pm
Jeff Moden (1/24/2011)
mister.magoo (1/24/2011)
Dave,I haven't tested yours, but can see that it does not meet with Sir Jeff's strict guidelines....
BWAA-HAA!!!... I'm just the humble tester here. The OP posted the original guidelines. 😛
Thought you might like that!
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 24, 2011 at 3:21 pm
mister.magoo (1/24/2011)
TABLOCKX on mvts knock about 0.5 second off the time (14%) for a Pool of 10,000I will add this tweak to the code when I run the performance tests later.
Good call Wayne!
I have to admit... the problem initially had me thinking "quirky update" until I realized that it wouldn't handle it. But the QU stuck in my head, and with the talk about the partitioning and the maxdop 1, it popped back up to the top, and I started wondering if any of the hints in that process would help out here. I'm glad it makes a difference, and I'm interested to see the results of it on the larger set. (Just for the heck of it - try it on the temp table also!)
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
January 24, 2011 at 3:26 pm
WayneS (1/24/2011)
I have to admit... the problem initially had me thinking "quirky update" until I realized that it wouldn't handle it. But the QU stuck in my head, and with the talk about the partitioning and the maxdop 1, it popped back up to the top, and I started wondering if any of the hints in that process would help out here. I'm glad it makes a difference, and I'm interested to see the results of it on the larger set. (Just for the heck of it - try it on the temp table also!)
I also started out thinking QU but soon realised that the potentially massive recursion required would be a double-hard problem for that.
What is needed is a "Double-Barrelled Quirky Update" or an "Ultimate Quirky Update" or something....
I will try the TABLOCKX on both tables and on larger sets.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 24, 2011 at 4:50 pm
Here is another approach. Instead of searching for "no duplicate" I keep track of which values that have been used.
SET NOCOUNT ON
CREATE TABLE#Sample
(
ID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
idMvt1 INT NOT NULL,
idMvt2 INT NOT NULL,
idMvt3 INT NOT NULL
)
INSERT#Sample
(
idMvt1,
idMvt2,
idMvt3
)
SELECTTOP(1000000)
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt1,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt2,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt3
FROMsys.All_Columns ac1,
sys.All_Columns ac2
-- Solution here
CREATE TABLE#Values
(
idMvt INT PRIMARY KEY NONCLUSTERED
)
INSERT#Values
(
idMvt
)
SELECT DISTINCTu.idMvt
FROM#Sample AS s
UNPIVOT(
idMvt
FOR Col IN (s.idMvt1, s.idMvt2, s.idMvt3)
) AS u
CREATE TABLE#Result
(
ID INT PRIMARY KEY CLUSTERED
)
DECLARE@CurrID INT,
@OldID INT,
@idMvt1 INT,
@idMvt2 INT,
@idMvt3 INT
SELECT TOP(1)@CurrID = ID,
@idMvt1 = idMvt1,
@idMvt2 = idMvt2,
@idMvt3 = idMvt3
FROM#Sample
ORDER BYID
INSERT#Result
(
ID
)
VALUES(
@CurrID
)
DELETE
FROM#Values
WHEREidMvt IN (@idMvt1, @idMvt2, @idMvt3)
WHILE @CurrID IS NOT NULL
BEGIN
SELECT@OldID = @CurrID,
@CurrID = NULL
SELECT TOP(1)@CurrID = s.ID,
@idMvt1 = s.idMvt1,
@idMvt2 = s.idMvt2,
@idMvt3 = s.idMvt3
FROM#Sample AS s
WHEREEXISTS(SELECT * FROM #Values AS v WHERE v.idMvt = s.idMvt1)
AND EXISTS(SELECT * FROM #Values AS v WHERE v.idMvt = s.idMvt2)
AND EXISTS(SELECT * FROM #Values AS v WHERE v.idMvt = s.idMvt3)
AND s.ID > @OldID
ORDER BYs.ID
IF @CurrID IS NOT NULL
BEGIN
INSERT#Result
(
ID
)
VALUES(
@CurrID
)
DELETE
FROM#Values
WHEREidMvt IN (@idMvt1, @idMvt2, @idMvt3)
END
END
SELECTID
FROM#Result
DROP TABLE#Sample,
#Values,
#Result
N 56°04'39.16"
E 12°55'05.25"
January 24, 2011 at 5:10 pm
Look at us, all doing WHILE loops ! Someone should take a photograph to preserve the moment 😛
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 24, 2011 at 5:25 pm
Performance Stats as promised
First my solution WITH and WITHOUT TABLOCKX on MVTS and #accepted:
And next a comparison with SwePeso's solution which shows an interesting variance between the two solutions for small and large sets....
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 24, 2011 at 7:09 pm
How about this as a solution?
SET NOCOUNT ON
CREATE TABLE #Sample
(
ID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
idMvt1 INT NOT NULL,
idMvt2 INT NOT NULL,
idMvt3 INT NOT NULL
)
INSERT #Sample
(
idMvt1,
idMvt2,
idMvt3
)
SELECT TOP(1000000)
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt1,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt2,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt3
FROM sys.all_columns ac1,
sys.all_columns ac2;
-- Solution here
set statistics io on;
set statistics time on;
with MyIDs (
Mvt,
ID
) as (
select
idMvt1,
ID
from
#Sample
union
select
idMvt2,
ID
from
#Sample
union
select
idMvt3,
ID
from
#Sample
),
UniqueMvtIDs (
Mvt,
ID
) as (
select
Mvt,
min(ID)
from
MyIDs
group by
Mvt
)
select distinct
s.*
from
#Sample s
inner join UniqueMvtIDs umi
on (s.ID = umi.ID)
;
set statistics time off;
set statistics io off;
drop table #Sample;
Simple testing using statistics time on and io on:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Sample_____________________________________________________________________________________________________________000000000050'. Scan count 4, logical reads 9403, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5875 ms, elapsed time = 6341 ms.
!!!! Nevermind, doesn't work !!!!
January 24, 2011 at 7:18 pm
Lynn Pettis (1/24/2011)
How about this as a solution?
SET NOCOUNT ON
CREATE TABLE #Sample
(
ID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
idMvt1 INT NOT NULL,
idMvt2 INT NOT NULL,
idMvt3 INT NOT NULL
)
INSERT #Sample
(
idMvt1,
idMvt2,
idMvt3
)
SELECT TOP(1000000)
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt1,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt2,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS idMvt3
FROM sys.all_columns ac1,
sys.all_columns ac2;
-- Solution here
set statistics io on;
set statistics time on;
with MyIDs (
Mvt,
ID
) as (
select
idMvt1,
ID
from
#Sample
union
select
idMvt2,
ID
from
#Sample
union
select
idMvt3,
ID
from
#Sample
),
UniqueMvtIDs (
Mvt,
ID
) as (
select
Mvt,
min(ID)
from
MyIDs
group by
Mvt
)
select distinct
s.*
from
#Sample s
inner join UniqueMvtIDs umi
on (s.ID = umi.ID)
;
set statistics time off;
set statistics io off;
drop table #Sample;
Simple testing using statistics time on and io on:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Sample_____________________________________________________________________________________________________________000000000050'. Scan count 4, logical reads 9403, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5875 ms, elapsed time = 6341 ms.
!!!! Nevermind, doesn't work !!!!
The problem is, it doesn't work correctly. I changed the number of rows to 20 and the random number generators to 10 and here's what I got very first time out...
ID idMvt1 idMvt2 idMvt3
----------- ----------- ----------- -----------
1 8 9 1
2 10 3 10
3 10 1 4
4 5 2 6
7 7 9 6
--Jeff Moden
Change is inevitable... Change for the better is not.
January 24, 2011 at 7:28 pm
Agh! Sorry, Lynn, I missed where you said...
!!!! Nevermind, doesn't work !!!!
... at the bottom of your post.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 24, 2011 at 8:37 pm
Jeff Moden (1/24/2011)
Agh! Sorry, Lynn, I missed where you said...!!!! Nevermind, doesn't work !!!!
... at the bottom of your post.
That's okay, I think I fixed it!
Here is the corrected code:
with MyIDs (
Mvt,
ID
) as (
select
idMvt1,
ID
from
#Sample
union all
select
idMvt2,
ID
from
#Sample
union all
select
idMvt3,
ID
from
#Sample
),
UniqueMvtIDs (
Mvt,
ID
) as (
select
Mvt,
min(ID)
from
MyIDs
group by
Mvt
)
select distinct
s.*
into
#Results
from
#Sample s
where
s.idMvt1 in (select umi.Mvt from UniqueMvtIDs umi where s.ID <= umi.ID)
and s.idMvt2 in (select umi.Mvt from UniqueMvtIDs umi where s.ID <= umi.ID)
and s.idMvt3 in (select umi.Mvt from UniqueMvtIDs umi where s.ID <= umi.ID)
;
Use the original test suite from my previous post.
And here is the simple timing of the code with output to the screen:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Sample_____________________________________________________________________________________________________________000000000059'. Scan count 30, logical reads 31020, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 22015 ms, elapsed time = 12492 ms.
And writing to a temp table (#Results):
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Sample_____________________________________________________________________________________________________________00000000005C'. Scan count 30, logical reads 31020, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 21873 ms, elapsed time = 11964 ms.
January 26, 2011 at 10:16 am
Excellent solution. Good Job Mr. Magoo
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 26, 2011 at 10:40 am
CirquedeSQLeil (1/26/2011)
Excellent solution. Good Job Mr. Magoo
Thank you - it would be good to know how the OP got on..hint.hint...
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 26, 2011 at 7:56 pm
Okay, tested my solution against that of mister.magoo and his won. Mine returns the same result set with the same sample data.
I guess I'm going to have to start playing more with pivot and unpivot.
Well done, sir!
February 14, 2011 at 8:13 am
Hi everyone !!
like I've said I opted for a CLR solution, even if we do prefer a sql one. it was basically because I was short in time. but right now I'm working on these solutions and I have to say The results were quite a surprise!
I attached the set I'm using for testing, it contains 64178 rows, this is far from a set with millions of rows but I have the expected results here
I tested 4 solutions :
- Jeff Moden & Peso's solution within 12 sec it throws a max recursivity exception
- the first solution of Mister Magoo and Lynn Pettis's solutions were good but doesn't return the expected result
- ColdCoffee's first solutio turn in 608Â ms but it returns only 158 rows out of the expected 163
- Mister Magoo's final solution was perfect (even if I couldn't turn it on temporay table or table variable) it returns the 163 combinations in 3 seconds
I will shortly post the result on a larger scale.
Viewing 15 posts - 61 through 75 (of 75 total)
You must be logged in to reply to this topic. Login to reply