May 7, 2015 at 8:50 am
It may depend heavily on data distribution but 'not exists' perfoms better in my tests. Here are the test scripts. QU version uses a copy of original data but the time of replicating data is only a small fraction of the update itself. One may anticipate QU will get to the first place having 10 million rows or more in Movements.
CREATE TABLE #BOM (
ItemID INT
,neededQuantity FLOAT
,PRIMARY KEY (ItemID)
);
CREATE TABLE #WhareHouseMovement (
ItemID INT
,Quantity FLOAT
,[Date] DATETIME
,PRIMARY KEY (ItemID,[Date])
);
-- Create covering index
CREATE NONCLUSTERED INDEX wm_ix1 ON #WhareHouseMovement (Quantity) INCLUDE ([Date]);
GO
--== Queries/scripts to test
CREATE PROC qDwain AS
WITH InventoryDates AS
(
SELECT [Date]
FROM #WhareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN #BOM b
CROSS APPLY
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityAvail=1
FROM #WhareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
HAVING SUM(Quantity) >= neededQuantity
) c
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM #BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
GO
CREATE PROC qNotExists AS
WITH dts AS (
SELECT [Date]
FROM #WhareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP (1) DATE AS readytomnfc
FROM dts
WHERE NOT EXISTS (
SELECT 1
FROM #BOM b
WHERE neededQuantity > isnull((
SELECT sum(Quantity) q
FROM #WhareHouseMovement s
WHERE s.ItemId = b.ItemId
AND s.DATE <= dts.DATE
), 0)
)
ORDER BY DATE
GO
CREATE PROC qQU AS
IF object_id('[tempdb].[dbo].#whm') IS NOT NULL
DROP TABLE #whm;
CREATE TABLE #Whm (
ItemID INT
,Quantity FLOAT
,[Date] DATETIME
,rt FLOAT
);
CREATE CLUSTERED INDEX whmidx ON #whm (ItemID,[Date]);
INSERT #whm (ItemID,Quantity,[Date],rt)
SELECT ItemID,Quantity,[Date],0 AS rt
FROM #WhareHouseMovement;
DECLARE @RunningTotal FLOAT = 0;
DECLARE @PrevItemID INT = NULL;
UPDATE #whm
SET @RunningTotal = CASE
WHEN ItemID = @PrevItemID
THEN @RunningTotal + Quantity
ELSE Quantity
END
,rt = @RunningTotal
,@PrevItemID = ItemID
FROM #whm WITH (TABLOCKX)
OPTION (MAXDOP 1);
WITH dts
AS (
SELECT [Date]
FROM #WhareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP (1) DATE AS readytomnfc
FROM dts
WHERE NOT EXISTS (
SELECT 1
FROM #BOM b
WHERE neededQuantity > isnull((
SELECT TOP (1) rt
FROM #whm t
WHERE t.ItemId = b.ItemId
AND t.DATE <= dts.DATE
ORDER BY DATE DESC
), 0)
)
ORDER BY DATE
GO
CREATE PROC RunTests AS
SET NOCOUNT ON;
-- timing appliance
DECLARE @timing TABLE (
evt VARCHAR(200)
,tim DATETIME DEFAULT getdate()
);
DECLARE @counts VARCHAR(200) = (
SELECT cast(count(*) AS VARCHAR(9))
FROM #WhareHouseMovement
) + ', ' + (
SELECT cast(min(neededQuantity) AS VARCHAR(9))
FROM #BOM
);
INSERT @timing (evt) VALUES ('** Dwain ' + @counts);
EXEC qDwain
INSERT @timing (evt) VALUES ('** Dwain ' + @counts);
INSERT @timing (evt) VALUES ('** NotExists ' + @counts);
EXEC qNotExists;
INSERT @timing (evt) VALUES ('** NotExists ' + @counts);
INSERT @timing (evt) VALUES ('** Quirky Update ' + @counts);
EXEC qQU;
INSERT @timing (evt) VALUES ('** Quirky Update ' + @counts);
SELECT evt, datediff(ms, min(tim), max(tim)) AS duration
FROM @timing
GROUP BY evt
GO
--== Populate tables
DECLARE @BOMCnt INT = 200;
WITH Tally (n)
AS (
SELECT TOP(@BOMCnt) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_objects
)
INSERT INTO #BOM (ItemID, neededQuantity)
SELECT n, 1 + ABS(CHECKSUM(NEWID())) % 10
FROM Tally;
WITH Tally (n)
AS (
SELECT TOP 5000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_objects a, sys.all_objects b
)
INSERT INTO #WhareHouseMovement (ItemID,Quantity,[Date])
SELECT b.n
,5 + CHECKSUM(NEWID()) % 15
,DATEADD(day, 4 * a.n + 1 + ABS(CHECKSUM(NEWID())) % 4, 0)
FROM Tally a
CROSS APPLY (
SELECT b.n
FROM Tally b
WHERE b.n <= @BOMCnt
) b;
--== Run tests
EXEC RunTests;
-- force seaching all the data, no manufacturing possible
UPDATE #BOM SET neededQuantity *= 1000;
EXEC RunTests;
May 7, 2015 at 7:23 pm
Hi again Serge! I wanted to confirm your findings and see just how much the data in play causes timing differences. So I built my own test harness, not because I don't trust yours but because it avoids setting up all the SPs. Here is that:
-- Create tables and setup data only
CREATE TABLE dbo.BOM
(
ItemID INT
,neededQuantity FLOAT
,PRIMARY KEY (ItemID)
);
CREATE TABLE dbo.WareHouseMovement
(
ItemID INT
,Quantity FLOAT
,[Date] DATETIME
,QuantityInStock FLOAT
,PRIMARY KEY (ItemID, [Date])
-- Note added FK because it should be here
,FOREIGN KEY (ItemID) REFERENCES dbo.BOM (ItemID)
);
GO
DECLARE @BOMItems INT = 100
,@NoOfStockMovementsPerBOM INT = 500;
DELETE FROM dbo.WareHouseMovement;
DELETE FROM dbo.BOM;
-- Set up the BOM
WITH Tally (n) AS
(
SELECT TOP (@BOMItems) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns
)
-- 100 BOM items
INSERT INTO dbo.BOM (ItemID, neededQuantity)
SELECT n, 1+ABS(CHECKSUM(NEWID()))%10
FROM Tally;
-- Set up the stock movements
WITH Tally (n) AS
(
SELECT TOP (@NoOfStockMovementsPerBOM) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns a CROSS JOIN sys.all_columns b
)
INSERT INTO dbo.WareHouseMovement (ItemID, Quantity, [Date])
-- Remove the 5+ on the Quantity
SELECT b.n, CHECKSUM(NEWID())%15, DATEADD(day, 4*a.n + 1+ABS(CHECKSUM(NEWID()))%4, 0)
FROM Tally a
CROSS APPLY
(
SELECT b.n
FROM Tally b
WHERE b.n <= @BOMItems
) b;
--GO
---- Clean up when ready
--DROP TABLE dbo.WareHouseMovement;
--DROP TABLE dbo.BOM;
I then set up various runs, which you can see here. By adjusting parameters in the data creation step (above) I can achieve various row counts and force whether a solution exists or not. Note that I too played around with an EXISTS version of my solution, and I've included it as one of the methods below because sometimes it does the job a bit better than my original.
-- Various runs/methods saved to a run statistics table
DECLARE @TimingStatistics TABLE
(
RunID INT IDENTITY PRIMARY KEY
,Method VARCHAR(100)
,CoveringIndex VARCHAR(3)
,ElapsedMS INT
,BOMItemCount INT
,DatesInStockMovement INT
);
DECLARE @StartDT DATETIME = GETDATE()
,@QURow INT;
DECLARE @BOMItems INT = (SELECT COUNT(*) FROM dbo.BOM)
DECLARE @NoOfStockMovementsPerBOM INT = (SELECT COUNT(*) / @BOMItems FROM dbo.WareHouseMovement);
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
CROSS APPLY
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityAvail=1
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
HAVING SUM(Quantity) >= neededQuantity
) c
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain 1 - Original', 'NO', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
WHERE EXISTS
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityAvail=1
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
HAVING SUM(Quantity) >= neededQuantity
)
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain 2 - EXISTS', 'NO', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
WITH dts AS (
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP (1) DATE AS readytomnfc
FROM dts
WHERE NOT EXISTS (
SELECT 1
FROM dbo.BOM b
WHERE neededQuantity > isnull((
SELECT sum(Quantity) q
FROM dbo.WareHouseMovement s
WHERE s.ItemId = b.ItemId
AND s.DATE <= dts.DATE
), 0)
)
ORDER BY DATE;
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Serge - NOT EXISTS', 'NO', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
-- Create covering index
CREATE NONCLUSTERED INDEX wm_ix1
ON dbo.WareHouseMovement (Quantity) INCLUDE ([Date]);
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
CROSS APPLY
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityAvail=1
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
HAVING SUM(Quantity) >= neededQuantity
) c
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain 1 - Original', 'YES', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
WHERE EXISTS
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityAvail=1
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
HAVING SUM(Quantity) >= neededQuantity
)
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain 2 - EXISTS', 'YES', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
WITH dts AS (
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP (1) DATE AS readytomnfc
FROM dts
WHERE NOT EXISTS (
SELECT 1
FROM dbo.BOM b
WHERE neededQuantity > isnull((
SELECT sum(Quantity) q
FROM dbo.WareHouseMovement s
WHERE s.ItemId = b.ItemId
AND s.DATE <= dts.DATE
), 0)
)
ORDER BY DATE;
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Serge - NOT EXISTS', 'YES', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
DROP INDEX wm_ix1 ON dbo.WareHouseMovement;
-- Perform the QU
DECLARE @ItemID INT = 0
,@RT FLOAT = 0;
-- Note PRIMARY KEY (CLUSTERED INDEX) is on (ItemID, [Date])
UPDATE dbo.WareHouseMovement WITH(TABLOCKX)
SET @rt = QuantityInStock = CASE ItemID WHEN @ItemID THEN @rt + Quantity ELSE Quantity END
,@ItemID = ItemID
OPTION (MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'QU', 'NO', DATEDIFF(millisecond, @StartDT, GETDATE()), @BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
SELECT @QURow = @@IDENTITY;
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
CROSS APPLY
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityInStock
FROM
(
SELECT TOP 1 QuantityInStock
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
ORDER BY c.[Date] DESC
) a
WHERE QuantityInStock >= neededQuantity
) c
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain QU', 'NO', DATEDIFF(millisecond, @StartDT, GETDATE()) +
(SELECT ElapsedMS FROM @TimingStatistics WHERE RunID = @QURow)
,@BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
-- Create covering index
CREATE NONCLUSTERED INDEX wm_ix1
ON dbo.WareHouseMovement (Quantity) INCLUDE ([Date]);
WITH InventoryDates AS
(
SELECT [Date]
FROM dbo.WareHouseMovement
WHERE Quantity > 0
GROUP BY [Date]
)
SELECT TOP 1 ReadyToMFG=[Date]
FROM InventoryDates a
CROSS JOIN dbo.BOM b
CROSS APPLY
(
-- Returned value is immaterial here
-- CA only serves to reduce the row count
SELECT QuantityInStock
FROM
(
SELECT TOP 1 QuantityInStock
FROM dbo.WareHouseMovement c
WHERE a.[Date] >= c.[Date] AND b.ItemID = c.ItemID
ORDER BY c.[Date] DESC
) a
WHERE QuantityInStock >= neededQuantity
) c
GROUP BY [Date]
HAVING COUNT(*) = (SELECT COUNT(*) FROM dbo.BOM)
ORDER BY ReadyToMFG
OPTION(MAXDOP 1);
-- Record timings and reset timer
INSERT INTO @TimingStatistics
SELECT 'Dwain QU', 'YES', DATEDIFF(millisecond, @StartDT, GETDATE()) +
(SELECT ElapsedMS FROM @TimingStatistics WHERE RunID = @QURow)
,@BOMItems, @NoOfStockMovementsPerBOM;
SELECT @StartDT = GETDATE();
DROP INDEX wm_ix1 ON dbo.WareHouseMovement;
SELECT *
FROM @TimingStatistics
WHERE RunID <> @QURow
ORDER BY ElapsedMS;
Now for some run results.
-- Date rows 50,000: 100 Items, 500 dates per BOM Item (no solution)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS YES 283
Serge - NOT EXISTS NO 293
Dwain QU NO 830
Dwain QU YES 830
Dwain 2 - EXISTS YES 10026
Dwain 2 - EXISTS NO 10110
Dwain 1 - Original NO 10126
Dwain 1 - Original YES 10176
-- Date rows 100,000: 100 Items, 1000 dates per BOM Item (early solution - 5+)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS NO 63
Dwain 2 - EXISTS YES 63
Serge - NOT EXISTS YES 63
Dwain 2 - EXISTS NO 73
Dwain 1 - Original YES 143
Dwain 1 - Original NO 270
Dwain QU NO 316
Dwain QU YES 316
-- Date rows 500,000: 100 Items, 5000 dates per BOM Item (early solution - 5+)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS YES 96
Dwain 2 - EXISTS YES 130
Serge - NOT EXISTS NO 133
Dwain 2 - EXISTS NO 160
Dwain 1 - Original YES 616
Dwain 1 - Original NO 1300
Dwain QU NO 1309
Dwain QU YES 1396
-- Date rows 1,000,000: 200 Items, 5000 dates per BOM Item (early solution - 5+)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS YES 206
Serge - NOT EXISTS NO 213
Dwain 2 - EXISTS YES 253
Dwain 2 - EXISTS NO 270
Dwain 1 - Original YES 1226
Dwain 1 - Original NO 1330
Dwain QU NO 2362
Dwain QU YES 2892
-- Date rows 100,000: 100 Items, 1000 dates per BOM Item (late solution - 2+)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS NO 166
Serge - NOT EXISTS YES 280
Dwain QU YES 429
Dwain QU NO 516
Dwain 2 - EXISTS NO 560
Dwain 1 - Original YES 630
Dwain 2 - EXISTS YES 703
Dwain 1 - Original NO 753
-- Date rows 500,000: 100 Items, 5000 dates per BOM Item (later solution - 1+)
Method CoveringIndex ElapsedMS
Serge - NOT EXISTS YES 376
Serge - NOT EXISTS NO 390
Dwain QU NO 1666
Dwain QU YES 1692
Dwain 2 - EXISTS YES 3090
Dwain 2 - EXISTS NO 3100
Dwain 1 - Original YES 3546
Dwain 1 - Original NO 4300
Looks like you've got a pretty good solution there sir! Congratulations!
Note that my solution that uses QU is rather a non-thinking one. I simply substituted the QU column into the place where I was doing the triangular join. There might be a better way to do that, which might put the QU ahead on some of these scenarios.
Edit: Forgot to mention the 5+, 2+, 1+ notation on early vs. later solutions. This is where I adjusted the addition to the Quantity that is loaded into the WareHouseMovement table, which I noted in my earlier post.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 2 posts - 16 through 16 (of 16 total)
You must be logged in to reply to this topic. Login to reply