December 6, 2017 at 4:09 am
Hello
Our company is in the garment hire business so we have items on an order that go out and then come back into our warehouse. When they come back their barcode is scanned at various stations within the warehouse. I want to flag items that are overdue (have not came back when they should have) and other statuses so my logic is to look at when the item was meant to be returned (OrderItems.ReturnDate) and find the next scan date we have on the system BarcodeScanHistory.DateTimeScan i can then do a datediff to workout what the status should be Overdue, long Overdue etc. My problem is performance. Each barcode can have up to 50 barcodescanhistory records so currently i'm doing a join on barcode and with DateTimeScan > returndate and taking the min date. doing this for 70k is quite slow. ive attached scripts for sample date and my query.
December 6, 2017 at 4:22 am
So that people don't need to download the file (please post your code into the Topic):--table for holding all the scans of a barcode over its lifetime
CREATE TABLE [dbo].[BarcodeScanHistory](
[DateTimeScan] [datetime] NOT NULL,
[Barcode] [nvarchar](256) NOT NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-10','21431791')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-11','21431791')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-12','21431791')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-10','01756395')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-11','01756395')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-12','01756395')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-01','00297271')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-06','00297271')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-09','00297271')
--table for holding items that go out and come back on orders
CREATE TABLE [dbo].[OrderItems](
[ItemNumber] [NVARCHAR](50) NOT NULL,
[ReturnDate] [DATE] NULL,
[Barcode] [NVARCHAR](50) NULL
) ON [PRIMARY]
INSERT INTO [dbo].[OrderItems]VALUES('T3444083','2015-04-10','21431791')
INSERT INTO [dbo].[OrderItems]VALUES('T3443206','2016-04-11','01756395')
INSERT INTO [dbo].[OrderItems]VALUES('T3444641','2017-04-09','00297271')
--return all the barcodes and the next date the system saw the items
SELECT
oi.*,
(SELECT CAST(MIN(gp.DateTimeScan) AS DATE) FROM [dbo].[BarcodeScanHistory] gp WHERE gp.Barcode = oi.Barcode AND gp.DateTimeScan > oi.[ReturnDate]) [NextScanDate]
FROM
[dbo].[OrderItems] oi
You talk about statuses in your post (flag items that are overdue (have not came back when they should have) and other statuses), however, I can't see anything that has a status column.
Also, from the small data you've supplied, what would be your expected output?
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
December 6, 2017 at 4:40 am
Please also post DDL for any indexes and constraints that you have on the tables.
John
December 6, 2017 at 4:45 am
The status conditions have yet to be decided but will be based on the difference in days between ReturnDate and DateTimeScan so the issue im finding is getting the optimum way of returning the next date we saw each barcode after its return date. The last section of sql produces my expected output ie. Barcode, Returndate, NextScanDate but it obviously dose not perform well so im looking for a better way of doing this.
These tables are staging tables I populate daily, currently with no indexes or constraints.
many thanks
December 6, 2017 at 5:00 am
ps_vbdev - Wednesday, December 6, 2017 4:45 AMThese tables are staging tables I populate daily, currently with no indexes or constraints.
I think that's your problem. For every barcode in OrderItems, you're having to do a scan of the BarcodeScanHistory table. Try indexing the Barcode column in each table, and the DateTimeScan and ReturnDate columns. Drop the indexes before you do your daily load, and recreate them afterwards. At the very least, make sure that you have good statistics on the tables so that the query optimizer can come up with the best execution plan possible.
With no constraints, how can you be sure you're not getting, for example, barcodes in the BarcodeScanHistory table that don't exist in OrderItems?
John
December 6, 2017 at 6:36 am
Yep appreciate the indexing aspect and that was next thing to tackle. I just know the sub query in the select isnt the most elegant way to perform this and was looking to optimize this first before looking at the indexing so was really looking for better options to get the next scan date. I did a quick test by adding index but it didnt have a major impact - didnt update statistics tho.
December 6, 2017 at 7:07 am
Please will you post the actual execution plan (.sqlplan file) for your query when you run it against the full data set?
John
December 6, 2017 at 7:43 am
ps_vbdev - Wednesday, December 6, 2017 6:36 AMYep appreciate the indexing aspect and that was next thing to tackle. I just know the sub query in the select isnt the most elegant way to perform this and was looking to optimize this first before looking at the indexing so was really looking for better options to get the next scan date. I did a quick test by adding index but it didnt have a major impact - didnt update statistics tho.
Any efficient plan is going to require an index on Barcode and DateTimeScan, so you should go ahead and create that index. Here is an alternate method that might perform better once that index is in place.
SELECT *
FROM #OrderItems oi
OUTER APPLY
(
SELECT TOP 1 DateTimeScan
FROM #BarcodeScanHistory bsh
WHERE bsh.Barcode = oi.Barcode
AND bsh.DateTimeScan > oi.ReturnDate
ORDER BY DateTimeScan
) nsd(NextScanDate)
;
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
December 6, 2017 at 8:22 am
ahhhh outer apply off course exactlt the prod i needed. thanks.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply