August 13, 2008 at 6:55 am
Hi forum,
I'm writing a procedure that will receive a few parameters which will update or insert records in a 2 tables. One of the parameters, a barcode value has to be validated to make sure it exists but depending on what is in the barcode, I could find it either in column1 or column2.
So far I've done this as follows.
if exists (select 1 from LookupTable where @Parm in (Column1, Column2))
Begin
...blah blah blah
End
I've been told by someone that it's a very bad way of doing it because SQL can't use indexes when coded like this and that my lookups are slower than they could be.
They recommend that I do this with 2 queries either as;
declare @RecordFound int
set @RecordFound = 0
if exists (select 1 from LookupTable where @Parm = Column1)
set @RecordFound = 1
if exists (select 1 from LookupTable where @Parm = Column2)
set @RecordFound = 1
if (@RecordFound = 1)
Begin
...blah blah blah
End
or
if exists (select 1 from LookupTable where @Parm = Column1) or exists (select 1 from LookupTable where @Parm = 2)
Begin
...blah blah blah
End
If I have unique indexes on column1 and column2, wouldn't my method be faster or at least equal than the other two methods which requires two queries ?
August 13, 2008 at 7:16 am
Well... It depends...
The IN clause in this case is probably going to cause a table scan. I'd say that you're better off using IF EXISTS in a nested fashion such that you check the value that fails the most first. If it fails, you don't have to make the second check at all. You make the second check if the first check passes. That should minimize the number of unecessary reads.
But, what are you doing? Frequently people use EXISTS statement where they should have used a WHERE statement or some other construct. I recently found where a developer had used EXISTS prior to an UPDATE to "save time" on the UPDATE and to determine if no rows were updated. I showed how he was forcing two reads for each update and that the value was most likely going to be there, not missing. So doing an UPDATE and then checking for rows affected was the better approach.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 13, 2008 at 7:57 am
In the case of your query the IN is like OR'ing which will likely cause an table scan because you have 2 indexes to check. In the other options the 2 queries will both do index seeks which will likely be faster than scans.
Here is a little test to show you what I mean (done in adventureworks):
[font="Courier New"]-- clean up test table
IF OBJECT_ID('dbo.Products') IS NOT NULL
BEGIN
DROP TABLE dbo.Products
END
-- create & populate test table
SELECT
ProductId,
CONVERT(VARCHAR(3), productid) + LEFT(ProductNumber, 7) AS prodcode,
CONVERT(VARCHAR(3), productid) + LEFT(Name, 7) AS prodcode2
INTO
dbo.Products
FROM
production.product
-- create primary key on test table
ALTER TABLE dbo.Products
ADD CONSTRAINT PK_dbo_Products PRIMARY KEY CLUSTERED (ProductId)
-- create unique non-clustered indexes on test table
CREATE UNIQUE INDEX UX_Prodcode ON dbo.Products(ProdCode)
CREATE UNIQUE INDEX UX_Prodcode2 ON dbo.Products(ProdCode2)
-- Declare and fill the variable
DECLARE @prodcode VARCHAR(10)
SELECT @prodcode = MIN(prodcode) FROM dbo.products
-- start test getting IO and time stats
PRINT 'Begin In clause'
SET STATISTICS IO ON
SET STATISTICS Time ON
IF EXISTS (SELECT 1 FROM dbo.PRoducts WHERE @prodcode IN (Prodcode, prodcode2))
BEGIN
PRINT 'ENd In clause'
END
ELSE
BEGIN
PRINT 'ENd In clause'
END
SET STATISTICS Io OFF
SET STATISTICS Time OFF
-- clean up cache and buffer so it is a valid test
DBCC FreeProcCache()
DBCC DropCleanBuffers()
PRINT 'Begin = prodcode'
SET STATISTICS IO ON
SET STATISTICS Time ON
IF EXISTS (SELECT 1 FROM dbo.PRoducts WHERE @prodcode = Prodcode)
BEGIN
PRINT 'End = prodcode'
END
ELSE
BEGIN
PRINT 'End = prodcode'
END
SET STATISTICS Io OFF
SET STATISTICS Time OFF
-- clean up cache and buffer so it is a valid test
DBCC FreeProcCache()
DBCC DropCleanBuffers()
PRINT 'Begin = prodcode2'
SET STATISTICS IO ON
SET STATISTICS Time ON
IF EXISTS (SELECT 1 FROM dbo.PRoducts WHERE @prodcode = Prodcode2)
BEGIN
PRINT 'End = prodcode2'
END
ELSE
BEGIN
PRINT 'End = prodcode2'
END
SET STATISTICS Io OFF
SET STATISTICS Time OFF
-- clean up cache and buffer so it is a valid test
DBCC FreeProcCache()
DBCC DropCleanBuffers()
PRINT 'Begin Or Exists'
SET STATISTICS IO ON
SET STATISTICS Time ON
IF EXISTS (SELECT 1 FROM dbo.PRoducts WHERE @prodcode = Prodcode) OR EXISTS (SELECT 1 FROM dbo.PRoducts WHERE @prodcode = Prodcode2)
BEGIN
PRINT 'End Or Exists'
END
ELSE
BEGIN
PRINT 'End Or Exists'
END
SET STATISTICS Io OFF
SET STATISTICS Time OFF
-- clean up cache and buffer so it is a valid test for re-run
DBCC FreeProcCache()
DBCC DropCleanBuffers()
[/font]
Here are the results for TIME and IO Stats for each query:
Begin In clause
Table 'Products'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 19 ms.
ENd In clause
Begin = prodcode
Table 'Products'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
End = prodcode
Begin = prodcode2
Table 'Products'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.
End = prodcode2
Begin Or Exists
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Products'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 12 ms.
End Or Exists
You can see that using IN is the least efficient and actually does a clustered index scan while the others have no scans. I am also attaching the Execution plan.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 13, 2008 at 8:19 am
Grant,
Here's a short description of what I do. The .NET app that I'm writing will run in Windows CE Device. This is a typical physical inventory count.
Each product in our warehouse is barcoded and this can be either our barcode value or the customer's. We sometimes have both but it's a rare case because most of our customers don't want more than 1 barcode on the products so when they have their own we can't have ours.
When the user scans a barcode I need to check that it's valid. If it is, I insert/create records in my tables add it to the count and return details of the products to the app, if the check fails I return a message to alert the user that the barcode isn't valid (misprinted, damaged ticket, bad scan etc).
There is no processing done in the CE device, after each barcode scan, I call a procedure that does whatever needs to be done and only return data to the app that is displayed on screen.
Here's a simplified sample Skeleton (removed some output parms) of the procedure.
Create Procedure UpdatePhysicalCount
(
@Warehouse char(2),
@Location char(10),
@BarcodeValue Char(25),
@ValidBarcode bit output,
@ProdNo char(9) output,
@ProdColor char(5) output,
@ProdSize char(2) output,
@TotBin int output,
@TotSku int output
)
AS
set @ValidBarcode = 1
Set @ProdNo = ''
set @ProdColor = ''
set @ProdSize = ''
if not exists (select 1 from ProductList where BarcodeValue in (OurBarcode, CustomerBarcode)
Begin
Set @ValidBarcode = 0
Return
End
/* Each item in our product table has a primary key called ProdID which is used to retrieve other columns from that table */
declare @ProdID int
set @ProdID = (select prodid from ProductList where BarcodeValue in (OurBarcode, CustomerBarcode)
set @ProdNo = (Select Prodno from ProductList where ProdID = @ProdID)
set @ProdColor = (Select ProdColor from ProductList where ProdId = @ProdID)
set @ProdSize = (Select @ProdSize from ProductList where ProdID = @ProdID)
/* Then I have a few other statements to update
counters in the inventory count tables
and output parameters */
The remote device calls this procedure and upon return display an error message if @ValidBarode = 1, otherwise it displays the content of the other output parameters and the user goes on with the next scan.
This is the base of all my warehouse applications, I have 4 apps that do different things and they all need to validate the barcodes for existence in either the OurBarcode or CustomerBarcode fields and retrieve data for the corresponding ProdID value.
August 13, 2008 at 8:37 am
Jack,
I modified your code to run it with my tables. The results are very telling......and scary :unsure:
The "IN Claude" takes an average of 575ms to run. There's one thing that I find odd though, it took "only" 274ms the first time then it took between 550ms and 625ms and subsequent runs.
The "Or exists" method takes 11ms.
I guess I have my answer. Thanks !
August 13, 2008 at 8:44 am
OK. You may or may not want to continue the existence check. From what I can see though, I think you could skip it. It looks like you're doing a bunch of select statements to populate properties then you use those properties to insert values. Instead, why don't you do a direct INSERT ... SELECT:
INSERT INTO Something
(ProdId
,ProdNo
,ProdColor
,ProdSize)
SELECT p.ProdId
,p.ProdNo
,p.ProdColor
,p.ProdSize
FROM ProductList p
WHERE BarCodeValue IN (OurBarcode, CustomerBarcode)
Although I'd try to find a way to get rid of the IN too.
Or, if you really do have to load parameters, do it in a single statement like this:
SELECT @ProdId = ProdId
,@ProdNo = Prodno
,@ProdColor = ProdColor
,@ProdSize = ProdSize
FROM ProductList
WHERE BarCodeValue IN (OurBarcode, CustomerBarcode)
That will at least radically reduce the number of reads & scans you have in the system.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 13, 2008 at 8:44 am
Glad I could help. That is the difference between SEEKS and SCANS. This is also why you should TEST different code.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 13, 2008 at 8:51 am
I did not look at your code, but as Grant has pointed out, assigning the variables in a single select would work best.
I also think you could eliminate the check as well. You could do your select and then do:
If @@ROWCOUNT = 0
Begin
Set @ValidBarcode = 1
END
ELSE
Begin
Set @ValidBarcode = 0
End
Of course this does not take into account the other logic you have in the procedure that we have not seen.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 13, 2008 at 8:59 am
Thanks guys. I'm a beginner at SQL and I still have a lot to learn. For instance I didn't know that it's possible to populaire multiple variables in 1 select statement as Grant has demonstrated.
I wouldn't have thought of using the @@ROWCOUNT value either 🙂
August 13, 2008 at 9:06 am
Don't worry about it. There's always more to learn, for everyone. Speaking of which, I need to download 2008 and get started learning new stuff.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply