November 11, 2019 at 12:39 pm
Hi Every One
I have query optimization issue as follows.
I have two related tables.
Commodity (This table has 75 Thousand records)
CommodityID (PK)
CommodityName
CommoditySupplier (This table has approx 800 Million records)
CommoditySupplierID (PK)
SupplierID (FK)
CommodityID (FK)
I am executing following query to list SupplierIDs having CommodityName 'SearchString'
SELECT SuppllierID FROM CommoditySupplier WHERE CommodityID IN (SELECT CommodityID FROM Commodity WHERE CommodityName Like '%SearchString%')
Problem is, it is extremely slow. It takes more than 30-40 minutes to execute. Sometimes even more than that.
please guide me to optimize the query. Or Should I change the structure in someway?
Thanks in advance.
November 11, 2019 at 12:54 pm
I would start by using an INNER JOIN, rather than a sub-query
SELECT cs.SuppllierID
FROM CommoditySupplier AS cs
INNER JOIN Commodity AS c
ON c.CommodityID = cs.CommodityID
WHERE c.CommodityName LIKE '%SearchString%';
However, your LIKE '%xxxx%' will always force a table scan, which will make the query slow on such large tables.
You need to stop searching where CommodityName name "contains" some value, and start searching where CommodityName "startsWith" some value.
SELECT cs.SuppllierID
FROM CommoditySupplier AS cs
INNER JOIN Commodity AS c
ON c.CommodityID = cs.CommodityID
WHERE c.CommodityName LIKE 'SearchString%'; -- NOTE no leading wild card
You can then add an index to assist with your search
CREATE NONCLUSTERED INDEX ix_Commodity_CommodityName
ON Commodity (CommodityName)
INCLUDE (CommodityID);
November 11, 2019 at 1:14 pm
Because the logical query processing order will process the join before the where, it MIGHT improve performance slightly if you search for the Commodities before searching for the Suppliers.
IF OBJECT_ID(N'tempdb..#Commodity') IS NOT NULL
BEGIN
DROP TABLE #Commodity;
END;
CREATE TABLE #Commodity (CommodityID int NOT NULL PRIMARY KEY CLUSTERED);
INSERT INTO #Commodity ( CommodityID )
SELECT CommodityID
FROM Commodity
WHERE CommodityName LIKE '%SearchString%'; -- THIS IS A BAD IDEA that will cause a full table scan
SELECT cs.SuppllierID
FROM #Commodity AS c
INNER JOIN CommoditySupplier AS cs
ON c.CommodityID = cs.CommodityID
GROUP BY cs.SuppllierID;
November 11, 2019 at 4:12 pm
To help resolve this problem across the board, for all your queries, you really need to re-cluster the CommoditySupplier table. This is a classic case of the "default clustering by identity column" problem. Unfortunately, since it's such a large table, this will take quite a while. You'd almost certainly want to try it first in a test environment anyway. Change the clustered index as shown below, then retry the query. Review the query plan to make sure SQL is searching the Commodity table first, then doing (joined) seeks on the CommoditySupplier table.
--first, drop all nonclustered indexes (if any)
ALTER TABLE dbo.CommoditySupplier DROP CONSTRAINT PK_CommoditySupplier;
CREATE UNIQUE CLUSTERED INDEX CL_CommoditySupplier ON dbo.CommoditySupplier
( CommodityID, SuppliedID /*, CommoditySupplierID */
/*Add CSID *ONLY* IF *REQUIRED* TO MAKE THE FIRST TWO COLUMNS UNIQUE*/ )
WITH ( /*DATA_COMPRESSION = PAGE,*/ FILLFACTOR = 98, SORT_IN_TEMPDB = ON )
ON [<same_filegroup_name_as_original_PK_was_on>];
--re-create the PK, but nonclustered
ALTER TABLE dbo.CommoditySupplier ADD CONSTRAINT PK_CommoditySupplier PRIMARY KEY NONCLUSTERED ( CommoditySupplierID ) WITH ( /*DATA_COMPRESSION = ROW,*/ FILLFACTOR = 99, ... ) ON [...same_fg_name...];
--recreate all nonclustered indexes (if any)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 11, 2019 at 4:23 pm
To help resolve this problem across the board, for all your queries, you really need to re-cluster the CommoditySupplier table. This is a classic case of the "default clustering by identity column" problem.
Yeah... totally agree with that.
But... even that isn't going to help this query because the search criteria is using a leading wildcard character.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 11, 2019 at 4:39 pm
True, but that table only has 75K rows. Create an index with only CommodityID and CommodityName and, yes, SQL will still have to do an index scan, but the overall query still shouldn't take anywhere close to 30 mins, unless there are an unusually large number of matching IDs.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 15, 2019 at 8:26 pm
Might be worthwhile to first query the commodities table and put the results into a #temp table indexed on commodityID, then join to the #temp table. Depends on how many results are usually returned from the %search%.
Apologies, Des. I didn't notice the PRIMARY KEY in your create table statement.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
November 15, 2019 at 11:00 pm
First I would try adding this index to table CommoditySupplier:
CREATE INDEX IX_CommoditySupplier_CommodityID_INC_SupplierID
ON CommoditySupplier(CommodityID) INCLUDE (SupplierID);
If that doesn't improve performance enough I would, in addition to the above index, execute the query like this with the assistance of a temporary table.
IF OBJECT_ID('tempdb..#CommodityID','U') IS NOT NULL DROP TABLE #CommodityID
SELECT DISTINCT CommodityID
INTO #CommodityID
FROM Commodity
WHERE CommodityName Like '%SearchString%'
CREATE UNIQUE INDEX IX_#CommodityID_1 ON #CommodityID(CommodityID)
SELECT cs.SupplierID
FROM CommoditySupplier cs
INNER JOIN #CommodityID c
ON c.CommodityID = cs.CommodityID
DROP TABLE #CommodityID
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply