July 15, 2013 at 8:17 pm
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?
UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')
Thanks!
July 15, 2013 at 9:30 pm
r_noob (7/15/2013)
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')
Thanks!
I think this is going to be a stunningly bad performer regardless. But an index might help. What might be even faster, believe it or not, is to do an EXISTS query instead of a JOIN. Hell, even a cursor approach could be faster than the join, which could hit millions of times on a 45000X800000 join of that type.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 15, 2013 at 10:22 pm
I agree with Kevin. A looped search would probably be faster than a Join because you could then add a WHERE Field2 <> 'Yes" to the code which would speed things up quite a bit especially if there were some intelligent indexing around that column (at least on a temporary basis).
--Jeff Moden
Change is inevitable... Change for the better is not.
July 16, 2013 at 10:00 am
You may want to try full text search here. Together with using WHERE EXISTS instead of JOIN, should be considerably faster than JOIN with non sargable LIKE. For some reason I think it will win over loop-based aproch too.
July 16, 2013 at 10:48 am
I am not sure full text indexing will help with the '%' + column + '%' join ...
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 16, 2013 at 12:12 pm
r_noob (7/15/2013)
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')
Thanks!
Try this:
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable
CREATE TABLE #TempTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[Field1] INT NULL,
[Field2] NVARCHAR(3) NULL,
PRIMARY KEY (ID))
INSERT INTO #TempTable
SELECT 22,'No' UNION ALL
SELECT 31,'No' UNION ALL
SELECT 42,'No' UNION ALL
SELECT 55,'No'
IF OBJECT_ID('tempdb..#LookupTable') IS NOT NULL
DROP TABLE #LookupTable
CREATE TABLE #LookupTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[LookupValue] INT NULL,
[LookDesc] NVARCHAR(30) NULL,
PRIMARY KEY (ID))
INSERT INTO #LookupTable
SELECT 22,'Apple' UNION ALL
SELECT 33,'Orange' UNION ALL
SELECT 44,'Banana' UNION ALL
SELECT 55,'Pear'
UPDATE #TempTable
SET Field2 = 'Yes'
FROM
#TempTable AS A
INNER JOIN
#LookupTable AS B
ON A.ID = B.ID
WHERE
A.Field1 = B.LookupValue
SELECT * FROM #TempTable AS tt
July 16, 2013 at 2:11 pm
r_noob (7/15/2013)
... ON (A.Field1 like '% ' + B.LookupValue + ' %')
That looks like words in a sentence. Try Jeff's splitter: split the sentence on space, remove any leading blanks and seek into your lookup table.
Like this:
SELECT *
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
WHERE EXISTS (SELECT 1 FROM #Lookup b
WHERE b.LookDesc = split.Item)
SELECT *
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
INNER LOOP JOIN #Lookup b
ON b.LookDesc = split.Item
Prep:
IF OBJECT_ID('tempdb..#Table1') IS NOT NULL
DROP TABLE #Table1
CREATE TABLE #Table1 (
[ID] INT IDENTITY(1,1) NOT NULL,
[Field1] VARCHAR(100),
[Field2] VARCHAR(3) NULL)
INSERT INTO #Table1
SELECT 'The quick brown fox','No' UNION ALL
SELECT 'jumped over','No' UNION ALL
SELECT 'the','No' UNION ALL
SELECT 'lazy dog','No' UNION ALL
SELECT 'One day in the life of Ivan Denisovitch', 'No'
INSERT INTO #Table1 select a.Field1, 'No' AS Field2 from #Table1 a, sys.columns b
CREATE UNIQUE CLUSTERED INDEX ucx_ID ON #Table1 (ID)
IF OBJECT_ID('tempdb..#Lookup') IS NOT NULL
DROP TABLE #Lookup
CREATE TABLE #Lookup ([LookDesc] VARCHAR(30) NOT NULL)
INSERT INTO #Lookup
SELECT 'dog' UNION ALL
SELECT 'fox' UNION ALL
SELECT TOP 45000 x = RIGHT(NEWID(),5)+RIGHT(NEWID(),5) FROM SYSCOLUMNS A, SYSCOLUMNS B
CREATE UNIQUE CLUSTERED INDEX ucx_LookDesc ON #Lookup (LookDesc)
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
August 1, 2013 at 4:02 am
The performance of Chris' solution is fantastic.
If we use his setup script, we can use the following code to perform the update:
IF OBJECT_ID('tempdb..#t2') IS NOT NULL drop table #t2
SELECT DISTINCT ID
INTO #t2
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
WHERE EXISTS (SELECT 1 FROM #Lookup b WHERE b.LookDesc = split.Item)
UPDATE #Table1
SET Field2 = 'Yes'
WHERE ID in (
select ID from #t2
)
This code executes in less than 100ms on my machine. I expect similar results for the real data. Not bad to go from 45 minutes to sub-second performance!
Well done!
/SG
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply