January 11, 2011 at 4:17 am
Hi,
I have a query that I am optimising and as part of the optimisation I have removed a leading wildcard from a varchar column in the where clause. There is no index on this column and so the query is still making almost the same number of logical reads on the table (188644) while it runs the table scan on a pretty large table. Before I removed the leading wildcard, the stats showed 187572 reads - almost the same.
All the output is the same (io statistics are almost identical, same query plan, table scan operator shows the same actual and estimated costings and the properties of the table scan operator are the same including the DOP) and yet the query without the wildcard returns CPU time of 2406ms as opposed to the query with the wildcard which takes 75985ms. How can two identical query plans with almost the same amount of data being read, and the same results, have such a different duration?
In case anyone is thinking that it's spilling into TempDB, it isn't - the query with very slightly less data is the one that takes longest.
Does anyone have any theories that I can test for?
Thanks,
Martin
January 11, 2011 at 5:43 am
Post the plans?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 11, 2011 at 6:08 am
Sure.
Stats and plan for the fast query:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'DEAD'. Scan count 17, logical reads 188651, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'COMPANY'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2250 ms, elapsed time = 149 ms.
|--Parallelism(Gather Streams)
|--Nested Loops(Inner Join, OUTER REFERENCES:([MyDB].[dbo].[DEAD].[VALFIRMID]))
|--Clustered Index Scan(OBJECT:([MyDB].[dbo].[DEAD].[Y27ID]), WHERE:([MyDB].[dbo].[DEAD].[LENDERREF] like N'24088404' AND ([MyDB].[dbo].[DEAD].[LENDERID]=(100005) OR [MyDB].[dbo].[DEAD].[LENDERID]=(127551))))
|--Clustered Index Seek(OBJECT:([MyDB].[dbo].[COMPANY].[PK_COMPANY_1]), SEEK:([MyDB].[dbo].[COMPANY].[ID]=[MyDB].[dbo].[DEAD].[VALFIRMID]) ORDERED FORWARD)
Stats and plan for the slow query:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'DEAD'. Scan count 17, logical reads 187587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'COMPANY'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 73328 ms, elapsed time = 5009 ms.
|--Parallelism(Gather Streams)
|--Nested Loops(Inner Join, OUTER REFERENCES:([MyDB].[dbo].[DEAD].[VALFIRMID]))
|--Clustered Index Scan(OBJECT:([MyDB].[dbo].[DEAD].[Y27ID]), WHERE:([MyDB].[dbo].[DEAD].[LENDERREF] like N'%24088404' AND ([MyDB].[dbo].[DEAD].[LENDERID]=(100005) OR [MyDB].[dbo].[DEAD].[LENDERID]=(127551))))
|--Clustered Index Seek(OBJECT:([MyDB].[dbo].[COMPANY].[PK_COMPANY_1]), SEEK:([MyDB].[dbo].[COMPANY].[ID]=[MyDB].[dbo].[DEAD].[VALFIRMID]) ORDERED FORWARD)
January 11, 2011 at 6:17 am
You are comparing chalk and cheese, the processor effort in examining a string to see if it is equal ( your like without a '%' ) is a lot lighter than testing to see if the string ends with your pattern.
January 11, 2011 at 6:43 am
I meant the graphical plan. Include actual execution plan, save as .sqlplan file.
By removing the wildcard you've changed the like to an =. It is less work to do, even though there;s no suitable index to seek on.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 11, 2011 at 6:54 am
Thanks both of you - you've given me the answer. Even though the query engine is still scanning the whole table, using the wildcard is still slower as it has to search the column for any occurrence of the string rather than just finding a match. Hence the high CPU cost.
Thanks for taking the time to answer. 🙂
Martin
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply