August 6, 2008 at 3:16 pm
Hi,
I have following dynamic search query
SELECT TOP 50
(SELECT TOP 1 Name FROM Product WHERE Id IN (SELECT CurrentProductId FROM LicenseProduct WHERE LicenseId = License.Id) ORDER BY ReleaseNumber, Created DESC ) AS ProductName,
(SELECT TOP 1 ProductNumber FROM Product WHERE Id IN (SELECT CurrentProductId FROM LicenseProduct WHERE LicenseId = License.Id) ORDER BY ReleaseNumber, Created DESC) AS ProductNumber,
License.Id AS LicenseId,
License.SerialNumber,
License.ServiceExpires,
License.FirstName,
License.LastName,
License.CompanyName,
License.EmailAddress AS LicenseEmail,
u.LoweredUsername AS AccountEmail,
License.Activations,
License.ReplacedBy,
License.Enabled
FROM
License
LEFT OUTER JOIN
UserLicense ul ON License.Id = ul.LicenseId
LEFT OUTER JOIN
User u ON ul.UserId = u.UserId
WHERE
(@EmailAddress = '' OR (License.EmailAddress LIKE @EmailAddress + '%' OR u.LoweredUsername LIKE @EmailAddress + '%'))
AND
(@CompanyName = '' OR (License.CompanyName LIKE @CompanyName + '%'))
AND
(@LastName = '' OR (License.LastName LIKE @LastName + '%'))
AND
(@FirstName = '' OR (License.FirstName LIKE @FirstName + '%'))
AND
(@OrderNumber = '' OR (SerialNumber IN(SELECT Number FROM v_CombinedOrders WHERE OrderNumber LIKE @OrderNumber + '%')))
AND
(@SerialNumber = '' OR (License.SerialNumber LIKE @SerialNumber + '%'))
Now the problem with this query is......
it runs very slow............and here is the tricky part......
while my testing i figured out that this where clause
(@EmailAddress = '' OR (License.EmailAddress LIKE @EmailAddress + '%' OR u.LoweredUsername LIKE @EmailAddress + '%'))
is slowing down the query.........when i remove one LIke part of the query it ran very quickly.....so what i did was i make two copies of the query and join them using UNION and i use
(@EmailAddress = '' OR (License.EmailAddress LIKE @EmailAddress + '%')) in one query
and
(@EmailAddress = '' OR (u.LoweredUsername LIKE @EmailAddress + '%')) in the second query.....
and it ran very quickly..........now i dont understand the reason.......why one query having both like statements in one clause was running slow and two queries with individual like statements are running faster....
any thoughts??
August 6, 2008 at 4:34 pm
Just like the song says. "The answer, my friend is blowin' in the query plan". Ok, maybe not quite...
In all seriousness, I bet the query plan for the "combined" query has a strange order for which the tables are joined. Wheras the "union" query evaluates each query in a parallel proecess and merges the result sets.
Part of the query optimizer's process uses column distribution statistics to determine an optimal query plan. New to 2005, these statistics objects include textual composites which in theory help the optimizer select good indexes and table join orders for LIKE evaluations. By changing the SQL code, SQL Server compiled and executed a totally different query plan and this time it got it right.
You may want to try updating the stats with a fullscan to see if the first query's performance yield improves.
August 7, 2008 at 6:31 am
The correlated sub-queries at the top of your select statement are going to give you coniptions as this scales. The more data it hits, the slower these will run:
(SELECT TOP 1 Name FROM Product WHERE Id IN (SELECT CurrentProductId FROM LicenseProduct WHERE LicenseId = License.Id) ORDER BY ReleaseNumber, Created DESC ) AS ProductName,
(SELECT TOP 1 ProductNumber FROM Product WHERE Id IN (SELECT CurrentProductId FROM LicenseProduct WHERE LicenseId = License.Id) ORDER BY ReleaseNumber, Created DESC) AS ProductNumber,
Instead, try putting these into a sub-select within the FROM clause and then joining to them. In most cases, that performs much better.
As to the rest, if you post the execution plan, it will be easier to help you identify the bottlenecks and what specifically is causing them.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 7, 2008 at 7:59 am
The other problem is that that format of where clause tends to confuse the optimiser and has no single fixed optimal plan. You're probably getting a table scan for the outer table and the optimiser is unlikely to be able to do better.
Is this SQL 2000 or 2005?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 7, 2008 at 11:44 am
I tried to move the two select subqueries in the FROM clause and join them there but then the returned rows were different.
Gail its sql server 2005, whats the weird thing in where clause??
Please find attached the execution plans of both the queries.
Thanks,
Usman
August 7, 2008 at 12:40 pm
Looking at the two query plans, for Query1 I see that it decided to parallelize the entire plan. Query2's plan does not. See the following for a basic and good explaination of parallelism...
http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx
Basically, in Query1, SQL Server decided to break apart the workload into 4 parallel execution units and combine the results. In most cases, this nets a performance gain for the query. In this case, it wound up hurting you. Why? Well, with the first query, SQL Server decided that the query was complex and expensive enough that it needed to break it up into four parts. Unfortunately in this case, by doing so, it actually slowed down the query performance. More than likely because it didn't dedicate all workers at once for doing the LIKE comparisons. Fewer workers per parallel process meant slower comparisons.
In Query2, you see that by breaking the query up into two smaller queries using a union clause, the query optimizer decided it was better to serialize the query execution and assign all workers to each process. In this way, all workers were thrown at the comparisons, and the query ran much faster.
You can try to run the first query while using WITH (MAXDOP = 1) and see if the performance improves.
August 7, 2008 at 1:06 pm
Or, you might want to consider increasing the cost threshold for parallelism on your server. I've usually found the default is just too low, causing queries that don't need it to have to spawn threads and marshall all the data back together again, making them run slower, not faster.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 7, 2008 at 1:08 pm
Hey Grant, do you have a general value you apply or do you use some kind of formula based on counters? Just curious.
August 7, 2008 at 1:14 pm
Now here is an another finding.........the following problematic part of the where clause
@EmailAddress = '' OR
(
u.LoweredUsername LIKE @EmailAddress + '%'
OR
License.EmailAddress LIKE @EmailAddress + '%'
)
in this part.....the LIKE parts will be running on different tables of DIFFERENT DATABASES.......
i.e Users table is in a seperate database and License table is in another database (these databases are on same server)...............now considering the single query (without the UNION)
1-) if i use like only on Users table it ran faster (under 1 sec)
2-) if i use like only on License table it ran relatively slower (13 sec)
3-) if i use both the LIKE (just like it currently is) it ran very slow...(26 sec)
i have checked the indexes and both the columns are indexed............
in query plan i saw sql server doin an index spooling ............
i dont know....whats goin on......any one may get some hint from above facts......
August 7, 2008 at 1:20 pm
Usually we bump it up to 20 as a general rule.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 7, 2008 at 1:20 pm
The index spooling you saw was the query processor creating an in-process index in tempdb for the LIKE comparison operations. If you look at Query 2, it does the same thing but while using all the worker threads in a serial operation.
I don't think it matters that the tables reside in different databases. Its more of how many workers are working together to process your query. In a parallel query (Query 1), the workers are split up and process parts of the query, then marshall it all back together. In a serial query (Query 2) all the workers work together in unison to process your query and don't have the overhead of marshalling it back.
August 7, 2008 at 1:21 pm
Jeremy Brown (8/7/2008)
Hey Grant, do you have a general value you apply or do you use some kind of formula based on counters? Just curious.
Although, now that I think of it, we'll want to do some pretty careful testing with 2008 to see if the same bad behavior holds true. Thanks for bringing it up.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply