February 25, 2010 at 8:21 am
Chim Kalunta (2/23/2010)
It would appear that the clause:inner join WhitePage w on w.displayname = a.displayname or w.eid = a.eid
is forcing a Clustered Index Scan on the Whitepage table. Thats where the time is spent.
I have had success with splitting out OR clauses in WHERE/JOINs to be two queries (each for just one of the conditions) that are unioned together:
select *
from tableA
INNER JOIN tableB ON tableA.ColA = tableB.ColA
UNION
select *
from tableA
INNER JOIN tableB ON tableA.ColB = tableB.ColB
Though I do like Dave's suggestion below... I'll have to check out how that works on some code...
Another thing I've noticed is your where clause:
WHERE a.account LIKE '%' + @query + '%'
By having the wildcard at the start of the expression, I don't think that you will ever be able to utilize an index on the account column.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 25, 2010 at 8:58 am
How do I know if the stats is up to date?
Have you got auto update stats set on the database ?
If not use UPDATE STATISTICS command
I click on the Estimated Exec Plan icon to generate this one, how do I create an actual plan then?
In SSMS before executing your query check the 'Include actual execution plan' option.
See the links in my sig below for a better explanation of this.
February 25, 2010 at 3:27 pm
Does this perform better ?
SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email
FROM activedirectory a
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
WHERE a.account LIKE '%' + @query + '%'
Thanks to all replies:
Things seem to get more complex:
Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'
I believe this would worse the performance, right?
February 25, 2010 at 3:27 pm
Does this perform better ?
SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email
FROM activedirectory a
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
WHERE a.account LIKE '%' + @query + '%'
Thanks to all replies:
Things seem to get more complex:
Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'
I believe this would worse the performance, right?
February 25, 2010 at 3:27 pm
Does this perform better ?
SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email
FROM activedirectory a
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
WHERE a.account LIKE '%' + @query + '%'
Thanks to all replies:
Things seem to get more complex:
Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'
I believe this would worse the performance, right?
February 25, 2010 at 6:29 pm
Dave's query also can give you incorrect results because of the LEFT joins: you can get records where w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email ARE ALL NULLS.
You can tweak the query to avoid that, but my point is - don't just follow people because they have spare time to earn points 🙂
Anyway, from the query plan I can see that the problem is in scanning WhitePaper table for every qualifying record from ActiveDirectory table (you can verify that by running the query with SET STATISTICS IO ON). And the reason for that is: the query optimizer chooses NestedLoops join.
Add to the end of the query: OPTION(HASH JOIN, MERGE JOIN)
and you will scan WhitePaper only once. That should eliminate 90% of your reads.
If you try and confirm that, then if you want to eliminate the need for the hint - play with indexes to invite the query optimizer to use HASH or MERGE join
Technically it is better to use UNION instead of OR, but with your data it does not really make much of a difference. Your problem is NestedLoops join
February 25, 2010 at 6:52 pm
Alex 469216 (2/25/2010)
Dave's query also can give you incorrect results because of the LEFT joins: you can get records where w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email ARE ALL NULLS.You can tweak the query to avoid that, but my point is - don't just follow people because they have spare time to earn points 🙂
Anyway, from the query plan I can see that the problem is in scanning WhitePaper table for every qualifying record from ActiveDirectory table (you can verify that by running the query with SET STATISTICS IO ON). And the reason for that is: the query optimizer chooses NestedLoops join.
Add to the end of the query: OPTION(HASH JOIN, MERGE JOIN)
and you will scan WhitePaper only once. That should eliminate 90% of your reads.
If you try and confirm that, then if you want to eliminate the need for the hint - play with indexes to invite the query optimizer to use HASH or MERGE join
Technically it is better to use UNION instead of OR, but with your data it does not really make much of a difference. Your problem is NestedLoops join
Thank you very much for the suggestion, actually I do have problem with Dave's query, although in s simple case his solution greatly improve the performance, however, because of splitting same table into multiple tables (w, w2), that means I might have to include more OR in the WHERE clause --- which definitely brings down the performance.
More over, what should I do if I need to consolidate more tables in the future?
February 26, 2010 at 7:27 am
More over, what should I do if I need to consolidate more tables in the future?
You could insert your top 20 ActiveDirectory records into a temp table
and UNION separate selects from joins between that temp table to every table you want to consolidate.
February 26, 2010 at 9:18 am
Try playing with a temp table. This keeps the original employeeid and filters from the activedirectory table first. It should let you use the indexes on WhitePaper:
--------------------------------
create table #ActiveDirectory
(empnum varchar(20) NOT NULL PRIMARY KEY,
account varchar (100) NOT NULL,
displayname varchar (100) NOT NULL)
CREATE NONCLUSTERED INDEX adName ON #ActiveDirectory (displayname)
declare @query varchar(20)
INSERT INTO #ActiveDirectory (employeeid, account, displayname)
SELECT '0000' + a.employeeid AS empnum, a.account, a.displayname as [Name]
FROM activedirectory a WHERE charindex(@query, a.account) > 0
SELECT TOP 20 emp.empnum as [ID], emp.account, emp.displayname,
w.workphonenumber as phone, w.emailaddress as email
FROM emp
INNER JOIN WhitePage w on
w.displayname = a.displayname OR
w.empnum = a.empnum
---------------------------------------------
February 26, 2010 at 9:12 pm
Alex 469216 (2/26/2010)
More over, what should I do if I need to consolidate more tables in the future?
You could insert your top 20 ActiveDirectory records into a temp table
and UNION separate selects from joins between that temp table to every table you want to consolidate.
but my point is - don't just follow people because they have spare time to earn points
😉
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2010 at 3:19 am
Things seem to get more complex:
Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:
left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)
left join WhitePage w2 on w2.eid = a.eid
where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'
I believe this would worse the performance, right?
Possibly , but as with a lot of SQL , it depends.
Have you established weather your stats are up to date or not ? Without good stats , sqlserver will not be able to make good choices.
In terms of the data ,will there always be a row found within the whitepage table for each row in activedirectory ? Is it rare for a match to be found or common ?
February 27, 2010 at 8:37 pm
@halix,
Out of all the solutions, simple and the one thats most likely to improve the performance is the solution
based on UNION.
Yet 'w.empnum = '00000' + a.employeeid' wouldnt use a index. If the query is really important to you then
try adding a computed column, index that and use it perhaps. I realize its going bit too far to get rid of a scan, but if it matters to you then its worth giving it a try.
Also, a quick way to find whether stats are updated is using STATS_DATE function
Regards,
Raj
February 27, 2010 at 8:44 pm
/*
You could insert your top 20 ActiveDirectory records into a temp table
and UNION separate selects from joins between that temp table to every table you want to consolidate.
*/
Isnt that supposed to be 'You insert all qualifying ActiveDirectory records into temp table'?
Regards,
Raj
March 1, 2010 at 2:50 pm
Dave Ballantyne (2/27/2010)
Have you established weather your stats are up to date or not ? Without good stats , sqlserver will not be able to make good choices.
In terms of the data ,will there always be a row found within the whitepage table for each row in activedirectory ? Is it rare for a match to be found or common ?
You actually reminded me this: I can do some auto nightly job to check the data and consolidate different tables, that should speed up everything. I would think one single table is the final solution, I mean, if possible. In the case I have to dynamically consolidate different tables, I don't know, I suspect any query would take long time to execute.
Viewing 14 posts - 16 through 28 (of 28 total)
You must be logged in to reply to this topic. Login to reply