Query plan - largest cost

Question

Query plan - largest cost

rgtft

Hall of Fame

Points: 3518
More actions
September 14, 2012 at 10:13 am

#262003

This query takes about 45 seconds to complete (down from 10+ minutes by adding clustered index to table). Looking at the query plan, the largest cost is a clustered index seek -- which, as I understand it, is what I'm supposed to be getting rather than table or index scans.
What other indexes may be missing to speed this up?
I know it's a big topic, where should I start reading for more information on how to read & understand a query plan and tune a query.
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
GO
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT DISTINCT vwea.FranchiseNumber,
vwea.CustomerID,
vwea.EmailBefore,
vwea.EmailAfter,
vwea.ChangeDate,
j.ScheduledEndDate AS MoveDate,
CASE
WHEN j.ScheduledEndDate < vwea.ChangeDate THEN 'After Move Change'
ELSE 'Before move change'
END AS Change
FROM dbo.tblJob j
INNER JOIN dbo.tblWorkOrder wo ON j.DatabaseInfoID = wo.DatabaseInfoID
AND j.WorkOrderID = wo.WorkOrderID
INNER JOIN dbo.EmailAudit vwea ON wo.CustomerID = vwea.CustomerID
INNER JOIN dbo.tblDatabaseInfo ON j.DatabaseInfoID = dbo.tblDatabaseInfo.DatabaseInfoID
INNER JOIN dbo.tblFranchise ON j.DatabaseInfoID = dbo.tblFranchise.DatabaseInfoID
AND vwea.FranchiseNumber = dbo.tblFranchise.FranchiseNumber
WHERE (vwea.FranchiseNumber = 253)
AND j.ScheduledEndDate < vwea.ChangeDate
ORDER BY vwea.ChangeDate DESC;
SET STATISTICS IO OFF;
GO
SET STATISTICS TIME OFF;
GO
Thanks,
Rob
(Query plan attached)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 1

Looking at this, the first thing I'd suggest is you update your statistics. They're pretty badly out of wack. That might be why you're getting such an expensive plan. You're processing huge amounts of rows to only return 144. Also, I only see five tables the query, but, quick count, eight access points in the plan. Are you dealing with views or nested views here? That could also be problematic.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

rgtft Hall of Fame Points: 3518 More actions · Answer 2

I'm updating statistics on those tables right now; dbo.EmailAudit is a view. I'll look into that.

Thanks,

Rob

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 3

The simplification step in the optimization process can clean up issues with views, but not always and not always all the way. Depending on what's there, you might be better off accessing those tables more directly through this query.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

rgtft Hall of Fame Points: 3518 More actions · Answer 4

I did an update of the statistics on all of the related tables. That didn't change the execution time.

The view ([font="Courier New"]EmailAudit[/font]) duplicates many of the tables in the original query. Just running the view takes about 25 seconds.

The view definition is:

DBCC FREEPROCCACHE;

DBCC DROPCLEANBUFFERS;

GO

SET STATISTICS IO ON

SET STATISTICS TIME ON

--CREATE VIEW [dbo].[EmailAudit]

--AS

SELECT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

a.[CustomerID],

a.[EmailAddress] AS [EmailBefore],

c.EmailAddress AS [EmailAfter],

a.[ChangeDate]

FROM [dbo].[tblEmailAudit] a

INNER JOIN dbo.tblCustomer c ON C.DatabaseInfoID = a.DatabaseInfoID

AND c.CustomerID = a.CustomerID

AND ISNULL(c.FranUniqueID, c.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

LEFT JOIN dbo.tblFranchise f ON a.DatabaseInfoID = f.DatabaseInfoID

AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

LEFT JOIN dbo.tblDatabaseInfo di ON a.DatabaseInfoID = di.DatabaseInfoID

WHERE ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'');

SET STATISTICS IO OFF;

GO

SET STATISTICS TIME OFF;

GO

So I rewrote the original query removing the view and I'm still getting about the same time (40-45 seconds).

-- original query rewritten to remove EmailAudit view --

DBCC FREEPROCCACHE;

DBCC DROPCLEANBUFFERS;

GO

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT DISTINCT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

a.CustomerID,

a.EmailAddress AS EmailBefore,

c.EmailAddress AS EmailAfter,

a.ChangeDate,

j.ScheduledEndDate AS MoveDate,

CASE

WHEN j.ScheduledEndDate < a.ChangeDate THEN 'After Move Change'

ELSE 'Before move change'

END AS Change

FROM dbo.tblEmailAudit a

INNER JOIN dbo.tblCustomer c ON c.CustomerID = a.CustomerID

AND c.DatabaseInfoID = a.DatabaseInfoID

INNER JOIN dbo.tblWorkOrder wo ON c.CustomerId = wo.CustomerId

AND wo.DatabaseInfoId = a.DatabaseInfoId

INNER JOIN dbo.tblJob j ON j.WorkOrderID = wo.WorkOrderID

AND j.DatabaseInfoID = a.DatabaseInfoID

INNER JOIN dbo.tblDatabaseInfo di ON di.DatabaseInfoID = a.DatabaseInfoID

LEFT JOIN dbo.tblFranchise f ON a.DatabaseInfoID = f.DatabaseInfoID

AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

WHERE (COALESCE(f.FranchiseNumber, di.FranchiseNumber) = 253)

AND j.ScheduledEndDate < a.ChangeDate

AND ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'')

ORDER BY a.ChangeDate DESC;

SET STATISTICS IO OFF;

GO

SET STATISTICS TIME OFF;

GO

The two biggest costs in the query plan are both index seeks (one clustered and the other non-clustered). The third largest cost is an index scan on dbo.tblJob.

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 5

rgtft (9/16/2012)
The two biggest costs in the query plan are both index seeks (one clustered and the other non-clustered). The third largest cost is an index scan on dbo.tblJob.

There's really only one way for the seeks to be taking longer than a scan... the seeks are being repeated. Look at the properties of the seeks and see that they are executed many times. A better index capable of doing an initial seek and a "range scan" following that would be a tremendous amount faster.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

rgtft Hall of Fame Points: 3518 More actions · Answer 6

Silly question, how do I read the execution plan to know what columns to index?

I created two indexes (just eyeballed query and guessed). The first one seems to have improved the query, but the second one (though being used) still doesn't seem to improve anything.

-- Two indexes created --

CREATE NONCLUSTERED INDEX [IX_tblCustomer_CustomerId_EmailAddress] ON [dbo].[tblCustomer]

(

[CustomerID] ASC,

[EmailAddress] ASC

);

CREATE NONCLUSTERED INDEX [IX_tboWorkOrder_CustomerID_DatabaseInfoId] ON [dbo].[tblWorkOrder]

(

[CustomerID] ASC,

[DatabaseInfoId] ASC

);

Thanks,

Rob

rgtft Hall of Fame Points: 3518 More actions · Answer 7

I see a couple of websites that explain some of this:

I also see that Grant has a book on execution plans:

http://www.sqlservercentral.com/articles/books/65831/

Anyways, I've now got query down to 15-20 seconds; it looks like I've got some reading to do.

Rob

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 8

ISNULL and COALESCE on columns in the where clause are going to hurt performance. I still think your stats are off. I'm seeing estimated 4.5 million, actual 15,000. That's a pretty big disparity and is usually indicative of out of date or badly sampled statistics.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 9

Grant Fritchey (9/17/2012)
ISNULL and COALESCE on columns in the where clause are going to hurt performance....

This should be close to a SARGable version of the query:

SELECT DISTINCT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

a.CustomerID,

a.EmailAddress AS EmailBefore,

c.EmailAddress AS EmailAfter,

a.ChangeDate,

j.ScheduledEndDate AS MoveDate,

CASE

WHEN j.ScheduledEndDate < a.ChangeDate THEN 'After Move Change'

ELSE 'Before move change'

END AS Change

FROM dbo.tblEmailAudit a

INNER JOIN dbo.tblCustomer c

ON c.CustomerID = a.CustomerID

AND c.DatabaseInfoID = a.DatabaseInfoID

and (

c.EmailAddress <> a.EmailAddress

OR (c.EmailAddress IS NULL AND a.EmailAddress IS NOT NULL)

OR (c.EmailAddress IS NOT NULL AND a.EmailAddress IS NULL)

)

INNER JOIN dbo.tblWorkOrder wo

ON c.CustomerId = wo.CustomerId

AND wo.DatabaseInfoId = a.DatabaseInfoId

INNER JOIN dbo.tblJob j

ON j.WorkOrderID = wo.WorkOrderID

AND j.DatabaseInfoID = a.DatabaseInfoID

LEFT JOIN dbo.tblDatabaseInfo di

ON di.DatabaseInfoID = a.DatabaseInfoID

AND di.FranchiseNumber = 253

LEFT JOIN dbo.tblFranchise f

ON (f.FranUniqueID = a.FranUniqueID OR f.DatabaseInfoID = a.DatabaseInfoID)

--a.DatabaseInfoID = f.DatabaseInfoID

AND f.FranchiseNumber = 253

--AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

WHERE 1 = 1

--(COALESCE(f.FranchiseNumber, di.FranchiseNumber) = 253)

AND j.ScheduledEndDate < a.ChangeDate

--AND ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'')

ORDER BY a.ChangeDate DESC;

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

rgtft Hall of Fame Points: 3518 More actions · Answer 10

Grant Fritchey (9/17/2012)
ISNULL and COALESCE on columns in the where clause are going to hurt performance. I still think your stats are off. I'm seeing estimated 4.5 million, actual 15,000. That's a pretty big disparity and is usually indicative of out of date or badly sampled statistics.

I manually ran (below) on all of the associated tables:

UPDATE STATISTICS dbo.tblCustomers WITH FULLSCAN;

As I understand it, without specifying an index or statistic, it should update all of them on the table. Or am I off here?

Thanks,

Rob

rgtft Hall of Fame Points: 3518 More actions · Answer 11

ChrisM@Work (9/17/2012)
Grant Fritchey (9/17/2012)
ISNULL and COALESCE on columns in the where clause are going to hurt performance....
This should be close to a SARGable version of the query:

Chris M:

Your code is SARGable and faster (aprox. 9 seconds); the only issue is it's bringing back a few thousand records rather than the 144 I expected. I'll look into your code as to why.

Thanks,

Rob

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

rgtft (9/17/2012)
ChrisM@Work (9/17/2012)
Grant Fritchey (9/17/2012)
ISNULL and COALESCE on columns in the where clause are going to hurt performance....
This should be close to a SARGable version of the query:
Chris M:
Your code is SARGable and faster (aprox. 9 seconds); the only issue is it's bringing back a few thousand records rather than the 144 I expected. I'll look into your code as to why.
Thanks,
Rob

Hi Rob

The join on dbo.tblFranchise is complex - I've set it up as a left join for safety. The code that this was cloned from - your last code iteration - showed it as a left join, however it was referenced in the WHERE clause, which would usually flag it up as an inner join. Try it with an inner join - but I suspect it will be a little more complex than that. Returning all the values from dbo.tblFranchise in the result set will help you decide how to filter.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden