October 5, 2011 at 8:03 am
Hi,
I have a problematic query and need your guidance.
Let's that I have 3 tables A, B and C and all 3 of them have large amount of data(lets say 4 million each).
In the query I have, A is joined to B with a on clause on a column, B to C on another column.
The where clause just have a column from table C to meet the business demand and to restrict the number of results.
Now the problem starts, the execution plan shows that in table A and B, clustered index scan is done. I can create non clustered indexes on these tables but will they work since their columns are not in where clause.
Will it help me to create indexes for these tables based on 'ON' clause on which they are joined.
Please let me know what tried and tested theory says. I will try creating index in off-business hours but need some experts opinion here.
Regards
Chandan
October 5, 2011 at 8:08 am
Can you post the query and actual execution plan?
Also if your goal is to go from an Index scan to a seek , then this is not always possbile or even an improvment
October 5, 2011 at 9:41 am
No real way to tell you anything without more information. Minimum an execution plan.
You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 5, 2011 at 12:50 pm
Grant Fritchey (10/5/2011)
No real way to tell you anything without more information. Minimum an execution plan.You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?
Sorry Grant and Steve. I was having dinner so could not reply for sometime.
I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.
Regards
Chandan
October 5, 2011 at 1:06 pm
chandan_jha18 (10/5/2011)
Grant Fritchey (10/5/2011)
No real way to tell you anything without more information. Minimum an execution plan.You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?
Sorry Grant and Steve. I was having dinner so could not reply for sometime.
I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.
Regards
Chandan
Looking through it a few things stand out. Stuff like this:
CAST(dbo.WH_INET_VendorCustomerRelation.VendorId AS VARCHAR(6))
When used in a JOIN or a WHERE clause automatically cause scans. You can't get away with it otherwise. Your table structure joins on columns that are not the same data type? I'd start working there. Fix that. You can't tune this otherwise.
If you look at the execution plan, you're getting bitmap filtering
PROBE([Opt_Bitmap1025],[CardData].[dbo].[WH_CarrierData].[Id],N'[IN ROW]')
It's a mechanism to enhance performance in parallel execution plans when dealing with large data sets. From what I can tell, your hitting so much data all at once that SQL Server is forced to use this approach.
I'd start with gettng rid of that CAST and then see where the plan takes you. All the scans right now are killing you. You do have a bit of disparity on the statistics too, but that might not be an issue.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 5, 2011 at 1:31 pm
Your outer query has the WHERE clause:
where PostedDate='1/19/2011'
and PostedDate is defined in the view as:
CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate
You will always get a scan on the WH_Status table unless you can filter on an indexed field.
October 5, 2011 at 1:40 pm
paul_ramster (10/5/2011)
Your outer query has the WHERE clause:
where PostedDate='1/19/2011'
and PostedDate is defined in the view as:
CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate
You will always get a scan on the WH_Status table unless you can filter on an indexed field.
Yeah, there are functions all over the place leading to scans. Data types are there for you to use. Not using them will cost you.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 5, 2011 at 1:55 pm
Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.
But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?
As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.
@Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.
Thanks
Chandan
October 5, 2011 at 3:54 pm
chandan_jha18 (10/5/2011)
Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?
As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.
@Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.
Thanks
Chandan
Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 5, 2011 at 6:04 pm
Grant Fritchey (10/5/2011)
chandan_jha18 (10/5/2011)
Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?
As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.
@Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.
Thanks
Chandan
Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.
What about a join condition like this:
ON DAY(t.Tdate) = DAY(p.PRD_ValidDate)
AND MONTH(t.Tdate) = MONTH(p.PRD_ValidDate)
AND YEAR(t.Tdate) = YEAR(p.PRD_ValidDate)
October 5, 2011 at 6:19 pm
vince_sql (10/5/2011)
Grant Fritchey (10/5/2011)
chandan_jha18 (10/5/2011)
Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?
As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.
@Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.
Thanks
Chandan
Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.
What about a join condition like this:
ON DAY(t.Tdate) = DAY(p.PRD_ValidDate)
AND MONTH(t.Tdate) = MONTH(p.PRD_ValidDate)
AND YEAR(t.Tdate) = YEAR(p.PRD_ValidDate)
Nope. Calculations on columns result in scans. Period. SQL Server can't know what the values of those calculations are going to be, so it can't use an index.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 6, 2011 at 11:21 am
I changed the data type of 2 columns to match the data type of other column so that varchar can be compared with varchar but still it didn't make much difference in performance.
Any other suggestions please.
Regards
Chandan
October 6, 2011 at 11:48 am
I've been watching. I've learned all I know from Grant and I haven't read more than 10% of his stuff (yet). So if he says there's nothing to do unless... then I'd listen 😉
October 6, 2011 at 11:59 am
Grant Fritchey (10/5/2011)
chandan_jha18 (10/5/2011)
Grant Fritchey (10/5/2011)
No real way to tell you anything without more information. Minimum an execution plan.You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?
Sorry Grant and Steve. I was having dinner so could not reply for sometime.
I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.
Regards
Chandan
Looking through it a few things stand out. Stuff like this:
CAST(dbo.WH_INET_VendorCustomerRelation.VendorId AS VARCHAR(6))
When used in a JOIN or a WHERE clause automatically cause scans. You can't get away with it otherwise. Your table structure joins on columns that are not the same data type? I'd start working there. Fix that. You can't tune this otherwise.
If you look at the execution plan, you're getting bitmap filtering
PROBE([Opt_Bitmap1025],[CardData].[dbo].[WH_CarrierData].[Id],N'[IN ROW]')
It's a mechanism to enhance performance in parallel execution plans when dealing with large data sets. From what I can tell, your hitting so much data all at once that SQL Server is forced to use this approach.
I'd start with gettng rid of that CAST and then see where the plan takes you. All the scans right now are killing you. You do have a bit of disparity on the statistics too, but that might not be an issue.
I changed the data type of vendor id and another column to avoid a cast and stored them as varchar but still index scan is forced.
Any other suggestion?
Thanks
Chandan
October 6, 2011 at 12:01 pm
Grant Fritchey (10/5/2011)
paul_ramster (10/5/2011)
Your outer query has the WHERE clause:
where PostedDate='1/19/2011'
and PostedDate is defined in the view as:
CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate
You will always get a scan on the WH_Status table unless you can filter on an indexed field.
Yeah, there are functions all over the place leading to scans. Data types are there for you to use. Not using them will cost you.
Can you please explain me what advice you gave me here.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply