March 24, 2014 at 9:03 am
Hi Guys,
I have a query I have written to join 3 tables, run a CASE statement for two columns and perform an aggregate SUM on the two columns. The results are grouped by the SUM. This table has over a billion records and is currently indexed by date. The query looks like this ..
SELECT DISTINCT site.site_number AS Site, site.city AS City, site.state AS State,
SUM(CASE WHEN historypoints.histConfID LIKE '%Generator$20Start$2fStop' THEN CONVERT(DECIMAL(10, 2), history.numeric_value / 100 * 15) ELSE 0 END) AS [Runtime(min)],
SUM(CASE WHEN historypoints.histConfID LIKE '%Generator$20Command' THEN CONVERT(DECIMAL(10, 2), history.numeric_value / 100 * 15) ELSE 0 END) AS [Command(min)]
FROM history INNER JOIN
historypoints ON history.config_id = historypoints.config_id INNER JOIN
xcm ON historypoints.xcm_id = xcm.xcm_id INNER JOIN
site ON xcm.site_id = site.site_id
WHERE (history.timestamp BETWEEN GETDATE() - 14 AND GETDATE())
GROUP BY site.site_number, site.city, site.state
This query takes a VERY long time to run. Any suggestions on improvement would be greatly appreciated!
Thanks!
March 24, 2014 at 9:12 am
Welcome to SSC.
Unfortunately the query alone is rarely enough to help with performance issues. However, this type of question is quite common and so there is an article on what to do in order to provide enough information to the rest of us.
Have a look at http://www.sqlservercentral.com/articles/SQLServerCentral/66909/ and then you should be able to help us to help you with your problem.
March 24, 2014 at 9:12 am
Your GROUP BY eliminates the need for DISTINCT
SELECT --DISTINCT
s.site_number AS Site,
s.city AS City,
s.state AS State,
SUM(CASE WHEN hp.histConfID LIKE '%Generator$20Start$2fStop'
THEN CONVERT(DECIMAL(10, 2), h.numeric_value / 100 * 15) ELSE 0 END) AS [Runtime(min)],
SUM(CASE WHEN hp.histConfID LIKE '%Generator$20Command'
THEN CONVERT(DECIMAL(10, 2), h.numeric_value / 100 * 15) ELSE 0 END) AS [Command(min)]
FROM history h
INNER JOIN historypoints hp ON h.config_id = hp.config_id
INNER JOIN xcm ON hp.xcm_id = xcm.xcm_id
INNER JOIN [site] s ON xcm.site_id = s.site_id
WHERE (h.timestamp BETWEEN GETDATE() - 14 AND GETDATE())
GROUP BY s.site_number, s.city, s.state
Can you post the estimated execution plan as a .sqlplan attachment please - then if you can, the actual execution plan.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 10:31 am
Thanks for the quick replies guys! Sorry for confusion, I havent posted a question here before and wasnt entirely sure what you would though am not surprised to hear you need more info. I have attached the Estimated Execution Plan, Table DDL for all 4 tables and the indexes for the table in question. I dont have the actual execution plan yet, as i have not had success running this query on the production server yet. It is currently running and if needed, i will attach when its finally completes.
Thanks again!
March 24, 2014 at 10:47 am
mrupperman (3/24/2014)
Thanks for the quick replies guys! Sorry for confusion, I havent posted a question here before and wasnt entirely sure what you would though am not surprised to hear you need more info. I have attached the Estimated Execution Plan, Table DDL for all 4 tables and the indexes for the table in question. I dont have the actual execution plan yet, as i have not had success running this query on the production server yet. It is currently running and if needed, i will attach when its finally completes.Thanks again!
Actually, the Actual Execution Plan would be of greater benefit as it provides information on the actual execution of query.
March 24, 2014 at 10:53 am
Also, it could just be me, but after downloading the estimated execution plan I was unable to open it.
March 24, 2014 at 10:54 am
Lynn Pettis (3/24/2014)
Also, it could just be me, but after downloading the estimated execution plan I was unable to open it.
Me neither, think it was truncated.
However: very high row counts suggest your statistics may be out of date. Also, there's an index hint - have you tried the query with and without the suggested index?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 11:02 am
I am still running the query, its at 45min+ :-D.
I wanted to try the suggested index, but there is already one on the table very similar and our other team suggested we not add any others due to already poor performance. In regards to the index. As noted in the attachment referencing indexes, it appears to have most of the columns included. I am not familiar with index creation enough to know if this is efficient or not, but it seems to me that one would not want that many columns in an index as the whole point is to limit the data searched while querying .. right? Could this index be working against me?
I'll try to re-run the estimated plan and see if it works this time.
March 24, 2014 at 11:07 am
Ok .. try this one. If this doesnt work, then I will just for the Actual Execution Plan and post later.
March 24, 2014 at 11:11 am
mrupperman (3/24/2014)
Ok .. try this one. If this doesnt work, then I will just for the Actual Execution Plan and post later.
No luck for me.
March 24, 2014 at 11:31 am
Nope. I keep getting errors in the XML when I try to open it. Did you edit the plan?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 24, 2014 at 11:46 am
Grant Fritchey (3/24/2014)
Nope. I keep getting errors in the XML when I try to open it. Did you edit the plan?
No, I just created the plan and saved it. Should I save it as XML and attach that or just wait for Actual Plan?
Any thoughts on the index question I mentioned a couple posts ago? Basically is there any benefit to creating an index with most of the table columns in it? Seems inefficient.
March 24, 2014 at 11:59 am
mrupperman (3/24/2014)
Grant Fritchey (3/24/2014)
Nope. I keep getting errors in the XML when I try to open it. Did you edit the plan?No, I just created the plan and saved it. Should I save it as XML and attach that or just wait for Actual Plan?
Any thoughts on the index question I mentioned a couple posts ago? Basically is there any benefit to creating an index with most of the table columns in it? Seems inefficient.
Hmmm. It should just work. Save a .SQLPLAN file and post it here. Can you try reopening the files before you post them?
In answer to the index question... it depends. It is inefficient in terms of storage because you're effectively storing everything twice. But, it can help when selecting data. Covering indexes are always better. But there is a high cost to that since you'll have to store everything, sort it, rearrange it on page splits, defragment it. Nothing is free. It really depends on what you get out of it in support of the query.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 24, 2014 at 12:37 pm
mrupperman (3/24/2014)
Grant Fritchey (3/24/2014)
Nope. I keep getting errors in the XML when I try to open it. Did you edit the plan?No, I just created the plan and saved it. Should I save it as XML and attach that or just wait for Actual Plan?
Any thoughts on the index question I mentioned a couple posts ago? Basically is there any benefit to creating an index with most of the table columns in it? Seems inefficient.
On your history table, you have the clustered index on your primary key, history_id. If most of your queries against this table are based on date ranges, it may make more sense to make the primary key a nonclustered index and put the clustered index on the timestamp column.
March 25, 2014 at 3:42 am
I can't open the plan either.
This kind of trouble shooting can be a little difficult without visibility of what's going on in the plan.
It should "just work" exactly as Grant says.
Get the plan, right click in the plan pane and save as .sqlplan.
This assumes you are using Sql Server Management Studio, by the way... unless you are using some other editor.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply