May 22, 2013 at 2:01 am
Hi Guys,
The below one is my query. It's taking 12 seconds for the execution process. The count(distinct()) and SUM() functions are taking long time for execution. I tried it after create the non-cluster index but i was getting the same time. How can I avoid this issue?.
select T.Name as name,T.Id as id,COUNT(distinct(DD.dynamictableid)) as counts ,ROUND(SUM(D.[employees]),0) as measure1 from dbo.TreeHierarchy T
left join dbo.DynamicDataTableId DD on T.Id= DD.HierarchyId AND T.DataViewId=DD.DataViewId
left join dbo.Demo1 D on D.[Demo1Id] = DD.DynamicTableId
where T.DataViewId=2 AND T.ParentId=0
group by T.Id, T.Name
Thanks in Advance!!!!.
May 22, 2013 at 2:08 am
Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 22, 2013 at 2:13 am
It depends, you did not give tables definition and actual execution plan, but this query looks like a good candidate to try indexed views.
May 22, 2013 at 3:35 am
Thanks your reply. I created the view and try to create the index but i am getting the below error.
"The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.MyView' and the index name 'idx_MyView'. The duplicate key value is (1)".
The below one is my created view,
CREATE VIEW MyView WITH SCHEMABINDING AS
SELECT dbo.DynamicDataTableId.HierarchyId, dbo.DynamicDataTableId.DynamicTableId, dbo.TreeHierarchy.HierarchyId AS Expr1, dbo.TreeHierarchy.Name, dbo.TreeHierarchy.chkCondition, dbo.TreeHierarchy.DataGroupId AS Expr2, dbo.TreeHierarchy.DataViewId, dbo.TreeHierarchy.DataViewType,
dbo.TreeHierarchy.OriginalNodeName, dbo.DG1.DG1Id, dbo.TreeHierarchy.Id, dbo.DG1.employees, dbo.TreeHierarchy.ParentId
FROM dbo.TreeHierarchy INNER JOIN
dbo.DynamicDataTableId ON dbo.DynamicDataTableId.HierarchyId = dbo.TreeHierarchy.Id INNER JOIN
dbo.DG1 ON dbo.DynamicDataTableId.DynamicTableId = dbo.DG1.DG1Id
GO
CREATE UNIQUE CLUSTERED INDEX idx_MyView ON MyView(HierarchyId)
May 22, 2013 at 4:55 am
e4d4 (5/22/2013)
It depends, you did not give tables definition and actual execution plan, but this query looks like a good candidate to try indexed views.
Can you explain why? This is important - the OP has followed your advice and is now struggling with creating an indexed view when it may not be the appropriate action to take.
I would recommend that the OP follows Gail's advice and posts further information so that folks can make a more informed decision.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 22, 2013 at 5:05 am
Difficult to comment without table / plan info.
One suggestion Skipping char/varchar columns from groupby clause gives you better query performance, for the final output result where you need Name in your case rejoin the table with aggegrated results.
May 22, 2013 at 5:35 am
One suggestion Skipping char/varchar columns from groupby clause gives you better query performance, for the final output result where you need Name in your case rejoin the table with aggegrated results.
It's really depends. May be yes and may be not.
Depends on what kind of indexes OP has. It's quite often to have the "Name" column indexed. In such case it will be hard to determine without trial what will perform better, having it in GROUP BY or rejoining table to itself to get it back,
May 22, 2013 at 5:44 am
thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.
May 22, 2013 at 5:50 am
sivaraman8282 (5/22/2013)
thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.
Be careful with the conclusions you draw from this because if there's a significant difference in execution time then it's far more likely to be related to indexing than to the datatype of T.Name. Have a look at the execution plans for the two queries with and without T.Name in the GROUP BY clause. If you can see any difference, then the performance hike is not related to the datatype. Better still, post the plans (actual, not estimated) for the batch as a .sqlplan attachment.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 22, 2013 at 5:50 am
Problem is, by doing that you've changed what the query does and it may well return differernt data after your modification.
One of the worst things you can do while tuning is trying random stuff without a reason. Shot gun query tuning may have some effect, but will leave you with a query that may or may not return the same data and you may well not know why it's faster if it even is.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 22, 2013 at 6:37 am
ChrisM@Work (5/22/2013)
sivaraman8282 (5/22/2013)
thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.Be careful with the conclusions you draw from this because if there's a significant difference in execution time then it's far more likely to be related to indexing than to the datatype of T.Name. Have a look at the execution plans for the two queries with and without T.Name in the GROUP BY clause. If you can see any difference, then the performance hike is not related to the datatype. Better still, post the plans (actual, not estimated) for the batch as a .sqlplan attachment.
Why do we need to add the char column in the group by, when you dont need it and can be easily get back by requery it, adding indexes on name may give the results faster, but adding indexes for each and every column mince adding overheads on index maintanance also.
May 22, 2013 at 8:17 am
Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.
You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.
The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 22, 2013 at 8:23 am
GilaMonster (5/22/2013)
Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.
The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.
What I was referring to is that, if the ID and Name are pointing to the same row, then why one should add name in groupby condition, as it will increase a unnecessary overhead on sql server to group by name also.
May 22, 2013 at 8:41 am
Bhaskar.Shetty (5/22/2013)
GilaMonster (5/22/2013)
Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.
The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.
What I was referring to is that, if the ID and Name are pointing to the same row, then why one should add name in groupby condition, as it will increase a unnecessary overhead on sql server to group by name also.
As I've said before - it does depend!
If you need both columns ID and Name in the output in the query which also performs some sort of aggregation, you have few choices:
1. Add both columns into GROUP BY (overhead on grouping)
2. Perform grouping on ID only, then join back to table to grab the Name (overhead on JOIN)
3. Use aggregate function over NAME column (eg. MAX(NAME)) (overhead on aggragation)
4. CROSS APPLY to itself to grab the Name (overhead on JOIN)
5. Another more exotic ways... (overhead on something other)
All of the above will produce the same result. Which one of them will have the best performance will depend on many factors such as indexes, column sizes, data volumes and many other.
May 22, 2013 at 8:43 am
Original query form (with some tables I happen to have lying around, around 50k rows in the larger table)
SELECT s.id ,
s.LookupColumn ,
COUNT(DISTINCT SomeArbDate)
FROM dbo.SecondaryTable_Medium AS s
INNER JOIN dbo.PrimaryTable_Medium AS p ON s.LookupColumn = p.SomeColumn
GROUP BY s.id ,
s.LookupColumn
Table 'SecondaryTable_Medium'. Scan count 1, logical reads 19, physical reads 0.
Table 'PrimaryTable_Medium'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 102 ms.
The recommended faster alternative (faster because avoid unnecessary group by on a char column)
SELECT s.id ,
( SELECT LookupColumn
FROM dbo.SecondaryTable_Medium s_inner
WHERE s.id = s_inner.id
) AS LookupColumn ,
COUNT(DISTINCT s.SomeArbDate)
FROM dbo.SecondaryTable_Medium AS s
INNER JOIN dbo.PrimaryTable_Medium AS p ON s.LookupColumn = p.SomeColumn
GROUP BY s.id
Table 'SecondaryTable_Medium'. Scan count 2, logical reads 38, physical reads 0.
Table 'PrimaryTable_Medium'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0.
SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 202 ms.
Not really enough rows to draw any meaningful conclusions from.
To help the OP with his performance problem, we need to see the query's execution plan and the indexes on the table so that we can see what's wrong and make the appropriate recommendations.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply