SQl Server 2008 Query Performance

Question

SQl Server 2008 Query Performance

sivaraman8282

SSC Enthusiast

Points: 153
More actions
May 22, 2013 at 2:01 am

#278101

Hi Guys,
The below one is my query. It's taking 12 seconds for the execution process. The count(distinct()) and SUM() functions are taking long time for execution. I tried it after create the non-cluster index but i was getting the same time. How can I avoid this issue?.
select T.Name as name,T.Id as id,COUNT(distinct(DD.dynamictableid)) as counts ,ROUND(SUM(D.[employees]),0) as measure1 from dbo.TreeHierarchy T
left join dbo.DynamicDataTableId DD on T.Id= DD.HierarchyId AND T.DataViewId=DD.DataViewId
left join dbo.Demo1 D on D.[Demo1Id] = DD.DynamicTableId
where T.DataViewId=2 AND T.ParentId=0
group by T.Id, T.Name
Thanks in Advance!!!!.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

e4d4 SSCertifiable Points: 5777 More actions · Answer 2

It depends, you did not give tables definition and actual execution plan, but this query looks like a good candidate to try indexed views.

sivaraman8282 SSC Enthusiast Points: 153 More actions · Answer 3

Thanks your reply. I created the view and try to create the index but i am getting the below error.

"The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.MyView' and the index name 'idx_MyView'. The duplicate key value is (1)".

The below one is my created view,

CREATE VIEW MyView WITH SCHEMABINDING AS

SELECT dbo.DynamicDataTableId.HierarchyId, dbo.DynamicDataTableId.DynamicTableId, dbo.TreeHierarchy.HierarchyId AS Expr1, dbo.TreeHierarchy.Name, dbo.TreeHierarchy.chkCondition, dbo.TreeHierarchy.DataGroupId AS Expr2, dbo.TreeHierarchy.DataViewId, dbo.TreeHierarchy.DataViewType,

dbo.TreeHierarchy.OriginalNodeName, dbo.DG1.DG1Id, dbo.TreeHierarchy.Id, dbo.DG1.employees, dbo.TreeHierarchy.ParentId

FROM dbo.TreeHierarchy INNER JOIN

dbo.DynamicDataTableId ON dbo.DynamicDataTableId.HierarchyId = dbo.TreeHierarchy.Id INNER JOIN

dbo.DG1 ON dbo.DynamicDataTableId.DynamicTableId = dbo.DG1.DG1Id

GO

CREATE UNIQUE CLUSTERED INDEX idx_MyView ON MyView(HierarchyId)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

e4d4 (5/22/2013)
It depends, you did not give tables definition and actual execution plan, but this query looks like a good candidate to try indexed views.

Can you explain why? This is important - the OP has followed your advice and is now struggling with creating an indexed view when it may not be the appropriate action to take.

I would recommend that the OP follows Gail's advice and posts further information so that folks can make a more informed decision.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Bhaskar.Shetty Hall of Fame Points: 3054 More actions · Answer 5

Difficult to comment without table / plan info.

One suggestion Skipping char/varchar columns from groupby clause gives you better query performance, for the final output result where you need Name in your case rejoin the table with aggegrated results.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 6

One suggestion Skipping char/varchar columns from groupby clause gives you better query performance, for the final output result where you need Name in your case rejoin the table with aggegrated results.

It's really depends. May be yes and may be not.

Depends on what kind of indexes OP has. It's quite often to have the "Name" column indexed. In such case it will be hard to determine without trial what will perform better, having it in GROUP BY or rejoining table to itself to get it back,

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

sivaraman8282 SSC Enthusiast Points: 153 More actions · Answer 7

thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

sivaraman8282 (5/22/2013)
thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.

Be careful with the conclusions you draw from this because if there's a significant difference in execution time then it's far more likely to be related to indexing than to the datatype of T.Name. Have a look at the execution plans for the two queries with and without T.Name in the GROUP BY clause. If you can see any difference, then the performance hike is not related to the datatype. Better still, post the plans (actual, not estimated) for the batch as a .sqlplan attachment.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Problem is, by doing that you've changed what the query does and it may well return differernt data after your modification.

One of the worst things you can do while tuning is trying random stuff without a reason. Shot gun query tuning may have some effect, but will leave you with a query that may or may not return the same data and you may well not know why it's faster if it even is.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Bhaskar.Shetty Hall of Fame Points: 3054 More actions · Answer 10

ChrisM@Work (5/22/2013)
sivaraman8282 (5/22/2013)
thanks Newbie. I removed nvarchar field(T.Name) from group by area. Now, the performance is good compare to previous one.
Be careful with the conclusions you draw from this because if there's a significant difference in execution time then it's far more likely to be related to indexing than to the datatype of T.Name. Have a look at the execution plans for the two queries with and without T.Name in the GROUP BY clause. If you can see any difference, then the performance hike is not related to the datatype. Better still, post the plans (actual, not estimated) for the batch as a .sqlplan attachment.

Why do we need to add the char column in the group by, when you dont need it and can be easily get back by requery it, adding indexes on name may give the results faster, but adding indexes for each and every column mince adding overheads on index maintanance also.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.

You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.

The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Bhaskar.Shetty Hall of Fame Points: 3054 More actions · Answer 12

GilaMonster (5/22/2013)
Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.
You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.
The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.

What I was referring to is that, if the ID and Name are pointing to the same row, then why one should add name in groupby condition, as it will increase a unnecessary overhead on sql server to group by name also.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 13

Bhaskar.Shetty (5/22/2013)
GilaMonster (5/22/2013)
Bhaskar.Shetty (5/22/2013)
but adding indexes for each and every column mince adding overheads on index maintanance also.
You wouldn't want to do that and it would be entirely useless to do. Adding name to whatever index SQL's already using for the query is what Chris was referring to.
The other point still stands, removing a column from the group by may well change the performance, but it's a different query with different results and joining back to the table a second time to get the results back may make it slower than the original query was.
What I was referring to is that, if the ID and Name are pointing to the same row, then why one should add name in groupby condition, as it will increase a unnecessary overhead on sql server to group by name also.

As I've said before - it does depend!

If you need both columns ID and Name in the output in the query which also performs some sort of aggregation, you have few choices:

1. Add both columns into GROUP BY (overhead on grouping)

2. Perform grouping on ID only, then join back to table to grab the Name (overhead on JOIN)

3. Use aggregate function over NAME column (eg. MAX(NAME)) (overhead on aggragation)

4. CROSS APPLY to itself to grab the Name (overhead on JOIN)

5. Another more exotic ways... (overhead on something other)

All of the above will produce the same result. Which one of them will have the best performance will depend on many factors such as indexes, column sizes, data volumes and many other.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

Original query form (with some tables I happen to have lying around, around 50k rows in the larger table)

SELECT s.id ,

s.LookupColumn ,

COUNT(DISTINCT SomeArbDate)

FROM dbo.SecondaryTable_Medium AS s

INNER JOIN dbo.PrimaryTable_Medium AS p ON s.LookupColumn = p.SomeColumn

GROUP BY s.id ,

s.LookupColumn

Table 'SecondaryTable_Medium'. Scan count 1, logical reads 19, physical reads 0.

Table 'PrimaryTable_Medium'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0.

SQL Server Execution Times:

CPU time = 16 ms, elapsed time = 102 ms.

The recommended faster alternative (faster because avoid unnecessary group by on a char column)

SELECT s.id ,

( SELECT LookupColumn

FROM dbo.SecondaryTable_Medium s_inner

WHERE s.id = s_inner.id

) AS LookupColumn ,

COUNT(DISTINCT s.SomeArbDate)

FROM dbo.SecondaryTable_Medium AS s

INNER JOIN dbo.PrimaryTable_Medium AS p ON s.LookupColumn = p.SomeColumn

GROUP BY s.id

Table 'SecondaryTable_Medium'. Scan count 2, logical reads 38, physical reads 0.

Table 'PrimaryTable_Medium'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0.

SQL Server Execution Times:

CPU time = 32 ms, elapsed time = 202 ms.

Not really enough rows to draw any meaningful conclusions from.

To help the OP with his performance problem, we need to see the query's execution plan and the indexes on the table so that we can see what's wrong and make the appropriate recommendations.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass