January 7, 2011 at 4:46 pm
Hello,
We've got created the following view (1.), in a database that have no more then 350Mb :w00t: but the result of the select statement (2.) stays among 5 hours to return results :crazy:
I'm completely lost and i really dont understand what part of the code i must change :blush:
Sorry!!! Can you help me or can you give me any idea so that i can change the code of this "crazy view".....
Happy new year
Thanks and regards,
JMSM 😉
1.
------------- Create View Code ------------------
use [DBName]
GO
create view [dbo].[VwNAME] as
select [ID]
,[Login]
,FLPExp
,FLKHw
,PUACnt
,FLWRArt
,FLWRBPst
,FLWDnlds
,case when FLPExp = 1 and FLKHw = 1 and PUACnt > 1 then 100
when FLPExp = 1 and FLKHw = 1 and FLWRArt = 1 then 75
when FLPExp = 1 and FLKHw = 1 and FLWDnlds = 1 then 50
when FLPExp = 1 and FLKHw = 1 then 25
else 0
end 'PercentPrf'
from (
select
a.id as 'ID',
a.ACCName as 'Login',
case when count(distinct b.title)>0 then 1 else 0 end FLPExp,
case when count(distinct f.idknowhow)>0 then 1 else 0 end FLKHw,
count(distinct d.id)+count(distinct e.IDArtcl) PUACnt,
case when (count(distinct g.id) + count(distinct h.id))>0 then 1 else 0 end FLWRArt,
case when count(distinct i.id)>0 then 1 else 0 end FLWRBPst,
case when count(distinct j.CntDnlds)>0 then 1 else 0 end FLWDnlds
from Athr a
left join PstExp b on a.id = b.idAthr
left join Artcl d
on a.id = d.idAthrSubMtr
and d.status = 2 -- Aproved
left join (select e1.id, e2.idAthr, e2.IDArtcl
from Artcl e1,
ArtclcoAthr e2
where e1.id = e2.IDArtcl
and e1.status = 2) e -- CO Artcls
on a.id = e.idAthr
left join Athrknowhow f on a.id = f.idAthr
left join Artcl g -- Artcls less 30 days
on a.id = g.idAthrSubMtr
and datediff(d, g.PblDte, getdate()) < 30
and g.Status = 2
left join (select h1.id, h2.idAthr
from Artcl h1,
ArtclcoAthr h2
where h1.id = h2.IDArtcl
and datediff(d, h1.PblDte, getdate()) < 30
and h1.status = 2) h -- CO Artcls less 30 days
on a.id = h.idAthr
left join UsgWkiBlg i -- Posts less then 30 days
on a.id = i.idAthr
and i.ShPntTypSWeb = 1
and datediff(d, i.LstMdfation, getdate()) < 30
left join UsgeAthrArtcl j -- downloads
on a.id = j.idAthr
where a.locked = 0
group by a.id, a.ACCName ) a
go
---------------- End View Code ------------------
2.
---------------- Select Statement ---------------
SELECT [ID]
,[Login]
,FLPExp
,FLKHw
,PUACnt
,FLWRArt
,FLWRBPst
,FLWDnlds
,[PercentPrf]
FROM [DBName]. [dbo].[VwNAME]
GO
------------ End Select Statement ---------------
January 9, 2011 at 6:35 am
It's hard to make something up from your create view statement as it doesn't have a proper lay-out, which makes it hard to read.
At first glance I see you have a lot of joins, including outer joins, with functions in the where clause. You should try to avoid functions in the where clause, as indexes will probably not be used. Also, the many COUNT DISTINCT aggregates can also slow things down.
If possible and necessary, replace the "on-the-fly" calculated columns from the where clauses with real persisted computed columns.
You should probably also check out if there are any indexes and if there are, if they are useful.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
January 9, 2011 at 7:06 am
Thanks a lot Koen.
I'll do the right checks and i'll tell you the solution.
Once more thanks.
Regards,
JMSM 😉
January 9, 2011 at 7:26 am
There are several approaches you can use:
Most important:
instead of applying a function to your columns rewrite it as SARGable arguments:
datediff(d, i.LstMdfation, getdate()) < 30
-- would be
i.LstMdfation>dateadd(dd, -30, getdate())
Another option that usually helps is preaggregation.
Find the table that would be reduced the most by your where conditions and populate that table as an indexed temp table and use this table in your query.
Furthermore, you're calling Artcl at least four times. Maybe the query itself can be rewritten to be much more efficient.
And finally, there might be a missing index issue, too. Check the execution plan for best index usage.
January 9, 2011 at 9:00 am
I agree with what everyone else has said. I think they've caught the obvious stuff. Now, to keep going, can you post the execution plan from the query, preferably after you fix the functions that are preventing index use. That can help to figure out if you have good indexes in place and they're getting used properly.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 9, 2011 at 9:10 am
Thanks a lot LutzM.
I'll change the function as u tell me.
Regards,
JMSM 😉
January 9, 2011 at 6:07 pm
JMSM (1/9/2011)
Thanks a lot LutzM.I'll change the function as u tell me.
Regards,
JMSM 😉
The function issue is not the only problem (might not even be the most relevant one).
So, please follow Grants advice and post the actual execution plan.
And don't worry too much about Celko's post... If you'd take the time trying to read and understand what his real intention is, you'd probably have rewritten your code by then anyway... 😉
Beside he really knows SQL stuff (at least what rumour tells) he's got that "special" online personality. The bottom line: He doesn't like the way you name your columns, tables, and views. Doesn't change performance at all, but would help readability and maintainability, I admitt. But that's not the major concern here, I guess.... 😉
He's right about the DISTINCT issue though...
January 10, 2011 at 7:40 am
Hello again,
I've attached the execution plan and the code of both views, the oldest and the new one.
I've change yet the column names, and the tablename on both views.
Thanks and regards,
JMSM 😉
January 10, 2011 at 8:04 am
JMSM (1/10/2011)
Hello again,I've attached the execution plan and the code of both views, the oldest and the new one.
I've change yet the column names, and the tablename on both views.
Thanks and regards,
JMSM 😉
There's just a ton of things going on here. First off, all those DISTINCT operations, they're killing your performance. You need to reevaluate if you really need them, and if so, I'd suggest building a materialized view with this data already aggregated (DISTINCT is an aggregation operation).
It also looks like you're missing indexes all over the place. Just picking one out
left join pastexperience b on a.id = b.idauthor
This results in a merge join on just a few records. From what I can see, I'd try an index on the PastExperience table on the column IdAuthor and probably would add the Title column to the INCLUDE list.
There are probably lots of other opportunities for indexes. On a guess, you don't have any indexes on your foriegn keys do you?
That's my 2 cents worth. I'm sure others will pick up on more.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 10, 2011 at 9:45 am
Grant Fritchey (1/10/2011)
This results in a merge join on just a few records. From what I can see, I'd try an index on the PastExperience table on the column IdAuthor and probably would add the Title column to the INCLUDE list.
Grant,
I'm suddenly worried that once again a core item to my understanding is invalid. Merge Joins are supposed to be the fastest. It's the "card-shuffle" join, one pass on both sides of the data and bam, done. Why would you go out of your way to avoid that?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 10, 2011 at 10:09 am
Hello again,
I'm really but really sorry, but i'm completely lost and hope that anyone explain me what should i do like if i was a donkey :crying:
Sorry but i really dont know what is wrong or what is right.. :sick:
Thnaks and regards,
JMSM 😉
January 10, 2011 at 10:40 am
*nearly sprays his tea all over the monitor upon opening the execution plan*
Yikes. Wow. :w00t:
Now, THAT'S a Table Spool. Just the estimate, I know, but a 101million row spool? No wonder those distincts are crying.
The materialized view that Grant's talking about above is an indexed view. Creating that will cause tremendous pain whenever you edit underlying data though. What I would be more inclined to do is 'snapshot' this data into a result table overnight, and then use it as a baseline for anything during the day. It'll be up to 24 hours out of date... but that hopefully isn't much of a concern here since this query runs from the beginning of time. I didn't see any filters on it, anyway...
JMSM, this is not a small and easy to fix item. To quote Grant: "There's just a ton of things going on here." You're talking we'd need to see the DDL and indexing for all the involved tables. We'd need to have expected use cases, since it looks like you're building 'one view to rule them all', and intend on applying where clausing afterwards. We'd need sample data, and we'd need access to some of the selectivity histograms to really fix this properly.
There is no 'quick fix' to this. You've got a ton of data and a ton of different concurrent calculations.
Really, this is the kind of thing you usually hire specialized consultants for. Is this view eventually used for some kind of data warehouse, or business intelligence? It's practically screaming that it should be going to SSAS for final reporting.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 10, 2011 at 10:49 am
It's just not as easy as you might think to work on such a rather large query without any sample data to play with ... 😉
All we have is the view definition and the execution plan. So there will be some questions to help us understand as well as a discussion between the people trying to help you.
One question would be:
How many rows are in Artcl in total and with Status = 2?
I noticed you select only data with that specific status. So it might help to build a temp table holding only data from Artcl with status =2.
It might be worth to include a join to ArtclcoAthr when building the temp table and add a column that would represent the result of datediff(d, h1.PblDte, getdate()) < 30. Maybe some aggregation can be done at this level, too. The temp table would replace the two subqueries.
It would be much more efficient though, if we could get the table defs for all related tables including all indexes as well as some representative (but fake) sample data...
January 10, 2011 at 11:13 am
A couple more points to ponder on :
StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="TimeOut"
That means that the server has given up on finding a "decent" plan to execute that query. So breaking it into a couple step will definitely help here.
CompileTime="16342"
That means that even after 16 seconds of processing, the server couldn't come up with a good plan.
Is the server powerfull enough to run your normal day to day loads???
We can help all day long, but in the end I think you're better off hire a consultant out for this problem to really solve it correctly and get a ton of pointers and experience in the process. I think you're in way over your head, way over what you can even imagine.
January 10, 2011 at 11:21 am
... and 1 more point.
All those spools are generating 59 GB of data... 172 times the total size of the DB.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply