June 4, 2011 at 10:56 pm
i have seen some strange behaviour with this sproc...this sproc has bunch of select/insert/ update statements. Occasionally this sproc hangs at one particular place (select statement) and does nothing. There is no blocking at all. When this proc is hung, if i just run the select statement itself results come back very quickly. I am so lost as to why does it occasionally hung at this select statement? Can someone please advice as to where else should i look?
June 5, 2011 at 2:28 am
If the spid isn't blocked, it's in the tempdb spooling. Look for contention on the raid for the tempdb.
I realize that's kind of a leap, but it's the only case I can think of when you can't see the actual hang on the spid. The spid's not hung, but tempdb is being a pain in the butt. 🙂
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
June 5, 2011 at 8:42 am
Craig Farrell (6/5/2011)
If the spid isn't blocked, it's in the tempdb spooling. Look for contention on the raid for the tempdb.I realize that's kind of a leap, but it's the only case I can think of when you can't see the actual hang on the spid. The spid's not hung, but tempdb is being a pain in the butt. 🙂
could because select actually creates a new temp table and inserts data into in it. But the confusing part is that it happens only once in while randomly..how do i see if there is a tempdb pressure going on?
June 6, 2011 at 1:18 am
A similar thing was happeing to one of my sql server.
The reason was that the sproc was reading from a huge heap table (no primary key - clustered idnex) a lot of defragmentation. The delay was happening not entirely randomly but on the first 2-3 days of a new month. (the querries were manipulating data based on months)
The sollution was to create a Primary key (This took several hours and for an 100GB table the database grew 100Gb duting the process and afterwards it released those 100GB - internally not to the os , wich means that we needed to Shrink Datab se also afterwards).
In order to troubleshoot , use perfmon to setup some basic counters focusing mostly on I/O read write. We saw that on the days the sproc hang the I/O time was 100% for more than 10 times the average...
June 6, 2011 at 8:09 am
Get a copy of sp_whoisactive (amazing freebie from Adam Machanic found on sqlblog.com) and check for tempdb usage, IOs, query plan, etc. Parameter sniffing could be at fault here since you mention it runs fast directly from SSMS. PFS/SGAM latching could be to blame if you have tempdb object creation.
Honestly there are 1001 things that could be at fault here.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
June 6, 2011 at 9:27 am
Thanks everyone. I don't think it is parameter sniffing issue because it doesn't take any parameters at all. Mentioned below is the query where it occasionally hangs.
sELECT Row_number() OVER ( PARTITION BY c.rid ORDER BY HEID ASC, CovStat DESC ) AS ident ,
t.rid ,
t.h_cdr AS cdr ,
c.id AS MCDI ,
t.id AS TempCoverageId ,
c.FPC AS C_FPC ,
pp_a.PYC AS C_PYC ,
t.h_cdr AS C_cdr
INTO #TempTable_2
FROM TMPCOV t
INNER JOIN COV c ON t.rid = c.rid
LEFT OUTER JOIN PPYS pp_a ON pp_a.FPC = c.FPC
LEFT OUTER JOIN PPYS pp_h ON pp_h.FPC = t.h_FPC
LEFT OUTER JOIN TMPCOV tempcovjoin ON c.id = tempcovjoin.MCDI
WHERE t.h_cdr = 2
AND tempcovjoin.MCDI IS NULL
AND ( ( t.h_SubC = c.SubC
AND pp_a.PYC IS NOT NULL
AND pp_a.PYC = pp_h.PYC
)
OR ( c.FPC = t.h_FPC )
OR ( pp_h.ptype = pp_a.ptype
AND pp_a.ptype = 'SLF'
)
)
AND NOT EXISTS ( SELECT *
FROM TMPCOV
WHERE MCDI = c.id )
Also when it hangs it is blocking other processes which use Tempdb. I am leaning more towards something related to tempdb. The indexes look fine from the source tables.
June 6, 2011 at 10:20 am
sqldba_icon (6/6/2011)
Thanks everyone. I don't think it is parameter sniffing issue because it doesn't take any parameters at all. Mentioned below is the query where it occasionally hangs.sELECT Row_number() OVER ( PARTITION BY c.rid ORDER BY HEID ASC, CovStat DESC ) AS ident ,
t.rid ,
t.h_cdr AS cdr ,
c.id AS MCDI ,
t.id AS TempCoverageId ,
c.FPC AS C_FPC ,
pp_a.PYC AS C_PYC ,
t.h_cdr AS C_cdr
INTO #TempTable_2
FROM TMPCOV t
INNER JOIN COV c ON t.rid = c.rid
LEFT OUTER JOIN PPYS pp_a ON pp_a.FPC = c.FPC
LEFT OUTER JOIN PPYS pp_h ON pp_h.FPC = t.h_FPC
LEFT OUTER JOIN TMPCOV tempcovjoin ON c.id = tempcovjoin.MCDI
WHERE t.h_cdr = 2
AND tempcovjoin.MCDI IS NULL
AND ( ( t.h_SubC = c.SubC
AND pp_a.PYC IS NOT NULL
AND pp_a.PYC = pp_h.PYC
)
OR ( c.FPC = t.h_FPC )
OR ( pp_h.ptype = pp_a.ptype
AND pp_a.ptype = 'SLF'
)
)
AND NOT EXISTS ( SELECT *
FROM TMPCOV
WHERE MCDI = c.id )
Also when it hangs it is blocking other processes which use Tempdb. I am leaning more towards something related to tempdb. The indexes look fine from the source tables.
1) sp_whoisactive can actually help you find the PFS/SGAM latching issue I mentioned if it is occuring.
2) Not only are you populating a temp table here, you are also doing a SORT for the ROW_NUMBER operation. Double hit. Could be HASHING on the joins for more tempdb hits.
3) ORs are horrible for the optimization process. NOT EXISTS too. Ugly stuff there.
4) Note that SSMS has some different SET values for ANSI crap that can cause significantly different plans from ADO.NET or other connections. You can view these values down in the guts of one of the query execution DMVs. I would actually get the plans for both executions and examine for differences. It is possible you could use a plan guide since there are no parameters to force the optimal plan for every execution.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
June 7, 2011 at 4:07 am
Hello sqldba_icon,
seeing your query there are two things focused my attention and looked strange to me:
The NOT EXISTS clause. As pointed by Kevin it could be bad for the optimization process. You can try to substitute it for another LEFT OUTER JOIN; maybe it is done by SQL Server when creating a plan, or maybe don't.
The ROW_NUMBER operator, it implies a SORT over the data result. You create a temp table so you will use it sooner or later, and at first glance I can imagine two scenarios:
- When you use this temp table you need it sorted by c.rid, Row_Number. In this case you will sort the temp table so you don't need to sort your original query, you can calculate this row_number after when needed.
- When you use this temp table you need it filtered by row_number (i.e. Row_number = 1). In this case you can filter data at first, in your original query.
Maybe you use other scenarios but anyway you should detect these scenarios, perceive the type of process you are doing with the temp table, and think about if some process can be optimized in the original query.
Regards,
Francesc
June 7, 2011 at 9:17 am
Thanks everyone.
June 7, 2011 at 11:33 am
sqldba_icon (6/6/2011)
Thanks everyone. I don't think it is parameter sniffing issue because it doesn't take any parameters at all. Mentioned below is the query where it occasionally hangs.sELECT Row_number() OVER ( PARTITION BY c.rid ORDER BY HEID ASC, CovStat DESC ) AS ident ,
t.rid ,
t.h_cdr AS cdr ,
c.id AS MCDI ,
t.id AS TempCoverageId ,
c.FPC AS C_FPC ,
pp_a.PYC AS C_PYC ,
t.h_cdr AS C_cdr
INTO #TempTable_2
FROM TMPCOV t
INNER JOIN COV c ON t.rid = c.rid
LEFT OUTER JOIN PPYS pp_a ON pp_a.FPC = c.FPC
LEFT OUTER JOIN PPYS pp_h ON pp_h.FPC = t.h_FPC
LEFT OUTER JOIN TMPCOV tempcovjoin ON c.id = tempcovjoin.MCDI
WHERE t.h_cdr = 2
AND tempcovjoin.MCDI IS NULL
AND ( ( t.h_SubC = c.SubC
AND pp_a.PYC IS NOT NULL
AND pp_a.PYC = pp_h.PYC
)
OR ( c.FPC = t.h_FPC )
OR ( pp_h.ptype = pp_a.ptype
AND pp_a.ptype = 'SLF'
)
)
AND NOT EXISTS ( SELECT *
FROM TMPCOV
WHERE MCDI = c.id )
Also when it hangs it is blocking other processes which use Tempdb. I am leaning more towards something related to tempdb. The indexes look fine from the source tables.
I had something like this come up just yesterday that I fixed. It was very strange. The SProc would hang sometimes. I would copy/paste the code from the stored procedure into SSMS and run it in steps and it would run just fine - each step ran in a second or two. This was SQL 2000 so I don't know if this is the same situation as yours.
The logic had a similar LEFT JOIN and a check for SomeColumn IS NULL in the WHERE clause - i.e. finding unmatched records from one table to the other. I changed the LEFT JOIN to a NOT EXISTS in the WHERE clause and it would still hang. Very mysterious. I was under a lot of pressure to just get the thing running so I couldn't fully investigate the phenomenon.
I changed the initial insert into the temp table to just dump everything on the matching criteria. Then I did an INNER JOIN on a DELETE to get rid of the ones that were matched when I wanted only the unmatched records. The procedure ran just fine then - no hang.
I've found that sometimes doing things in steps - divide and conquer - can work better than trying to do it all at once.
I'm still going back to the original LEFT JOIN/WHERE SomeColumn IS NULL and try to figure out why the heck it would hang like that sometimes.
Give it a try and see if it works for you.
Todd Fifield
June 8, 2011 at 8:24 am
There are few more details i am working on. The same sproc works perfectly fine on other server, i am still leaning towards tempdb. I have a case opened with Microsoft, will let you know how it goes. Thanks
June 20, 2011 at 10:46 pm
I have some more details now. The sproc is stuck again at the same batch and mentioned below are few details:
i) this sql (http://www.sqlservercentral.com/Forums/Topic1119857-146-1.aspx) is stuck inside the sproc, however if i manually run the same sql from management studio while the batch of sql is stuck , runs perfectly fine. I also created a temp sproc with just the sql which is getting hung and ran through job and runs fine.
ii) sp_lock o/p for the spid is attached, please take a look
Can someone please help in identifying as to where else should i look. This is so weird query is occasionally hung at a batch but runs perfectly normal through management studio whenever hung? MSFT is also working on this, but i would rely more on this forum than MSFT.
June 21, 2011 at 12:03 am
You can set the profiler trace. and see where its creating the issue?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
June 21, 2011 at 12:14 am
sqldba_icon (6/20/2011)
Can someone please help in identifying as to where else should i look. This is so weird query is occasionally hung at a batch but runs perfectly normal through management studio whenever hung? MSFT is also working on this, but i would rely more on this forum than MSFT.
Locks look fine.
As Bhuvnesh mentioned above, I'd start a serverside trace filtered to this proc and pull all the compile events along with start/completes.
With that, and the xml as the plan sits when it hangs, we can probably see what it's doing, especially comparing it to a 'good' plan.
I realize it's not parameter sniffing, but something's changing. Either its internal, which would cause the compiles, or it's external, in which case you won't see a new plan when this occurs.
Also, sorry I lost track of this thread. You'll need to run perfmon counters or the equivalent on the tempdb LUN for read/write and other disk I/O items to determine if that's your hang. If you see any indicators there, you'll probably need to drag the SAN team into the process, as you're not going to be able to directly see past the Fiber.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
June 21, 2011 at 8:01 am
I am sure MSFT has captured exec plans.But if you look at my sql statement it is pretty straight select and insert...one thing what i am not sure is i am doing select into type...would it make any difference if i create a temp table first and then insert into it?Again whenever the issue exists the same sql at the same time works perfectly fine.I am surprised why does sp_lock info is showing a bulk import...
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply