February 22, 2016 at 2:30 pm
I'm doing a data conversion job, and I have a large stored procedure (almost 7000 lines) that does various analytical and conversion tasks. I have it all in one proc because I'm slowly pecking away at the data, fixing errors as my analysis routines uncover them, then re-running the proc to see what has been fixed and what still needs attention. The database is live, so there are new errors accumulating as I fix the old ones, but I'm gaining.
Just in the last few days, I added a set of tasks that are bogging the run down in an absolutely insane fashion. The tasks are not at all complex - a simple update with a join. When I execute the statements in immediate mode, they run fairly quickly. The table on which I'm operating is not especially efficient (no indexes), but neither is it very large. Only around 70,000 records, and a simple self-join. I'm not especially keen to waste a lot of time tuning indexes, since it will be used only for this conversion job and then completely discarded. Even when the number of records modified is relatively large (30,000 is the most I have seen so far), the process is quick - a second or so at the most. But when I run the entire procedure, it takes absolutely forever. The sum of the statement runtimes should be on the order of a minute or two, but it runs for over an hour. I don't know exactly how long, because I have it set to print an error message and terminate when it detects a problem with the data, so it never finishes.
I went through it earlier this afternoon and executed it step by step, highlighting the individual statements and pressing F5. No problems, no hesitations, response times on the order of a second, all the way through the proc. But now, setting the entire proc to execute again is the same - hung somewhere inside. I have RaiseError statements throughout, so I know approximately where it is, but there is nothing special there. There is a set of statements like this:
if (select count(1)
from PaleoData.Tax.PodrobnostiTmp P1
inner join PaleoData.Tax.PodrobnostiTmp P2
on p1.Family_13000 = p2.Family_13000
where P1.Family_13000 is not null and p1.SuperClass_6000 is not null
and p2.Family_13000 is not null and p2.SuperClass_6000 is null) > 0
begin
update P2 set p2.SuperClass_6000 = p1.SuperClass_6000
from (select Family_13000, SuperClass_6000
from PaleoData.Tax.PodrobnostiTmp
where Family_13000 is not null and SuperClass_6000 is not null
group by Family_13000, SuperClass_6000) P1
inner join PaleoData.Tax.PodrobnostiTmp P2
on p1.Family_13000 = p2.Family_13000
Where P2.Family_13000 is not null and p2.SuperClass_6000 is null
select @Tmp = @@ROWCOUNT
set @errmsg = 'Upfill SuperClass_6000 from Family_13000 - ' + cast(@tmp as varchar(10))
end
else
set @errmsg = 'No upfill to SuperClass_6000 from Family_13000'
RAISERROR(@errmsg,0,1) WITH NOWAIT
that all execute with no hesitation when run in immediate mode.
It's a four-core processor, and one of the cores is running at fairly high percentage, occasionally bumping against 100%. The other three are idle.
Is there anything that is different about running a statement in immediate mode, that would cause it to bog when run as part of a stored procedure? Is there a limit on the number of statements that can be run in a row? Is there something like the VBA DoEvents statement, that would give the OS a moment to pause for breath and dump any display buffers?
I realize this is kind of vague, but it's kind of a vague problem. i don't want to spam the forum with 7000 lines of T-SQL, but I will post anything that anyone thinks might help diagnose this behavior. I do have Set NoCount On right at the beginning.
Whoops - update here. All four cores just flatlined to 100% and they're staying there. Several minutes now and they're all holding at max. No idea what the machine could be doing - this database is the only thing on it.
Okay, just dropped back to three cores idle and one high, occasionally touching 100%.
February 22, 2016 at 2:38 pm
There could be a problem with parameter sniffing or the proc could have been compiled with old stats.
Try putting WITH RECOMPILE at the top of the procedure.
February 22, 2016 at 3:02 pm
Ken McKelvey (2/22/2016)
There could be a problem with parameter sniffing or the proc could have been compiled with old stats.Try putting WITH RECOMPILE at the top of the procedure.
Nope - just tried it, still slower than molasses and flatling all four cores again.
February 22, 2016 at 3:22 pm
Is this procedure using transactions? They can seriously bloat your logs if they are large. Kind of grasping for straws here. Are you able to get an actual execution plan or does it kill your server trying to do it?
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 22, 2016 at 3:53 pm
Your code has to actually execute the count in your first query, and since you are only testing to see if it is greater than 0, there is no need to actually count those records. Why not use IF EXISTS instead? Here's the re-factored portion that you supplied, with the change in place:
IF EXISTS (
SELECT 1
FROM PaleoData.Tax.PodrobnostiTmp AS P1
INNER JOIN PaleoData.Tax.PodrobnostiTmp AS P2
ON p1.Family_13000 = p2.Family_13000
WHERE P1.Family_13000 IS NOT NULL
AND p1.SuperClass_6000 IS NOT NULL
AND p2.Family_13000 IS NOT NULL
AND p2.SuperClass_6000 IS NULL
)
BEGIN
UPDATE P2
SET p2.SuperClass_6000 = p1.SuperClass_6000
FROM (
SELECT Family_13000, SuperClass_6000
FROM PaleoData.Tax.PodrobnostiTmp
WHERE Family_13000 IS NOT NULL
AND SuperClass_6000 IS NOT NULL
GROUP BY Family_13000, SuperClass_6000
) AS P1
INNER JOIN PaleoData.Tax.PodrobnostiTmp AS P2
ON p1.Family_13000 = p2.Family_13000
WHERE P2.Family_13000 IS NOT NULL
AND p2.SuperClass_6000 IS NULL;
SELECT @Tmp = @@ROWCOUNT;
SET @errmsg = 'Upfill SuperClass_6000 from Family_13000 - ' + cast(@tmp as varchar(10));
END
ELSE
BEGIN
SET @errmsg = 'No upfill to SuperClass_6000 from Family_13000' ;
RAISERROR(@errmsg,0,1) WITH NOWAIT;
END
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 22, 2016 at 4:26 pm
IF (SELECT COUNT...)>0 actually gets converted to an EXISTS check by the optimizer. http://blogs.technet.com/b/wardpond/archive/2007/08/27/if-exists-select-vs-if-select-count-1-0.aspx
Like you, I'd still prefer the explicit EXISTS, for clarity if for no other reason.
I wrote up a demo for this almost exactly a year ago on another forum post: http://www.sqlservercentral.com/Forums/FindPost1664312.aspx.
I knew I'd written that up, but finding it was a pain; at some point it became clear it would be faster to rewrite the demo than find the old post, but I was stubborn 🙂
Some of it has ridiculous leftovers from copy/pasted code (like the TOP 1 inside the EXISTS), but it illustrates the point nevertheless. One day when my connection's not having issues with posts with lots of code in them I'll go prettify the code in that post 🙂
Cheers!
February 22, 2016 at 9:54 pm
Jacob, that doesn't surprise me, but I was also thinking that there just might be some conditions that could mess with that query plan, and cause a bad plan and the optimizer somehow wouldn't get the chance... And I agree, why get in the habit of writing a count that isn't actually necessary?
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 22, 2016 at 11:48 pm
Sean - no, not using transactions. Just started it with execution plans enabled, but since there are several hundred independent SELECT and UPDATE statements in the procedure, I suspect the plans will overwhelm something in the SSMS window. It's running now, I'll see what happens. So far the plans are just what I would expect - mostly table scans and sorts. Not the greatest, but wit this amount of data and nothing else running, entirely acceptable.
Steve & Jacob - you're right, of course, the EXISTS predicate is more sensible. I initially wrote it as a count, then spun it out to the numerous copies that do almost identical tasks through the process, then realized that an EXISTS predicate is a better way, but didn't feel like changing all the places I would have to. The performance is acceptable for my purposes and it's not going into production - it's only for my conversion job.
The issue that's giving me fits is not performance in general, it's performance while running the entire stored procedure. Every individual statement, including these counts, executes promptly and correctly when run individually. It's only when the entire procedure is turned loose that the server goes nuts. It's doing it again now - all four cores flatlined at 100%.
February 23, 2016 at 12:06 am
Looking at the query plans - some of the estimates are wildly off. Estimated rows 331,232, actual rows 597.
February 23, 2016 at 1:00 am
I've got it, sort of. I've been single-stepping through it in debug mode, and it appears to be the counts WITHIN THE IFs that are causing the problem. Here is one such offending statement. When I executed it via F10 in debug mode, it maxed out the processor for over 4 minutes.
if (select count(1)
from PaleoData.Tax.PodrobnostiTmp P1
inner join PaleoData.Tax.PodrobnostiTmp P2
on p1.Family_13000 = p2.Family_13000
where P1.Family_13000 is not null and p1.Suborder_11000 is not null
and p2.Family_13000 is not null and p2.Suborder_11000 is null) > 0
When I copied the text to a new SSMS window and added a bit of dressing:
if (select count(1)
from PaleoData.Tax.PodrobnostiTmp P1
inner join PaleoData.Tax.PodrobnostiTmp P2
on p1.Family_13000 = p2.Family_13000
where P1.Family_13000 is not null and p1.Suborder_11000 is not null
and p2.Family_13000 is not null and p2.Suborder_11000 is null) > 0
print 'x'
else
print 'y'same result.
But when I highlight only the core of the statement,
select count(1)
from PaleoData.Tax.PodrobnostiTmp P1
inner join PaleoData.Tax.PodrobnostiTmp P2
on p1.Family_13000 = p2.Family_13000
where P1.Family_13000 is not null and p1.Suborder_11000 is not null
and p2.Family_13000 is not null and p2.Suborder_11000 is null
that executes instantly, even when I copy that core out to yet another window, and run it WHILE the one wrapped in the IF statement is still grinding along with four cores at 100%. The execution plan show two table scans, no surprise, except the one with the IF statement shows nested loops to gather the data, while the bare one shows a hash match. The number of rows is off, with an estimate of close to one million.
Next step is to try changing it to EXISTS, see if that helps.
February 23, 2016 at 1:09 am
The estimated number of rows was almost a million, the ACTUAL was over 59 million. Seems clear what he problem is, but WHY is it doing that? Apparently it is creating a full outer join, of 70K * 70K rows, then doing a select of the appropriate ones, then counting them, instead of doing the join on only the rows specified in my WHERE clause. The optimizer seems to have had some good drugs for breakfast this morning.
February 23, 2016 at 1:18 am
Nope - EXISTS gives the same behavior - all four cores at 100%, grinding away.
February 23, 2016 at 1:56 am
Just tried an alternate version, using subqueries with their own WHERE clauses. No improvement.
if exists (select 1
from (select Family_13000, Suborder_11000
from PaleoData.Tax.PodrobnostiTmp
where Family_13000 is not null and Suborder_11000 is not null) P1
inner join (select Family_13000, Suborder_11000
from PaleoData.Tax.PodrobnostiTmp
where Family_13000 is not null and Suborder_11000 is null) P2
on p1.Family_13000 = p2.Family_13000
)
print 'x'
else
print 'y'
February 23, 2016 at 2:30 am
Do Statistics need updating? Maybe with FULLSCAN rather than the default?
Viewing 15 posts - 1 through 15 (of 31 total)
You must be logged in to reply to this topic. Login to reply