April 5, 2005 at 2:40 am
I have a question, that has been bugging me since yesterday. I don't even know, if the question belongs in this section - if not feel free to move it where it belongs!!!
We are developing a webapplication with C# and SQL Server 2000 sp3. We use stored procedures to retreive the data. Now one of these stored procs suddenly started behaving weird. it time out repeatedly - we upped the connection timeout as some of the procs do take a while and it still timed out.
When I started the proc with realistic params from QA I saw it took over 8 minutes. it consists of a select joining 4 tables - I would post the statement, but as I am not allowed to post the table schema I don't know if it would help you understand the problem and anyway it doesn't seem to lay within the statement.
Anyway - I played around a bit trying to optimize the statement within the proc but I didn't get the time down any further. I thought for some reason that the Query Plan had become kind of corrupt, so I recompiled the proc - it brought the time down to 5 minutes - still too long.
When I started just the select statement in QA declaring and setting the params manually the proc executed in under 1 second. the exact same statement with the exact same params executed in the proc still took 5 minutes.
I then had an idea: I boxed the params passed into the proc in local variables within the proc - and presto it worked - it brought the proc down to under 1 second, just as it had done in QA. Now my question is why is this effective? my colleagues are all eager to starting boxing all params into local variables to see if it will make any more differences - but I warned the not to, until I know why it worked and if it would always be an effective solution, which I somehow don't think it will be.
Can anyone explain to me what happened? And why? What is the advantage of boxing the params? Is there one at all? I have looked the issue up, but don't seem to find any answers. Perhaps I am looking in the wrong direction?
Thanks for your time
nano
April 5, 2005 at 5:45 am
I remember having a conversation with someone here that stated local variables do perform better but can not remember why...
How are you defining the variables? VARCHAR v INT?
How are you using them in your WHERE statements? i.e. using NOT IN etc...
Good Hunting!
AJ Ahrens
webmaster@kritter.net
April 5, 2005 at 6:41 am
Please post the query that you can. It will give us a better idea of what you are doin and why there may be such a delay. And keep in mind the more details you can provide the better, including the execution plan (text version)
April 5, 2005 at 6:48 am
I am just declaring the following local:
CREATE PROCEDURE mag_selectMaxProteinResultsTest
@jobID int,
@processingModuleNr int,
@scoreName varchar(50),
@scoreValue float,
@peptidenumber int
AS
DECLARE @jobID2 int
DECLARE @processingModuleNr2 int
DECLARE @scoreName2 varchar(50)
DECLARE @scoreValue2 float
SET @jobID2 = @jobID
SET @processingModuleNr2 = @processingModuleNr
SET @scoreName2 = @scoreName
SET @scoreValue2 = @scoreValue
....
and using @jobID2 etc int the select statement as opposed to using the direct parameter @jobid ... I can't see where the difference is, of why this should be so much more effective?
I do see from the different Execution Plans, that the slower one uses nested loops whereas the faster one uses merge joins ... but my questions is why this is so? What is the diffence when using local variables as opposed to the incoming parameters?
thanks in advance for any answers that help clear this up
nano
April 5, 2005 at 6:56 am
Here's something that might help you :
http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=65&messageid=16197
check the 2nd post of ckempste
http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=6&messageid=13768
April 5, 2005 at 6:57 am
The actual query is the following:
SELECT Proteins.ProteinID,
Proteins.FastaID,
PROTAB2.SumMaxScore AS MainScore,
Proteins.Description
FROM (
SELECT PROTAB.ProteinID,
SUM(PROTAB.MaxScore) AS SumMaxScore,
PROTAB.NumOfPeptides
FROM (
SELECT PeptidesProteins.ProteinID,
MAX(JobModuleScores.ScoreValue) AS MaxScore,
COUNT(DISTINCT Peptides.Sequence) AS NumOfPeptides
FROM PeptidesProteins
INNER JOIN JobModuleScores
ON PeptidesProteins.JobID = JobModuleScores.JobID
INNER JOIN Peptides
ON PeptidesProteins.JobID = Peptides.JobID
AND PeptidesProteins.ProcessingModuleNr = Peptides.ProcessingModuleNr
AND PeptidesProteins.PeptideID = Peptides.PeptideID
AND JobModuleScores.JobID = Peptides.JobID
AND JobModuleScores.ProcessingModuleNr = Peptides.ProcessingModuleNr
AND JobModuleScores.PeptideID = Peptides.PeptideID
WHERE (PeptidesProteins.ProcessingModuleNr = @processingModuleNr2)
AND (JobModuleScores.ScoreName = @scoreName2)
AND (PeptidesProteins.JobID = @jobID2)
GROUP BY Peptides.Sequence, PeptidesProteins.ProteinID
HAVING (MAX(JobModuleScores.ScoreValue) > @scoreValue2)
) PROTAB
GROUP BY PROTAB.ProteinID,
PROTAB.NumOfPeptides
)PROTAB2
INNER JOIN Proteins
ON PROTAB2.ProteinID = Proteins.ProteinID
ORDER BY PROTAB2.SumMaxScore DESC,
PROTAB2.NumOfPeptides DESC
I don't know if this helps you understand the situation better ...
thanks ...
nano
April 5, 2005 at 7:20 am
Thanks for those links ... it sure has cleared up my question ...
nano
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply