Query: SLOW then FAST then SLOW (SQL 2000 SP3, Large Tables)

  • Any specific answers or "things to try" will be appreciated!  Let me know if you want any more info.

    Note the database tables are fairly large and compatiability set to 6.5 (we need to move!)

    I have a query which is called from a VB6 app using ADO 2.5 to create a client side READONLY disconnected recordset.  A large number (20ish) of recordsets are requested at the same time, both before and after.  There will be between 0 and 20 rows returned, but in the specific example there are no rows returned (as expected).

    The query below is called twice (by code with a 0.5 second gap) per "user call".  This is what happened in test and the evidence suggests happens in live (350 users, but only 2 would be using this functionality at any one time).

    (LIVE Server: IBM xSeries 365 Server with Quad Intel Xeon MP 3.0 GHz Processors and 8GB memory, Win 2003Ent, SQL2000 SP3, 45% Av. CPU, 75% Peak)

    User 1: SLOW (approx. 60secs), then FAST (6secs)

    User 1: FAST (6secs), then FAST (6secs)

    User 1: SLOW (approx. 60secs), then FAST (6secs)

    Note Users click in parallel and functionality completed within the same minute.

    Test Using SQLAnalsyer

    When I ran just the queries one after the other within 5 mins (plucked via profiler) in SQLAnalyser,

    Query 1 took 61;Query 2 took 6

    CHECKPOINT, DBCC DROPCLEANBUFFERS, Query 1 took 7

    CHECKPOINT, DBCC DROPCLEANBUFFERS, Query 2 took 6

    Query 1 took 7;Query 2 took 6

    Wait half an hour,

    CHECKPOINT, DBCC DROPCLEANBUFFERS, Query 1 took 56sec

    CHECKPOINT, DBCC DROPCLEANBUFFERS, Query 1 took 7sec

    CHECKPOINT, DBCC DROPCLEANBUFFERS, Query 2 took 7sec

    --Query 1: 61secs In Test

    CHECKPOINT

    DBCC DROPCLEANBUFFERS

    SELECT PB.PBTxnTriggerID From

    (SELECT E.PBTxnTriggerID from PlanBasedTxnTrigger PB, Plan_ P, Group_ G, Scheme S, ExitDet E Where

    PB.PBTxnInstructionId = E.ExitNo AND

    P.PlanNo = E.PlanNo AND

    PB.PlanNo = P.PlanNo AND

    P.GrpNo = G.GrpNo  AND

    G.SchemeNo = S.SchemeNo AND

    PB.PBTxnTriggerActionedDate >= S.SchemeAnnivYearStartDate AND

    PB.PBTxnTriggerActionedDate <= S.SchemeAnnivYearEndDate AND

    PB.PBTxnTriggerStatusCode = 1 AND

    PB.PBTxnTypeId = 54 

    AND

    P.PlanNo = '1001000068  '

    )a

    inner join PlanBasedTxnTrigger PB on a.PBTxnTriggerID = PB.PBTxnTriggerID and   --PB.planno = '1001000068  ' and

    PB.PBTxnTypeId = 36 AND

    PB.PBTxnTriggerStatusCode = 1

    inner join plan_ P on p.planno = pb.planno and p.planno = '1001000068  '

    inner join Group_ G on P.GrpNo = G.GrpNo

    inner join Scheme S on S.SchemeNo = G.SchemeNo

    and PB.PBTxnTriggerActionedDate >= S.SchemeAnnivYearStartDate AND

    PB.PBTxnTriggerActionedDate <= S.SchemeAnnivYearEndDate

    --Query 2: 6secs In Test

    CHECKPOINT

    DBCC DROPCLEANBUFFERS

    SELECT PB.PBTxnTriggerID From

    (SELECT E.PBTxnTriggerID from PlanBasedTxnTrigger PB, Plan_ P, Group_ G, Scheme S, ExitDet E Where

    PB.PBTxnInstructionId = E.ExitNo AND

    P.PlanNo = E.PlanNo AND

    PB.PlanNo = P.PlanNo AND

    P.GrpNo = G.GrpNo  AND

    G.SchemeNo = S.SchemeNo AND

    PB.PBTxnTriggerActionedDate >= S.SchemeAnnivYearStartDate AND

    PB.PBTxnTriggerActionedDate <= S.SchemeAnnivYearEndDate AND

    PB.PBTxnTriggerStatusCode = 1 AND

    PB.PBTxnTypeId = 54 

    AND

    P.PlanNo = '1001000046  '

    )a

    inner join PlanBasedTxnTrigger PB on a.PBTxnTriggerID = PB.PBTxnTriggerID and   --PB.planno = '1001000046  ' and

    PB.PBTxnTypeId = 36 AND

    PB.PBTxnTriggerStatusCode = 1

    inner join plan_ P on p.planno = pb.planno and p.planno = '1001000046  '

    inner join Group_ G on P.GrpNo = G.GrpNo

    inner join Scheme S on S.SchemeNo = G.SchemeNo

    and PB.PBTxnTriggerActionedDate >= S.SchemeAnnivYearStartDate AND

    PB.PBTxnTriggerActionedDate <= S.SchemeAnnivYearEndDate

    Execution Plan:

    StmtText

      |--Nested Loops(Inner Join, WHERE[P].[GRPNO]=[G].[GRPNO]))

           |--Clustered Index Seek(OBJECT[Lamda].[dbo].[PLAN_].[PK_PLAN_] AS [P]), SEEK[P].[PLANNO]='1001000068  ') ORDERED FORWARD)

           |--Nested Loops(Inner Join, OUTER REFERENCES.[SchemeNo]) WITH PREFETCH)

                |--Nested Loops(Inner Join, WHERE[PB].[PBTxnTriggerActionedDate]>=.[SchemeAnnivYearStartDate] AND [PB].[PBTxnTriggerActionedDate]<=.[SchemeAnnivYearEndDate]))

                |    |--Filter(WHERE([PB].[PlanNo]='1001000068  ' AND [PB].[PBTxnTriggerStatusCode]=1) AND [PB].[PBTxnTypeId]=54))

                |    |    |--Bookmark Lookup(BOOKMARK[Bmk1000]), OBJECT[Lamda].[dbo].[PlanBasedTxnTrigger] AS [PB]))

                |    |         |--Nested Loops(Inner Join, OUTER REFERENCES[E].[EXITNO]))

                |    |              |--Nested Loops(Inner Join, OUTER REFERENCES[PB].[PBTxnTriggerId]))

                |    |              |    |--Nested Loops(Inner Join, WHERE[PB].[PBTxnTriggerActionedDate]>=.[SchemeAnnivYearStartDate] AND [PB].[PBTxnTriggerActionedDate]<=.[SchemeAnnivYearEndDate]))

                |    |              |    |    |--Nested Loops(Inner Join, OUTER REFERENCES[G].[SCHEMENO]))

                |    |              |    |    |    |--Nested Loops(Inner Join, OUTER REFERENCES[P].[GRPNO]))

                |    |              |    |    |    |    |--Clustered Index Seek(OBJECT[Lamda].[dbo].[PLAN_].[PK_PLAN_] AS [P]), SEEK[P].[PLANNO]='1001000068  ') ORDERED FORWARD)

                |    |              |    |    |    |    |--Clustered Index Seek(OBJECT[Lamda].[dbo].[GROUP_].[PK_GROUP_] AS [G]), SEEK[G].[GRPNO]=[P].[GRPNO]) ORDERED FORWARD)

                |    |              |    |    |    |--Clustered Index Seek(OBJECT[Lamda].[dbo].[Scheme].[PK_Scheme] AS ), SEEK.[SchemeNo]=[G].[SCHEMENO]) ORDERED FORWARD)

                |    |              |    |    |--Filter(WHERE[PB].[PlanNo]='1001000068  '))

                |    |              |    |         |--Bookmark Lookup(BOOKMARK[Bmk1005]), OBJECT[Lamda].[dbo].[PlanBasedTxnTrigger] AS [PB]))

                |    |              |    |              |--Index Seek(OBJECT[Lamda].[dbo].[PlanBasedTxnTrigger].[PBTxnTriggerStatusType] AS [PB]), SEEK[PB].[PBTxnTriggerStatusCode]=1 AND [PB].[PBTxnTypeId]=36) ORDERED FORWARD)

                |    |              |    |--Index Seek(OBJECT[Lamda].[dbo].[EXITDET].[ExitDetPBTxnTriggerId] AS [E]), SEEK[E].[PBTxnTriggerId]=[PB].[PBTxnTriggerId] AND [E].[PLANNO]='1001000068  ') ORDERED FORWARD)

                |    |              |--Index Seek(OBJECT[Lamda].[dbo].[PlanBasedTxnTrigger].[PBTxnTriggerInstructionId] AS [PB]), SEEK[PB].[PBTxnInstructionId]=Convert([E].[EXITNO])) ORDERED FORWARD)

                |    |--Clustered Index Scan(OBJECT[Lamda].[dbo].[Scheme].[PK_Scheme] AS ))

                |--Index Seek(OBJECT[Lamda].[dbo].[GROUP_].[GroupSchemeNo] AS [G]), SEEK[G].[SCHEMENO]=.[SchemeNo]) ORDERED FORWARD)

    Approx Table Size (rows)

    PlanBasedTxnTrigger PB 37 million, Plan_ P 190,000, Group_ G 5000, Scheme S 5000, ExitDet 45000

  • Initial thoughts are that the first call you are making it is pumping all the data into the cache - once cached subsequent reads are fast. after your 30 minute wait the cache has emptied again so the first query now needs to fill the cache again. There may be other reasons - but this was my first impression. - Also are you using all your available memory - do you have the /3GB or /POE switch in the boot.ini file and the AWE flag set on the sp_configure ?...

    If you are running SP4 ( you are currently doing SP3 ) - remember there is a patch required to fix a bug with the memory allocation when using the /AWE switch - http://support.microsoft.com/default.aspx?kbid=899761 ... Bug #: 474343

    ** What you see, Depends on what you Thought, Before, You looked! **

  • flushing data out of cache will cause physical disk i/o which may degrade performance. If you can reproduce your slow and fast queries then capture and compare the query plans to see if there's any difference.

    Are we talking dynamic queries here or calls to procs?

    Does this happen for ALL users or just some - you might want to check connection string properties are the identical in all cases .

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Just reaching here, but sometimes you might see this depending on the data distribution. 

    In this example plan 1001000046 may have a different data distribution than 1001000068 - possibly producing massively different datasets? 

    You could test this by running some tests using these values and see if there is a correlation between the parameter value and the query time.

    Also if you haven't already, try updating the stats for all the tables in the query (especially any small lookup tables, or tables with a 'status' type field that changes frequently).

  • - how is the load of other queries in your system ?

    - is this a dedicated sqlserver server ?

    - 45 % avg cpu is not that alarming, but gives you an idea there may be troubles to expect

    - what's the I/O pressure of your server (physical/virtual disc)

    - how heavy is your memory usage ?

    - I'd start a trace during the whole elaps periode of your tests and the after your test hour, analyse it for queries consuming many reads/writes/cpu and elaps. Maybe you'll find out a certain pattern or sequence of events causing your delays.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I don't have access to the Prod DB, but I understand:-

    AWE is enabled

    "check connection string properties" - same connection string used over the life of the user session and is opened then closed.

    "dedicated sqlserver server" -  It is a dedicated DB running just this app.  (Hopefully) with minimum allowed monitoring/virus checking.

    "what's the I/O pressure of your server (physical/virtual disc) - It's a SAN.  Although I have not dug deeply and maybe being "blagged", I am told that the IBM hardware optimises the disks it uses.  The Logs and Data are written to two logically different drives: S and R.  SQLServer runs from on D and system on C.

    It's dynamic rather than SP - is an SP cached differently.  Is it possible to caluculate how much of the cached memoy the SQLServer temp tables take up (and perhaps explain why it does not stay cached very long?)

    "How is the load of other queries in your system?" - not sure what you mean, do you mind elaborating?

    "how heavy is your memory usage ?" - I don't have access to the Prod DB, but can get the figures - which win2003/sql counters.  Would a days usage be OK? I thought that SQLServer reserves the memory in someway, so would it show fluctuations? In the Technical Test area I have witnessed periodic writing (I/O) activity - a DBA told me this was normal as it is where SQLServer writes the "buffer" to disk. a) do you agree? b) can it be "flattened" by writing to disk slightly more frequently, but in small chunks (so that any one wait due to this activity is shorter?)

    Thanks for ALL comments.

    Gary

  •  "I don't have access to the Prod DB" That's why you need good collaboration with your productiondba. It eases things to get hold on figures smoothly.

    "AWE is enabled" so sqlserver may use more than 3 Gb. Can it ? What are the settings for min and max servermemory ?

    "check connection string properties" - what kind of locking do they connect with ? Let's hope "read committed" ! The default is "repeatable read", which is bad for performance, but good for isolation.

    "dedicated sqlserver server" -  doublecheck that at least your sqlserver data and logfiles are excluded from viruschecking.

    "SAN" a san should perform well. And you may have to get back to that if you cannot find salvation in the other areas. But I wouldn't focus on it wright now.

    It's dynamic rather than SP - is an SP cached differently.  yes. Chances of resusability are better ! You'll have to investigate how dynamic your dynamic queries are. Are they composed at runtime or are they fixes with only the values for the predicates provided at runtime ? If it is the latter choice, advise to switch to SP's or at least parameterized queries so planreuse is optimized. With sp's you also have the advantage of easy querytuning without applicationmodification !

    "How is the load of other queries in your system?" - Are there other applications/queries that run on your sqlserver ?

    "how heavy is your memory usage ?" - which win2003/sql counters.  Check for the counters at http://www.sqlservercentral.com/columnists/abressi/2634.asp.

    "Would a days usage be OK? "Should be

    "I thought that SQLServer reserves ..."

    a) do you agree? yes. Layzy writerprocesses perform this task. 

    b) can it be "flattened" by writing to disk slightly more frequently, but in small chunks (so that any one wait due to this activity is shorter?) That would be a to advanced impact for this moment !

    Advise : ask your productiondba to start a SQLServer profiler trace with the TSQL-template. Make sure you include the figures for elapstime, cpu, reads, writes, applicationname, sqluser, windowsuser and get the sqltext. Have it run for let's start with an hour, to reduce its impact.

    mayby http://www.sqlservercentral.com/columnists/bkelley/perfmonexcerpt.asp can enlighten a bit regarding focus areas.

    http://www.sqlservercentral.com/columnists/RDyess/obtainingqueryexecutionplansthroughsqlprofilertrac.asp can get you some help regarding sqlprofiler !

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply