Odd Peformance Testing Results

  • We’ve just installed a new dedicated SQL box with 64-bit OS and 64-bit SQL Server 2005.  The box has two processors with dual cores and 16 Gb RAM.  Direct attached storage with two external disk enclosures each on it’s own channel and each enclosure populated with 6-15k RPM drives.  The initial configuration is with each enclosure having a 6-disk RAID 0 array and the two enclosures/channels mirroring each other.

     

    A copy of our production database (SQL Server 2000) has been restored to this machine after which I executed the following statements:

     

    exec sp_updatestats

    dbcc updateusage (0)

     

    This is a data warehouse application with several tables with millions of rows repetitively queried.  I specifically targeted these tables with:

     

    update statistics table_name

    with fullscan

     

    dbcc dbreindex (‘table_name’)

    go

     

    No further updating has occurred and the database has only been used performance testing of our long-running procs.  I chose three stored procedures and re-compiled them on the new server and started testing. Prior to each test, the following statements were executed:

     

    Dbcc freeproccache

     

    Dbcc dropcleanbuffers

     

    For individual tests, the statements were executed before each proc; for simultaneous tests, the statements were executed once before executing the three procs simultaneously in separate SSMS windows.

     

    Test 1 (times are mins:seconds) one at a time:

     

    Proc1: 11:09

    Proc2: 3:04

    Proc3: 11:44

     

    Test 2 – three procs simultaneously:

     

    Proc1: 4:22

    Proc2: 3:16

    Proc3: 11:38

     

    This didn’t make sense – that Proc1 would complete in much less time with more work on the box than when it was running by itself.

     

    I retested Proc1 by itself:

     

    Proc1: 11:14

     

    So clearly, to have the shortest execution time for Proc1, I need to run the other procs simultaneously…???

     

    Again, this is a totally controlled environment, I am the ONLY person on this box and laborious attention to ensure each test is identical.

     

    I have repeated all of these tests 3 times and the results only vary by a few seconds with the exeption of Proc1 in the simultaneous scenario – it has ran as fast as 2:17 !!!  4:22 is the longest it has ran with the other procs executing simultaneously.

     

    Is my methodology flawed??  Is there a rational explanation??  If there was such a fast way to execute Proc1, why wouldn’t SQL Server do that when it is running alone??

     

    Arrrgggghhhhhhhh...I just ran Proc1 alone again...11:03  Consistent if nothing else...

     

     

  • I am only guessing here.. But do procs 1 and 2 reference similar tables/indexes? Proc 3 seems to be consistent either way so I'll leave that out.. If that is the case perhaps when executing 1&2, proc 2 is dragging that stuff into memory(cache) so proc one references that instead of going out to the drives hence the faster performance?

    Thanks

    Rich

  • Actually all three procs access the same three large tables, albeit with different data ranges, groupings and selection criteria.  This to simulate our routine environment of looking at the same data from different perspectives and time periods.

     

    I had considered your suggestion – and I think you have to be correct – the results are the results and what else could it be?  However, given the magnitude of difference – Proc1 runs 2.5 to nearly 5 times faster in concert with Procs 2 & 3 (which show virtually no change in run time) – this would seem to be an enormous untapped performance potential if one is simply relying on SQL Server to determine efficient execution plans.  In the end, we’re still talking about accessing the same data and I would think SQL should be able to determine an effective data acquisition/caching methodology that would work nearly as well as that achieved by running a wholly independent proc…???

     

    I don’t know if this is simply a fluke combination of elements – but given the dramatic results of Proc1 (from 11 minutes to 2.25 minutes) – I would want to investigate more applications of this but it would seem to be a very ‘hit and miss’ endeavor…

     

    Another item on these procs, they do not use any HINT elements to override SQL execution plan determination…

     

    Wyatt

  • We are indeed talking about the the same data.. however the 1st time you run the proc it has to do a bunch of disk IOs to grab all the data etc. once its in memory then sql will perform the operations lgoically which is heck of a lot faster than doing physical reads. Try Clearing the cache, run proc 1 then run proc one again and you should see an improvement similar to the execution of procs 1&2 simultanelously..

    Thanks!

    Rich

  • I understand what you’re saying regarding data caching.  I still think it odd that this dramatic of a result occurs with the processes running simultaneously!!

     

    In any case, the ‘magic’ is now gone.  I had detached the database and moved the files to different physical locations for testing and then moved them back to the original locations where these results were obtained.

     

    I am now getting consistent and repeatable results that are substantially different than the original results that I’d measured.  The ‘magic’ 2-4 minute simultaneous runs are now taking 9.5 minutes.  This with no updates to the database other than detaching/attaching.

     

    Unfortunately, obtaining predictable results seems elusive.  Even with tightly controlled conditions, I don’t seem to get repeatable results with I/O configuration.  When compared to our existing production server, the results are even more contentious – Proc3 runs about 25% longer on the ‘old’ box (SQL Server 2000; 3 GB; two-processor; single SCSI channel and fewer spindles); Proc2 runs in the same time and Proc1 runs in 6.5 minutes on the old box versus 11 minutes (singlely) on the new box.  Again – with the DBCC statements prior to each run on both boxes.

     

    Less complex queries (but still I/O intensive) show dramatic improvements on the new box (ie. old box = 4 minutes; new box = 14 seconds).  However, these complex stored procs, even though they are processing large amounts of data, are not seeing much improvement with the improved I/O performance of the new box.  I expect production use to see significant improvements with the caching available with greatly increased memory.

     

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply