Identical Servers, but different performance puzzle

  • Guys,

    We have MSSQL 2000 Server instance installed and working well on Windows 2003 Server machine [IBM X series-366] with 16GB RAM, 3.67GHZ cpu power, and 400GB hard disk space.

    We further created an identical server instance on a new machine.  More specifically, on Windows 2003 Server machine [Intel (R) Xeon (TM)] with  16GB RAM, 3.67GHZ cpu power, and 400GB hard disk space, we installed MSSQL 2000 Server and copied over all the dbs, applications ...

    We were expecting same or similar performance (since processor speed, ram, hd, server and database configurations are all the same, with same indexes on same tables.  However, for some reason, there is a noticeable difference in performance.

    More specifically, I ran Profiler for 30 minutes on both servers simultaneously [same trace parameters].  The trace file of the new server is 3 times as large as that of the old one (i.e. It looks like more items are being processed).  However, the average duration of the executed stored procedures is much longer on the new server than that of the old server.

    Moreover, when I run same queries on 2 servers.  The query on the new server always takes longer than that on the old server.  And for tables where we don't have indexes, it takes much longer.

    Following advice here(http://support.microsoft.com/kb/274750/), we configured our new server (just as was our old one configured) to use 15GB of RAM.  I further compared the configurations of 2 servers by executing sp_configure (with advance options).  The only difference I saw was that "remote proc trans" is set to off on the new server and on on the old server.  I don't think it could affect this issue though.

    Furthermore, the new server appears to have many more locks, as compared to the old server.  Could it be because it is processing more items?

    I cannot figure what is causing the queries to be slower on the new server.

    Can anyone suggest anything?

    Thanks a lot

  • Couple of questions:

    Is RAID level on the new server the same as the old?

    How did you build the new database?

    Are the doing the same thing and how is the traffic being balanced between them?

    Have deep have you dug into the indexes and stats (dbcc showcontig, dbcc show_statistics)

    Have you used a Sql Compare type product to validate the schemas are as identical as you think they are?

    Please post or PM a response and I'll check back, this sounds challenging.

    David

  • Things to look for:

    Cache hit Ratios (other queries pre-populating the cache)

    Disk Queues (RAID levels etc)

    sp recompiles (Always worth a look)

    Disk fragmentation (depending on how you copied everything on this could be significant)

    Any one (or all) of these things could be affecting your results.

  • More things. Check AWE/PAE and be sure they are the same. I bet they are since you mention comparing configs, but double check.

    2nd, was this a backup/restore? Did you rebuild all indexes? Sometimes these don't translate well to the new server, so I'd do this. I'd also update the statstics.

    Are the tables the exact same? Can you compare two profiler traces and query plans for one query that is performing differently. It might zero in on where the issue is.

  • Use SQL Server Performance Analysis Utility - Read80Trace and check how your queries are performing at an aggregate level.

    On disk end, monitor Avg Disk sec/Transfer, CPU usage, buffer cache hit ratio.

    Take a snapshot of DBCC SQLPERF( WAITSTATS) before and after the trace.

    This can give good idea on possible bottleneck areas.


    -- Amit



    "There is no 'patch' for stupidity."



    Download the Updated SQL Server 2005 Books Online.

  • Guys,

    I believe RAID level is the same or better on the new server.  The new MSSQL 2000 server was created by restoring a back up of the old server.  As such, all the structures and indexes are identical.

    I performed a simple analysis, in both environments, simultaneously, and below are the results:

    1. CHECKPOINT

    2. DBCC DROPCLEANBUFFERS

    3. DBCC FREEPROCCACHE

    4. SET STATISTICS TIME ON

    5. SET STATISTICS IO ON

    6. Query - SHOW EXECUTION PLAN

    7. Ran a simple select query with a single condition

    OLD ENVIRONMENT

    ---------------

    Table 'Test'. Scan count 1, logical reads 648672, physical reads 0, read-ahead reads 553699.

    SQL Server Execution Times:

    CPU time = 7125 ms, elapsed time = 150508 ms.

    NEW ENVIRONMENT

    ---------------

    Table 'Test'. Scan count 14, logical reads 647119, physical reads 167, read-ahead reads 646585.

    SQL Server Execution Times:

    CPU time = 9907 ms, elapsed time = 52602 ms.

    Interestingly, withough cleaning the buffers and such, the old environment took less time.  Now it took more time, although new environment has more physical reads, read-ahead reads, and scan count.

    Thanks for all the great advises ... I am exploring all suggested areas!!!

     

  • Thanks for the update. I'd really rebuild all indexes and then see. Also let us know.

  • Steve,

    We perform INDEXDEFRAG daily on both servers - wouldn't that be enough ... or should I still rebuild all the indexes?

     

    Thanks a lot

  • I agree with Steve but would recommand a showcontig on both servers so you can compare the indexes. That could help you narrow it to specific tables if the problem comes back.

  • I will do as suggested.

    In the meantime, I ran the following command on both servers: DBCC SQLPERF (WAITSTATS).

    The numbers are drastically different, with the new/problematic server consistently having much greater numbers for all the different wait types.

    I am currently trying to figure out how to interpret this information.

    Thanks a lot

  • I think it is pretty clear that your indexes are the problem.  Your waitstats is probably just reflecting that.   I run the dbcc dbreindex and then take a fresh look at it.

     

  • David,

    We run INDEXDEFRAG daily ... is it not sufficient?

     

    Thanks a lot!

  • In in opinion and in my shop No.

    Run this command and private message me the results or post for the tables:

    DBCC SHOWCONTIG WITH ALL_INDEXES,FAST, TABLERESULTS

    Depending on how big your database and indexes are this could be intense so RUN IT AT A TIME WHEN YOU CAN TAKE A POTENTIAL PERFORMANCE HIT.

  • David,

    I ran this command in both - old and new environments and have the results (~100 rows of data).  I don't think its good idea to post so much stuff here.  Should I post some specific column values?

    I was looking at scan density mostly - > 80% for most clustered indexes, but much lower in many cases for non-clustered indexes - true in both environments.

    Another thing we noticed today by running 3rd party software (what's up) is that our old environment is using ~96% of memory on the average with min begin close to same and only 20% on the new environment, although settings are exactly the same for both, with min = 0 and max = 15GB.

     

    Thanks a lot!

  • I sent you a PM with my e-mail address. I would double check to make sure AWE was setup properly.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply