Application Slow

  • Hi,

    I have an application in use at a customer site that uses SQL Server 2000 and the customer is complaining about slow performance. The normal issues I receive is that a particular part of the application is running slow. So I request a profiler trace and identify the slow performing query and resolve the problem. This customer says its all slow. I have requested Profiler Traces and examined them and at times I can see queries that normally run in less than 1 second taking 20-30 seconds to complete.

    I have also asked the customer for perfmon logs and examined these. The logs show that the database disk (not the log disk) has sometimes queues of 100's of I/O's. The Avg. Disk Sec/Reads show response times for the disk varying from less than 20ms upto 2 seconds.

    I believe the issue is with the poor I/O performance, but the customer says its the application.

    can anyone advise on how to proceed with an issue like this.

    Ronnie

  • 1. Compile the data to show that there is a problem with disk access.

    2. How much memory is on the server? What is the Buffer Cache Hit Ratio? as it seems the server is going to disk too often.

    Here is a good article on what to look for and contains some things you do not mention:

    http://www.devarticles.com/c/a/SQL-Server/How-to-Perform-a-SQL-Server-Performance-Audit/

    Also look at locks and blocks. It is possible that a simple thing like the order of table access in the queries is not consistent.

  • I believe the issue is with the poor I/O performance,

    I agree with Jack, but I'll say it a differnet way... WHY do you believe that? What proof do you have? In order to convince the customer, you will need to show a comparison between systems under similar load or come up with an irrefutable proof that that is the problem. Until then, you'll be going around in circles with the customer forever.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The profiler trace I have, covering a 90 minute period, shows two queries that on two ocassions took 20+ seconds to run, at separate times. This was out of 250+ runnings of the same queries, the rest of the time the queries completed in less than 2 seconds, returning between 900-1200 rows. The cpu used by the queries is nearly always the same and so are the reads.

    How do I find out why the queries have slowed down so dramtically.

    I have this application on many other sites and have no problem. The only difference with this customer is the application is on a Cluster with a SAN.

    Ronnie

  • Although there are a huge number of possible reasons for this behavior, in my experience it is almost always due to one of two probable influences:

    1) Irregularities in the optimizer/statistics facilities. SQL Server 2000 had a number of issues in this area which you can research at Microsoft, however, it is usually easy enough to tell if this may be the cause. Compare the query plan of 20 sec responses to the 2 sec instances. If they are significantly different then that is your likely cause. The easiest Quick & Dirty short-term fix is to add query hints to force them to choose the plan you know they should use.

    However, the fact that CPU & Read IO's are the same is an indication that this is probably not what is happening. Which leaves you with...

    2) Contention. Contention is a broad category that can take many forms, both obvious and subtle. The high queue size on your data disk makes it likely that you are experiencing physical contention for that disk (and not contention for other resources, such as CPU and not logical application contention such as Locking).

    At this point, the possible sources of contention break down as follows:

    A) Application Internal: In other words, you have too many users trying to do the same thing at the same time. That disk has become the bottleneck that everyone is backing up behind. You need to either put less demand on that disk or give that disk more IO capacity (ie., optimize the App, or the queries/sProcs, tune the tables/indexes, move the tables to less used disks, OR reconfigure the disk (see below)).

    B) SQL Server internal: other usages of the SQL server are putting load on that disk that are contending with the app's usage of it. You need to eliminate, reduce or redirect that other SQL usage.

    C) Server Internal: non-SQL usages of that disk, but from the same server, are competing for that disk. You need to eliminate, reduce or redirect that other usage.

    D) SAN-concealed: SAN's are commonly carved up into separate Logical volumes that are nonetheless competing for the same physical resources. So for instance, your Log disk and your Data disk may actually be sharing the same physical devices (this is a BAD thing, but much more common than you might think). More likely, some completely unrelated server(s) have heavily-used logical volumes that are using the same physical drives as your data disk. Solution: reconfigure your data disk on the SAN.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • All,

    I have had another look through the profiler trace I have to try and get to the bottom of this issue. The worst performing query took 41 seconds to return its data using 0.2 seconds of CPU time. Now my initial theory was that it was an I/O problem, I am not so sure now. During the profile session the query ran 200 plus times with and average run time of 0.5 seconds.

    During the 40 seconds that the query took other queries where running and completing in less that 1 second. In fact another user ran the same query with different selection criteria and it returned in less than 1 second. If my I/O theory was right I would have expected other queries to slow down during the 40 seconds period.

    When I run the query on the customer site it returns in 1 second or less.

    Help

    Ronnie

  • Ronnie.Doggart (3/12/2008)


    All,

    I have had another look through the profiler trace I have to try and get to the bottom of this issue. The worst performing query took 41 seconds to return its data using 0.2 seconds of CPU time. Now my initial theory was that it was an I/O problem, I am not so sure now. During the profile session the query ran 200 plus times with and average run time of 0.5 seconds.

    During the 40 seconds that the query took other queries where running and completing in less that 1 second. In fact another user ran the same query with different selection criteria and it returned in less than 1 second. If my I/O theory was right I would have expected other queries to slow down during the 40 seconds period.

    When I run the query on the customer site it returns in 1 second or less.

    Help

    Ronnie

    Hi Ronnie,

    Ok, let's look at this from a slightly different perspective. Its sounds like you have identified the "slow" query, which runs fine on other sites, but on this particular site runs "slow". Only difference between this site and other sites is that this site has a SAN? Have you isolated the offending query or queries and unit test each one in terms of system resource usage? If not, I would suggest you look into doing this, as this will allow you to understand query behavior. You will need to review the execution plan between where the query runs in 40 seconds vs where the query runs in 1 second. Have you checked for table or index scans? Has the query always run slow on this particular site? Have you checked for blocking during query execution? Have you look at the MAXDOP setting and compared between system(s)?

    Thanks,

    Phillip Cox

    MCITP - DBAdmin

  • All,

    I have noticed that on the customer site the following query uses different plans for different data. So as an example:

    Table has two indexes:

    index1 is on column x

    index2 is on columns x and y

    select * from table1 where x = 'xyz' and y='ab'

    produces a plan that uses index1 and does more reads

    select * from table1 where x = 'def' and y='st'

    produces a plan that uses index2 and does fewer reads

    Why does it not use index2 everytime as its columns are used in the 'where' clause of the select.

    Ronnie

  • Ronnie.Doggart (3/12/2008)


    All,

    I have noticed that on the customer site the following query uses different plans for different data. So as an example:

    Table has two indexes:

    index1 is on column x

    index2 is on columns x and y

    select * from table1 where x = 'xyz' and y='ab'

    produces a plan that uses index1 and does more reads

    select * from table1 where x = 'def' and y='st'

    produces a plan that uses index2 and does fewer reads

    Why does it not use index2 everytime as its columns are used in the 'where' clause of the select.

    Ronnie

    Hi Ronnie,

    How often are statistics updated? How often are indexes rebuilt or re-organized?

    Index2 includes both columns in SARG (e.g. Where clause), thus would be faster in terms of identifying matching rows, as less lookups are required. In comparision, index1 would force a lookup of x first, then lookup of y, so more reads.

    Can you past execution plan from both queries?

    Thanks,

    Phillip Cox

  • All,

    Info lgncc_enquiry has 434523 rows and had statistics collected yesterday.

    [font="Courier New"]SELECT * FROM LGNCC_ENQUIRY WHERE xref1 = '1001183168' AND objecttype = 'D4'

    (971 row(s) affected)

    Table 'LGNCC_ENQUIRY'. Scan count 1, logical reads 3060, physical reads 43, read-ahead reads 0.

    |--Filter(WHERE:([LGNCC_ENQUIRY].[ObjectType]='D4'))

    |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([LGNCC_ENQUIRY]))

    |--Index Seek(OBJECT:([LGNCC_ENQUIRY].[LGNCC_ENQUIRY_XREF1]), SEEK:([LGNCC_ENQUIRY].[xref1]='1001183168') ORDERED FORWARD)

    SELECT * FROM LGNCC_ENQUIRY WHERE xref1 = '101000599728' AND objecttype = 'C1'

    (1 row(s) affected)

    Table 'LGNCC_ENQUIRY'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0.

    |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([LGNCC_ENQUIRY]))

    |--Index Seek(OBJECT:([LGNCC_ENQUIRY].[lgncc_enquiry_t2]), SEEK:([LGNCC_ENQUIRY].[xref1]='101000607124' AND [LGNCC_ENQUIRY].[ObjectType]='C1') ORDERED FORWARD)

    [/font]

    Ronnie

  • Hi Ronnie,

    Please try to run this code and post execution plan:

    SELECT * FROM LGNCC_ENQUIRY WHERE xref1 = '1001183168' AND objecttype = 'D4' WITH (INDEX(lgncc_enquiry_t2))

    Also, what data type is xref1 column?

    Thanks,

    Phillip Cox

    MCITP - DBAdmin

  • Philip,

    xref1 is varchar(80).

    [font="Courier New"]

    SELECT * FROM LGNCC_ENQUIRY WITH (INDEX(lgncc_enquiry_t2))

    WHERE xref1 = '1001183168' AND

    objecttype = 'D4'

    (971 row(s) affected)

    Table 'LGNCC_ENQUIRY'. Scan count 1, logical reads 3058, physical reads 0, read-ahead reads 0.

    |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY]))

    |--Index Seek(OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY].[lgncc_enquiry_t2]), SEEK:([LGNCC_ENQUIRY].[xref1]='1001183168' AND [LGNCC_ENQUIRY].[ObjectType]='D4') ORDERED FORWARD)

    [/font]

    Ronnie

  • Ronnie.Doggart (3/12/2008)


    Philip,

    xref1 is varchar(80).

    [font="Courier New"]

    SELECT * FROM LGNCC_ENQUIRY WITH (INDEX(lgncc_enquiry_t2))

    WHERE xref1 = '1001183168' AND

    objecttype = 'D4'

    (971 row(s) affected)

    Table 'LGNCC_ENQUIRY'. Scan count 1, logical reads 3058, physical reads 0, read-ahead reads 0.

    |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY]))

    |--Index Seek(OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY].[lgncc_enquiry_t2]), SEEK:([LGNCC_ENQUIRY].[xref1]='1001183168' AND [LGNCC_ENQUIRY].[ObjectType]='D4') ORDERED FORWARD)

    [/font]

    Ronnie

    Hi Ronnie,

    Ok, from what I can see, the first query isn't highly selective as you have many rows being returned vs query 2 and in the original query, there is a combination of both logical and physical reads, but we managed to circumvent this by forcing query to use index lgncc_enquiry_t2, which improved things. Out of curiosity, how long did the query take to run when forcing index? Also, its good practice to include column names in your statements not "*", so if you can re-code using required column names and test again.

    Thanks,

    Phillip Cox

    MCITP - DBAdmin

  • Philip,

    [font="Courier New"]select

    ENQUIRY.CASEID,

    ENQUIRY.CASEREF,

    ENQUIRY.ENQUIRYTYPE,

    ENQUIRY.OBJECTTYPE,

    ENQUIRY.XREF1,

    ENQUIRY.XREF2,

    ENQUIRY.XREF3,

    ENQUIRY.OBJECTDESC

    FROM LGNCC_ENQUIRY ENQUIRY WITH (INDEX(lgncc_enquiry_t2))

    WHERE xref1 = '1001183168' AND

    objecttype = 'D4'

    (971 row(s) affected)

    Table 'LGNCC_ENQUIRY'. Scan count 1, logical reads 4676, physical reads 15, read-ahead reads 465.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 397 ms.

    |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY] AS [ENQUIRY]))

    |--Index Seek(OBJECT:([FrontlineLive].[FrontlineLive].[LGNCC_ENQUIRY].[lgncc_enquiry_t2] AS [ENQUIRY]), SEEK:([ENQUIRY].[xref1]='1001183168' AND [ENQUIRY].[ObjectType]='D4') ORDERED FORWARD)

    [/font]

    Ronnie

  • I know I'm showing late to the party - but is there any chance to make that a covering index? Assuming those extra fields aren't ludicrously long - you'd be able to get rid of the bookmark lookups....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply