IO performance issues

Question

Post reply

IO performance issues

shahgols

SSCrazy Eights

Points: 9874
More actions
November 21, 2002 at 3:20 pm

#159027

Hello everyone, I have a server with 4 drives. SQL server is on C, Logs on D, Index on E, and Data on F.
My user, since 2 days ago, has got very slow response on his procedure. He has been running this procedure for months, and it takes on average 3 hours. It has been running for over 40 hours. There is a lot of CPU cycles, but the IO is minimal. I ran Performance Monitor and looked at the Physicaldisk> Avg. Disk Read Queue Length and Avg. Disk Write Queue Length for each of the drives. I see that at all times one or more drives have a very high queue (if the line is up, that means there is more in the queue, right?)
Sometimes it C drive that is high on queue, sometimes it is the log (but not usually) and sometimes it is the index and/or data drives that have high queue.
Am I interpreting the graph correctly? What can I do to speed up IO? Do you think that the IO will get better if I add more drives and spread the data around, and move the tempdb files to their own drive?
What gets me is that one day the process ran in approx 3 hours, and the next day, the problems started and the process has been running for 40+ hours. There is no major changes, just the usual data inserts and updates (2,000 inserts a night).
Drive C has 11 Gigs open with 6 Gigs used. Drive D has 1 Gig open with 16 Gigs used.
Drive E has 6 Gigs open with 10 Gigs used.
Drive F has 32 Gigs open with 36 Gigs used.
Any help is very much appreciated.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 1

What does the execution plan show?

Do you update statistics or rebuild indexes?

Are these drives single drives or raid arrays?

Open task manager, add the columns to it for IO, page faults, etc

What processes hold the top IO? Page Faults? Etc? Is virtual memory being used? by what?

The most common cause of what your describing are virtual memory swapping, page splits in indexes, or other processes keeping your server IO bound. With the information from above, you should be able to identify or eliminate these as possibilities.

Antares686 SSC Guru Points: 125444 More actions · Answer 2

Run sp_who and see what is happening on SPID 1-5. Awhile back I started getting a BACKUP LOG on temmpDB and my server screeched not to a halt but it seemed like it. Ultimately it turns out my tempDB was doing this because of size remaining on the drive where it is located and it need to grow, I fortunately had 30GB I could free of stuff I wantd to be rid of anyway and I have not had an issue since. Sometimes other processes can be the root of your issues.

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 3

Not critising your code, but in my experience, anything that takes 3 hours to run can be improved upon, and if there are improvements then this generally means that the optimiser is not getting a good plan.

What can result, because of this, is that a small change can cause the bad plan to become really really bad, and thus take 40Hrs + to run.

Are you on 7 or 2000, with 7 there is a known issue with parallism that can cause this.

What does the SP do?

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

shahgols SSCrazy Eights Points: 9874 More actions · Answer 4

What should I look for in spids 1-5? Also, if tempdb is growing and taking up the whole drive space, can't I just look at the free space remaining through explorer? I hope that this is the problem, it would be an easy fix.

simon, you are right, I did an explain plan and found out that there is a table scan. I checked the table, there is no index. I have created several indexes, have tested the script, but the performance doesn't pick up. I created several combo and single column non-clustered indexes, but sql server didn't want to use them. I gave hint in the statement to use the index, and sql server did use it, but then it showed the bookmark lookup step. So, this causes my user's query to run just as slow as before.

I created clustered indexes, once using the PK column, and once using the column in the select statement's where clause. The same performance there too.

Here is the t-sql, can you help? If you want, I can add whatever indexes you tell me to add and I will add and post the explain plan. Also, to let you know, the explain plan shows that the sub-procedure does not hurt the performance, its the select statement in the cursor that is the problem.

---------------------------------------------------------------------

-- GUID assignment pass 1

---------------------------------------------------------------------

SET QUOTED_IDENTIFIER OFF

GO

SET ANSI_NULLS ON

GO

-- set-up

declare @result int, @ierror int, @irowcount int

-- counters

declare @iAssignedUPID int, @iNewUPID int

select @iAssignedUPID = 0, @iNewUPID = 0

-- Iterate through all non-guid assigned records, try to find guids, or assign new ones as needed

declare @foundUPID int

declare @pk int, @pFirstName varchar(40), @pLastName varchar(40),

@N_ADDRESS1 varchar(40), @N_ADDRESS2 varchar(40), @N_HOUSENUM varchar(11),

@N_PREDIR varchar(2), @N_STREET varchar(40), @N_STRSUF varchar(11),

@N_POSTDIR varchar(2), @N_UNIT varchar(11), @N_UNITDES varchar(4),

@N_CITY varchar(30), @N_STATE varchar(2),

@N_ZIP varchar(10), @N_ZIP4 varchar(8), @N_ZIP10 varchar(18),

@N_COUNTY varchar(44), @N_COUNTRY varchar(50), @N_MATCHCODE char(4),

@emailaddress varchar(60),

@LFIRM char(6), @LUSER char(10), @crSource char(3), @crChannel char(3),

@crCellCode varchar(16), @crListCode varchar(16), @returnedUPID int

set nocount on

-- set-up a cursor with all the records that do not have GUID

declare newUPID_cursor cursor for

select PK, mf_firstName, mf_lastName, dp_addr1, dp_addr2, dp_housenum,

dp_predir, dp_street, dp_strsuf,

dp_postdir, dp_unit, dp_unitdes,

dp_city, dp_state,

dp_zip, dp_zip4, dp_zip10,

dp_county, '', dp_mcode,

'', ''

from sepNOBO_AnalysisTable

where a_UPIDonDPUL = -1

for update of a_UPIDonDPUL

open newUPID_cursor

fetch next from newUPID_cursor

into @pk, @pFirstName, @pLastName, @N_ADDRESS1, @N_ADDRESS2, @N_HOUSENUM,

@N_PREDIR, @N_STREET, @N_STRSUF,

@N_POSTDIR, @N_UNIT, @N_UNITDES,

@N_CITY, @N_STATE,

@N_ZIP, @N_ZIP4, @N_ZIP10,

@N_COUNTY, @N_COUNTRY, @N_MATCHCODE,

@emailaddress, @crListCode

-- iterate through all of them and attempt to find guids, or assign new ones when needed

while (@@fetch_status = 0)

begin

-- attempt to get a valid UPID from dpUL

exec @result = dbo.findUPIDMatch2 @pFirstName, @pLastName, @N_HOUSENUM, @N_PREDIR,

@N_STREET, @N_STRSUF, @N_POSTDIR, @N_UNIT,

@N_ZIP, @N_ZIP4, @emailaddress, @foundUPID output

if @foundUPID is not null

-- if a valid UPID is found, use it; but also check to see if dupes were available

begin

update sepNOBO_AnalysisTable

set a_UPIDonDPUL = @foundUPID

where current of newUPID_cursor

set @iAssignedUPID = @iAssignedUPID + 1

end

else

-- if no valid UPID is found, simply assign upid -2 to the table

begin

set @crSource = ''

set @crChannel = ''

set @crCellCode = ''

update sepNOBO_AnalysisTable

set a_UPIDonDPUL = -2

where current of newUPID_cursor

set @iNewUPID = @iNewUPID + 1

end

-- get the next candidate record in the set

fetch next from newUPID_cursor

into @pk, @pFirstName, @pLastName, @N_ADDRESS1, @N_ADDRESS2, @N_HOUSENUM, @N_PREDIR, @N_STREET, @N_STRSUF,

@N_POSTDIR, @N_UNIT, @N_UNITDES,

@N_CITY, @N_STATE,

@N_ZIP, @N_ZIP4, @N_ZIP10,

@N_COUNTY, @N_COUNTRY, @N_MATCHCODE,

@emailaddress, @crListCode

end

close newUPID_cursor

deallocate newUPID_cursor

set nocount off

Thanks everyone, really appreciate it.

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 5

All I can say is CURSOR. What does findUPIDMatch2 do.

Firstly I hate cursors so don't use them much. I might be wrong in these statements but you I would do this

Don't use cursor do a SELECT TOP 1 where you do the fetch.

Try and replace it all by a set based UPDATE statement.

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

shahgols SSCrazy Eights Points: 9874 More actions · Answer 6

Simon, I am confused about your instruction below:

"Don't use cursor do a SELECT TOP 1 where you do the fetch."

Why do a select top 1?

Also, what is a set based update? Sorry, I am new to sql server and have not spent much time in programming yet.

The match2 procedure tries to find out if the person already exists in our database. It truncates a staging table, and tries to match the information that was passed to it to some rows in our main table. If it finds some matches, it inserts them into the staging table. Then it does some more testing (only a few and simple) to see if one or more of the rows in the staging table are for sure the same person. It then returns success or failure depending on if it found an exact match or not.

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 7

A set based process process all data in a set at once.

i.e

UPDATE MyTABLE

SET guid = MyMainTable.NewGuid

, othercolumn = MyMainTable.otherColumn

FROM mYTable

LEFT JOIN MyMainTable ON MyMainTable.name = MyTable.name

where a_UPIDonDPUL = -1

So this updates all your records in one go (obviously it needs to be more complex).

What I mean by using select top 1 is that if you remove the cursor stuff and put the select top 1 statement where the fetches where it will do exactly the same as you have with out the use of a cursor. In addition use @@ROWCOUNT rather than @@FETCH_STATUS.

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 8

great call, simonsabin.

ALWAYS set based over cursor, if possible. In my opinion, the time to use a cursor is when NO other functionality will work, as SQL Server LIKES to be a set based creature.

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 9

A bit of history that I am not 100% on is that CURSOR support was only put maintained in SQL to keep ANSI compliance. But was not the biggest priority and thus very very poor performance. especially compared with ORACLE where CURSORS are (if I remember) really quite good.

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

Antares686 SSC Guru Points: 125444 More actions · Answer 10

My understanding is that most of the large SQL transactions in Oracle use Cursors in processing under the hood.

AJ Ahrens SSC-Insane Points: 20676 More actions · Answer 11

I have had a similiar issue with a stored-proc. The procedure ran in under 10 seconds and it ran fine for over 18 months. 1 day it stopped performing and took over the resources of the server and ran for 1 hour + before we killed it.

Took us a half a day to try and figure out what was wrong (Including having MS on-site).

Final resolution:

We changed one input var from varchar to char ran sp_recompile changed the var back to varchar redid the sp_recompile and BOOM started working perfectly again.

However.... In your case looks like a MAJOR re-write get rid of CURSOR and possibly farm some parts to other SP's for processing.

Good Luck,

AJ

AJ Ahrens

SQL DBA

Custom Billing AT&T Labs

Good Hunting!

AJ Ahrens

webmaster@kritter.net

guarddata SSCertifiable Points: 6583 More actions · Answer 12

Actually - cursors were extremely bad in SQL 6.5 (I don't know about 7.0) but they perform very well in SQL 2000. At least in some of our routines, the CURSOR out-performed a SELECT TOP 1 by about 10%. That said - I personally prefer the set based routine unless you absolutely MUST deal with a single row at a time.

Another thing that will catch you off guard is the use of database constraints and triggers. If your updates always go through a stored procedure, performance improves greatly by having your own code verify foreign keys or column constrain values rather than leaving that work to the database engine.

Guarddata

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 13

The other guys are right!

Don't use the cursor!

As I've seen your TSQL I could determint that effectivly ther will be no index usage on the sepNOBO_AnalysisTable (a_UPIDonDPUL) because of the poor selectivity. there will be a table scan.

In my tests running a cursor versus set based select/update can be at least 10 times slower. If in addition you are running a function on each row then it's normal to have 3 hours of runtime.

Having a set based aproach you could finish this job within a few minutes. (Especially if you are setting a table lock hint to your update statement)

If you give me (by email if you want) the DDL of your tables and the dbo.findUPIDMatch2 function maybe could we help you to find out a better way to do the job

Bye
Gabor

spongemagnet SSCrazy Points: 2812 More actions · Answer 14

I have the same consensus as everyone else regarding use of cursors. The only time I really use cursors is if I am manually handling data, it's rare to see a piece of production work from me that uses cursors.

Aside from rewriting the query, there are a few external factors you could look at also. Is it possible that your server is out of memory and busy swapping? This is a killer.. The other thing, is a personal pet peve (sp?) of mine. Make sure you turn off document indexing or whatever it's called. With NT I had to deal with idiots installing Office on my servers, which in turn installs and sets up FINDFAST to run for you (FINDFAST runs forever and does nothing but consume io's). After eliminating this I found out that the wonderful indexing feature included in Win2k and XP (possibly .NET also) is the same thing as FINDFAST. Unless you are also using your server as a file server MAKE SURE you have the indexing service disabled. You can either disable the service entirely (through control panel, as with other services) or through the right click properties on each individual drive.

-Ken