High response times and queue lenthing duringlog back up

Question

High response times and queue lenthing duringlog back up

gmcrouch

SSC-Addicted

Points: 442
More actions
March 2, 2012 at 1:21 pm

#254170

During log back up server almost hangs job takes 14 min runs every 1/2 hour
Disk Queue Lenght is well over 1500
Response time is over 15,000 MS
lots of event ID 833 error in App log IO request greater the 15 seconds
Server has 8 100GB plus DBs
Data files are on 2 Raid 1+0 Logical drives 8 drives each
Log drive is also on Raid 1+0 8 drives each
none of our other servers have this issue
hardware looks good no time outs bus error or hard read errors on drives
everything is good after log back up finishes
any idea's on how to fix this?
During norman ops Disk IO goes over 100MB a second
Durning back ups it only gets as high as 8MBs a second

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Investigate the poor throughput and high latency on the log drive. If that's a SAN, look at the SAN diagnostics, check the switch usage stats, etc.

Also maybe check your transaction log VLFs, see Kimberly Tripp's blog post on optimising transaction log throughput.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

gmcrouch SSC-Addicted Points: 442 More actions · Answer 2

Drives are SAS attached

Server HP SE326M1 attached storage HP MSA50

Perry Whittle SSC Guru Points: 234013 More actions · Answer 3

run DBCC LOGINFO against each database and check the number of rows returned for each, post back here if you can.

It would also be advisable to check the configuration of the locally attached disks, make sure you haven't got any failed disks in the array

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

gmcrouch SSC-Addicted Points: 442 More actions · Answer 4

Rows returned from DBCC LOGINFO

DB528: 116 rows

DB529: 116 rows

DB530: 116 rows

DB531: 113 rows

db506: 116 rows

DB510: 116 rows

DB514: 153 rows

DB518: 153 rows

let me know if you want me to post the full output here are the first few lines

FileIdFileSizeStartOffsetFSeqNoStatusParityCreateLSN

2255590481927482610640

22555904256409674826201280

22555904512000074826301280

22809856767590474826401280

2253952104857607482650648180000000457600592

2253952107397127482660648180000000457600592

2253952109936647482670648180000000457600592

2286720112476167482680648180000000457600592

2262144115343367482690648183000000037600590

2262144117964807482700648183000000037600590

2262144120586247482710648183000000037600590

2393216123207687482720648183000000037600590

2327680127139847482730648188000000010700595

2327680130416647482740648188000000010700595

2327680133693447482750648188000000010700595

2327680136970247482760648188000000010700595

23276801402470474827701288208000000025600595

23276801435238474827801288208000000025600595

23276801468006474827901288208000000025600595

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

500 000+ rows for each DB?????

That's seriously bad. Find Kimberly Tripp's article on transaction log throughput and read it.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

gmcrouch SSC-Addicted Points: 442 More actions · Answer 6

sorry those first 3 digits are DB numbers (DB name)

read article

rows are between 113 and 153 Rows

thats not to many right?

I am just a hardware guy ( Break Fix Tech)

I think hardware is good

Admin has had us change everything twice. except drives

HP Server manager show no time outs no bus errors or any other errors

Admin compares this to one of his other 100s of servers that have much smaller DB's (20GB) vs this server with 200 GB DBs says this is only one with issues.

on other servers log back up takes about 1 min

this server log takes 15 min server is unusable during log back up almost hangs

this serer has 8 200+GB data bases maybe too much for it to handle? or does this this server have config issue?

transation logs are large 1 is 50GB the reasted are about 20GB

Server has 48 GB of mem

SQL server 2008 R2

OS 2008 Server

only 4 of the DBs are being backup the back up files are 30MB each a checkpoint is being run with the backups

Perry Whittle SSC Guru Points: 234013 More actions · Answer 7

gmcrouch (3/3/2012)
rows are between 113 and 153 Rows
thats not to many right?

That's still excessive, what size are each of the log files physically?

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

gmcrouch SSC-Addicted Points: 442 More actions · Answer 8

1 is almost 50GB

the rest are just over 20GB

I just updated the post I am just a break fix tech.

This admin thinks I am an expert and know how to fix everything but this is a little too much for me.

I am sure the hardware is good.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

The thing here is that there's not much that can cause very slow disk response other than the hardware. Doesn't have to be faulty, just slow or overused or incorrectly configured somewhere.

If you're not sure where to look, maybe consider getting someone in to look into the problem for you?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Perry Whittle SSC Guru Points: 234013 More actions · Answer 10

GilaMonster (3/4/2012)
If you're not sure where to look, maybe consider getting someone in to look into the problem for you?

Agreed, all we have at present is an assurance the hardware is good and configured correctly.

Have you checked to ensure the array has no failed disks?

Where are you located geographically

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

gmcrouch SSC-Addicted Points: 442 More actions · Answer 11

I tested a coping 100GB backup folder after log backup finished

I copied the back up folder (100 GB) to log disk response time about 10MS

IO 150 MB a second

then I tested copy from log disk to back up disk

same results 10 MS Response 150 MB sec IO

both ways 100 GB took 9 minutes Queue Length less then 2

During log back up IO does not get over 10MB a second

response time is over 15000 MS Queue Length over 1500

problem is only during Log back up

I tried copy during log back up estimated time 2 days +

michael.albert SSC Veteran Points: 245 More actions · Answer 12

always good to make sure you're on the most recent firmware - if you end up calling your HW vendors' support, I am sure that they will point to this first.

gmcrouch SSC-Addicted Points: 442 More actions · Answer 13

We have hundreds of these servers they all manage a slice of a very large DB only difference I noticed is this server DB's are 10X larger then the DB's on the other servers. I looked at a few of the other servers and they are all 20GB DB's this ones DB's are 200 GB all are identical hardware same firmware this is only one with high response time. I think issue has to do with this server having much larger log files to handle that are very fragmented.

Does this sound like a logical conclusion?

on the other server log back up takes a minute.

On this server the log backup takes 14 minutes during first minute of backup there are no problems but after log jobs been running for over a minute response time and queue length jump

This server has been removed from the process it's replacement only has 20GB databases.

Again I am just a hardware guy don't know much about the whole process just know the hardware looks good.

going to try and get them to rebuild the data bases.

all the hardware except drives has been changed.

Servers been re-imaged twice. only thing that has changed is DB files

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 14

How are you doing your log backups? Are you using a third-party program for those backups? How large are the log backups?

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot