March 2, 2012 at 1:21 pm
During log back up server almost hangs job takes 14 min runs every 1/2 hour
Disk Queue Lenght is well over 1500
Response time is over 15,000 MS
lots of event ID 833 error in App log IO request greater the 15 seconds
Server has 8 100GB plus DBs
Data files are on 2 Raid 1+0 Logical drives 8 drives each
Log drive is also on Raid 1+0 8 drives each
none of our other servers have this issue
hardware looks good no time outs bus error or hard read errors on drives
everything is good after log back up finishes
any idea's on how to fix this?
During norman ops Disk IO goes over 100MB a second
Durning back ups it only gets as high as 8MBs a second
March 2, 2012 at 1:28 pm
Investigate the poor throughput and high latency on the log drive. If that's a SAN, look at the SAN diagnostics, check the switch usage stats, etc.
Also maybe check your transaction log VLFs, see Kimberly Tripp's blog post on optimising transaction log throughput.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 2, 2012 at 1:39 pm
Drives are SAS attached
Server HP SE326M1 attached storage HP MSA50
March 3, 2012 at 11:52 am
run DBCC LOGINFO against each database and check the number of rows returned for each, post back here if you can.
It would also be advisable to check the configuration of the locally attached disks, make sure you haven't got any failed disks in the array
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
March 3, 2012 at 1:20 pm
Rows returned from DBCC LOGINFO
DB528: 116 rows
DB529: 116 rows
DB530: 116 rows
DB531: 113 rows
db506: 116 rows
DB510: 116 rows
DB514: 153 rows
DB518: 153 rows
let me know if you want me to post the full output here are the first few lines
FileIdFileSizeStartOffsetFSeqNoStatusParityCreateLSN
2255590481927482610640
22555904256409674826201280
22555904512000074826301280
22809856767590474826401280
2253952104857607482650648180000000457600592
2253952107397127482660648180000000457600592
2253952109936647482670648180000000457600592
2286720112476167482680648180000000457600592
2262144115343367482690648183000000037600590
2262144117964807482700648183000000037600590
2262144120586247482710648183000000037600590
2393216123207687482720648183000000037600590
2327680127139847482730648188000000010700595
2327680130416647482740648188000000010700595
2327680133693447482750648188000000010700595
2327680136970247482760648188000000010700595
23276801402470474827701288208000000025600595
23276801435238474827801288208000000025600595
23276801468006474827901288208000000025600595
March 3, 2012 at 3:45 pm
500 000+ rows for each DB?????
That's seriously bad. Find Kimberly Tripp's article on transaction log throughput and read it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 3, 2012 at 5:07 pm
sorry those first 3 digits are DB numbers (DB name)
read article
rows are between 113 and 153 Rows
thats not to many right?
I am just a hardware guy ( Break Fix Tech)
I think hardware is good
Admin has had us change everything twice. except drives
HP Server manager show no time outs no bus errors or any other errors
Admin compares this to one of his other 100s of servers that have much smaller DB's (20GB) vs this server with 200 GB DBs says this is only one with issues.
on other servers log back up takes about 1 min
this server log takes 15 min server is unusable during log back up almost hangs
this serer has 8 200+GB data bases maybe too much for it to handle? or does this this server have config issue?
transation logs are large 1 is 50GB the reasted are about 20GB
Server has 48 GB of mem
SQL server 2008 R2
OS 2008 Server
only 4 of the DBs are being backup the back up files are 30MB each a checkpoint is being run with the backups
March 3, 2012 at 5:38 pm
gmcrouch (3/3/2012)
rows are between 113 and 153 Rowsthats not to many right?
That's still excessive, what size are each of the log files physically?
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
March 3, 2012 at 5:47 pm
1 is almost 50GB
the rest are just over 20GB
I just updated the post I am just a break fix tech.
This admin thinks I am an expert and know how to fix everything but this is a little too much for me.
I am sure the hardware is good.
March 4, 2012 at 2:26 am
The thing here is that there's not much that can cause very slow disk response other than the hardware. Doesn't have to be faulty, just slow or overused or incorrectly configured somewhere.
If you're not sure where to look, maybe consider getting someone in to look into the problem for you?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 4, 2012 at 2:51 am
GilaMonster (3/4/2012)
If you're not sure where to look, maybe consider getting someone in to look into the problem for you?
Agreed, all we have at present is an assurance the hardware is good and configured correctly.
Have you checked to ensure the array has no failed disks?
Where are you located geographically
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
March 5, 2012 at 9:01 am
I tested a coping 100GB backup folder after log backup finished
I copied the back up folder (100 GB) to log disk response time about 10MS
IO 150 MB a second
then I tested copy from log disk to back up disk
same results 10 MS Response 150 MB sec IO
both ways 100 GB took 9 minutes Queue Length less then 2
During log back up IO does not get over 10MB a second
response time is over 15000 MS Queue Length over 1500
problem is only during Log back up
I tried copy during log back up estimated time 2 days +
March 5, 2012 at 9:09 am
always good to make sure you're on the most recent firmware - if you end up calling your HW vendors' support, I am sure that they will point to this first.
March 5, 2012 at 9:23 am
We have hundreds of these servers they all manage a slice of a very large DB only difference I noticed is this server DB's are 10X larger then the DB's on the other servers. I looked at a few of the other servers and they are all 20GB DB's this ones DB's are 200 GB all are identical hardware same firmware this is only one with high response time. I think issue has to do with this server having much larger log files to handle that are very fragmented.
Does this sound like a logical conclusion?
on the other server log back up takes a minute.
On this server the log backup takes 14 minutes during first minute of backup there are no problems but after log jobs been running for over a minute response time and queue length jump
This server has been removed from the process it's replacement only has 20GB databases.
Again I am just a hardware guy don't know much about the whole process just know the hardware looks good.
going to try and get them to rebuild the data bases.
all the hardware except drives has been changed.
Servers been re-imaged twice. only thing that has changed is DB files
March 5, 2012 at 9:29 am
How are you doing your log backups? Are you using a third-party program for those backups? How large are the log backups?
David
@SQLTentmakerβHe is no fool who gives what he cannot keep to gain that which he cannot loseβ - Jim Elliot
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply