July 24, 2016 at 1:15 am
There is something bogging down our computer/server and causing job schedules to hang or fail and I cannot find it. Everything on that computer/server is performing badly including for example simple actions. We tried rebooting the computer, but problem persisted after we restarted SQL Server and the job schedules.
I'm a developer and not a DBA so having trouble finding the cause.
I assume it's something related to one of our jobs. The job in question is a daily job schedule that calls a stored procedure that acts as an ETL to import data from one database to another and make some updates. It used to run within an hour, but then about 7 days ago it stated taking 10-15 hours to run. Then the last 3 days it failed altogether after running for some time. We let it run today again, but after 10 hours we cancelled it.
Here's what I've tried/found:
First I reviewed the error message for the failed runs and found it was failing because the log file is out of space.
So I tried to shrink the log file by using the code below. It worked, but it didn't reduce the file size at all.
Then, since the code below didn't work, we tried to shrink the log file using SSMS, but it failed due to error. It said "Lock request time out period exceeded". So, to try to remove the lock request time out, I ran "sp_who2" to see if I could find something there. For the relevant database I saw one user/SPID with the following:
SPID: 63, Status: Suspended, Command: Delete, CPU Time: 1142382, DiskIO: 1254258
Without knowing for sure, I thought that could be the issue so I tried to end that transaction using "Kill 63". However, it appears that didn't work because if I run "sp_who2" it now reads
SPID: 63, Status: Suspended, Command: Killed/Rollback, CPU Time: 1142803, DiskIO: 1261601
What I need help are with the following:
1. How can I find what is causing the bad performance?
2. How can I shrink the log file? Could that be causing the bad performance?
Here's the code I tried:
USE MyDatabase;
GO
ALTER DATABASE MyDatabase
SET RECOVERY SIMPLE;
GO
DBCC SHRINKFILE (MyDatabase_log, 1);
GO
ALTER DATABASE MyDatabase
SET RECOVERY FULL;
GO
July 24, 2016 at 1:04 pm
As to this specific database and your current issue, it looks like the log just isn't being managed as well as it should. When you have a database in full recovery and you run out of space for the log, you would usually want to back up the log first rather than trying to shrink it.
If the log file size has grown too large for whatever reason (not backing up often enough, a one off process that blows up the log or possibly log file backup failures) you may want to shrink it in those cases. Shrinking the log regularly is not a good practice. The idea is to size it correctly and manage the size with log backups - in conjunction with business needs for recovery.
If you need to immediately reduce the size of the log, Try to backup the log and then shrink it. I would also check the log backup history for that database and see if something is going on with the backups and that they are happening regularly without failures as well as checking the space available for the destination for the backups.
Sue
July 24, 2016 at 1:51 pm
1. What are the log file settings for initial size and growth?
2. What are you deleting and why?
3. Are you doing proper backups that include log file backups? If so, what's the schedule? If not, when are you going to make one?
4. Are you doing statistics maintenance? If so, what is it?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 25, 2016 at 6:52 am
It seems that you're not managing the log adequately, as previously stated. You can either change the recovery mode to simple and lose the benefits of minimal data-loss or you improve the log management.
Here's a guide for the second option: http://www.sqlservercentral.com/stairway/73776/
The first option can make sense in certain situations, but it all depends on how much data are you willing to lose in a disaster.
July 25, 2016 at 7:17 am
Data loss is not as much a concern because there daily backups of the server and all the data is coming from a source system that has its own backups in place. Thus db (setup by someone else long ago) is "trying" to act as DW.
The long file final reduced to 7G to 1G (which still seems to high) and then has grown back up to 3G.
Don't get why it's so high. There are about 10 active DBS on same server with similar process that do get larger that 300-500mb. With exception of one at 55G (which is also having problems I have to tackle next)
July 25, 2016 at 8:48 am
If a daily backup is enough for this database, I'd suggest to change the recovery mode to simple. Be sure that you don't have any uncommitted transactions that would prevent the log to truncate.
Size the log correctly to prevent the need of auto-growth when running your daily processes.
I still recommend that you read the article series (stairway) I included in the previous post and that you answer Jeff's questions to get an advice in detail.
July 25, 2016 at 10:37 am
I'm working on Jeff's questions and will respond later today
July 25, 2016 at 11:10 am
In addition to working on managing the log files in general, you may want to take a look at the import process itself. You likely have some large transactions which is affecting the log files. You can use Bulk Insert, SSIS or other processes as well as changing the batch size and commit size. Addressing those can improve the performance as well as help with some of the log file growth. You would likely benefit from just taking some time to understand the log files in general first though as that's the part that seems to be tripping you up a bit.
Sue
July 28, 2016 at 9:20 am
Ok. After days of reviews and trying different things it appears to have been a hardware problem all along. Thanks everyone for your help
July 29, 2016 at 1:32 pm
ptownbro (7/28/2016)
Ok. After days of reviews and trying different things it appears to have been a hardware problem all along. Thanks everyone for your help
What kind of hardware problem, if you don't mind me asking?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 29, 2016 at 5:40 pm
There was a bad disk on one the drive of the server. Once it was removed the performance sky rocketed. We're now ongoing through the process of moving everything to a sever that can handle the processing requirements.
July 30, 2016 at 11:32 am
That would do it. Thank you for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply