Unable to Shrink Log file

  • db size ~83Gb, log1 ~40Gb, log2 ~64Gb.

    Over the weekend the log file grew to consume the 40Gb partition it is located on. Filled up and started throwing errors. Massive delete was done Fri afternoon from BOMCALCTRANS table to free some db space, this is what caused the log to grow out of proportion. I do a Tlog backup every hour, 24 hours. Sunday I tried to shrink the log file, got errors saying it could not write a checkpoint. Many other data integrity errors happening. I could not get any space recovered with any method. This disk has only 9Mb space free. I added another log file to the db on another disk slice and SQL started performing well again, no errors. I still cannot reclaim any space in log file1. DBCC LOGINFO reports ~272k rows at status 2, so these transactions will not go into a log backup. DBCC SHRINKFILE on log file1 produces 'Cannot shrink log file 2 (AXDB30_SP5_DEV_log) because all logical log files are in use.' DBCC SHRINKFILE on log file2 produces 'Cannot shrink log file 3 (AXDB30_SP5_DEV_log2) because of minimum log space required.' so it seems to have shrunk to its minimally allowed amount successfully. select * from master..sysprocesses where open_tran <> 0 returns no records. Available file space for this blade is all but consumed, so a full backup is now failing.

    One possible remedy: Can we do a shrink file and empty the contents of log file 1 into log file2 (combining them)in order to regain control over the first log file? The disk it is on is most likely too full to allow any operations to be performed on it.

    Any advice is appreciated.

  • Please query sys.databases and see what the value of log_reuse_wait_descr is for that database. Also run DBCC SQLPERF(LogSpace) and post the results here.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I forgot to mention that I have this db as a production ERP db, and I have it replicated (partially) to another server so that we can report on it using SSRS. So now the results:

    AX30_SP6_LIVEREPLICATION

    and the replica publication, for reference,

    AXdistributionCHECKPOINT

    Just so you have all the info, here is the entire table from my query, 'select name, log_reuse_wait_desc from sys.databases', see attachment ScreenShot20.jpg.

    As for DBCC SQLPERF (LogSpace):

    master0.992187568.110240

    tempdb28.8046959.343640

    model0.492187540.476190

    msdb46.4921931.801380

    ReportServer1.99218848.382350

    ReportServerTempDB1082.8052.9146590

    eBridge_KWS_LIVE0.742187537.302630

    AX30_SP6_LIVE97428.9899.998890

    ebridge_IRM_LIVE1.49218831.937170

    AXdistribution2.49218861.402820

    eBridge_Axapta30-RM0.742187542.960530

    or see ScreenShot21.jpg, if the attachments actually make it to you.

    Any ideas?

    EDIT - The snapshot replication seems to be working fine, from snap to subscription. It fires at 4am and I just verified I have invoices from yesterday in there.

  • Do you have any transactional replication configured?

    Can you run DBCC OPENTRAN and post the results?

    The distribution DB doesn't appear to have a problem with the log, just AX30_SP6_LIVE

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Transaction information for database 'AX30_SP6_LIVE'.

    Replicated Transaction Information:

    Oldest distributed LSN : (0:0:0)

    Oldest non-distributed LSN : (404298:9743:1)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    There were no errors.

    BTW, thanks a bunch for helping me track this down.

  • Do you have transactional replication configured on this database, or is it just snapshot replication?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sorry, I did omit that. Just snapshot replication, once a day.

  • Ok. What you've got there, and I really wish I knew how this happened, is a partially created transactional replication setup. The fix is fairly simple.

    Create a transactional replication publication.

    Add a single article. It doesn't matter what table you add. Just pick one.

    Finish the wizard but don't select the option to create the snapshot. It's unnecessary.

    Delete the publication you just created.

    Doing that will let SQL clean up properly. Once you've done all that, check DBCC OPENTRAN and sys.databases again. You also should be able to shrink the log down to a reasonable size.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK I am not sure how you arrived at this conclusion, and I am quite sure that I did not set up and partial trans replication, so let me give you a little more info on how this happened. I am not doubting your skills, I just want to make sure you are armed with all the facts.

    The tlog disk slice is 40Gb with hourly tlog backups. This has worked well for about a year. However, the db (on another slice) sometimes grows a lot as a result of BOM calculations, mostly localized to 2-3 specific days in the month. This can create 2-8 million rows in that BOMCALCTRANS table for that day, and Microsoft says we can clean that table out periodically. When it is cleaned, tlogs grow astronomically fast and an hourly spin-off may not handle it. I have had to manually shrink the log on a few occasions.

    On this particular occasion, that 40G slice ran totally out of space and SQL all but locked up, threw errors, etc. No method of shrink would touch it. In fact, the byte count has remained identical for the past 3 days. Adding that second log file was the only thing that got me back into production. And that second log file is the only one that will now respond to a shrink - original log file made of steel.

    Are you certain that the reason it will not budge is not because all usable space on that slice is consumed?? no room to move?? I am sorry, but it is hard for me to accept that all the self-flagellation over this issue comes with such a simple fix, and one so obvious to all but myself.

    Please accept my gratitude for your patience with me on this issue.

  • log_reuse_wait_descr column states what is preventing log truncation. In your case, its Replication which means there is a unfinished replication. This may happen if you've restored this database from a backup of a replicated database.

    Just create a dummy transaction replication and delete it(as suggested by Gail), then u can backup ur log file which will release space, post which you can shrink the log file.



    Pradeep Singh

  • Ken Watson (7/15/2009)


    OK I am not sure how you arrived at this conclusion, and I am quite sure that I did not set up and partial trans replication, so let me give you a little more info on how this happened. I am not doubting your skills, I just want to make sure you are armed with all the facts.

    Experience. I've seen this so many times that I've lost count. The key piece of info is the results from DBCC OPENTRAN, with the Oldest distributed LSN of (0:0:0). There will only ever be mention of oldest distributed LSN and oldest non-distributed LSN when SQL thinks there is transactional replication set up. Neither snapshot nor merge use the transaction log.

    I know you didn't set up transactional replication. As I said, I don't know what causes this, I wish I did. I do know, from experience, that the solution I posted works. I can find you a number of other threads here where exactly that solution worked. There may be other ways to resolve this that I don't know of.

    Are you certain that the reason it will not budge is not because all usable space on that slice is consumed?? no room to move??

    All the usable space in the log is consumed. DBCC SQLPERF(logspace) shows that clearly. The reason that space is not been reused is replication, as per sys.databases, the log_reuse_wait_descr column. To clear some space in the log, we have to resolve what is preventing the space from been reused, ie, the replication that SQL thinks is set up.

    I am sorry, but it is hard for me to accept that all the self-flagellation over this issue comes with such a simple fix, and one so obvious to all but myself.

    It's only obvious to me because I've seen this again and again and again, both here and on servers that I worked directly with. If you prefer to wait for a better solution, that's fine with me.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • THIS is why I subscribe to this site. Experience, professionalism, and a modest amount of hand-holding when it's called for. I appreciate your advice immensely, and I will attempt your suggestions soon. For now, my IT manager has me in on a VoIP meeting. I wear many hats, and my expertise is broad. That is why I am glad I can rely on those with deep expertise on these type subjects.

    I will report my results back here when I am done.

    Thanks to all again.

  • Ok, I followed you instructions but am sad to say nothing changed. I have provided screen shots of each step of the way. I even did a tlog spin off just to see. No change - LSN numbers the same and sys.databases still shows REPLICATION.

    Any ideas?

  • Odd. That normally works perfectly. Think this is first case that I've seen where it hasn't.

    Try creating the transactional publication again then, instead of dropping it from the gui, run these:

    EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @reset = 1

    GO

    CHECKPOINT

    GO

    exec sp_removedbreplication @dbname = 'AX30_SP6_LIVE', @type = 'tran'

    If it gives any errors, post them. This shouldn't affect your snapshot replication but, just to be safe, script out the snapshot publication first.

    The sp_repldone marks all records in the log as replicated. This will allow the space in the log to be reused giving some free space. In case the dropping of the replication failed because there wasn't enough available log space. Also a possibility is that SQL's taking a while to go through that massive log and clean off the 'pending replication' flags.

    Run them individually, in case one throws an error. Maybe wait a while between doing the repl_done and the drop if waiting's an option.

    Run alone, sp_repldone would be a temporary fix. Hopefully the dropping of the replication will then clean things up properly. Check DBCC OPENTRAN when you're done, see if there's any reference fo distributed LSNs. log_reuse_wait probably won't be replication anymore because of the repldone. If the opentran doesn't mention distributed LSNs, the fix is permanent. If it does nut the log_reuse is not showing replication, then all we've managed is a temporary fix.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Before you run all that, can you run this in the DB in question.

    exec sp_helppublication

    If it returns anything other than the name of your snapshot publication, can you paste the results into excel and post them. I'm only going to have a chance to look at them late tomorrow though.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply