March 28, 2023 at 9:59 pm
Hello All
Recently we upgraded our production MSSQL Server from 2019 to 2022 and faced an issue with truncation of log file.
Our environment is AAG with several replicas and we noticed that truncation_lsn randomly stucks on Secondaries and sometime on Primary. Currently we have just one workaround - it is server restart (for secondary) or failover (if happens on Primary), but we'd like to find a permanent solution.
We noticed that truncation_lsn stucks right after below warnings
Controller thread will stall to allow the close thread to catchup
Close thread is falling behind: 4 checkpoints outstanding.
We think that the issue is somehow relates to checkpoint and In-Memory, as we have XTP_Checkpoint wait if above warnings happen on Primary.
If we run manual Checkpoint it never ends.
I'll appreciate if someone can provide a clue or share some experience how to find a solution here.
Thank you in advance
March 29, 2023 at 10:10 pm
Thanks for posting your issue and hopefully someone will answer soon.
This is an automated bump to increase visibility of your question.
April 10, 2023 at 1:22 pm
If you are seeing XTP_Checkpoint wait when these warnings occur on the primary, it may indicate that there is a problem with the checkpoint process.
Here are a few suggestions you could try to resolve the issue:
April 10, 2023 at 8:40 pm
check your vlfs as well
DBCC LOGINFO
I generally aim to have less than or equal to 40 or 50 vlfs, I wouldn't of my way to do anything about it unless there were several hundred or I had a window where I could easily deal with it.
Given that it is happening after reboot, am curious if long recovery time is interfering with checkpoints.
have you configured indirect checkpoints? If the database originally came from before whenever it was made default (2016? 2014?) it may have never been done.
April 28, 2023 at 10:05 am
Did you find a solution to this?
We are seeing the same behavior on one of our databases, and I also believe that it may be related to in-memory tables.
Did you, by any chance, move the database to another server recently?
June 5, 2023 at 3:24 pm
Hi. We migrated our production databases from a SQL 2017 instance to a new SQL 2022 instance. We just started getting this exact issue this past weekend. It is only happening with our database that does have in-memory enabled and is part of an availability group.
We ran out of disk space for the transaction log. Performing log backups and restarting SQL service on the replicas did not resolve the issue. We had to remove the affected database from the AG and change the database recovery model from Full to Simple in order to successfully truncate our log. We didn't want to do this, but that was our only option to get things back up. The database was then changed back to Full recovery model and re-added to the AG.
We are on SQL 2022 CU1. I'm thinking about installing CU4 soon, but none of the release notes for the CU's mention fixing this so I don't think it'll help.
June 5, 2023 at 3:41 pm
That isn't the same issue. The logs wouldn't truncate because the memory optimized tables needed log space to commit, and with a full log file there was neither space to commit those changes or checkpoint. Expanding the log file a little bit then backing it up would have resolved the full log.
June 5, 2023 at 3:50 pm
I should have mentioned that we have no in-memory tables in the database. So technically there's nothing in-memory that should need log space. We did have one initially but deleted it years ago.
If you're wondering why we still have in-memory enabled for that database, it's because you can't easily delete an in-memory filegroup.
June 27, 2023 at 8:17 pm
We're having the same issue with one of our environments. The only workaround until now is to reboot. We just installed the latest CU, but there is nothing in the documentation that may make us think that this will solve the problem.
June 27, 2023 at 8:30 pm
There is a bug related to in-memory tables and Lock pages in memory.
A workaround is to remove lock pages in memory for the instance (if possible) until it is fixed.
It is unfortunately not fixed in CU5.
June 30, 2023 at 6:08 pm
Thanks for the reply. Do you have documentation in regards to the Bug you're mentioning?
July 3, 2023 at 3:12 pm
an update on this issue
We set up a load testing environment and easily reproduced the issue after some hours. We halted the test and recover the database to start with lock pages in memory setting disabled. The test has been running since Saturday and we haven't had the issue until now.
July 14, 2023 at 3:37 am
This was removed by the editor as SPAM
August 2, 2023 at 7:54 pm
We also have temporarily disabled "Lock pages in memory" and have not seen the issue crop up yet.
August 15, 2023 at 1:08 pm
This was fixed with the CU7
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply