August 29, 2014 at 9:40 pm
Environment Background: Running a Dev and Prod environment of SharePoint 2013. Environments are essentially identical, except for that fact that Dev is virtual and Prod is physical. The databases for both environments reside on the same physical SQL server, but separated into different named instances. The SQL server is running version 2012 and is fairly hefty. The production instance of SharePoint has about 120GB of RAM allocated to it's SQL instance, the Dev environment has about 60GB.
Background: Fellow staff member attempted to perform a bulk upload of about 20k documents through web front end. Realizes documents are not being added correctly, and tires to perform bulk delete. Documents get stuck in the SharePoint recycle bin. Upon doing so, the site that was supposed to ingest the document starts being slugish. Temporarily goes down, but come up without admin intervention. Issue goes on for about a day or so.
I am not very familiar with MS SQL 2012, but I decided to help since I am the primary SharePoint admin. Looked through SharePoint logs, and found several mentions of database locks. I ran the Top Database Locks reports for the production database associated to the site, and found over 200k locks. Since this is our production SharePoint and SQL instance I asked the staff member to recreate the issue in our dev. environment of SharePoint so I would have a little more freedom to work. Ran the bulk delete in Dev using PowerShell, and the production site that was experience the issue goes down. Bulk delete eventually completes for our SharePoint dev environment and the production site comes back up. I don't understand how my actions in dev affected the sites in prod? The instances have memory limits enforced. My initial thought was that there were not enough resources to perform the bulk delete.
Since I could not duplicate the true scenario in dev I moved back to prod. Ran Top Locks report and started killing off the processes that were holding onto the locks. Cleared all locks and tried running the bulk delete in prod. Still no luck, every attempt to delete documents causes locks and they are not being released. Eventually, I got the crazy idea of detaching the DB. After detaching and reattaching I was able to perform the bulk delete successfully. It seems like the table that handle the movement of documents in and out of SharePoint was hit hard by the bulk upload and delete attempts. All of the SQL statements that were producing locks referenced one specific table.
I have a feeling I will run into this again, and would like to know if there is a way to perform a database "reset"? Similar, to an IISRESET were connections are killed and memory resources are freed. The detach option was really my last resort, and unfortunately my organization does not have a true DBA.
August 29, 2014 at 9:52 pm
No. And don't detach the database ever unless you a good reason to (this isn't one).
The answer is, don't do large bulk processes. Delete/update in batches.
August 29, 2014 at 10:04 pm
Any idea why Dev had an impact on Prod? I believe it is due to memory contention, but I assumed with named instances all actions should stay isolated within the resources allocated to the instance. I didn't configure the SQL instances, but I am assuming the memory is allocation is correct. I will be reviewing it next week.
Do you mind sharing your suggested steps for resolution? The DB associated with the site was dogged during the bulk jobs. One table to be specific seemed to be hit the hardest. It just so happens that this table is responsible for moving documents in and out of SharePoint. After I cleared the recycle bin in prod the database shrunk about 2 GB, and the database stopped suffering from Locks. The queries that perform the inserts and delete jobs were running smoothly. Releasing all locks as soon the query executed.
August 29, 2014 at 11:04 pm
You said one of them is virtual. As in a VM on the physical machine?
It didn't unless you have something configured poorly like all the database files and the VM VHD all on the same drive.
August 29, 2014 at 11:16 pm
The web front end servers for the Dev environment of SharePoint are virtual. The web front end servers for the production SharePoint environment are physical. The SQL server that hosts the databases for both environments is physical, and the server is built with separate disk partions for the dB's and assoacited logs.
The environment are essentially isolated. The only commonality is the physical SQL server, but each environment has its own named instanced. I am really leaning toward improperly configured instances.
August 29, 2014 at 11:38 pm
Are the instances sharing the same db file and log file drives? What about tempdb files? If so, that could definitely cause contention on the other instance.
August 30, 2014 at 12:57 pm
Yeap. All of the db files and logs are located on their own drive partitions. There is plenty of space available, but now that I consider IOPS there might not have been enough disk resources.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply