Row locks not escalating to table locks after 5000

  • I've got an INSERT that's selecting data from a linked server and attempting to push 10 million rows into the blank table. More or less, it looks like this:

    insert into ReceivingTable (

    Field1, Field2, Field3, Field4

    , Field5, Field6, Field7, Field8

    , Field9, Field10, Field11, Field12

    , Field13, Field14, Field15

    )

    select

    t.Field1, t.Field2, t.Field3, t.Field4

    , t.Field5, t.Field6, t.Field7, t.Field8

    , t.Field9, t.Field10, t.Field11, t.Field12

    , t.Field13, t.Field14, t.Field15

    from [linkedserver].nameddb.dbo.Table1 t with (nolock)

    inner join [linkedserver].nameddb.dbo.Table2 t2 with (nolock) on t2.Field1 = t.FK_Table2_Field1

    inner join [linkedserver].nameddb.dbo.Table3 t3 with (nolock) on t3.Field1 = t2.FK_Table3_Field1

    The receiving table has a clustered index and three nonclustered indexes. After about 10 minutes of running, it errors:

    Msg 1204, Level 19, State 4, Line 2

    The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

    There are no other active users. I ran it again and monitored the following DMO to watch the growth of locks for that spid:

    SELECT request_session_id, COUNT (*) num_locks

    -- select *

    FROM sys.dm_tran_locks

    --where request_session_id = 77

    GROUP BY request_session_id

    ORDER BY count (*) DESC

    The number of locks started small and held for a while around 4-7 locks, but at about 5 minutes in the number of locks held by that spid grew dramatically to more than 8 million before finally erroring again with the same message. Researching, I can't figure out why it's not escalating from row locks to table locks at the appropriate threshold. The threshold in was set to 0 at first (Server Properties > Advanced > Parallelism > Locks). I set it to 5000, and it still didn't seem to help. Rewriting the INSERT to include a WITH (TABLOCK) allows it to finish successfully in testing. My problem is that it's coming out of an ETL with source code that I can't edit. I need to figure out how to force it to escalate to locking the entire table via table or server level settings.

    A colleague suggested that installing service packs may take care of it (the client is running SQL Server 2008 R2 (RTM)), but I haven't found anything online to support that theory. Before I take a production SQL Server offline and commit the client to another day without their data warehouse for a fix that might not do the trick, I thought I'd try this place as you've all been a great resource for my lurking in the past.

    Many thanks!

    JT

  • 10 million rows across a link server is going to be slow no matter how you do it. I would suggest that doing your inserts in batches would be a better approach instead of trying to do all 10 million in a single query.

    Also, unless you don't care how accurate your query is you should drop those NOLOCK hints. http://blogs.sqlsentry.com/aaronbertrand/bad-habits-nolock-everywhere/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Possibly because there are other locks on the table. Escalation isn't guaranteed, if there are any incompatible locks on the table, the escalation will fail and the original locks will remain. The escalation will be attempted after a certain number more locks are taken.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sean Lange (7/16/2015)


    10 million rows across a link server is going to be slow no matter how you do it. I would suggest that doing your inserts in batches would be a better approach instead of trying to do all 10 million in a single query.

    Also, unless you don't care how accurate your query is you should drop those NOLOCK hints. http://blogs.sqlsentry.com/aaronbertrand/bad-habits-nolock-everywhere/[/url]

    I definitely agree with you, but I can't change the query itself. All I can do is lube up the environment to make it better handled.

  • Try using openquery instead of the 4-part naming. That way you can be sure that the query is executed on the remote server and the results sent across, not the three complete tables sent across and the query executed locally.

    Oh, and ditch the nolocks.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/16/2015)


    Possibly because there are other locks on the table. Escalation isn't guaranteed, if there are any incompatible locks on the table, the escalation will fail and the original locks will remain. The escalation will be attempted after a certain number more locks are taken.

    I just set the database to single-user mode and reattempted the insert to try to eliminate the possibility of other locks on the table. It failed with the same error.

    GilaMonster (7/16/2015)


    Try using openquery instead of the 4-part naming. That way you can be sure that the query is executed on the remote server and the results sent across, not the three complete tables sent across and the query executed locally.

    Oh, and ditch the nolocks.

    With the DB in single-user mode, I converted the query to this:

    insert into ReceivingTable (

    Field1, Field2, Field3, Field4

    , Field5, Field6, Field7, Field8

    , Field9, Field10, Field11, Field12

    , Field13, Field14, Field15

    )

    select

    t.Field1, t.Field2, t.Field3, t.Field4

    , t.Field5, t.Field6, t.Field7, t.Field8

    , t.Field9, t.Field10, t.Field11, t.Field12

    , t.Field13, t.Field14, t.Field15

    from openquery([linkedserver], '

    select

    t.Field1, t.Field2, t.Field3, t.Field4

    , t.Field5, t.Field6, t.Field7, t.Field8

    , t.Field9, t.Field10, t.Field11, t.Field12

    , t.Field13, t.Field14, t.Field15

    from nameddb.dbo.Table1 t

    inner join nameddb.dbo.Table2 t2 on t2.Field1 = t.FK_Table2_Field1

    inner join nameddb.dbo.Table3 t3 on t3.Field1 = t2.FK_Table3_Field1

    ') t

    It failed with the same error and behavior: just 3 locks for the majority of the runtime (approx 5.5 minutes) and then the number of locks ballooned to several million before breaking after 16 minutes. I've attached a screenshot of sp_whoisactive and the count of locks that I captured around the 9 minute mark.

  • Ack, I just read your first post again. Change the locks value back to 0. That's not the number of locks on an object which triggers an escalation (which is 4000).

    The locks option sets the maximum number of available locks, thereby limiting the amount of memory the SQL Server Database Engine uses for them. The default setting is 0, which allows the Database Engine to allocate and deallocate lock structures dynamically, based on changing system requirements.

    By setting it to the minimum, you've told SQL that it may not ever take more than 5000 locks, which is probably crippling for most systems.

    Do you have any trace flags enabled? What are the row lock and page lock properties for the indexes on the destination table?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/16/2015)


    Ack, I just read your first post again. Change the locks value back to 0. That's not the number of locks on an object which triggers an escalation (which is 4000).

    The locks option sets the maximum number of available locks, thereby limiting the amount of memory the SQL Server Database Engine uses for them. The default setting is 0, which allows the Database Engine to allocate and deallocate lock structures dynamically, based on changing system requirements.

    By setting it to the minimum, you've told SQL that it may not ever take more than 5000 locks, which is probably crippling for most systems.

    Do you have any trace flags enabled? What are the row lock and page lock properties for the indexes on the destination table?

    Ack, indeed! I've moved that locks value back to 0.

    I do not have any trace flags enabled.

    The sys.tables and sys.indexes results for that object_id are attached.

  • Which service pack is your database server running? I seem to remember some locking problems pre-service pack 1...

  • jotate (7/16/2015)


    GilaMonster (7/16/2015)


    Ack, I just read your first post again. Change the locks value back to 0. That's not the number of locks on an object which triggers an escalation (which is 4000).

    The locks option sets the maximum number of available locks, thereby limiting the amount of memory the SQL Server Database Engine uses for them. The default setting is 0, which allows the Database Engine to allocate and deallocate lock structures dynamically, based on changing system requirements.

    By setting it to the minimum, you've told SQL that it may not ever take more than 5000 locks, which is probably crippling for most systems.

    Do you have any trace flags enabled? What are the row lock and page lock properties for the indexes on the destination table?

    Ack, indeed! I've moved that locks value back to 0.

    I do not have any trace flags enabled.

    The sys.tables and sys.indexes results for that object_id are attached.

    Thinking more about these results, I altered the clustered index for the destination table and disabled the allow_row_locks and allow_page_locks. I re-ran the insert, and it worked! It's still not clear to me why it didn't jump to just locking the table without those settings disabled, but that at least did the trick for the immediate need of getting the table populated.

  • Just make sure you change them back immediately after the insert, or you may have some nasty blocking problems.

    As for service pack, that server really needs to be upgraded. There have been a lot of fixes, bugs, performance problems, security patches, etc, in the 4 service packs that 2008 R2 has. Schedule some downtime, get that DB instance up to date.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/17/2015)


    Just make sure you change them back immediately after the insert, or you may have some nasty blocking problems.

    As for service pack, that server really needs to be upgraded. There have been a lot of fixes, bugs, performance problems, security patches, etc, in the 4 service packs that 2008 R2 has. Schedule some downtime, get that DB instance up to date.

    Will do. The destination table is a staging table for an ETL process. With those disabled, the ETL processed successfully (for the first time in several days!). Hooray!

    We pushed them up to Service Pack 2 last night, but the error continues to occur. I'll work with their team to get SP 3 and 4 installed.

    I was able to test this on another client site with a much beefier underlying server. The ballooning number of row locks happened again in that environment, but it didn't throw the error. I'm thinking the error on Client #1 may just be a memory limitation (64 bytes per lock (it's a 32-bit system) * 10 million locks = 640MB of the available 8GB), and the ballooning number of row locks may just be how SQL Server works for inserts from data pulled from a linked server with allow_row_locks enabled on the destination table. Just for curiosity's sake, I'm going to try to recreate that "huge number of row locks upon insert" scenario with as few variables as possible to verify that's just how it works under some specific set of conditions.

  • jotate (7/17/2015)


    Just for curiosity's sake, I'm going to try to recreate that "huge number of row locks upon insert" scenario with as few variables as possible to verify that's just how it works under some specific set of conditions.

    Interesting stuff coming out of this testing. I ran this to set it all up:

    create database [JTTest]

    go

    alter database [JTTest] set recovery simple

    go

    alter database [JTTest] set single_user with rollback immediate

    go

    use [JTTest]

    go

    create table SourceTable (

    ID int identity primary key not null

    , RecordType varchar(50) not null

    , RecordFlag bit not null

    , RecordDescription varchar(255) null

    , OtherInteger int not null

    , Created datetime not null

    , CreatedBy varchar(50) not null

    , LastModified datetime not null

    , LastModifiedBy varchar(50) not null

    )

    go

    Then I ran this to begin inserting 2 million records into that table:

    declare @FillString varchar(50) = 'XXXXXXXXXX'

    set @FillString = @FillString + @FillString + @FillString + @FillString + @FillString

    insert into SourceTable (

    RecordType

    , RecordFlag

    , RecordDescription

    , OtherInteger

    , Created

    , CreatedBy

    , LastModified

    , LastModifiedBy

    )

    select top 2000000

    @FillString as RecordType

    , 1 as RecordFlag

    , @FillString + @FillString + @FillString + @FillString + @FillString as RecordDescription

    , o1.object_id as OtherInteger

    , getdate() as Created

    , @FillString as CreatedBy

    , getdate() as LastModified

    , @FillString as LastModifiedBy

    from sys.objects o1, sys.objects o2, sys.objects o3, sys.objects o4

    go

    While that was running, I ran this query against master for that instance and monitored the total number of locks for that inserting spid:

    SELECT request_session_id, COUNT (*) num_locks

    FROM sys.dm_tran_locks

    GROUP BY request_session_id

    ORDER BY count (*) DESC

    For that initial insert of data, the total number of locks stayed very low (between 5 and 8).

    Then I ran the insert again to add another 2 million rows and bring our total up to 4 million. Again, I monitored the number of locks for that spid. It immediately began growing rapidly. It grew to a little more than 100,000 before the insert query finally finished. I ran the insert again to bring the total up to 6 million, and the number of locks again grew rapidly to over 100,000 before the insert finished.

    Next, I dropped and recreated the SourceTable. During my insert to that fresh table, the number of locks again stayed very low. The next several inserts after the initial 2 million rows saw the number of locks grow to 100,000+.

    Next, I truncated the SourceTable and ran another insert. During this insert into the empty table, the number of locks grew 100,000+.

    So, it looks like this doesn't have anything to do with linked servers. An insert into a freshly created clustered index appears to perform a table lock, whereas all subsequent inserts (even those against a freshly truncated table) will demand potentially millions of row or page locks to perform the insert.

    All of that testing was against a SQL 2008 R2 (SP1) instance. I just ran the same test against a 2012 (SP1) instance and got some surprising results. The locks appeared to grow to 5000 before collapsing back down to below 10. That seems to indicate that the 2012 instance grabbed a table lock once it surpassed 5000 row/page locks.

    Interesting stuff. Somewhere between 2008 R2 SP2 and 2012 SP1, this behavior seems to have changed (unless I'm missing some key setting in the model database that's driving the difference in behavior between my two test instances). I'm going to find a test box with 2008 R2 SP4 and see what happens.

  • jotate (7/17/2015)


    I'm going to find a test box with 2008 R2 SP4 and see what happens.

    It looks like the latest SP for 2008 R2 is SP3 (unless I'm missing something: https://support.microsoft.com/en-us/kb/2527041). I installed that locally and ran the test again. The initial insert after table creation performed a table lock. The subsequent inserts performed around 100,000 page locks before query completion. It looks like this may just be the way that it worked in 2008 R2, and the "grab a table lock after 5000 page locks" was implemented in 2012 for this particular scenario.

    I think that puts to bed my concern with the issue. We'll just have to install more memory to that server or force the staging tables in question to disallow page locks when we're performing those inserts. Long term, I'll get our development team to explicitly call for a table lock in the code.

    Thanks for all your help!

  • jotate (7/17/2015)


    It looks like this may just be the way that it worked in 2008 R2, and the "grab a table lock after 5000 page locks" was implemented in 2012 for this particular scenario.

    No, lock escalation has been in the product for a very long time, before I started, and that was with SQL 2000. Why it's not in this case is a mystery. I don't have a working 2008 R2 instance any longer or I'd poke a bit and see what's up.

    As a long-term fix I'd recommend the tablock hint and convert the query to use OPENQUERY, not because of effects on locking, but to ensure that the tables are joined remotely and then sent across. Linked servers tend to be more than a little finicky w.r.t. performance.

    And you're right about the SP. 2008 has an SP4, 2008 R2 has SP3.

    Edit: You're not the only person who's run into this: http://sqlblog.com/blogs/adam_machanic/archive/2009/10/30/sql-server-2008-lock-escalation-inserts-and-a-potential-bug.aspx. At this point, my money's on 'bug in SQL 2008'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 39 total)

You must be logged in to reply to this topic. Login to reply