Running out of Identity values

  • sql-lover (10/3/2012)


    bt,

    That's a very nice piece of code, thank you so much. However, I wonder how many locks (if one) it will create. Like I said, the production server has serious IO issues and very bad drive layout. I'm actually in the process of migrate to a new environment I designed.

    But I may try and test on my ITG box. Also, how do I know when complete? The live table gets incremented sporadically. I guess the scripts will surpass the insert rate.

    Still I wonder how I can make BCP work via XML.

    You can add a with (nolock) on your select if it makes you feel better, but since it's just reading records that aren't being updated that nolock will have virtually zero impact, but it won't hurt. As it is (sans nolock), it's just going to put a non-exclusive lock on the pages as it's reading them to verify that no other process has an exclusive lock on those same pages (which would mean that an update is in the works, ergo, it would wait until that update is done.) The non-exclusive lock is released as soon as the pages are read.

    You'll know it's done when the script is complete. You could add a print statement to the end as well if you want more clarity on that. Also, with this process, you can kill it at any time (just be sure to manually execute the rollback statement if you do so) with very little impact since it is only dealing with the N records at a given time. If your box is getting bogged down, simply kill it and rollback. The script will then pick up wherever it left off when you're ready to start it up again.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • I wanna die ... :crying:

    I just discovered there is no CI on that table, just a NCI. So bt ... your script is doing an ugly table scan (checked the execution plan on my test environment, real data).

    I guess I will have to incorporate that on new one, after moving the records.

  • Wow, they built a billion record heap? Ouch. Sorry man.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • ScottPletcher (10/3/2012)

    ...

    Why would the relative position of the column make the slightest bit of difference at all?

    It may effect page splits and re-writes.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • bteraberry (10/3/2012)


    Wow, they built a billion record heap? Ouch. Sorry man.

    I've seen a few similar tables. I am in the process of fixing those issues.

    Anyway ... tested your script. But it looks does not loop automatically, am I right? I mean, it processes whatever is on the batch and then you need to invoke it again for next batch?

    I was confused because the WAITFOR. I guess that's for MS-SQL to "breath" and log changes ... and allow people to use the table in between batches... right?

  • A WaitFor in such a process is commonly there to give the server a chance to finish other threads/processes in between loops.

    However, on a process like this, if you do batches of 5k, on 1B rows, just the half-second WaitFors add up to 27 hours, not including the time to actually process rows. So you may need to manage that.

    And definitely make it loop till it's done.

    An easy way to do that is:

    select 1 as Col into #T;

    while @@rowcount > 0

    begin

    ... whatever you're looping ...

    end;

    That way, as long as the process has rows to run, it will keep going.

    If you have instructions after the Insert Select, you'll need to either move them before it (if possible) in the loop, or assign the @@rowcount value to a variable and loop on that instead. Otherwise, @@rowcount can easily return 0 every run.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • while 1 = 1 loops just fine. It will break out of the loop when applicable because of the break statement.

    My bad on @@ROWCOUNT ... that value should be assigned to a variable before the commit, then checked against the block size after the commit. I got lazy in my code for you and I forgot that after the commit, the @@ROWCOUNT will be 0, which will then cause the loop to break every time.

    A record size of 5000 and a wait of half a second is neither right nor wrong. Those weren't provided as such. They're examples. If someone is really concerned about their system sucking and they don't want to cause an undesirable strain on the system, then a lighter weight but longer running strain would be preferable to a shorter but more intense load. If it needs to be done in the next three weeks and the primary goal is to reduce impact, then 27 hours is irrelevant. Our servers are not struggling, so normally I would wait for five one-hundredths of a second in each loop. The record count in each batch will vary tremendously based on the size of each record.

    But in this case, it's all irrelevant if he's working with a heap and cannot get top (N) without a scan. That being the case, any chunk method would be a problem.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • GSquared (10/3/2012)


    A WaitFor in such a process is commonly there to give the server a chance to finish other threads/processes in between loops.

    However, on a process like this, if you do batches of 5k, on 1B rows, just the half-second WaitFors add up to 27 hours, not including the time to actually process rows. So you may need to manage that.

    And definitely make it loop till it's done.

    An easy way to do that is:

    select 1 as Col into #T;

    while @@rowcount > 0

    begin

    ... whatever you're looping ...

    end;

    That way, as long as the process has rows to run, it will keep going.

    If you have instructions after the Insert Select, you'll need to either move them before it (if possible) in the loop, or assign the @@rowcount value to a variable and loop on that instead. Otherwise, @@rowcount can easily return 0 every run.

    Part of the goal in a successful chunking strategy is to prevent one humongous transaction that's going to destroy the system. The begin tran and commit tran need to be contained inside the loop. The most straight forward way I've seen to do this is simply do "while 1 = 1" and then calculate the logic within the loop of when it's time to break out of the loop. It's really no different though than simply assigning the @@ROWCOUNT to a variable immediately after the insert and use 'while @var > 0', but if you do that there is an extra step to make sure the initial value of @var is > 0 so it runs the first time. If it's left null the loop won't execute.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • So.... no interest in trying the 65 millisecond method I proposed to see if it'll work? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eugene Elutin (10/3/2012)


    ScottPletcher (10/3/2012)

    ...

    Why would the relative position of the column make the slightest bit of difference at all?

    It may effect page splits and re-writes.

    ?? How is that??

    Fixed-length columns come first, then variable length columns. Increasing length is increasing the length, regardless of where in the row the column is, right??

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Jeff Moden (10/3/2012)


    So.... no interest in trying the 65 millisecond method I proposed to see if it'll work? 😉

    Took a less elegant approach 🙂 ... and I may use bt's script for another table (same problem) so I can try live.

    Today, I copied the data into new table , with new data type. I will export the new table, which is a heap, via bcp. Move that file to live server. Create the new table there, then import via bcp.

    On testing server though, I still need to test the logic to break the FK from source table, recreate on new table and add missing rows not caught by bcp today via SELECT NOT IN or something similar.

    The advantage of copying the data into new table, different environment, is that I minimize production impact.

    If I can copy 90% or more outside the production environment, I will be a happy man. And I believe I accomplished that today. So after testing the FK and Index dropping logic, will be a matter of adding a few missing records, which of course I will do during the maintenance window, which hopefully will take a few min only.

  • sql-lover (10/3/2012)


    Jeff Moden (10/3/2012)


    So.... no interest in trying the 65 millisecond method I proposed to see if it'll work? 😉

    Took a less elegant approach 🙂 ... and I may use bt's script for another table (same problem) so I can try live.

    Today, I copied the data into new table , with new data type. I will export the new table, which is a heap, via bcp. Move that file to live server. Create the new table there, then import via bcp.

    On testing server though, I still need to test the logic to break the FK from source table, recreate on new table and add missing rows not caught by bcp today via SELECT NOT IN or something similar.

    The advantage of copying the data into new table, different environment, is that I minimize production impact.

    If I can copy 90% or more outside the production environment, I will be a happy man. And I believe I accomplished that today. So after testing the FK and Index dropping logic, will be a matter of adding a few missing records, which of course I will do during the maintenance window, which hopefully will take a few min only.

    Thanks for the feedback. Guess I'll try the method I suggested on some test tables. If it works, there wouldn't be any down time per se`.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • bteraberry (10/3/2012)


    while 1 = 1 loops just fine. It will break out of the loop when applicable because of the break statement.

    My bad on @@ROWCOUNT ... that value should be assigned to a variable before the commit, then checked against the block size after the commit. I got lazy in my code for you and I forgot that after the commit, the @@ROWCOUNT will be 0, which will then cause the loop to break every time.

    A record size of 5000 and a wait of half a second is neither right nor wrong. Those weren't provided as such. They're examples. If someone is really concerned about their system sucking and they don't want to cause an undesirable strain on the system, then a lighter weight but longer running strain would be preferable to a shorter but more intense load. If it needs to be done in the next three weeks and the primary goal is to reduce impact, then 27 hours is irrelevant. Our servers are not struggling, so normally I would wait for five one-hundredths of a second in each loop. The record count in each batch will vary tremendously based on the size of each record.

    But in this case, it's all irrelevant if he's working with a heap and cannot get top (N) without a scan. That being the case, any chunk method would be a problem.

    bt,

    I was almost ready to run your script, but found that min identity values do not match. In other words, the script is not grabbing or starting to copy from the min value. I am still checking why.

  • After lot of testing, will go with this logic ...

    DECLARE @i INT;

    DECLARE @chunk INT;

    DECLARE @maxNewTable INT;

    SET @i = ( SELECT MIN(ID) FROM OldTable WITH (NOLOCK) );

    SET @chunk = 5;

    SET @maxNewTable = ( SELECT MAX(ID) FROM NewTable )

    IF @maxNewTable IS NULL

    BEGIN

    begin try

    BEGIN TRANSACTION;

    set identity_insert dbo.NewTable on;

    insert into dbo.NewTable (ID, Col2, Col3, Col4, Col5)

    select ID, Col2, Col3, Col4, Col5

    from dbo.OldTable

    WITH (NOLOCK)

    where ID BETWEEN @i AND @i+@chunk;

    COMMIT TRANSACTION;

    end try

    begin catch

    print error_message();

    rollback transaction;

    end catch

    END

    ELSE

    begin try

    BEGIN TRANSACTION;

    set identity_insert dbo.NewTable on;

    insert into dbo.NewTable (ID, Col2, Col3, Col4, Col5)

    select ID, Col2, Col3, Col4, Col5

    from dbo.OldTable

    WITH (NOLOCK)

    where ID > @maxNewTable AND ID < @maxNewTable+@chunk;

    COMMIT TRANSACTION;

    end try

    begin catch

    print error_message();

    rollback transaction;

    end catch

    set identity_insert dbo.NewTable off;

    Certainly not as clean like BT's, but that one is not grabbing lower ID values properly. I believe is something related to this line:

    select @i = coalesce((select top (1) PrimaryKeyID from dbo.YourNewTable order by PrimaryKeyID desc), 0);

    But I really do not have time for more testing.

    I am still testing the dropping FK part. I do need to drop the FK from old and recreate on new table, after all rows are copied and during the short downtime window.

    I also need to rename the tables, after re-creating the FKs on new one.

Viewing 14 posts - 31 through 43 (of 43 total)

You must be logged in to reply to this topic. Login to reply