Date Conversion Fills the Transaction log

  • I have 2 sql server 2005 instances running on 2 different boxes, one is 32 bit Win Server 2003, the other is a 64 bit.

    now, I load a table with 10 million rows, one column of which is a datetime which needs converting to a datetime,103 type

    --Add a New Column

    USE DBSUPPORT

    ALTER TABLE extract ADD NewLastLoginDate datetime NULL

    GO

    --Convert the existing column

    USE DBSUPPORT

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    GO

    this works fine on the 64 bit machine, but fills the transaction log on the 32 bit, regardless of how big i make the T-Log, leaving the NewLastLoginDate Column full of NULL's

    Msg 9002, Level 17, State 4, Server XXXXXXXXX, Line 3

    The transaction log for database 'DBSupport' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

    ....also it worked fine on the 32 bit until I rebooted it about a month ago, although this could be a red herring.

    Any Ideas ???

  • Do it in batches of a few hundred thousand rows at a time. If you do that as a single statement, the log must be big enough to contain the before and after values (plus metadata) for the entire 10 million rows.

    What's the recovery model?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The recovery model is SIMPLE

    would you batch it with a WHERE...

    USE DBSUPPORT

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103) WHERE LastLoginDate < 2010

    GO

    USE DBSUPPORT

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103) WHERE LastLoginDate > 2010

    GO

    for instance???

  • You can, it's a little more complex that way as you need to know the distribution of the data. I'd typically do it like this. (not tested!)

    DECLARE @Done Bit;

    Set @Done = 1;

    While (@Done = 0)

    Begin

    UPDATE TOP (100000) extract -- Tweak if necessary. Tradeoff between time and log usage.

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    WHERE NewLastLoginDate IS NULL;

    CHECKPOINT;

    If (@@RowCount = 0) -- Last update found no null values to update

    SET @Done = 1;

    End

    Just one question. Why are you storing the string representation of the date? Conversions and formatting should typically be done on display, not stored in the table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/17/2010)


    You can, it's a little more complex that way as you need to know the distribution of the data. I'd typically do it like this. (not tested!)

    DECLARE @Done Bit;

    Set @Done = 1;

    While (@Done = 0)

    Begin

    UPDATE TOP (100000) extract -- Tweak if necessary. Tradeoff between time and log usage.

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    WHERE NewLastLoginDate IS NULL;

    CHECKPOINT;

    If (@@RowCount = 0) -- Last update found no null values to update

    SET @Done = 1;

    End

    Just one question. Why are you storing the string representation of the date? Conversions and formatting should typically be done on display, not stored in the table.

    Nice, However, NewLastLoginDate is all NULL's and lastlogindate CAN and DOES Contain Nulls so this would run forever I guess, Do you think I need to consider using CURSOR?

  • No, you absolutely do not want to consider a cursor. You want to do this in reasonable size batches, not one row at a time (which will take forever). If there are possibly null values for LastLoginDate, all you need to do is add another predicate to the where clause of the update.

    DECLARE @Done Bit;

    Set @Done = 1;

    While (@Done = 0)

    Begin

    UPDATE TOP (100000) extract -- Tweak if necessary. Tradeoff between time and log usage.

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    WHERE NewLastLoginDate IS NULL AND LastLoginDate IS NOT NULL;

    CHECKPOINT;

    If (@@RowCount = 0) -- Last update found no null values to update

    SET @Done = 1;

    End

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • R0gu3Tr4d3r (8/17/2010)


    I have 2 sql server 2005 instances running on 2 different boxes, one is 32 bit Win Server 2003, the other is a 64 bit.

    now, I load a table with 10 million rows, one column of which is a datetime which needs converting to a datetime,103 type

    --Add a New Column

    USE DBSUPPORT

    ALTER TABLE extract ADD NewLastLoginDate datetime NULL

    GO

    --Convert the existing column

    USE DBSUPPORT

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    GO

    this works fine on the 64 bit machine, but fills the transaction log on the 32 bit, regardless of how big i make the T-Log, leaving the NewLastLoginDate Column full of NULL's

    Msg 9002, Level 17, State 4, Server XXXXXXXXX, Line 3

    The transaction log for database 'DBSupport' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

    ....also it worked fine on the 32 bit until I rebooted it about a month ago, although this could be a red herring.

    Any Ideas ???

    If you think that's badk, just wait until you try to use that VARCHAR date column you're building for something other than a place holder. You never did answer the question... what makes you think you need a VARCHAR date column?

    The other thing is... why are you updating two tables? Do the work on the 64 bit machine and transfer it to the 32 bit machine.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ideally Id love to store the date time as it is, and get the MI guys to convert the data in their queries...but I am where I am.

    In the solution above the loop only executes twice,and only converts the top 100000 rows, first time through it converts the 1st 100000 rows, then the second time through reads down the first 100000 rows again, but does no conversions as they have already been done. The WHERE clause doesnt apply to the 'top 100000'

  • R0gu3Tr4d3r (8/18/2010)


    The WHERE clause doesnt apply to the 'top 100000'

    The Where applies to the table, filters out the rows where the conversion has not been done, that results in a set of records. The TOP then takes a sample of that. Second time around the where filters out the rows that have just been updated resulting in only the un-updated rows. Top then takes a subset of that.

    I did make a couple of mistake in the code, one of which was putting the checkpoint before the check of the rowcount. Since Checkpoint returns no rows, @@rowcount straight after is 0. This is corrected and tested code.

    DECLARE @Done Bit;

    Set @Done = 0;

    While (@Done = 0)

    Begin

    UPDATE TOP (100000) extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    WHERE NewLastLoginDate IS NULL AND LastLoginDate IS NOT NULL;

    If (@@RowCount = 0) -- Last update found no null values to update

    SET @Done = 1;

    CHECKPOINT;

    End

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok, tested and works perfectly, thanks for all the help.

    The wierd thing is that it works fine as a standalone T-SQL command in Management Studio as

    ALTER TABLE extract ADD NewLastLoginDate datetime NULL

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    and only fails when running from a script !!

    Anyways, all working now so thanks once again.

  • R0gu3Tr4d3r (8/17/2010)


    I have 2 sql server 2005 instances running on 2 different boxes, one is 32 bit Win Server 2003, the other is a 64 bit.

    now, I load a table with 10 million rows, one column of which is a datetime which needs converting to a datetime,103 type

    --Add a New Column

    USE DBSUPPORT

    ALTER TABLE extract ADD NewLastLoginDate datetime NULL

    GO

    --Convert the existing column

    USE DBSUPPORT

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    GO

    Have you considered adding a computed column instead ?

    Like this:

    ALTER TABLE extract ADD NewLastLoginDate (convert(datetime, lastlogindate, 103))

    Adding a computed column is instant regardless of the table size.

  • R0gu3Tr4d3r (8/18/2010)


    Ok, tested and works perfectly, thanks for all the help.

    The wierd thing is that it works fine as a standalone T-SQL command in Management Studio as

    ALTER TABLE extract ADD NewLastLoginDate datetime NULL

    UPDATE extract

    SET NewLastLoginDate = convert(datetime, lastlogindate, 103)

    and only fails when running from a script !!

    Anyways, all working now so thanks once again.

    Consider Stefan_G's good suggestion (previous post above) of using a Computed Column. Since the convert forumula results in something deterministic, it can also be indexed.

    As a correction, the correct calculation for a format 103 column is CONVERT(CHAR(10),LastLoginDate,103). The first operand in that should NOT be "DATETIME".

    Using a computed column also overcomes my itch (and a lot of other people's itches) about storing an actual formatted date in a database (which is considered by most to be a "worst practice"). Think of a computed column as a single column view. Still, this type of formatting really shouldn't be done in the DB and the folks in Ml should know better.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh yeah. Almost forgot. Since the formula is "deterministic", a lot of folks may be tempted to make the computed column PERSISTED. That will take just as long as adding a column to the table because it will need to materialize a column. It won't help all that much for speed and still won't be updateable, so I wouldn't persist the column in this case.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/21/2010)


    Using a computed column also overcomes my itch (and a lot of other people's itches) about storing an actual formatted date in a database (which is considered by most to be a "worst practice").

    He's not.

    The new column is a datetime column. From the comment about the MI guys, I assume that the original column we're converting from is the formatted varchar. Using the format code in the convert to datetime tells SQL what format the value it's converting is in.

    SELECT convert (datetime, '05/12/2010')

    SELECT convert (datetime, '05/12/2010', 103)

    The first defaults to converting from the mm/dd/year format and gives 2010-05-12

    The second uses the 103 format and gives 2010-12-05

    Of course, if that assumption was wrong and the original column is also datetime, this has been a colossal waste of time.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/21/2010)


    He's not.

    Heh... crud... I missed that. Thanks, Gail.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply