Problem when trying to move 65mill. records from table a to table b

  • Hi all,

    I have a problem with a table. It's a logging table where one coloumn for some reason is a NTEXT field - and it never contains more than 40 chars.

    The table def.

    CREATE TABLE [dbo].[WebService_Security_Log](

    [WSLog_GUID] [uniqueidentifier] NOT NULL,

    [WSLog_Account] [nvarchar](50) NULL,

    [WSLog_Username] [nvarchar](50) NULL,

    [WSLog_Password] [nvarchar](50) NULL,

    [WSLog_Method] [nvarchar](100) NULL,

    [WSLog_RelayMethod] [nvarchar](100) NULL,

    [WSLog_LoginResult] [tinyint] NULL,

    [WSLog_MethodReturnCode] [int] NULL,

    [WSLog_Begin] [datetime] NULL,

    [WSLog_End] [datetime] NULL,

    [WSLog_Details] [ntext] NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    Now the table contains 65mill records and takes up about 25 gigs of data. I really want this tabel moved to another database ( and change the type of WSLog_Details)

    But for some reason when i try do copy the data the transaction runs extremely slow. The speed is about 10.000 records per 90 seconds. The odd thing is i've got a table with the same data but only 6 mill records and here theres no problem. It can copy all records in about 90 seconds. If I omit the NTEXT field when copying data from the large tabel - theres no problems either. Except I would like to keep the data in that field.

    I belive i've tried about everything else than asking here..

    Attempt 1:

    INSERT INTO archive..WebService_Security_Log

    SELECT * FROM prod..WebService_Security_Log

    This code will run with about 10.000 records per 90 seconds.

    I've tried to set the hit WITH (NOLOCK) on the source. This does not have any effect.

    I've tried to set recovery mode to BULK_LOGGED before starting the operation. This does not have any effect.

    I've tried to run the copy via DTS and SSIS - but neither improves the speed.

    Attempt 2:

    INSERT INTO archive..WebService_Security_Log

    SELECT

    s.[WSLog_GUID],

    s.[WSLog_Account],

    s.[WSLog_Username],

    s.[WSLog_Password],

    s.[WSLog_Method],

    s.[WSLog_RelayMethod],

    s.[WSLog_LoginResult],

    s.[WSLog_MethodReturnCode],

    s.[WSLog_Begin],

    s.[WSLog_End],

    NULL

    FROM prod..WebService_Security_Log s

    This runs with blazing speed - but ofcourse i don't get the info from the Details field with me.

    I've tried to add a new NVARCHAR field to the tabel and trying to update that field with the data from the Details field - but this it will not do on the 65mill table - no problems when its the smaller table.

    I have most likely forgotten some of the stuff i've tried - but i'm starting to think there something wrong with the data in table. None of the data in either of the tables are used and therefore cached. And to me the 6 mill record table seems to of such a size that potential problems with a copy of all the data would show up here too.. It's not just a small sample table..

    Does anyone have any idea of what could be the problem - or and idea of what / or where I should be looking to get an idea of whats blocking the data copy??!

    Thanks,

    Kresten

    Thanks!

    /Kresten

  • I don't have a solid answer for you, but I do know that your ntext data is stored off-row and SQL server uses a pointer to find it. those types of lookups are often random I/O on the filesystem and slow things down quite a bit. You may look at the DMV's or sysprocesses to look at the wait types on that query.

    also try Simple Recovery if possible with (NOLOCK) on the source table.

    Bulk Logged won't have any affect on the statement you wrote.

    If you can't go to siimple, you might just try WRITETEXT.

    There's also the possiblity that the db is growing and space allocations are chewing you up.

    dunno.

    make sure you have instant file initialization on, just in case and set your autogrowth extent setting to grow by a reasonable amount for a 26GB growth (512MB vs 1kb)

    Anyway, good luck!

  • try breaking up the move into chunks. so you'd first move rows 1-1000000, then 1000001 to 2000000 then 2000001 to 3000000 etc.

    also, you might disable/remove any/all indexes until after the move and then bring them online (even though that'll probably take a long time).

  • Hi,

    Thx for the replies i'm currently trying this:

    (I've added a coloumn to keep track of the sets i'm trying to copy.. )

    SET NOCOUNT ON;

    GO

    ALTER DATABASE GBPROD SET RECOVERY SIMPLE

    GO

    SET ROWCOUNT 500000

    UPDATE GBPROD..Archive_WebService_Security_Log

    SET [state] = 0

    GO

    INSERT INTO tempstorage..Archive_WebService_Security_Log

    SELECT

    s.[WSLog_GUID],

    s.[WSLog_Account],

    s.[WSLog_Username],

    s.[WSLog_Password],

    s.[WSLog_Method],

    s.[WSLog_RelayMethod],

    s.[WSLog_LoginResult],

    s.[WSLog_MethodReturnCode],

    s.[WSLog_Begin],

    s.[WSLog_End],

    s.[WSLog_Details]

    FROM GBPROD..Archive_WebService_Security_Log s WITH (NOLOCK)

    WHERE [State] = 0

    GO

    SET ROWCOUNT 0

    UPDATE GBPROD..Archive_WebService_Security_Log SET

    [State] = 1

    WHERE [State] = 0

    GO

    ALTER DATABASE GBPROD SET RECOVERY FULL

    GO

    This executes in 6:45 min - and since this is a move-once and then forget all about this - I can move a set of 500.000 records when I got some time and the system is not too busy .. And then when i'm done just drop the table..

    The first times I was fiddling with this i was DTS - and i usually stopped the job after 50.000 records.. Wondering.. Trying to restart and then the first 50.000 records was copied in a second, but when the data apparently was not loaded - speed dropped again.

    There are no indexes at all - so no worries there.

    Also we have tried to create multiple datafiles in the destination table with a growth of 1024mb - just to be sure. And ofcourse this did not have any effect what so ever..

    So my best guess is something prevents the data to be loaded fast enough - for the fun of it I tried to copy another 110mill+ record table of 10 gigs - this took about 10 minutes (or maybe even less) - and all my other tests shows the same speed .. as long theres no NTEXT fields.

    I have one question about the WRITETEXT.. I've never used that - but would that require me to copy all data except for the NTEXT field to the dest table and then afterwards use the WRITETEXT to copy the data from the source to destination table? And what makes WRITETEXT cool?

    Thanks,

    Kresten

    Thanks!

    /Kresten

  • Build an SSIS package to do it. It will almost certain run faster. Insert/Select is a slow operation, you want to get it to do a Bulk/Fast insert.

    The package would be little more than an Exec SQL task to trunc the new table (especially usefull when building the process), and a dataflow task. You will also have 2 connections. Simple.

    CEWII

  • Hi,

    I've tried using SSIS already - my normal goto DBA guru, was extremely surprised that it did not help at all.. Ok, maybe it would take 30 seconds of the 6:45- but no more than that..

    Everything that touches that table just seems to more or less fall asleep...

    And just as i'm writing this i tried to move 6 mill records from the small table to the new dest. That took 2:20, where moving 500.000 from the big table to the SAME dest took 6:45 :S

    Give me some hardware to kick hard 🙂

    Thanks for all inputs!

    /Kresten

    Thanks!

    /Kresten

  • Then that leasds me to question what it is about that table, in the query that pulls the data are you doing an order by or is it just SELECT * FROM dbo.table??

    Any calculated columns?

    I would still do it in SSIS..

    CEWII

  • I have one question about the WRITETEXT.. I've never used that - but would that require me to copy all data except for the NTEXT field to the dest table and then afterwards use the WRITETEXT to copy the data from the source to destination table? And what makes WRITETEXT cool?

    It essentially minimally logs BLOB types, which saves writing to the log. instead of logging the actual changes to the data pages in case of rollback, it just acknowledges a page was changed. The problem is you can't roll back if you encounter a problem. probably not a big deal in this case.

    Even in simple recovery db writes are are logged fully and once the transaction commits, then the log is truncated.

    By removing the logging of the NTEXT changes, you reduce the amount of log writes which can speed you up.

    WRITETEXT could be used in conjunction with TEXTPTR to update the columns with their corrosponding NTEXT values /after/ you write the in-row data to the table, which you said was much faster. My guess is because 1. it is in-row data and 2. it doesn't have to log as much.

    Cheers!

  • Well - i'm not sure what that tables problem is.. But it's strange..

    I've been sitting and hitting F5 a couple of times to run the transaction with 500.000 records.. and it was running at about 6-7 min per round. While getting mentally ready for having to run this 120 times - then all of a sudden the execution time dropped to 1:30 for 500.00.. and my last execution was 4:30 for 2.000.000 records. Still far from the speed of the smaller table but at least it will make me possible to have the table moved while i'm still 30 🙂

    Could this be due to very messed up data in the beginning of the table? It's a 7 year old database - with 300 + tables and 130 gigs of data in 300 mill+ records.. We have not been doing much cleaning on the database until now - so maybe the data had been scattered all over the place in the beginning of the table.. ?

    I surely will be using SSIS in the future - but for now i'll stick with my script and the F5 button 🙂

    /Kresten

    Thanks!

    /Kresten

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply