Data Loading Performance issues

  • Hi all

    This based on this article http://msdn.microsoft.com/en-us/library/dd425070(SQL.100).aspx

    General information

    Trace Flag 610 in on, the data base is in Simply Recover, and Server is 2008 sp1

    What I am doing

    I have a table w/clustered index and it empty I do my first batch of insert into table and minimum logging works and data look good. I run my second batch of insert and minimum logging does not seem to work. Just so we are clear the cluster index we are using is very simple for this test four values A,B,C,D 10 million rows each and we insert the data in the order of the cluster 10m ‘A’ then 10m ‘B’ and so on.

    Here is a sample on the insert any thoughts would helpful

    Thanks

    Scott

    CREATE TABLE OutPutTable

    (

    IDRow int NULL

    ,ColInt int NULL

    ,ExpRow Char(1) NULL

    ,ColVarchar varchar(20) NULL

    ,Colchar char(2) NULL

    ,ColCSV varchar(80) NULL

    ,ColMoney money NULL

    ,ColNumeric numeric(16,4) NULL

    ,ColDate datetime NULL

    ,AutoId int IDENTITY(1,1) NOT NULL

    )

    CREATE CLUSTERED INDEX Clust_IDX ON OutPutTable (ExpRow)WITH (FillFactor = 100)

    GO

    DBCC TRACEON (610)

    Go

    --First Batch

    INSERT INTO OutPutTable WITH(Tablockx)

    (

    IDRow

    ,ColInt

    ,ExpRow

    ,ColVarchar

    ,Colchar

    ,ColCSV

    ,ColMoney

    ,ColNumeric

    ,ColDate

    )

    SELECT

    IDRow

    ,ColInt

    ,ExpRow

    ,ColVarchar

    ,Colchar

    ,ColCSV

    ,ColMoney

    ,ColNumeric

    ,ColDate

    FROM

    SAMPLEDATA

    WHERE

    ExpRow = 'A'

    GO

    DBCC TRACEOFF (610)

    GO

    DBCC TRACEON (610)

    Go

    --Second Batch

    INSERT INTO OutPutTable WITH(Tablockx)

    (

    IDRow

    ,ColInt

    ,ExpRow

    ,ColVarchar

    ,Colchar

    ,ColCSV

    ,ColMoney

    ,ColNumeric

    ,ColDate

    )

    SELECT

    IDRow

    ,ColInt

    ,ExpRow

    ,ColVarchar

    ,Colchar

    ,ColCSV

    ,ColMoney

    ,ColNumeric

    ,ColDate

    FROM

    SAMPLEDATA

    WHERE

    ExpRow = 'B'

    GO

    DBCC TRACEOFF (610)

    GO

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • I my understanding of Sunils blog is correct, you don't need to add any TABLOCK hint when using Trace Flag 610.

    You might even be able to laod the data in parallel with minimal logging.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hi

    I am not saying that Sunils blog is incorrect but in real world testing it not working w/t610 on and with TABLOCKX or with out TABLOCKX.

    Scott

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • There are a couple of ways in which you might not get minimal logging in your example where ExpRow = B. Are you able to run that query (with Actual Execution Plan ON), save the graphical execution plan (right-click on it), and upload it please?

  • In the meantime, I knocked together a demo based on your example, which shows TF610 working as advertised:

    -- REQUIRES 2008

    -- USE A DATABASE IN SIMPLE OR BULK_LOGGED MODE

    -- A suitable database on my machine is called 'Sandpit'

    USE Sandpit;

    GO

    -- This is the sample data for bulk-loading

    CREATE TABLE dbo.SampleData

    (

    exp_row CHAR(1) NOT NULL,

    );

    GO

    -- Load 9,160 test rows

    -- 3,000 each in exp_row 'A', 'B', 'C'

    -- 160 rows in exp_row 'D'

    WITH Numbers (n)

    AS (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1

    FROM master.sys.all_columns AC1,

    master.sys.all_columns AC2,

    master.sys.all_columns AC3

    )

    INSERT dbo.SampleData

    (exp_row)

    SELECT TOP (3000 * 3 + 160)

    exp_row = CHAR(65 + (n / 3000))

    FROM Numbers;

    GO

    -- Show the sample data summary

    SELECT SD.exp_row,

    row_count = COUNT_BIG(*)

    FROM dbo.SampleData SD

    GROUP BY

    SD.exp_row

    ORDER BY

    SD.exp_row;

    -- Create the target of the bulk load

    CREATE TABLE dbo.ToLoad

    (

    row_id INTEGER IDENTITY NOT NULL,

    exp_row CHAR(1) NOT NULL,

    padding CHAR(1000) NOT NULL DEFAULT SPACE(1000)

    );

    GO

    -- The clustered index on the target table

    CREATE CLUSTERED INDEX c ON dbo.ToLoad (exp_row);

    GO

    -- Truncate the log

    CHECKPOINT;

    GO

    -- Show that the only log entries are the start and end of the checkpoint

    SELECT L.Operation,

    L.[Checkpoint Begin],

    L.[Checkpoint End]

    FROM sys.fn_dblog(NULL, NULL) L

    GO

    -- Ensure TF610 is off

    DBCC TRACEOFF (610);

    GO

    -- ===============

    -- === TEST 1 ====

    -- ===============

    -- Load exp_row A

    -- Minimally-logged load into empty clustered table

    -- Only TABLOCK required

    INSERT dbo.ToLoad WITH (TABLOCK)

    (exp_row)

    SELECT SD.exp_row

    FROM dbo.SampleData SD

    WHERE SD.exp_row = 'A'

    ORDER BY

    SD.exp_row

    OPTION (RECOMPILE);

    GO

    -- Show largest log records relating to the target table

    -- (All far too small to be fully logged)

    SELECT TOP (10)

    test = 'A',

    L.Operation,

    L.Context,

    L.[Log Record Fixed Length],

    L.[Log Record Length]

    FROM sys.fn_dblog(NULL, NULL) L

    WHERE L.AllocUnitName = N'dbo.ToLoad.c'

    AND L.Context <> N'LCX_INDEX_INTERIOR'

    ORDER BY

    L.[Log Record Length] DESC;

    GO

    -- ===============

    -- === TEST 2 ====

    -- ===============

    -- Load exp_row B

    -- Clear the log again

    CHECKPOINT;

    GO

    -- Enable TF610

    DBCC TRACEON (610);

    GO

    -- TABLOCK not required

    INSERT dbo.ToLoad

    (exp_row)

    SELECT SD.exp_row

    FROM dbo.SampleData SD

    WHERE SD.exp_row = 'B'

    ORDER BY

    SD.exp_row

    OPTION (RECOMPILE);

    GO

    -- Log records show fully-logged inserts

    -- until a new page is allocated then

    -- minimal logging for the remaining rows

    SELECT TOP (10)

    test = 'B',

    L.Operation,

    L.Context,

    L.[Log Record Fixed Length],

    L.[Log Record Length]

    FROM sys.fn_dblog(NULL, NULL) L

    WHERE L.AllocUnitName = N'dbo.ToLoad.c'

    AND L.Context <> N'LCX_INDEX_INTERIOR'

    ORDER BY

    L.[Log Record Length] DESC;

    GO

    -- Clear the log again

    CHECKPOINT;

    GO

    -- ===============

    -- === TEST 3 ====

    -- ===============

    -- Load exp_row C

    -- Disable TF610

    DBCC TRACEOFF (610);

    GO

    -- Minimal logging not possible now

    -- (even with TABLOCK)

    INSERT dbo.ToLoad WITH (TABLOCK)

    (exp_row)

    SELECT SD.exp_row

    FROM dbo.SampleData SD

    WHERE SD.exp_row = 'C'

    ORDER BY

    SD.exp_row

    OPTION (RECOMPILE);

    GO

    -- Log records show all inserts were fully logged

    -- even after a new page/extent is allocated

    SELECT TOP (10)

    test = 'C',

    L.Operation,

    L.Context,

    L.[Log Record Fixed Length],

    L.[Log Record Length]

    FROM sys.fn_dblog(NULL, NULL) L

    WHERE L.AllocUnitName = N'dbo.ToLoad.c'

    AND L.Context <> N'LCX_INDEX_INTERIOR'

    ORDER BY

    L.[Log Record Length] DESC;

    GO

    -- Clear the log again

    CHECKPOINT;

    GO

    -- ===============

    -- === TEST 4 ====

    -- ===============

    -- Load exp_row D (small number of rows)

    -- Enable TF610

    DBCC TRACEON (610);

    GO

    INSERT dbo.ToLoad

    (exp_row)

    SELECT SD.exp_row

    FROM dbo.SampleData SD

    WHERE SD.exp_row = 'D'

    ORDER BY

    SD.exp_row

    OPTION (RECOMPILE);

    GO

    -- Log records show inserts were fully logged

    SELECT TOP (10)

    test = 'D',

    L.Operation,

    L.Context,

    L.[Log Record Fixed Length],

    L.[Log Record Length]

    FROM sys.fn_dblog(NULL, NULL) L

    WHERE L.AllocUnitName = N'dbo.ToLoad.c'

    AND L.Context <> N'LCX_INDEX_INTERIOR'

    ORDER BY

    L.[Log Record Length] DESC;

    GO

    DBCC TRACEOFF (610);

    -- Clean up

    DROP TABLE

    dbo.SampleData,

    dbo.ToLoad;

    GO

    edit: error in comment block

  • I haven't looked at the article yet but BOL states that minimal logging will only work on a table with a clustered index if and only if the table starts out empty. A second load on the table will use minimal logging only if you truncate the table first... trace flag or no trace flag, TABLOCKX or no TABLOCKX... period. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts. Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table. I always put them into a new or newly truncated staging table first so I can validate the data, etc, before it goes anywhere near the final table. Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/14/2010)


    I haven't looked at the article yet but BOL states that minimal logging will only work on a table with a clustered index if and only if the table starts out empty. A second load on the table will use minimal logging only if you truncate the table first... trace flag or no trace flag, TABLOCKX or no TABLOCKX... period. 🙂

    You really do need to read that article. It's a major change in 2008, and a huge performance win in many cases.

  • Jeff Moden (11/14/2010)


    Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts.

    Yes, it does. I spent quite some time on the script I posted to make that point.

    Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table.

    You *never* use INSERT...SELECT with several hundred rows or more?

    Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.

    I'm interested to know why you need full logging when bulk loading the final table.

  • Thanks for the time I will pull the plan for you and test your script in my environment

    Scott

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • shump66 (11/15/2010)


    Thanks for the time I will pull the plan for you and test your script in my environment

    I'll look forward to it.

  • Jeff Moden (11/14/2010)


    Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts. Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table. I always put them into a new or newly truncated staging table first so I can validate the data, etc, before it goes anywhere near the final table. Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.

    We work with very large data set, At times it is necessary to re-create a base tables to match a client new extract we have tables in house that start at 100M rows and go to and pass 10B rows If I have a table say with 50M rows 10 columns that has a clustered index I can use this insert..select into a new table with the same Clust index and it will take about 2-4 min. and the log file would grow to 1g. The issue I am have is I need break a 1.8B row table up into batch as we are changing the clust index and would like to Insert...Select in the correct order to take advantage minimum logging w/t610

    Scott

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • Hi Paul

    Here is some of the info you asked for As you can see for the log size and it is acting as it is fully log

    As before

    Thanks for your help

    Scott

    ****Before any data insert just contains source data

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ----------------- ------------ ------------ ---------------- -----------

    SPHTest 4288 4284 4 1

    ****After First Run With Type A

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ----------------- ------------ ------------ ---------------- -----------

    SPHTest 5488 5452 36 426

    ****After Second0 Run With Type B

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ---------------- ------------ ------------ ---------------- -----------

    SPHTest 6688 6626 62 5551

    test Operation Context Log Record Fixed Length Log Record Length

    ---- ------------------------------- ------------------------------- ----------------------- -----------------

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 4216

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 204

    --Here is the code as is

    CHECKPOINT;

    go

    DBCC TRACEON (610)

    GO

    INSERT INTO SPHTest.dbo.OutPutTable WITH(Tablockx)

    (

    IDRowNum

    ,SomeInt

    ,ExpRow

    ,SomeVarchar

    ,Somechar

    ,SomeCSV

    ,SomeMoney

    ,SomeNumeric

    ,SomeDate

    ,SomeHex12

    )

    SELECT

    IDRowNum

    ,SomeInt

    ,ExpRow

    ,SomeVarchar

    ,Somechar

    ,SomeCSV

    ,SomeMoney

    ,SomeNumeric

    ,SomeDate

    ,SomeHex12

    FROM

    SPHTest.dbo.SAMPLEDATA

    WHERE

    ExpRow = 'B'

    ORDER BY

    ExpRow

    OPTION (RECOMPILE);

    DBCC TRACEOFF (610)

    GO

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • Paul White NZ (11/14/2010)


    In the meantime, I knocked together a demo based on your example, which shows TF610 working as advertised:

    Here is the result to your script run on my servers I have some looging going on

    I change the script to insert a few more rows

    But the log file is over 8Gig after the fact? is this correct

    Before Run

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ----------------- ------------ ------------ ---------------- -----------

    SPHTest 54 1 53 1

    After I Created Sample Data and shrunk log

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ---------------- ------------ ------------ ---------------- -----------

    SPHTest 117 117 0 1

    After I ran type 'A' note the size of the log File minimal logging

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    --------------- ------------ ------------ ---------------- -----------

    SPHTest 3517 3442 75 151

    After I ran type 'B' note the size of the log File this can not be minimal logging

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    -------------- ------------ ------------ ---------------- -----------

    SPHTest 6917 6795 122 8976

    After full Run of your script

    DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB

    ----------------- ------------ ------------ ---------------- -----------

    SPHTest 12054 11982 72 8976

    --Server Version

    Microsoft SQL Server 2008 (SP1) - 10.0.2723.0 (X64)

    Jul 9 2009 23:46:07

    Copyright (c) 1988-2008 Microsoft Corporation

    Developer Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2)

    --Database Setup

    SET COMPATIBILITY_LEVEL = 100

    SET ANSI_NULL_DEFAULT ON

    SET ANSI_NULLS OFF

    SET ANSI_PADDING OFF

    SET ANSI_WARNINGS OFF

    SET ARITHABORT OFF

    SET AUTO_CLOSE OFF

    SET AUTO_CREATE_STATISTICS ON

    SET AUTO_SHRINK OFF

    SET AUTO_UPDATE_STATISTICS ON

    SET CURSOR_CLOSE_ON_COMMIT OFF

    SET CURSOR_DEFAULT GLOBAL

    SET CONCAT_NULL_YIELDS_NULL OFF

    SET NUMERIC_ROUNDABORT OFF

    SET QUOTED_IDENTIFIER OFF

    SET RECURSIVE_TRIGGERS OFF

    SET DISABLE_BROKER

    SET AUTO_UPDATE_STATISTICS_ASYNC OFF

    SET DATE_CORRELATION_OPTIMIZATION OFF

    SET TRUSTWORTHY OFF

    SET ALLOW_SNAPSHOT_ISOLATION OFF

    SET PARAMETERIZATION SIMPLE

    SET READ_COMMITTED_SNAPSHOT OFF

    SET HONOR_BROKER_PRIORITY OFF

    SET READ_WRITE

    SET RECOVERY SIMPLE

    SET MULTI_USER

    SET PAGE_VERIFY TORN_PAGE_DETECTION

    SET DB_CHAINING OFF

    (10600000 row(s) affected)

    exp_row row_count

    ------- --------------------

    A 3000000

    B 3000000

    C 3000000

    D 1600000

    (4 row(s) affected)

    Operation Checkpoint Begin Checkpoint End

    ------------------------------- ------------------------ ------------------------

    LOP_BEGIN_CKPT 2010/11/15 14:38:52:930 NULL

    LOP_END_CKPT NULL 2010/11/15 14:38:59:530

    (2 row(s) affected)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (3000000 row(s) affected)

    test Operation Context Log Record Fixed Length Log Record Length

    ---- ------------------------------- ------------------------------- ----------------------- -----------------

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    A LOP_MODIFY_ROW LCX_PFS 62 92

    (10 row(s) affected)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (3000000 row(s) affected)

    test Operation Context Log Record Fixed Length Log Record Length

    ---- ------------------------------- ------------------------------- ----------------------- -----------------

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    B LOP_INSERT_ROWS LCX_CLUSTERED 62 1100

    B LOP_MODIFY_ROW LCX_PFS 62 92

    B LOP_MODIFY_ROW LCX_PFS 62 92

    B LOP_MODIFY_ROW LCX_PFS 62 92

    B LOP_MODIFY_ROW LCX_PFS 62 92

    B LOP_MODIFY_ROW LCX_PFS 62 92

    B LOP_MODIFY_ROW LCX_PFS 62 92

    (10 row(s) affected)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (3000000 row(s) affected)

    test Operation Context Log Record Fixed Length Log Record Length

    ---- ------------------------------- ------------------------------- ----------------------- -----------------

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104

    (10 row(s) affected)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (1600000 row(s) affected)

    test Operation Context Log Record Fixed Length Log Record Length

    ---- ------------------------------- ------------------------------- ----------------------- -----------------

    D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108

    D LOP_INSERT_ROWS LCX_CLUSTERED 62 1100

    D LOP_MODIFY_ROW LCX_PFS 62 92

    D LOP_MODIFY_ROW LCX_PFS 62 92

    D LOP_MODIFY_ROW LCX_PFS 62 92

    D LOP_MODIFY_ROW LCX_PFS 62 92

    D LOP_MODIFY_ROW LCX_PFS 62 92

    (10 row(s) affected)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Scott
    I am a Senior Data Analyst/DBA
    I work with MS SQL, Oracle & several ETL Tools

    @thewolf66[/url]

  • Scott,

    The good news is that you are seeing minimal logging, but read on.

    Minimal logging means that SQL Server does not log the individual row changes; it logs changes to pages instead. The crucial point is that SQL Server performs many fewer logging operations when using minimal logging. Minimal logging does not necessarily mean that less log space will be used overall (it might use more).

    When changing one row at a time, SQL Server has to build the log record for that change (including any compensating actions needed to undo it), write the log record, wait for the storage system to report that the log record is safely stored, and then make the data change. With minimal logging, SQL Server can make a whole page of changes at once.

    The primary advantage of minimal logging then, is that the number of logged operations is roughly proportional to the number of pages written, rather than the number of rows. It is often more efficient to log a smaller number of (potentially slightly larger) records, than log a larger number of perhaps slightly smaller records.

    So, under minimal logging, operations that change a large number of rows will complete faster than under full logging. Minimal logging therefore optimizes for speed - though it might end up using more log space, under some conditions. For a small number of rows, minimal logging might actually be slower - that is the reason that test D in my script does not use minimal logging.

    Inserting into an empty clustered table allows more minimal-logging optimizations than when the same table already contains records. This is the reason that test A generates many fewer log operations, compared to test B. For an empty clustered table, the structure of the clustered index can be created with the pages. When data already exists, we can't just create the clustered structure as we go, we have to link it in to the existing structure.

    I modified my script to show the differences between full and minimally logged behaviour:

    Test A (empty clustered table)

    Minimal Logging: 35 log records; 23,800 bytes used; 43,876 bytes reserved

    Full Logging: 464 log records; 3,220,716 bytes used; 52,158 bytes reserved

    Test B (non-empty clustered table)

    Minimal Logging: 74 log records; 321,064 bytes used; 11,521,977 bytes reserved

    Full Logging: 3069 log records; 3,689,584 bytes used; 970,684 bytes reserved

    As you can see, minimal logging generates many fewer records, which use less log space. Notice though that more log space is reserved under minimal logging. This log reservation is used to guarantee there will be enough log space if the server needs to roll the changes back. As Sunil mentions on his blog, the algorithm that reserves space in the log is deliberately very conservative, and reserves space as if we were using full logging.

    Paul

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply