November 13, 2010 at 5:58 am
Hi all
This based on this article http://msdn.microsoft.com/en-us/library/dd425070(SQL.100).aspx
General information
Trace Flag 610 in on, the data base is in Simply Recover, and Server is 2008 sp1
What I am doing
I have a table w/clustered index and it empty I do my first batch of insert into table and minimum logging works and data look good. I run my second batch of insert and minimum logging does not seem to work. Just so we are clear the cluster index we are using is very simple for this test four values A,B,C,D 10 million rows each and we insert the data in the order of the cluster 10m ‘A’ then 10m ‘B’ and so on.
Here is a sample on the insert any thoughts would helpful
Thanks
Scott
CREATE TABLE OutPutTable
(
IDRow int NULL
,ColInt int NULL
,ExpRow Char(1) NULL
,ColVarchar varchar(20) NULL
,Colchar char(2) NULL
,ColCSV varchar(80) NULL
,ColMoney money NULL
,ColNumeric numeric(16,4) NULL
,ColDate datetime NULL
,AutoId int IDENTITY(1,1) NOT NULL
)
CREATE CLUSTERED INDEX Clust_IDX ON OutPutTable (ExpRow)WITH (FillFactor = 100)
GO
DBCC TRACEON (610)
Go
--First Batch
INSERT INTO OutPutTable WITH(Tablockx)
(
IDRow
,ColInt
,ExpRow
,ColVarchar
,Colchar
,ColCSV
,ColMoney
,ColNumeric
,ColDate
)
SELECT
IDRow
,ColInt
,ExpRow
,ColVarchar
,Colchar
,ColCSV
,ColMoney
,ColNumeric
,ColDate
FROM
SAMPLEDATA
WHERE
ExpRow = 'A'
GO
DBCC TRACEOFF (610)
GO
DBCC TRACEON (610)
Go
--Second Batch
INSERT INTO OutPutTable WITH(Tablockx)
(
IDRow
,ColInt
,ExpRow
,ColVarchar
,Colchar
,ColCSV
,ColMoney
,ColNumeric
,ColDate
)
SELECT
IDRow
,ColInt
,ExpRow
,ColVarchar
,Colchar
,ColCSV
,ColMoney
,ColNumeric
,ColDate
FROM
SAMPLEDATA
WHERE
ExpRow = 'B'
GO
DBCC TRACEOFF (610)
GO
November 13, 2010 at 6:46 am
I my understanding of Sunils blog is correct, you don't need to add any TABLOCK hint when using Trace Flag 610.
You might even be able to laod the data in parallel with minimal logging.
November 13, 2010 at 7:00 am
November 14, 2010 at 5:46 pm
There are a couple of ways in which you might not get minimal logging in your example where ExpRow = B. Are you able to run that query (with Actual Execution Plan ON), save the graphical execution plan (right-click on it), and upload it please?
November 14, 2010 at 5:49 pm
In the meantime, I knocked together a demo based on your example, which shows TF610 working as advertised:
-- REQUIRES 2008
-- USE A DATABASE IN SIMPLE OR BULK_LOGGED MODE
-- A suitable database on my machine is called 'Sandpit'
USE Sandpit;
GO
-- This is the sample data for bulk-loading
CREATE TABLE dbo.SampleData
(
exp_row CHAR(1) NOT NULL,
);
GO
-- Load 9,160 test rows
-- 3,000 each in exp_row 'A', 'B', 'C'
-- 160 rows in exp_row 'D'
WITH Numbers (n)
AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1
FROM master.sys.all_columns AC1,
master.sys.all_columns AC2,
master.sys.all_columns AC3
)
INSERT dbo.SampleData
(exp_row)
SELECT TOP (3000 * 3 + 160)
exp_row = CHAR(65 + (n / 3000))
FROM Numbers;
GO
-- Show the sample data summary
SELECT SD.exp_row,
row_count = COUNT_BIG(*)
FROM dbo.SampleData SD
GROUP BY
SD.exp_row
ORDER BY
SD.exp_row;
-- Create the target of the bulk load
CREATE TABLE dbo.ToLoad
(
row_id INTEGER IDENTITY NOT NULL,
exp_row CHAR(1) NOT NULL,
padding CHAR(1000) NOT NULL DEFAULT SPACE(1000)
);
GO
-- The clustered index on the target table
CREATE CLUSTERED INDEX c ON dbo.ToLoad (exp_row);
GO
-- Truncate the log
CHECKPOINT;
GO
-- Show that the only log entries are the start and end of the checkpoint
SELECT L.Operation,
L.[Checkpoint Begin],
L.[Checkpoint End]
FROM sys.fn_dblog(NULL, NULL) L
GO
-- Ensure TF610 is off
DBCC TRACEOFF (610);
GO
-- ===============
-- === TEST 1 ====
-- ===============
-- Load exp_row A
-- Minimally-logged load into empty clustered table
-- Only TABLOCK required
INSERT dbo.ToLoad WITH (TABLOCK)
(exp_row)
SELECT SD.exp_row
FROM dbo.SampleData SD
WHERE SD.exp_row = 'A'
ORDER BY
SD.exp_row
OPTION (RECOMPILE);
GO
-- Show largest log records relating to the target table
-- (All far too small to be fully logged)
SELECT TOP (10)
test = 'A',
L.Operation,
L.Context,
L.[Log Record Fixed Length],
L.[Log Record Length]
FROM sys.fn_dblog(NULL, NULL) L
WHERE L.AllocUnitName = N'dbo.ToLoad.c'
AND L.Context <> N'LCX_INDEX_INTERIOR'
ORDER BY
L.[Log Record Length] DESC;
GO
-- ===============
-- === TEST 2 ====
-- ===============
-- Load exp_row B
-- Clear the log again
CHECKPOINT;
GO
-- Enable TF610
DBCC TRACEON (610);
GO
-- TABLOCK not required
INSERT dbo.ToLoad
(exp_row)
SELECT SD.exp_row
FROM dbo.SampleData SD
WHERE SD.exp_row = 'B'
ORDER BY
SD.exp_row
OPTION (RECOMPILE);
GO
-- Log records show fully-logged inserts
-- until a new page is allocated then
-- minimal logging for the remaining rows
SELECT TOP (10)
test = 'B',
L.Operation,
L.Context,
L.[Log Record Fixed Length],
L.[Log Record Length]
FROM sys.fn_dblog(NULL, NULL) L
WHERE L.AllocUnitName = N'dbo.ToLoad.c'
AND L.Context <> N'LCX_INDEX_INTERIOR'
ORDER BY
L.[Log Record Length] DESC;
GO
-- Clear the log again
CHECKPOINT;
GO
-- ===============
-- === TEST 3 ====
-- ===============
-- Load exp_row C
-- Disable TF610
DBCC TRACEOFF (610);
GO
-- Minimal logging not possible now
-- (even with TABLOCK)
INSERT dbo.ToLoad WITH (TABLOCK)
(exp_row)
SELECT SD.exp_row
FROM dbo.SampleData SD
WHERE SD.exp_row = 'C'
ORDER BY
SD.exp_row
OPTION (RECOMPILE);
GO
-- Log records show all inserts were fully logged
-- even after a new page/extent is allocated
SELECT TOP (10)
test = 'C',
L.Operation,
L.Context,
L.[Log Record Fixed Length],
L.[Log Record Length]
FROM sys.fn_dblog(NULL, NULL) L
WHERE L.AllocUnitName = N'dbo.ToLoad.c'
AND L.Context <> N'LCX_INDEX_INTERIOR'
ORDER BY
L.[Log Record Length] DESC;
GO
-- Clear the log again
CHECKPOINT;
GO
-- ===============
-- === TEST 4 ====
-- ===============
-- Load exp_row D (small number of rows)
-- Enable TF610
DBCC TRACEON (610);
GO
INSERT dbo.ToLoad
(exp_row)
SELECT SD.exp_row
FROM dbo.SampleData SD
WHERE SD.exp_row = 'D'
ORDER BY
SD.exp_row
OPTION (RECOMPILE);
GO
-- Log records show inserts were fully logged
SELECT TOP (10)
test = 'D',
L.Operation,
L.Context,
L.[Log Record Fixed Length],
L.[Log Record Length]
FROM sys.fn_dblog(NULL, NULL) L
WHERE L.AllocUnitName = N'dbo.ToLoad.c'
AND L.Context <> N'LCX_INDEX_INTERIOR'
ORDER BY
L.[Log Record Length] DESC;
GO
DBCC TRACEOFF (610);
-- Clean up
DROP TABLE
dbo.SampleData,
dbo.ToLoad;
GO
edit: error in comment block
November 14, 2010 at 7:27 pm
I haven't looked at the article yet but BOL states that minimal logging will only work on a table with a clustered index if and only if the table starts out empty. A second load on the table will use minimal logging only if you truncate the table first... trace flag or no trace flag, TABLOCKX or no TABLOCKX... period. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
November 14, 2010 at 7:47 pm
Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts. Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table. I always put them into a new or newly truncated staging table first so I can validate the data, etc, before it goes anywhere near the final table. Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 14, 2010 at 7:50 pm
Jeff Moden (11/14/2010)
I haven't looked at the article yet but BOL states that minimal logging will only work on a table with a clustered index if and only if the table starts out empty. A second load on the table will use minimal logging only if you truncate the table first... trace flag or no trace flag, TABLOCKX or no TABLOCKX... period. 🙂
You really do need to read that article. It's a major change in 2008, and a huge performance win in many cases.
November 14, 2010 at 7:53 pm
Jeff Moden (11/14/2010)
Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts.
Yes, it does. I spent quite some time on the script I posted to make that point.
Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table.
You *never* use INSERT...SELECT with several hundred rows or more?
Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.
I'm interested to know why you need full logging when bulk loading the final table.
November 15, 2010 at 5:48 am
November 15, 2010 at 6:16 am
shump66 (11/15/2010)
Thanks for the time I will pull the plan for you and test your script in my environment
I'll look forward to it.
November 15, 2010 at 6:30 am
Jeff Moden (11/14/2010)
Heh... OK... Now I get to pry the shoe out of my mouth. :blush: It looks like TF 610 does, in fact, help with converting fully logged inserts to minimally logged inserts. Oddly enough, it still doesn't matter for anything I do because I never do bulk inserts directly to the final table. I always put them into a new or newly truncated staging table first so I can validate the data, etc, before it goes anywhere near the final table. Once I start moving data from the staging table to the final table, I'm going to want it all to be logged, anyway.
We work with very large data set, At times it is necessary to re-create a base tables to match a client new extract we have tables in house that start at 100M rows and go to and pass 10B rows If I have a table say with 50M rows 10 columns that has a clustered index I can use this insert..select into a new table with the same Clust index and it will take about 2-4 min. and the log file would grow to 1g. The issue I am have is I need break a 1.8B row table up into batch as we are changing the clust index and would like to Insert...Select in the correct order to take advantage minimum logging w/t610
Scott
November 15, 2010 at 11:57 am
Hi Paul
Here is some of the info you asked for As you can see for the log size and it is acting as it is fully log
As before
Thanks for your help
Scott
****Before any data insert just contains source data
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
----------------- ------------ ------------ ---------------- -----------
SPHTest 4288 4284 4 1
****After First Run With Type A
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
----------------- ------------ ------------ ---------------- -----------
SPHTest 5488 5452 36 426
****After Second0 Run With Type B
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
---------------- ------------ ------------ ---------------- -----------
SPHTest 6688 6626 62 5551
test Operation Context Log Record Fixed Length Log Record Length
---- ------------------------------- ------------------------------- ----------------------- -----------------
B LOP_INSERT_ROWS LCX_CLUSTERED 62 4216
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
B LOP_INSERT_ROWS LCX_CLUSTERED 62 204
--Here is the code as is
CHECKPOINT;
go
DBCC TRACEON (610)
GO
INSERT INTO SPHTest.dbo.OutPutTable WITH(Tablockx)
(
IDRowNum
,SomeInt
,ExpRow
,SomeVarchar
,Somechar
,SomeCSV
,SomeMoney
,SomeNumeric
,SomeDate
,SomeHex12
)
SELECT
IDRowNum
,SomeInt
,ExpRow
,SomeVarchar
,Somechar
,SomeCSV
,SomeMoney
,SomeNumeric
,SomeDate
,SomeHex12
FROM
SPHTest.dbo.SAMPLEDATA
WHERE
ExpRow = 'B'
ORDER BY
ExpRow
OPTION (RECOMPILE);
DBCC TRACEOFF (610)
GO
November 15, 2010 at 1:36 pm
Paul White NZ (11/14/2010)
Here is the result to your script run on my servers I have some looging going on
I change the script to insert a few more rows
But the log file is over 8Gig after the fact? is this correct
Before Run
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
----------------- ------------ ------------ ---------------- -----------
SPHTest 54 1 53 1
After I Created Sample Data and shrunk log
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
---------------- ------------ ------------ ---------------- -----------
SPHTest 117 117 0 1
After I ran type 'A' note the size of the log File minimal logging
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
--------------- ------------ ------------ ---------------- -----------
SPHTest 3517 3442 75 151
After I ran type 'B' note the size of the log File this can not be minimal logging
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
-------------- ------------ ------------ ---------------- -----------
SPHTest 6917 6795 122 8976
After full Run of your script
DataBaseName TotalSize_MB SpaceUsed_MB FreeSpaceinDB_MB LogSize_MB
----------------- ------------ ------------ ---------------- -----------
SPHTest 12054 11982 72 8976
--Server Version
Microsoft SQL Server 2008 (SP1) - 10.0.2723.0 (X64)
Jul 9 2009 23:46:07
Copyright (c) 1988-2008 Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2)
--Database Setup
SET COMPATIBILITY_LEVEL = 100
SET ANSI_NULL_DEFAULT ON
SET ANSI_NULLS OFF
SET ANSI_PADDING OFF
SET ANSI_WARNINGS OFF
SET ARITHABORT OFF
SET AUTO_CLOSE OFF
SET AUTO_CREATE_STATISTICS ON
SET AUTO_SHRINK OFF
SET AUTO_UPDATE_STATISTICS ON
SET CURSOR_CLOSE_ON_COMMIT OFF
SET CURSOR_DEFAULT GLOBAL
SET CONCAT_NULL_YIELDS_NULL OFF
SET NUMERIC_ROUNDABORT OFF
SET QUOTED_IDENTIFIER OFF
SET RECURSIVE_TRIGGERS OFF
SET DISABLE_BROKER
SET AUTO_UPDATE_STATISTICS_ASYNC OFF
SET DATE_CORRELATION_OPTIMIZATION OFF
SET TRUSTWORTHY OFF
SET ALLOW_SNAPSHOT_ISOLATION OFF
SET PARAMETERIZATION SIMPLE
SET READ_COMMITTED_SNAPSHOT OFF
SET HONOR_BROKER_PRIORITY OFF
SET READ_WRITE
SET RECOVERY SIMPLE
SET MULTI_USER
SET PAGE_VERIFY TORN_PAGE_DETECTION
SET DB_CHAINING OFF
(10600000 row(s) affected)
exp_row row_count
------- --------------------
A 3000000
B 3000000
C 3000000
D 1600000
(4 row(s) affected)
Operation Checkpoint Begin Checkpoint End
------------------------------- ------------------------ ------------------------
LOP_BEGIN_CKPT 2010/11/15 14:38:52:930 NULL
LOP_END_CKPT NULL 2010/11/15 14:38:59:530
(2 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(3000000 row(s) affected)
test Operation Context Log Record Fixed Length Log Record Length
---- ------------------------------- ------------------------------- ----------------------- -----------------
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
A LOP_MODIFY_ROW LCX_PFS 62 92
(10 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(3000000 row(s) affected)
test Operation Context Log Record Fixed Length Log Record Length
---- ------------------------------- ------------------------------- ----------------------- -----------------
B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
B LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
B LOP_INSERT_ROWS LCX_CLUSTERED 62 1100
B LOP_MODIFY_ROW LCX_PFS 62 92
B LOP_MODIFY_ROW LCX_PFS 62 92
B LOP_MODIFY_ROW LCX_PFS 62 92
B LOP_MODIFY_ROW LCX_PFS 62 92
B LOP_MODIFY_ROW LCX_PFS 62 92
B LOP_MODIFY_ROW LCX_PFS 62 92
(10 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(3000000 row(s) affected)
test Operation Context Log Record Fixed Length Log Record Length
---- ------------------------------- ------------------------------- ----------------------- -----------------
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
C LOP_INSERT_ROWS LCX_CLUSTERED 62 1104
(10 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(1600000 row(s) affected)
test Operation Context Log Record Fixed Length Log Record Length
---- ------------------------------- ------------------------------- ----------------------- -----------------
D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
D LOP_INSERT_ROWS LCX_CLUSTERED 62 1108
D LOP_INSERT_ROWS LCX_CLUSTERED 62 1100
D LOP_MODIFY_ROW LCX_PFS 62 92
D LOP_MODIFY_ROW LCX_PFS 62 92
D LOP_MODIFY_ROW LCX_PFS 62 92
D LOP_MODIFY_ROW LCX_PFS 62 92
D LOP_MODIFY_ROW LCX_PFS 62 92
(10 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
November 15, 2010 at 11:41 pm
Scott,
The good news is that you are seeing minimal logging, but read on.
Minimal logging means that SQL Server does not log the individual row changes; it logs changes to pages instead. The crucial point is that SQL Server performs many fewer logging operations when using minimal logging. Minimal logging does not necessarily mean that less log space will be used overall (it might use more).
When changing one row at a time, SQL Server has to build the log record for that change (including any compensating actions needed to undo it), write the log record, wait for the storage system to report that the log record is safely stored, and then make the data change. With minimal logging, SQL Server can make a whole page of changes at once.
The primary advantage of minimal logging then, is that the number of logged operations is roughly proportional to the number of pages written, rather than the number of rows. It is often more efficient to log a smaller number of (potentially slightly larger) records, than log a larger number of perhaps slightly smaller records.
So, under minimal logging, operations that change a large number of rows will complete faster than under full logging. Minimal logging therefore optimizes for speed - though it might end up using more log space, under some conditions. For a small number of rows, minimal logging might actually be slower - that is the reason that test D in my script does not use minimal logging.
Inserting into an empty clustered table allows more minimal-logging optimizations than when the same table already contains records. This is the reason that test A generates many fewer log operations, compared to test B. For an empty clustered table, the structure of the clustered index can be created with the pages. When data already exists, we can't just create the clustered structure as we go, we have to link it in to the existing structure.
I modified my script to show the differences between full and minimally logged behaviour:
Test A (empty clustered table)
Minimal Logging: 35 log records; 23,800 bytes used; 43,876 bytes reserved
Full Logging: 464 log records; 3,220,716 bytes used; 52,158 bytes reserved
Test B (non-empty clustered table)
Minimal Logging: 74 log records; 321,064 bytes used; 11,521,977 bytes reserved
Full Logging: 3069 log records; 3,689,584 bytes used; 970,684 bytes reserved
As you can see, minimal logging generates many fewer records, which use less log space. Notice though that more log space is reserved under minimal logging. This log reservation is used to guarantee there will be enough log space if the server needs to roll the changes back. As Sunil mentions on his blog, the algorithm that reserves space in the log is deliberately very conservative, and reserves space as if we were using full logging.
Paul
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply