October 12, 2017 at 8:16 am
Hi guys,
i wonder why executing a TRUNCATE and INSERT INTO Command is mutch slower than executing these commands separately ?
Conditions:
recovery model: simple
table1 & table 2 have same structure & clustered index.
row count: 10,000,000
Data space: 1.8 GB
Using GO results in 50% less execution time. On my system : 34 to 65 sec
TRUNCATE TABLE table1
--GO --without slower
INSERT INTO table1 WITH (TABLOCK)
SELECT * FROM table2
Why is this slower ?
Is it possible to get the same performance in a stored procedure?
Thanks for your help.
October 12, 2017 at 9:11 am
Not sure, but my guess? The go sends the batch, and SQL does the truncate, which defers some of the deallocations, then gets the insert and processes that right away. Without the GO, I suspect SQL tries to deallocate all the pages before it starts the insert. Just a guess.
October 12, 2017 at 9:20 am
mrpellepelle - Thursday, October 12, 2017 8:16 AMHi guys,i wonder why executing a TRUNCATE and INSERT INTO Command is mutch slower than executing these commands separately ?
Conditions:
recovery model: simple
table1 & table 2 have same structure & clustered index.
row count: 10,000,000
Data space: 1.8 GBUsing GO results in 50% less execution time. On my system : 34 to 65 sec
TRUNCATE TABLE table1--GO --without slower
INSERT INTO table1 WITH (TABLOCK)
SELECT * FROM table2Why is this slower ?
Is it possible to get the same performance in a stored procedure?Thanks for your help.
Interesting. If you compare total time elapsed (ie, include the time taken for the INSERT to complete), is the GO method still faster?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
October 12, 2017 at 11:10 am
There should be no difference, is the whole query being executed? If not post ALL the code!
😎
October 12, 2017 at 11:27 am
How consistently are you able to generate those results?
October 13, 2017 at 1:46 am
Hi,
i did some more testing and checked the data file i/o behaviour:
Without GO: writes in LDF. At the end high writes to MDF
With GO: writes in MDF. Almost no LDF writes
@steve-2 Jones
Does this support your assumption?
@ZZartin
I tested this behaviour about 20 times on 2 systems with same hardware/software. I always come up with a time difference about 50%.
@Eirikur Eiriksson
It is all the code. :unsure:
@Phil Parkin
I am not sure if i get your point. I executed all the code with and without the GO command and checked the total execution time in SSMS
October 13, 2017 at 5:43 am
mrpellepelle - Friday, October 13, 2017 1:46 AM@Phil Parkin
I am not sure if i get your point. I executed all the code with and without the GO command and checked the total execution time in SSMS
Haha, now that I reread it, me neither. And I hadn't even been drinking. I think I had some notion about something executing asynchronously, but it doesn't stand up to close examination 🙂
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
October 13, 2017 at 9:12 am
Not sure. I think so, but not sure. Let me ask someone else that might better understand.
October 13, 2017 at 9:51 am
Just curious whether there's any difference in the execution plans generated with versus without.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 13, 2017 at 10:13 am
I sent a note off to Paul Randal. He says there's no difference and shouldn't be much at all.
You've got something else going on. Log growths, checkpoints something. Is this db in simple mode?
October 13, 2017 at 12:26 pm
mrpellepelle - Thursday, October 12, 2017 8:16 AMHi guys,i wonder why executing a TRUNCATE and INSERT INTO Command is mutch slower than executing these commands separately ?
Conditions:
recovery model: simple
table1 & table 2 have same structure & clustered index.
row count: 10,000,000
Data space: 1.8 GBUsing GO results in 50% less execution time. On my system : 34 to 65 sec
TRUNCATE TABLE table1--GO --without slower
INSERT INTO table1 WITH (TABLOCK)
SELECT * FROM table2Why is this slower ?
Is it possible to get the same performance in a stored procedure?Thanks for your help.
Possibly flying in the face of what the experts and a rather serious white paper on the subject claim...
In the experiments I've been doing with the idea of achieving Minimally Logged INSERTs into an empty table, this seems to have to do with an undocumented "problem" that I've found with Minimal Logging and it's easily fixed. The experiments I've been doing have to do with whether or not Minimal Logging would actually occur when a variable was in the code and the value of the variable wasn't known until Run Time. I also did experiments with the idea that the INSERTs needed to occur in the same order as the target Clustered Index, which is documented. I don't believe the ORDER BY problem comes into play here but I do believe that you're having a similar problem with TRUNCATE as I was having with Run Time Populated Variables and that is, the optimizer doesn't necessarily know that the table is being truncated and so assumes it's not. As with the Run Time Populated Variables problem, I believe the fix is to the "TRUNCATE in the same batch" problem may be to add OPTION(RECOMPILE) to the INSERT code.
And, no... I've not yet done a deep dive on the EXECUTION PLANs in any case. I only did repeatable tests do demonstrate the work around of using OPTION(RECOMPILE) in association with the Run Time variable problem.
I'll also state that since it's a requirement for "Minimally Logged Inserts to a Clustered Index" that the input must be in the same order as the target Clustered Index, you shouldn't take a chance on whether or not you need to include an ORDER BY. I've taken to always including it because, I've found, if it's needed, it will be used. If it's not, it won't. Nothing like a usually free guarantee.
If you have the time, would you post the DDL, constraints, defaults, and indexes for your table? I'd like to setup a test and the closer I can get to emulating your 10 million row table, the more definitive the test results will be. Thanks.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 16, 2017 at 3:22 am
When i use OPTION(RECOMPILE) to the INSERT statement, there is no time difference anymore.
GREAT, but why?
@jeff Moden
Anonymized, compressed DDL:
CREATE TABLE [dbo].[contract](
[contract_id] [INT] NOT NULL,
id1 [INT] NULL, id2 [INT] NULL, id3 [INT] NULL, id4 [INT] NULL,
id5 [INT] NULL, id6 [INT] NULL, id7 [INT] NULL, id8 [INT] NULL,
id9 [INT] NULL, id10 [SMALLINT] NULL, Id11 [INT] NULL, id12 [INT] NULL, id13 [INT] NULL, id14 [INT] NULL,
id15 [INT] NULL, id16 [INT] NULL, id17 [INT] NULL, id18 [INT] NULL,
id19 [INT] NULL, id20 [INT] NULL, id21 [INT] NULL, id22 [TINYINT] NOT NULL,
id23 [TINYINT] NOT NULL, id24 [TINYINT] NOT NULL, id25 [TINYINT] NOT NULL, id26 [TINYINT] NOT NULL,
id27 [TINYINT] NOT NULL, id28 [TINYINT] NOT NULL, id29 [TINYINT] NOT NULL, id30 [TINYINT] NOT NULL,
id31 [TINYINT] NOT NULL, id32 [INT] NOT NULL,id33 [INT] NULL,
CONSTRAINT [PK_contract] PRIMARY KEY CLUSTERED
(
[contract_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
@sgmunson Execution plan is the same. Clustered Index scan -> Clusterd Index Insert
@steve-2 Jones As mentioned in my first post, the model is simple.
October 16, 2017 at 6:02 am
mrpellepelle - Monday, October 16, 2017 3:22 AMWhen i use OPTION(RECOMPILE) to the INSERT statement, there is no time difference anymore.
GREAT, but why?
@jeff Moden
Anonymized, compressed DDL:
CREATE TABLE [dbo].[contract](
[contract_id] [INT] NOT NULL,
id1 [INT] NULL, id2 [INT] NULL, id3 [INT] NULL, id4 [INT] NULL,
id5 [INT] NULL, id6 [INT] NULL, id7 [INT] NULL, id8 [INT] NULL,
id9 [INT] NULL, id10 [SMALLINT] NULL, Id11 [INT] NULL, id12 [INT] NULL, id13 [INT] NULL, id14 [INT] NULL,
id15 [INT] NULL, id16 [INT] NULL, id17 [INT] NULL, id18 [INT] NULL,
id19 [INT] NULL, id20 [INT] NULL, id21 [INT] NULL, id22 [TINYINT] NOT NULL,
id23 [TINYINT] NOT NULL, id24 [TINYINT] NOT NULL, id25 [TINYINT] NOT NULL, id26 [TINYINT] NOT NULL,
id27 [TINYINT] NOT NULL, id28 [TINYINT] NOT NULL, id29 [TINYINT] NOT NULL, id30 [TINYINT] NOT NULL,
id31 [TINYINT] NOT NULL, id32 [INT] NOT NULL,id33 [INT] NULL,CONSTRAINT [PK_contract] PRIMARY KEY CLUSTERED
(
[contract_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]GO
@sgmunson Execution plan is the same. Clustered Index scan -> Clusterd Index Insert
@steve-2 Jones As mentioned in my first post, the model is simple.
I already told you when I suggested it. 😉 It's an undocumented sometimes necessary thing to do to trick the optimizer into doing Minimal Logging in the presence of a Clustered Index.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 18, 2017 at 3:13 am
Hi guys,
any other solutions / explanations for this behaviour?
I do not feel comfortable with adding OPTION(RECOMPILE) to every INSERT we are using to load our data warehouse enviroment.
I also tried TF 610, but it did not improve the timings.
October 18, 2017 at 5:39 am
mrpellepelle - Wednesday, October 18, 2017 3:13 AMHi guys,any other solutions / explanations for this behaviour?
I do not feel comfortable with adding OPTION(RECOMPILE) to every INSERT we are using to load our data warehouse enviroment.
I also tried TF 610, but it did not improve the timings.
If it were for a GUI related proc that hit the database tens of thousands of times per hour, I could absolutely understand your angst about using OPTION(RECOMPILE). Of course, you wouldn't be looking for "Minimal Logging" then. For loading a DW in a batch mode, the Recompiles will be trivial compared to the time and resources they save and they're necessary, in this case, to trick the optimizer into realizing that it actually can do the "Minimal Logging".
Unless someone can come up with a non-code-based trick to allow the optimizer to make the correct decision during run time, you're pretty much stuck with either slow/resource intensive code or the option to recompile. I recommend that you get comfortable with the later. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply