How to Insert Million of records in less time ?

Question

Post reply

How to Insert Million of records in less time ?

MTY-1082557

SSCarpal Tunnel

Points: 4540
More actions
November 24, 2010 at 9:27 am

#144165

I need to insert between 1 Million to 4 million of rows into a table. Part of my process is pretty fast, but the insertion of this rows take aprox 6 minutes. I already did some modifications in order to improve the time.
I need an advice of what is the best way to do it. My destination table is already big (aprox 49,061,939 rows). This table has 3 indexes because they are used from another process in order to speed as well. The destination table basically looks like:
StbID int
VersionID int
SegId int
Value varchar(10)
The four fields are my primary key. What I'm inserting is exactly all those values (btw 1 to 4 million rows). So my statement looks like:
INSERT dbo.STB with(tablock) -- 5:20 minutes
SELECT * FROM #TempSTB WITH(NOLOCK)
Any advice?

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

So the problem here is that the server becomes unavailable for 4-6 minutes whenever you do this?

You could do it in batches of 30 - 50k rows which would lock the tables for only seconds at a time.

Then wait a couple seconds in between batches to let the server catch up.

MTY-1082557 SSCarpal Tunnel Points: 4540 More actions · Answer 2

So the problem here is that the server becomes unavailable for 4-6 minutes whenever you do this?

i dont really care that much if the server becomes unavailable for 4-6minutes, because they are internal processes. what I really need its to speed the insertion.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

How long does it take to run the select alone (without insert)?

5 mins for a couple million rows is not lightning fast but it's not crawling either by most standards.

MTY-1082557 SSCarpal Tunnel Points: 4540 More actions · Answer 4

actually, it takes only 22 seconds.. the problem is the insert 🙁

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 5

How long does it take to do a select into?

How long does it take to insert in the base table if you disable all NC indexes?

Does you have triggers, foreign keys, check constraints?

From http://www.sqlskills.com/BLOGS/PAUL/post/Survey-what-is-the-highest-wait-on-your-system.aspx

What is your biggest wait type?

WITH Waits AS

(SELECT

wait_type,

wait_time_ms / 1000.0 AS WaitS,

(wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,

signal_wait_time_ms / 1000.0 AS SignalS,

waiting_tasks_count AS WaitCount,

100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,

ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum

FROM sys.dm_os_wait_stats

WHERE wait_type NOT IN (

'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK',

'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE',

'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BROKER_TO_FLUSH',

'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT', 'DISPATCHER_QUEUE_SEMAPHORE',

'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'BROKER_EVENTHANDLER',

'TRACEWRITE', 'FT_IFTSHC_MUTEX', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')

)

SELECT

W1.wait_type AS WaitType,

CAST (W1.WaitS AS DECIMAL(14, 2)) AS Wait_S,

CAST (W1.ResourceS AS DECIMAL(14, 2)) AS Resource_S,

CAST (W1.SignalS AS DECIMAL(14, 2)) AS Signal_S,

W1.WaitCount AS WaitCount,

CAST (W1.Percentage AS DECIMAL(4, 2)) AS Percentage

FROM Waits AS W1

INNER JOIN Waits AS W2

ON W2.RowNum <= W1.RowNum

GROUP BY W1.RowNum, W1.wait_type, W1.WaitS, W1.ResourceS, W1.SignalS, W1.WaitCount, W1.Percentage

HAVING SUM (W2.Percentage) - W1.Percentage < 95; -- percentage threshold

GO

SQLRNNR SSC Guru Points: 281334 More actions · Answer 6

Start with the answers to Ninja's last questions.

I am curious to know how well it performs without your query hints?

Have you constructed out the entire Insert / Select statement using actual column names in both parts in lieu of the * ?

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

MTY-1082557 SSCarpal Tunnel Points: 4540 More actions · Answer 7

How long does it take to do a select into?

it takes 2 seconds

How long does it take to insert in the base table if you disable all NC indexes?

It takes 4:20 min plus 32 seconds to rebuild indexes, almost 5 minutes. (not big difference)

Does you have triggers, foreign keys, check constraints?

no, I dont. I have 1 primary key and 2 Nc indexes

primary key:

StbID int

VersionID int

SegId int

Value varchar(10)

index one:

VersionId

index two:

StbId, SegID

there is not difference if I indicate the columns to be inserted (instead the *)

SQLMD Right there with Babe Points: 764 More actions · Answer 8

This probably isn't it, but what is the size of the database and/or log file? Is it growing the files every time you run this insert?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 9

MTY-1082557 (11/24/2010)
How long does it take to do a select into?
it takes 2 seconds
How long does it take to insert in the base table if you disable all NC indexes?
It takes 4:20 min plus 32 seconds to rebuild indexes, almost 5 minutes. (not big difference)
Does you have triggers, foreign keys, check constraints?
no, I dont. I have 1 primary key and 2 Nc indexes
primary key:
StbID int
VersionID int
SegId int
Value varchar(10)
index one:
VersionId
index two:
StbId, SegID
there is not difference if I indicate the columns to be inserted (instead the *)

What's the results of the wait types query I posted? The top 4-5 rows will give us an idea of what might be wrong on the server.

Other things that may slow you down... unlikely but possible suspects :

db file growth as already mentionned?

What's the autogrow setting for the datafile (% / mb and how much)

change tracking?

Row version store??

auto_clause or auto_shrink DB on?

Is tempdb short on ressources?

Do you have a big job or backup that is running at the same time you are testing this (yeah I know it's been 8 hours... but just in case).

Is this adhoc statement or SP? Have you tried flushing the proc cache?

Are the stats up to date?

Anti-virus scanning the db files?

Screensaver or anything else hitting the server's CPU at 100% (yes true story).

Did you show us the actual code or a modified version? Maybe we're not seeing something there.

When doing select into, are you perharps creating the table on a different filegroup on a different disk subsystem?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

MTY-1082557 (11/24/2010)
How long does it take to do a select into?
it takes 2 seconds
How long does it take to insert in the base table if you disable all NC indexes?
It takes 4:20 min plus 32 seconds to rebuild indexes, almost 5 minutes. (not big difference)
Does you have triggers, foreign keys, check constraints?
no, I dont. I have 1 primary key and 2 Nc indexes
primary key:
StbID int
VersionID int
SegId int
Value varchar(10)
index one:
VersionId
index two:
StbId, SegID
there is not difference if I indicate the columns to be inserted (instead the *)

Is the table heavily fragmented?

Does the new data insert at the end of the table / indexes or all over the place?

What's the fill factor for all indexes? and have you tried changing it to something less aggressive?

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 11

MTY-1082557 (11/24/2010)
I need to insert between 1 Million to 4 million of rows into a table. Part of my process is pretty fast, but the insertion of this rows take aprox 6 minutes. I already did some modifications in order to improve the time.
I need an advice of what is the best way to do it. My destination table is already big (aprox 49,061,939 rows). This table has 3 indexes because they are used from another process in order to speed as well. The destination table basically looks like:
StbID int
VersionID int
SegId int
Value varchar(10)
The four fields are my primary key. What I'm inserting is exactly all those values (btw 1 to 4 million rows). So my statement looks like:
INSERT dbo.STB with(tablock) -- 5:20 minutes
SELECT * FROM #TempSTB WITH(NOLOCK)
Any advice?

I just noticed that you have the TABLOCK hint on the target table. That will make the code wait until it can get the TABLOCK. The only way that it can get the TABLOCK is when no other users are using the table. It could wait for 2 weeks under the right conditions.

Obviously, my recomendation will be to remove the TABLOCK hint.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 12

1. i am second to ninja'a first suggestion to do it in batch format

2. how much data will be migrated ? huge data.....longer time.

3. also backup the log on small periods.

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

wu.wuqiang Ten Centuries Points: 1008 More actions · Answer 13

Remove indexes before insertion, then recreate indexes.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 14

wu.wuqiang (11/26/2010)
Remove indexes before insertion, then recreate indexes.

That frequently does nothing for the amount of time it takes because reindexing can take a lot of time. In fact, if you look back in this thread, the OP tried that and it made no significant improvement.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)