April 28, 2011 at 1:21 pm
We are in the process of importing millions of rows of data into an existing database. To speed up the process I was going to switch the database from full recovery to simple mode and drop all the indexes. I would recreate the indexes after the import and switch back to Full mode. Is this a good idea and how do I script out the existing indexes for recreation?
April 29, 2011 at 2:08 am
Was my question that stupid? lol
April 29, 2011 at 2:39 am
smitty-1088185 (4/29/2011)
Was my question that stupid? lol
Ofcourse not! I would recommend you to insert such a huge record set in batches otherwise your recovery model setting and else will not work.
Sql server will consider it as a single transaction and it will remain active until the insertion is completed so even if your recovery model is simple it will keep growing and will not get truncated.
So, better you divide it into batches of lets say 100000 and measure the time and resourcing it is consuming if everything seems to be under control try another batch and continue like that.
If you expect the table to be very big, you can also look for partioning which could help in boosting your query performance..
April 29, 2011 at 5:47 am
Dropping all indexes may help with the inserts, but then again it might not. It really depends on if any of them are necessary for lookup operatoins during the inserts. If not, yes, dropping them and recreating them is the way to go. To script them out you can use the scripting utility built into Management Studio. Right click the database and select tasks, generate script...
You already know about simple recovery from the previous post.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 29, 2011 at 11:12 am
I would leave the database at full recovery and separate the insert into smaller batches. Then perform a transaction log backup between the batches.
If you plan on going ahead with the simple recovery plan then make sure you perform a full backup before and after.
-------------------------------------------------
Will C,
MCITP 2008 Database Admin, Developer
April 30, 2011 at 12:09 am
Did you dropped Clustered Index too ?
You can create all indexes script RIGHT CLICK on the a specific database and then check attachment for further steps
Regards,
Syed Jahanzaib Bin Hassan
MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
Regards,
Syed Jahanzaib Bin Hassan
BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
www.aureus-salah.com
May 1, 2011 at 8:14 am
Very well said Sachnam....
Follow his advise. You never should insert millions of records at a time. Insert them in small batches.
Go with Simple recovery model and small batches.
Thank You,
Best Regards,
SQLBuddy
May 2, 2011 at 4:33 am
WOW! That is great information from everyone. You guys rock!!! I will go with simple and batch jobs with backups in between. Now I have to figure out how to seperate the data into smaller chunks 😉
May 2, 2011 at 2:13 pm
While you don't specify where the data you're importing is coming from (flatfile, another table, etc). You should use a WHILE loop to batch process your records. You can try something like this:
SET NOCOUNT ON
GO
/* Declarations */
DECLARE
@batchsize int, @PhyDB sysname, @Table varchar(25),
@Schema varchar(3),@sql nvarchar(500), @Error int,
@RowCount int, @LoopCount int
/*Set the variables */
SELECT
@PhyDB = 'MyDB', @Schema = 'dbo', @Table = 'MyTable',
@BatchSize = 10000, @RowCount = 0, @LoopCount = 0
/* Create the statements */
SET @sql = 'SQL Statement to SELECT and INSERT your data '
WHILE @batchsize <> 0
BEGIN
/* Delete the data */
EXEC sp_executesql @sql
SELECT @Error = @@ERROR, @batchsize = @@ROWCOUNT
SELECT @RowCount = @RowCount + @batchsize, @LoopCount = @LoopCount + 1
IF @LoopCount = 10
BEGIN
CHECKPOINT
PRINT('CHECKPOINT REACHED (100,000 rows deleted)')
SET @LoopCount = 0
END
PRINT ('Records Deleted: ' + CAST(@Batchsize as varchar(25)) + ', Total Count: ' + CAST(@RowCount as varchar(25)))
END
SET NOCOUNT OFF
GO
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
May 3, 2011 at 2:46 pm
If you drop and recreate the indexes you might want to recompute statistics
May 3, 2011 at 4:36 pm
Syed Jahanzaib Bin hassan (4/30/2011)
Regards,
Syed Jahanzaib Bin Hassan
MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
http://www.aureus-salah.com[/quote%5D
Syed,
You remarked about dropping the clustered index. I've seen the subject of inserting into a heap come up before and never seen a definitive answer.
Question, anyone:
Is bulk insertion into a heap more efficient than bulk insertion into a clustered index?
LC
May 3, 2011 at 5:29 pm
crainlee2 (5/3/2011)
Question, anyone:Is bulk insertion into a heap more efficient than bulk insertion into a clustered index?
Depends on a few things.
One is it will depend on if you're inserting to a clean table or not. In particular:
If the table has a clustered index and is empty, both data and index pages are minimally logged. In contrast, if a table has a clustered index and is non-empty, data pages and index pages are both fully logged regardless of the recovery model
Since this usually only applies to staging tables, it's worth knowing about but probably not an affect on what you're dealing with.
Check out this link for the full ruleset:
http://msdn.microsoft.com/en-us/library/ms191244.aspx
Another thing that will matter heavily is how your clustered index is organized. If you're inserting all over the place, it will depend on data volume if a rebuild is better than just working through it. If you're only inserting to the tail, it doesn't matter as much.
The big deal about the clustered index existence is page splitting during large inserts. The reason you can't find a definitive answer is "It Depends." 🙂
EDIT: Bad brain, almost reintroduced a myth, sorry.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 4, 2011 at 10:38 am
Thanks, Craig.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply