March 27, 2014 at 9:10 pm
which way is the better for 1 million records inserting into one table
1) insert into destination
select * from source
2) BULK INSERT destination
FROM '\\computer\source.txt';
3) BCP
March 27, 2014 at 9:28 pm
If the source data isn't too "wide", BCP would work quite well, but I'd recommend using SSIS so you can take more advantage of simultaneous/parallel threads
Take a quick look through these great articles:
Optimizing Bulk Import Performance
http://msdn.microsoft.com/en-us/library/ms190421(v=sql.105).aspx
The Data Loading Performance Guide
http://technet.microsoft.com/en-us/library/dd425070(SQL.100).aspx
We Loaded 1TB in 30 Minutes with SSIS, and So Can You
http://msdn.microsoft.com/en-us/library/dd537533(v=sql.100).aspx
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
March 27, 2014 at 10:56 pm
That article on loading 1TB in 30 minutes seems to be impressive testament for SSIS but they had 4 instances of SSIS running 56 streams to a 64 processor database using 56 different tables (later, switched into a paritioned table) on a full hardware package (including a killer disk system with 165 spindles and 8GB fiber channels) that would have made the makers of Cray green with envy. From what I can see, SSIS didn't have anything to do with the performance. If they had executed 56 parallel streams using BULK INSERT, they likely would have gotten even better performance and they wouldn't have needed the extra 4 servers that SSIS lived on. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
March 28, 2014 at 1:06 am
Fair point 🙂
For ease of use, I'd go with BCP. Which would you recommend?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
March 28, 2014 at 1:57 am
cooljagadeesh (3/27/2014)
which way is the better for 1 million records inserting into one table1) insert into destination
select * from source
2) BULK INSERT destination
FROM '\\computer\source.txt';
3) BCP
Quick questions; is this a one off thing? Does it need to be managed, logged, reported? Are all files in the same format? Is speed more important than server load? :w00t: Or is there a file in the first place? Is it maybe a table to table load? 😎
The options you list imply both table and file source.
March 28, 2014 at 10:55 am
MyDoggieJessie (3/28/2014)
Fair point 🙂For ease of use, I'd go with BCP. Which would you recommend?
If there are no "double-hop" or server trust problems for BULK INSERT, I'd go with BULK INSERT over BCP. That way, it can all be done from T-SQL without someone getting bent out of shape about using xp_CmdShell or having to do things from a command prompt or etc, etc. In the absence of xp_CmdShell, it's also a whole lot easier to make dynamic insofar as source and destination than using BCP. With that in mind, you can actually have a proc that does like they did in the 1TB/30Minute example insofar as parallel loads go. You can have a stored proc that creates multiple jobs and starts them all. Each job would poke a table with an "I'm done" marker that the main proc could check for completion. The jobs could be self-dropping or you could modify them from the main proc. Of course, such parallel loads don't do much for performance if your destination table/partitions are all on the same disk(s) regardless of how many files/filegroups you have. At best, I've only seen a nearly insignificant gain in load performance that way (same set of spindles) and I've also seen MUCH worse performance because the poor ol' read/write heads spend so much time thrashing on the same spindles.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply