August 20, 2010 at 7:02 am
I'm new to BULK INSERT but I need a better/faster way to extract several millions of rows of data from one db/table on server 'A' to a different db/table on server 'B'.
I run this operation today using a package in BIDS...it runs, but takes about 2.5 hours and there are a couple of checks/balances I need to integrate into the package that I'm not sure how to do in BIDS.
The main thing is I don't want all of the data from the table on server 'A', only certain columns.
Can I use BULK INSERT to do this via SSMS? Is there a different option via SSMS?
My issue is I want to schedule this as a daily job, but I need to check the status of the table on server 'A' to see if it has been updated for the day. There is a different table on server 'A' that houses the status of the table I need to copy over to server 'B'.
Probably something simple, but I'm not good enough in BIDS to do this kind of check, but I can do it in SSMS.
August 20, 2010 at 1:02 pm
BULK INSERT is only used to insert data from a file into a table.
When copying data between two sql server tables you could either use SSIS or you could use linked servers.
Try first setting up a linked server so you can access server A from server B, then you can simply run this on server B to create a table NewTable that is identical to the SourceTable on server A
SELECT *
INTO dbo.NewTable
FROM ServerA.DB.dbo.SourceTable
You can add any SQL processing you like - this is just a normal SQL query.
August 20, 2010 at 1:08 pm
Yeah, I've created tables that way before, but I'm currently inserting the records into an existing table.
Would a create table statement be markedly faster than an insert into?
August 20, 2010 at 3:24 pm
The main problem with INSERT INTO is that it creates a lot of log when inserting large amounts of data. SELECT INTO is minmally logged which means that it is faster and requires less log.
August 20, 2010 at 3:38 pm
Although this propbably does not apply to this particular exercise...
FYI,
SELECT INTO will cause a schema lock, preventing other DDL statements from executing against the table.
If you don't care about schema locks, SELECT INTO is better than INSERT INTO.
From my experience, data transfer of high volume records using linked servers are slow.
SSIS is definitely a way to go.
August 20, 2010 at 4:11 pm
Yeah, I found out about the lock - the hard way.
Using SSIS, how can I check the value in one column in table 'x' and then based on the value, either continue and process the package or end the package w/o doing anything further?
More detail:
There is a table 'a' that I want to extract the data from but it gets updated by a different owner every day. The time varies and sometimes due to other circumstances it can be delayed. But there is this other table 'x' that contains a status of the job they run to update table 'a'.
I want to check table 'x' to see the status of the update job for table 'a' and if it has been completed, then I want to process the package via an embedded query in the OLE Source. Again, if the table 'x' indicates table 'a' hasn't been updated then I want to end the package.
August 20, 2010 at 4:28 pm
Connect the data flow source to where table 'a' and 'x' reside (hopefully they are on the same server). Write a select statement that joins the two tables and returns only the records with the right status. Copy this statement inside the data source. Set up the data flow destination to table 'b' (on different server and db).
August 20, 2010 at 4:32 pm
Oops, I think I misundertood your explanation...table 'a' and 'x' do not join...:-P
You can add a IF statement before the SELECT that returns data from table 'a'. This way 0 records returned IF job status = <unfinished???>
Or you can add a Execute SQL Task to check the job status before the Data Flow Task, which contains the source and destination.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply