November 3, 2003 at 3:17 pm
We have implemented local partitioned views for better manageability of our 1 + billion record table.
This table has archived data for a couple of years, so we divided it into several tables, one for each
quarter. Only the current quarter table remains active for write ops, and others remain read only.
For backups purposes we keep the archive tables in one database and the active tables in other database,
so that we need to perform regular backups on the active database only. Once in a quarter we move the
table for the just concluded quarter to the archive database and do a full backup of archive database. Then
drop the table from the active database.
With that back ground I have a couple of quations:
The quarterly table I was talking about, gets to ~ 100 million records.
What is the best way to transfer this table between the databases. My concern was if I do it in one shot,
any should it not be completed in one transaction, the rollback of that magnitude is scary. I would
prefer if there is any way to transfer in chunks (like few thousand records) and commit, so my log and
recovery interval are in check. The only way to do this now seems to be use DTS Transformation, which seems
more like bcp out and bcp in. I notice DTS packages disply messages like 1000 rows copied etc,.
Is there any way to do the same from TSQL (like copying 1000 rows at a time), without going to a file in between?
thanks.
November 3, 2003 at 4:43 pm
how about write a sql script to devide the chunk into smaller piece by date time or by whatever you are using as your Primary key as long as they are int and you could do an incrementation base on them. once you are successfull on deviding the record, you could write a cursor or even a while loop to loop through the whole procesess
example:
while @i < 1000000
begin
insert tablename
select * from tablename
where id >= @i
and id <= @i + 1000
set @i = i+1000
end
mom
November 3, 2003 at 6:19 pm
Since you don't state what your PK is this is a little hard to do. But I would create a script that will set the rowcount to 1000 and then loop through until your updated rowcount is 0...
DECLARE @iNumRows int
DECLARE @iStartID int
DECLARE @iRowsAffected int
DECLARE @iErr int
SET @iNumRows = 1000 -- The number you want per batch...
SET @iStartID = 0 -- InitialPK value...
SET @iRowsAffected = 1 -- Any Non zero number..
SET ROWCOUNT @iNumRows
WHILE @iRowsAffected >= 1
BEGIN
BEGIN TRAN
INSERT INTO ....
SELECT ....
WHERE PK >= @iStartID
ORDER BY PK
SELECT @iErr = @@Error, @iNumRowsAffected = @@ROWCOUNT
IF @iErr != 0 ROLLBACK TRAN -- and handle error as needed
COMMIT TRAN -- You might even want to truncate the log here...
SELECT @iStartID = MAX(PK)
FROM REMOTETABLE...
END -- While loop
SET ROWCOUNT 0 -- Clean up after yourself...
Please note I have not tested this. I just typed it up here. I've done things like this in the past and it has worked great though! You may also need to change your @iStartID to what every your PK value is. And make sure you do an order by on your PK field(s).
Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer
Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer
This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.
November 4, 2003 at 9:58 am
Thanks to mom, gljjr and erasmusj for your replies, they would definitely work. But consider the situation where I do not have a PK, or my primary key consists of multiple columns and character based. I am curious how the DTS handles this. Because when I do DTS Import Export wizard for tables with no primary keys or multiple columns in the PK, it still does 1000 or so records at a time. My guess is that it is transfering the data to file and loading in chunks at time (some varation of BCP). Is that right? Or is it doing in memory, copying out and loading in rows. If it is doing in memory how does it keep pointer to where it had left off in source (we are talking about tables with no PK)..
BTW, I ran a trace while executing the package and I see a construct "insert bulk [PartDB].[dbo].[newtbl]("Id" int,
"State" varchar(50) ) where it does the insert. I see no references on the "insert bulk" on BOL.. Does anyone know what this is..
At this time I am leaning towards using a DTS transaformation, simply because it does the transfer in chunks like I wanted and saves me coding for it, and call the DTS from the procedure that does this task. Any reasons why I should not do this?
November 4, 2003 at 5:09 pm
DTS would show you that it finish 1000 chunk at a time but if it failed it would roll back everything leaving you to keep 0 rows of data. My other though was, is there any way to insert identity field just for the sake of keeping place or if there is any date or time for that matter that we could go by?
mom
November 5, 2003 at 7:56 am
With the batch size configuration, I can limit the #rows in my transaction, thus if the batch fails only the inserts in that batch are rolled back. I do have a date field and approximately each date has even distribution of records and I can do inserts by the date. But, consider my question more as an investigative.. How could the DTS transformation task be handling this, given it can work against any table whether has a PK or not or has multiple columns in the PK. And it does it in batches, with each a unit of transaction. That is what we should be able to do in TSQL without having to alter the table structure, because I do not see DTS altering the table.. Is DTS doing a BCP out and bulk copy in (in memory, since there is no file to be seen), is it using cursors or some other internal undocumented method not exposed through TSQL etc,. As I have already stated when I capture the DTS in trace I see it running a command "insert bulk" that has no references any where..
Anyone from Microsoft in the forum?
November 5, 2003 at 9:57 am
sxg6023,
I am really not sure what DTS does behind the back door. Here is what I found on BOL:
BULK INSERT [ [ 'database_name'.] [ 'owner' ].] { 'table_name' FROM 'data_file' }
[ WITH
(
[ BATCHSIZE [ = batch_size ] ]
[ [ , ] CHECK_CONSTRAINTS ]
[ [ , ] CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ] ]
[ [ , ] DATAFILETYPE [ =
{ 'char' | 'native'| 'widechar' | 'widenative' } ] ]
[ [ , ] FIELDTERMINATOR [ = 'field_terminator' ] ]
[ [ , ] FIRSTROW [ = first_row ] ]
[ [ , ] FIRE_TRIGGERS ]
[ [ , ] FORMATFILE = 'format_file_path' ]
[ [ , ] KEEPIDENTITY ]
[ [ , ] KEEPNULLS ]
[ [ , ] KILOBYTES_PER_BATCH [ = kilobytes_per_batch ] ]
[ [ , ] LASTROW [ = last_row ] ]
[ [ , ] MAXERRORS [ = max_errors ] ]
[ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
[ [ , ] ROWS_PER_BATCH [ = rows_per_batch ] ]
[ [ , ] ROWTERMINATOR [ = 'row_terminator' ] ]
[ [ , ] TABLOCK ]
)
]
mom
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply