October 24, 2016 at 8:09 am
Super_Grover (10/24/2016)
Okay we're trying to solve this but I'll mark Phil's post as solution.Thanks again!
If you find that you cannot run this as a distributed transaction, it should be possible to tweak the code.
Won't be quite as robust, but should get the job done.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2016 at 9:01 am
Since you've posted to a SQL 2012 forum, you might want to use a feature that was introduced in SQL 2012 specifically for paging: OFFSET/FETCH.
DECLARE @pg INT = 1,
@pg_size INT = 1000
@rows INT;
SET @rows = @pg_size;
WHILE @rows = @pg_size
BEGIN
INSERT INTO [P2WIDB002].[CS_ConversieAanlevering_VWB].[dbo].[MULTIMED_INHOUD]
SELECT * FROM [NLDSLASSQLTST].[CS_ConversieAanlevering].[dbo].[MULTIMED_INHOUD]
ORDER BY doc_number
OFFSET @pg_size * (@pg - 1) ROWS
FETCH NEXT @pg_size ROWS ONLY;
SELECT @rows = @@ROWCOUNT, @pg += 1;
END
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 24, 2016 at 2:53 pm
Another consideration is whether the process/progress could/should be interrupted mid-stream. Some additional flexibility might come from using a real table instead of temp where you could add a field for datetime when the records were transferred. Then you could pull batches of records from the set with no transferred datetime. This way, you could split the operation up over more than one day or if a batch got interrupted, delete them from the destination and refill. If the server catches on fire, you could stop your transfers and deal with it.
October 24, 2016 at 3:14 pm
Another thing to strongly consider when transferring such large amounts of data is "Minimal Logging" and the proper use of Trace Flag 610, which has been available since 2008 R2. It can save not only a relatively huge amount of time, but a huge amount of log file space and, sometimes, MDF space depending on when you build the Clustered Index).
--Jeff Moden
Change is inevitable... Change for the better is not.
October 25, 2016 at 2:17 am
You should use bcp
October 25, 2016 at 4:48 am
wtren (10/25/2016)
You should use bcp
Please justify this comment.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 25, 2016 at 5:47 am
Bcp can control batch size directly, write very fast, have very little disturb and burden to product database, after that you can copy data to the destine and do your data loading slowly, and no need to read data from source again if you have any trouble, it's a different strategy.
October 25, 2016 at 5:59 am
I was going to recommend SSIS for the same reasons, but it seemed like the conversation was focused on the solution we were discussing. It also sounds like a one - off, so I'm not sure how much of a learning curve is acceptable or the time frame.
October 25, 2016 at 6:15 am
yes, it's split the process into two step, and do it one by one, as I say, it's just a different strategy to do a work, depend on your requirement.
October 25, 2016 at 7:41 am
If you willing to go the CLR route, and the LOB is compressible, you can use compression functionality from System.IO.Compression to compress into a staging table locally, push smaller staging table across network, then inflate at the destination.
October 25, 2016 at 8:00 am
Super_Grover (10/23/2016)
Hi all,I must transfer a big table (231k rows not that big, but 735GB big because of blob) to another location by INSERT INTO SELECT FROM...
It appears that first the whole source table is being loaded before any record is actually inserted into the destination table. But there's not enough diskspace because tempdb just grows and grows.
...
If you're just selecting from one table into another table, then I'm thinking it shouldn't dump into tempdb, unless you're doing an ORDER BY.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
October 25, 2016 at 11:51 pm
Phil Parkin (10/25/2016)
wtren (10/25/2016)
You should use bcpPlease justify this comment.
Although it was a bit short, it's not a bad idea. I've done similar before. Instead of messing around with loops and all manner of other tricks, you BCP the data out as native format (very fast) and then BCP or BULK INSERT it back in where you want it. If you hold your mouth just right while you're doing it (especially with TF 610), you can even do it in a minimally logged fashion with the Clustered Index already in place.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 26, 2016 at 12:13 am
Yes, very fast, for multiple tables, you can also open several windows and download different table simultaneously.
October 27, 2016 at 10:23 am
wtren (10/26/2016)
Yes, very fast, for multiple tables, you can also open several windows and download different table simultaneously.
Except, in most SAN environments, that will actually slow things down because you can't control which spindles will come into play and head-thrashing frequently becomes a performance problem for such loads. Even if parallel loads work very fast for today, they might not tomorrow.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 27, 2016 at 11:49 pm
SAN normally provide cache to ease it, and SAN is always random io pattern.
Viewing 15 posts - 16 through 30 (of 43 total)
You must be logged in to reply to this topic. Login to reply