April 24, 2009 at 8:08 am
Hello, I have written a loop to transfer data from one table to another. In the code below what can I do to have it increment to the next 100000 rows as to not insert duplicate rows in the target table?
DECLARE
@rows INT
BEGIN TRY
WHILE 1 = 1
BEGIN
insert into outlet_sku_xref_test
([retailer_id]
,[sku_id]
,[outlet_id]
,[threshold]
,[qty]
,[sku_price]
,[tax_rate]
,[sell_thru]
,[inv_status_1]
,[inv_status_2]
,[inv_status_3]
,[inv_status_4]
,[inv_status_5]
,[rec_update_date]
,[rec_create_date]
,[rec_update_id])
select top (100000)
[retailer_id]
,[sku_id]
,[outlet_id]
,[threshold]
,[qty]
,[sku_price]
,[tax_rate]
,[sell_thru]
,[inv_status_1]
,[inv_status_2]
,[inv_status_3]
,[inv_status_4]
,[inv_status_5]
,[rec_update_date]
,[rec_create_date]
,[rec_update_id]
from outlet_sku_xref
IF @@RowCount = 0 BREAK
END
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() ErrorNBR, ERROR_SEVERITY() Severity,
ERROR_LINE() ErrorLine, ERROR_MESSAGE() Msg
PRINT 'The Insert Did Not Complete Successfully...'
END CATCH
GO
Best Regards,
~David
April 24, 2009 at 8:16 am
Is there a column that you can use for something like "Where ID between 1000 * (@Iteration-1) + 1 and 1000 * @Iteration"? @Iteration = 1 for the first time through the loop, and increments by 1 each pass through the loop. Thus, it will start with 1-1000, then 1001-2000, and so on.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 24, 2009 at 8:21 am
I have written a loop to transfer data from one table to another. In the code below what can I do to have it increment to the next 100000 rows as to not insert duplicate rows in the target table?
The issue with this is that a set has no ordering, so you can hardly say the first 100.000, the next 100.000. What you could do is apply a WHERE clause and say for example ... WHERE retailer_id BETWEEN 100001 AND 200000, provided they are numbered without gaps. Otherwise ROW_NUMBER() might help, but you will have to verify if performance is still acceptable for you as you would select the full table content then to get a row number for all rows.
brgds
Philipp Post
brgds
Philipp Post
April 24, 2009 at 8:23 am
There is only a composite unique key.
Best Regards,
~David
April 24, 2009 at 8:25 am
Can you insert the whole table into a temp table, with an ID on it, and then transfer to the other table with the temp table as the source, in batches at that step?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 24, 2009 at 9:02 am
You have a unique index, so you could use it.
INSERT YOURTABLE( blah,blah2,blah3 etc)
SELECT blah,blah2,blah3 etc
FROM YOUROTHERTABLE T1
LEFT OUTER JOIN YOURTABLE T2
ON T1.unique_index_part1=T2.unique_index_part1
AND T1.unique_index_part2=T2.unique_index_part2
AND T1.unique_index_part3=T2.unique_index_part3
WHERE T2.some_column_that_does_not_allow_nulls IS NULL
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
April 26, 2009 at 2:48 pm
[font="Tahoma"]Using a left join is one way, but the code will progressively slow down (er, that doesn't mean it's invalid, and sometimes it's still the best performing solution because other options generally involve additional I/O.)
Another option is to create a temporary table that lists the key values and has an identity key. You can track the number of the identity key to start from with a simple counter variable, and join the temporary table to the source table to get the range of keys.
A third option is to use an order by in your insert, and start the loop by getting the last values for the composite key fields in the destination table. You then have a where clause in the insert statement to skip past those keys. That generally takes the form: (A0 > A1 or (A0 = A1 and B0 > B1)).
[/font]
April 27, 2009 at 7:06 am
Actually, in SQL 2005, another solution that sometimes works is to use an Output clause to insert into a temp table, then use that to manage the loop. That works best if you have well-ordered data.
For example, with an ID column, you can Output Into a temp table, select the max value from that and assign it to a variable, truncate the temp table, and insert top 10000 where ID > the variable, then repeat as needed.
If the key columns involved have an order to them of some sort, then that method can be really, really fast.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 27, 2009 at 8:52 am
follow up to GSquared's first idea:
You could add an Identity field to a temp table loaded with your source data. Then load your destination table from that.
-- create temp source table
SELECT * INTO #tmpSku FROM outlet_sku_xref
-- add unique ID field
Alter table #tmpSku add sku_row_id integer IDENTITY (1,1)
insert into outlet_sku_xref_test
select
....
from #tmpSku
where sku_row_id between 1 and 10000
insert into outlet_sku_xref_test
select
....
from #tmpSku
where sku_row_id between 10001 and 20000
...and so on
April 27, 2009 at 9:31 am
I must be missing something...would it not work if you just removed the loop and inserted the records?
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 28, 2009 at 9:04 am
slange (4/27/2009)
I must be missing something...would it not work if you just removed the loop and inserted the records?
There are times when it's more efficient to do batches like this. Usually if you're transfering huge amounts of data.
One that I ran into recently was a table with over 100-million rows of data, and I needed to archive about 75% of it into another table. Partitioning was not an option. Just trying to run it all as a single insert followed by a delete would literally have locked up the server for days (low end server). Breaking it up into small batches, however, worked beautifully. Still took a long time, but the regular processes on the server could continue while it chugged away at the archive process.
Tried various row-counts on the insert, found that it got the most done per hour at about 7k rows per batch, with the least interference with the regular jobs on there.
Used Insert...Output Into...Select to archive, then used the contents of the temp table I was outputing into for a delete command, all wrapped in a transaction. Looped that with While @@Trancount > 0, and it just ran and ran and got the job done efficiently.
SQL can handle those huge operations, but many servers aren't up to it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 28, 2009 at 3:35 pm
GSquared (4/28/2009)
slange (4/27/2009)
I must be missing something...would it not work if you just removed the loop and inserted the records?There are times when it's more efficient to do batches like this. Usually if you're transfering huge amounts of data.
[font="Tahoma"]As a follow up: when you do one big insert, SQL Server does it all as a single transaction. Which means you can blow out the transaction logs and potentially tempdb as well as it rebuilds indexes. While SQL Server is rebuilding indexes and the like, all sorts of locking and resource contention is caused, and for a large update that can have significant impact on performance for anything else running.
If you break it into smaller "chunks" or "batches", then you don't run the risk of SQL Server running out of capacity or turning your transaction logs into mega-gigabyte behemoths.
Another valid approach for this sort of operation is to pull the data out into a file and use SSIS or BCP or BULK INSERT to load it and avoid most of the logging. You still get the issue with index rebuilds, but there are options to commit every certain number of rows.
[/font]
April 29, 2009 at 7:13 am
Yeah, I've used the export...import method too. That works well in some situations. I've found it quite advantageous when the data has to move across a slow network. FTPing a text file can be a lot faster than any sort of linked server type solution.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply