Loop to insert in batches

Question

Loop to insert in batches

Steve-335899

SSChasing Mays

Points: 639
More actions
November 27, 2006 at 8:09 pm

#170503

I am hoping someone can help me with this issue that I having. I am trying to create a table that I need for testing purposes. What I need to do is insert Table A into Table B 10 times. My issue is that Table A is 50 mil records so I want to do it in batches on an ID in the table.
What I want to do is a loop, that performs in batches, and inserts 10 times. (I have created a loop that can do this using a cartesian join to a table with 10 records in it, which creates my exploded table. Yes very hard on the server and cache).
For example, this might help what I am trying to get across:
Table A has 1000 records. Within this 1000 records are 10 unique ids. I need a loop that can pick out an ID and all its records and insert it into Table B ten times.
CAn anyone help. I'm sorry this is so vague but I'm at home trying to figure out where I might have gone wrong to begin with.
Thanks,
Steve

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 1

Steve,

I gotta ask... you're saying that you basically want to copy all of table A into table B 10 times (regardless of the loop you requested)??? To what end? Maybe there's something else we can do without ultimately creating table B as a 500 million row table....

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve Jones - SSC Editor SSC Guru Points: 736455 More actions · Answer 2

Do the new ids need to be unique?

Is this more complicated than:

insert b select * from A

...

?

Steve-335899 SSChasing Mays Points: 639 More actions · Answer 3

Thanks guys. I sat down, and went throught it logically and was able to build the two loops that i needed to complete this. There was some logic I needed to apply and have built into the loops and be able to utilise the cache.

I just panicked last night instead of sitting down and modeling it out.

Thanks

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Can we see the final result.. maybe we can improve it?!

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 5

Yeah... and how many rows did you ultimately end up with?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve-335899 SSChasing Mays Points: 639 More actions · Answer 6

Here is the code. It works fast....50mil start, 500mil final.

Declare@StartPointint,

@minstoreint,

@MaxStoreint,

@IncrimentSeedint,

@Countint

SET@StartPoint=0

SET@minstore=1

SET@MaxStore=500

SET@Count=(Select Max(PlaceID)

From Schema.SchemaTestStage)

WHILE@minstore <= @Count

BEGIN

SET@IncrimentSeed=0

WHILE@IncrimentSeed<10

BEGIN

BEGIN TRANSACTION

INSERT INTOSchema.SchemaTest

(

PlaceID,

MONTH_YEAR,

Number,

TOTAL,

UNITS,

COST,

NEW_TOTAL,

NEW_UNITS,

NEW_COST,

[Our Number],

SUPPLIERCD,

SUPPLIERID,

PrescriberID

)

SELECTPlaceID,

MONTH_YEAR,

Number,

TOTAL,

UNITS,

COST,

NEW_TOTAL,

NEW_UNITS/10,

NEW_COST,

[Our Number],

SUPPLIERCD,

SUPPLIERID,

Cast(PlaceID as varchar) + Cast(Right(ROWID, 1) as varchar)

FROMSchema.SchemaTestStage

WHEREPlaceID Between @minstore and @MaxStore

COMMIT TRANSACTION

SET @IncrimentSeed = @IncrimentSeed + 1

END

SET@StartPoint=@MaxStore

SET@minstore=@StartPoint + 1

SET@MaxStore=@StartPoint + 500

END

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 7

Thanks for the posted code.... still, why do you need to manufacture 10 rows for each row in the source table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve-335899 SSChasing Mays Points: 639 More actions · Answer 8

I need to test a process which involves run time. Once the actual "real" table is produced, we will be running reports based on certain metrics. The long and short of it is 6400 runs, based on a table that is 500 mil records. We need a baseline for time to produce.

The looping structure can be used with a join on another data table to produce the result set of the real table.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 9

I'm not QA expert but doesn't having 10 times a small set of data vs having 1 real size data affect the index distribution in such a way it could alter the results of the test?

Steve-335899 SSChasing Mays Points: 639 More actions · Answer 10

This purpose was not a result set test as it was a resource and system time test. The end result metrics will have to be QA'd, and that table will be the 500mil table.

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 11

Still, Remi is correct... it will affect even a system time test because of the distribution unless you know for sure that the distibution of the actual data will be very similar. It was a good idea you had, though... most don't take the time to test on anything more than a couple of dozen rows.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve-335899 SSChasing Mays Points: 639 More actions · Answer 12

I agree. The metrics and distribution were tested on the smaller table to ensure that when we exploded it 10 fold, which is now 100 after last night, would remain the same in baseline. Not enough people do these types of tests to ensure that sytem resources are being allocated properly. A lot of programmers take for granted the SANs and mainframes and the power they house.

This was run using about 10 different ranges until the optimal one was found. Testing this has been an eye opener for some.

Thanks a lot for the input here. I am just starting in the sytems architechture game and optimisation of these systems is in turn a DBA's best friend.

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 13

Spot on especially about testing (or the general lack, there of)... thanks for the feedback.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)