generate rows based on integer

Question

Post reply

generate rows based on integer

skeezwiz

SSC Eights!

Points: 896
More actions
September 29, 2010 at 6:02 pm

#227472

hi all, is there are more smart and quick way to generate data based on int column:
column1,column2,column3,column4,column5
234,ieifjrjfm,rwthtr,wh4hhtrh,6
235,ieifjrjfm,rwthtr,wh4hhtrh,2
236,ieifjrjfm,rwthtr,wh4hhtrh,3
237,ieifjrjfm,rwthtr,wh4hhtrh,1
238,ieifjrjfm,rwthtr,wh4hhtrh,8
Loop through this dataset and duplicate row - 234,ieifjrjfm,rwthtr,wh4hhtrh 6 times in another table | 235,ieifjrjfm,rwthtr,wh4hhtrh 2 times | 236,ieifjrjfm,rwthtr,wh4hhtrh 3 times etc. ??
I'm doing a double loop which works but very slow, 1 hour to generate ~5 million rows.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 1

This should handle 5 million rows in just a couple of minutes...

--===== Setup a test table (THIS IS NOT A PART OF THE SOLUTION)

SELECT TOP (1000)

column1 = IDENTITY(INT,1,1),

column2 = 'ieifjrjfm',

column3 = 'rwthtr',

column4 = 'wh4hhtrh',

column5 = ABS(CHECKSUM(NEWID()))%10+1

INTO #YourFirstTable

FROM master.sys.all_columns ac1

CROSS JOIN master.sys.all_columns ac2

;

--===== Show the test data (THIS IS NOT A PART OF THE SOLUTION)

SELECT TOP 100 *

FROM #YourFirstTable

ORDER BY column1

;

--===== Now, spawn the new rows according to the quantity in column 5.

-- Just change this to an insert and you're golden.

SELECT column1,column2,column3,column4

FROM #YourFirstTable yft

JOIN dbo.Tally t

ON t.N <= yft.column5

;

Now... two things. First, take a look at the article at the first link in my signature line below. With 481 points of your own, it's time you learned how to create a post that people will take more immediate interest in. 😉

Second, if you don't already know what a Tally table is, it's time to learn about that wonderful tool. Please read the following article to learn how it can replace certain While loops like I just did in the code above.

http://www.sqlservercentral.com/articles/T-SQL/62867/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 2

Oh yeah... almost forgot... make sure that "N" in the Tally table is bigger than the largest value of column 5 or at least do a test to see because the code I wrote won't work correctly if "N" is smaller and it won't warn you. It's too busy working to warn you. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

skeezwiz SSC Eights! Points: 896 More actions · Answer 3

wow!, using this tally approach for the same amount & type of records ran in ~10mins.

I included the qauntity field in an existing non-clustered index which should speed up things.

Would it be quicker if this tally table was permanent and not in tempdb?

I already figured out the max restriction on qty field, thx anyway 😉

I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.

I have ready many books and attended courses and have never come across this tally table approach, you should be commended.

Thanks.

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 4

msarikas (9/29/2010)
wow!, using this tally approach for the same amount & type of records ran in ~10mins.
I included the qauntity field in an existing non-clustered index which should speed up things.
Would it be quicker if this tally table was permanent and not in tempdb?
I already figured out the max restriction on qty field, thx anyway 😉
I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.
I have ready many books and attended courses and have never come across this tally table approach, you should be commended.
Thanks.

Thanks for the kudo's but the Tally Table isn't my original idea. It was used way back in the 60's and it's frequently called a "Numbers" table now. Shoot... even Celko (the guy above) uses such a thing except (IIRC) he prefers to call his a "sequence" table which is also an appropriate name for it.

And, I have been commended for it... you just did and you made my day. Thank you very much. 🙂

I have to agree with Celko though. This is an extreme amount of denormalization. Why do you need to do this? I mean, why can't you just store one row with 1 quantity for each item?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 5

msarikas (9/29/2010)
wow!, using this tally approach for the same amount & type of records ran in ~10mins.
I included the qauntity field in an existing non-clustered index which should speed up things.
Would it be quicker if this tally table was permanent and not in tempdb?
I already figured out the max restriction on qty field, thx anyway 😉
I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.
I have ready many books and attended courses and have never come across this tally table approach, you should be commended.
Thanks.

Ah... sorry. Didn't answer the other questions. I usually create a permanent Tally table in a Util database and write a synonym in the other databases to it. Speed won't change noticibly whether the table is in TempDB, local DB, or a Util DB with synonyms.

You are correct... the fewer the number of columns, the faster things will be. That's usually true with any query, though.

Since you're doing a table scan, the addition of the quantity field to a clustered index likely won't do anything except slow down inserts.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

skeezwiz SSC Eights! Points: 896 More actions · Answer 6

im with u jeff that in db world you need to store everything at the lowest grain e.g. date, products etc. will make your dba life easier and the system less prone to major changes.

Originally the data was housed per quantity as per user specs, now we have to expand this qty field to extract each instance of this record. Its a case of mind changing plus users designing system = disaster! welcome to my world :hehe:

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 7

msarikas (9/30/2010)
im with u jeff that in db world you need to store everything at the lowest grain e.g. date, products etc. will make your dba life easier and the system less prone to major changes.
Originally the data was housed per quantity as per user specs, now we have to expand this qty field to extract each instance of this record. Its a case of mind changing plus users designing system = disaster! welcome to my world :hehe:

Thanks for the feedback and I have an appreciation for that. Someone reads something somewhere on the internet or in a book and they suddenly become an evangelist for what may (or may not) be a bad method.

The expansion of rows like this isn't an uncommon request. If it's at all possible, I'd really like to know the business reason they have behind this. What utility does it provide?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 8

Seriously... I'd like to know just from an educational standpoint. The only reason anyone has ever told me why they wanted to do this was to supposedly make a FIFO inventory system a bit "easier" to manage. I'd like to know if there are other reasons.

Thanks in advance for any information you can share on the subject.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)