generate rows based on integer

  • hi all, is there are more smart and quick way to generate data based on int column:

    column1,column2,column3,column4,column5

    234,ieifjrjfm,rwthtr,wh4hhtrh,6

    235,ieifjrjfm,rwthtr,wh4hhtrh,2

    236,ieifjrjfm,rwthtr,wh4hhtrh,3

    237,ieifjrjfm,rwthtr,wh4hhtrh,1

    238,ieifjrjfm,rwthtr,wh4hhtrh,8

    Loop through this dataset and duplicate row - 234,ieifjrjfm,rwthtr,wh4hhtrh 6 times in another table | 235,ieifjrjfm,rwthtr,wh4hhtrh 2 times | 236,ieifjrjfm,rwthtr,wh4hhtrh 3 times etc. ??

    I'm doing a double loop which works but very slow, 1 hour to generate ~5 million rows.

  • This should handle 5 million rows in just a couple of minutes...

    --===== Setup a test table (THIS IS NOT A PART OF THE SOLUTION)

    SELECT TOP (1000)

    column1 = IDENTITY(INT,1,1),

    column2 = 'ieifjrjfm',

    column3 = 'rwthtr',

    column4 = 'wh4hhtrh',

    column5 = ABS(CHECKSUM(NEWID()))%10+1

    INTO #YourFirstTable

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    --===== Show the test data (THIS IS NOT A PART OF THE SOLUTION)

    SELECT TOP 100 *

    FROM #YourFirstTable

    ORDER BY column1

    ;

    --===== Now, spawn the new rows according to the quantity in column 5.

    -- Just change this to an insert and you're golden.

    SELECT column1,column2,column3,column4

    FROM #YourFirstTable yft

    JOIN dbo.Tally t

    ON t.N <= yft.column5

    ;

    Now... two things. First, take a look at the article at the first link in my signature line below. With 481 points of your own, it's time you learned how to create a post that people will take more immediate interest in. 😉

    Second, if you don't already know what a Tally table is, it's time to learn about that wonderful tool. Please read the following article to learn how it can replace certain While loops like I just did in the code above.

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh yeah... almost forgot... make sure that "N" in the Tally table is bigger than the largest value of column 5 or at least do a test to see because the code I wrote won't work correctly if "N" is smaller and it won't warn you. It's too busy working to warn you. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • wow!, using this tally approach for the same amount & type of records ran in ~10mins.

    I included the qauntity field in an existing non-clustered index which should speed up things.

    Would it be quicker if this tally table was permanent and not in tempdb?

    I already figured out the max restriction on qty field, thx anyway 😉

    I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.

    I have ready many books and attended courses and have never come across this tally table approach, you should be commended.

    Thanks.

  • msarikas (9/29/2010)


    wow!, using this tally approach for the same amount & type of records ran in ~10mins.

    I included the qauntity field in an existing non-clustered index which should speed up things.

    Would it be quicker if this tally table was permanent and not in tempdb?

    I already figured out the max restriction on qty field, thx anyway 😉

    I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.

    I have ready many books and attended courses and have never come across this tally table approach, you should be commended.

    Thanks.

    Thanks for the kudo's but the Tally Table isn't my original idea. It was used way back in the 60's and it's frequently called a "Numbers" table now. Shoot... even Celko (the guy above) uses such a thing except (IIRC) he prefers to call his a "sequence" table which is also an appropriate name for it.

    And, I have been commended for it... you just did and you made my day. Thank you very much. 🙂

    I have to agree with Celko though. This is an extreme amount of denormalization. Why do you need to do this? I mean, why can't you just store one row with 1 quantity for each item?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • msarikas (9/29/2010)


    wow!, using this tally approach for the same amount & type of records ran in ~10mins.

    I included the qauntity field in an existing non-clustered index which should speed up things.

    Would it be quicker if this tally table was permanent and not in tempdb?

    I already figured out the max restriction on qty field, thx anyway 😉

    I would imagine if the source table has very simple & not too many fields , it would run even quicker, my table has xml, date, etc etc.

    I have ready many books and attended courses and have never come across this tally table approach, you should be commended.

    Thanks.

    Ah... sorry. Didn't answer the other questions. I usually create a permanent Tally table in a Util database and write a synonym in the other databases to it. Speed won't change noticibly whether the table is in TempDB, local DB, or a Util DB with synonyms.

    You are correct... the fewer the number of columns, the faster things will be. That's usually true with any query, though.

    Since you're doing a table scan, the addition of the quantity field to a clustered index likely won't do anything except slow down inserts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • im with u jeff that in db world you need to store everything at the lowest grain e.g. date, products etc. will make your dba life easier and the system less prone to major changes.

    Originally the data was housed per quantity as per user specs, now we have to expand this qty field to extract each instance of this record. Its a case of mind changing plus users designing system = disaster! welcome to my world :hehe:

  • msarikas (9/30/2010)


    im with u jeff that in db world you need to store everything at the lowest grain e.g. date, products etc. will make your dba life easier and the system less prone to major changes.

    Originally the data was housed per quantity as per user specs, now we have to expand this qty field to extract each instance of this record. Its a case of mind changing plus users designing system = disaster! welcome to my world :hehe:

    Thanks for the feedback and I have an appreciation for that. Someone reads something somewhere on the internet or in a book and they suddenly become an evangelist for what may (or may not) be a bad method.

    The expansion of rows like this isn't an uncommon request. If it's at all possible, I'd really like to know the business reason they have behind this. What utility does it provide?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Seriously... I'd like to know just from an educational standpoint. The only reason anyone has ever told me why they wanted to do this was to supposedly make a FIFO inventory system a bit "easier" to manage. I'd like to know if there are other reasons.

    Thanks in advance for any information you can share on the subject.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply