INSERT A Record Based on Count

  • I have the following data,

    CustomerID EngID EngCount

    1 A11 2

    5 Z12 1

    10 X15 3

    I need to display this data as,

    CustomerID EngID EngCount

    1 A11 1

    1 A11 1

    5 Z12 1

    10 X15 1

    10 X15 1

    10 X15 1

    Based on the EngCount, I need to INSERT the records as above.


    Kindest Regards,

  • This is another classic example of how a "tally" or "numbers" table can be used. We'll see if Jeff's around 😀

    First, if you haven't create a numbers/tally table you'll need to do so.

    SELECT TOP 10000 --equates to more than 30 years of dates

    IDENTITY(INT,1,1) AS N

    INTO dbo.MyNumbers

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.MyNumbers

    ADD CONSTRAINT PK_MyNumbers_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.MyNumbers TO PUBLIC

    Now cross join it to your query, but limit the return set by the EngCount column ....

    DECLARE @table TABLE (CustomerID INT, EngID CHAR(3), EngCount INT)

    INSERT @table

    SELECT 1,'A11',2 UNION

    SELECT 5,'Z12',1 UNION

    SELECT 10,'X15',3

    SELECT

    *

    FROM

    @table t1

    CROSS JOIN tally t2

    WHERE

    t2.N <= t1.EngCount

    See Jeff, I have been listening ... LOL

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Great solution Jason. 😉

    The original specification required that the data be in a not incrmental format. For example, if A11 has 2 then it should be displayed as

    1 a11 1

    1 a11 1

    not:

    1 a11 1

    1 a11 2

    So, the following modifications can be applied to your script.

    SELECT

    CustomerID, EngID, 1 AS [EngCount]

    FROM

    @table t1

    CROSS JOIN mynumbers t2

    WHERE

    t2.N <= t1.EngCount

  • Adam Haines (11/8/2007)


    Great solution Jason. 😉

    The original specification required that the data be in a not incrmental format. For example, if A11 has 2 then it should be displayed as

    1 a11 1

    1 a11 1

    not:

    1 a11 1

    1 a11 2

    you are right Adam, however you don't need to do all of the CASE and N-N+1 thing. Each row will just be a 1. 😀

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • AH, you fixed your code right as I made my last post.... LOL

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Yeah, I stopped and thought about it and a light went off :hehe:

  • Nice 🙂

    Don't forget, though, in SQL2005 we don't need to use permanent tables and IDENTITY columns to generate our numbers for us. Use ROW_NUMBER() function and a derived table on the fly:

    Select t.CustomerID, t.EngID, 1 As EngCount

    From @Table As t

    Inner Join

    (

    Select top 100 ROW_NUMBER() Over(Order By c1.id) As RowNumber

    From master.dbo.syscolumns As c1

    Cross Join master.dbo.syscolumns As c2

    ) dt

    On (t.EngCount >= dt.RowNumber)

    Order by t.CustomerID, t.EngID

    I'm sure there's also a way to do this using a recursive CTE, but I'm about 5 cups of coffee short of that solution 😛

  • Hey, all it took was 1 cup of coffee:

    [font="Courier New"]With CTE_Expand

    As

    (

    Select CustomerID, EngID, 1 As EngNumber

    From @Table

    Union All

    Select c.CustomerID, c.EngID, c.EngNumber + 1

    From CTE_Expand As c

    Inner Join @Table As t

    On (c.CustomerID = t.CustomerID And

    c.EngID = t.EngID And

    c.EngNumber < t.EngCount)

    )

    Select CustomerID, EngID, 1 As EngCount

    From CTE_Expand

    Order By CustomerID, EngID[/font]

  • Nice!:cool:

  • Wow, that's pretty nice PW!

    Trigger, can you run both of these against your data and get the execution stats. My test shows PW's is a HUGE amount faster.

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Jason,

    Which script from PW did you compare to yours, the CTE or the subquery one?

  • It wasn't an entirely fair comparison, because I hard-coded the ROW_NUMBER() solution to use TOP 100.

    I'd expect the CTE to win out for smallish values of "EngCount" because it only expands out as high as necessary based on the max value in the original dataset.

    If the actual dataset has records where EngCount is orders of magnitude higher, then the CTE might be very sluggish in comparison to joining a derived table.

  • Adam Haines (11/8/2007)


    Jason,

    Which script from PW did you compare to yours, the CTE or the subquery one?

    the CTE

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Jason Selburg (11/8/2007)


    Wow, that's pretty nice PW!

    Trigger, can you run both of these against your data and get the execution stats. My test shows PW's is a HUGE amount faster.

    First, nicely done to all... good to see people thinking and taking performance into consideration...

    Jason, nicely done with the Tally table and the tests of the other methods... big questions for me ('cause I still don't have 2k5 to test with) are things like how many rows did you test against, what was the perfomance gain, and what does your test code look like?

    Thanks...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Actually, after you asked, I did a bit more testing and I was completely backwards in my statement.

    So if anyone wants to double check, please do.....

    I built a testData table that holds two, three, four, five and six digit random alpha-numeric strings and random numbers for the power of ten, hundred and thousand. I currently have 1mil rows.

    SELECT TOP 1000000

    IDENTITY(INT,1,1) AS nDex

    ,LEFT(CAST(NEWID() AS VARCHAR(128)), 2) AS twoDigit

    ,LEFT(CAST(NEWID() AS VARCHAR(128)), 3) AS threeDigit

    ,LEFT(CAST(NEWID() AS VARCHAR(128)), 4) AS fourDigit

    ,LEFT(CAST(NEWID() AS VARCHAR(128)), 5) AS fiveDigit

    ,LEFT(CAST(NEWID() AS VARCHAR(128)), 6) AS sixDigit

    ,(1 + ABS(CHECKSUM(NEWID())) % 10) AS tenPower

    ,(10 + ABS(CHECKSUM(NEWID())) % 90) AS hundredPower

    ,(100 + ABS(CHECKSUM(NEWID())) % 900) AS thousandPower

    INTO

    dbo.testData

    FROM

    Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    SAMPLE DATA

    nDextwoDigitthreeDigitfourDigitfiveDigitsixDigittenPowerhundredPowerthousandPower

    1 7F 527 1456 BBA13 7099C9 7 99 378

    2 48 832 652E AE393 D4A4E8 8 77 667

    3 AC 666 743E 15189 99DB9C 4 23 428

    4 A8 DCB 114A 311FD 4BAB45 9 46 716

    5 A4 0B8 A9B6 54E7B B81C33 2 66 858

    6 35 61B DF41 A4A77 44FA6B 8 17 832

    7 E3 2AE 7353 65C92 C7CDF4 7 93 757

    8 91 D04 C5B9 DE5C5 0AD2C6 10 52 357

    9 E6 5D0 ED1F 02AA4 8ED6E8 7 73 890

    10 85 F49 601C 596EA 77A733 6 26 125

    Now I modified the field names to fit my testData table. I'm using the nDex field as my customerID and the tenPower as the EngCount.

    Using the testing code below I got a 3% to 97% comparison.

    SELECT TOP 100000

    *

    FROM

    dbo.testData t1

    CROSS JOIN tally t2

    WHERE

    t2.N <= t1.tenPower;

    -- VS.

    WITH

    CTE_Expand

    AS (SELECT TOP 100000 nDex, twodigit, 1 AS tenPower

    FROM dbo.testData

    UNION ALL

    SELECT c.nDex, c.twodigit, c.tenPower + 1

    FROM

    CTE_Expand AS c

    INNER JOIN dbo.testData AS t

    ON c.nDex = t.nDex AND c.tenPower < t.tenPower)

    SELECT nDex, twodigit, 1 AS EngCount

    FROM CTE_Expand

    ORDER BY nDex, twoDigit

    Where the first code ( Mine 😀 ) has a subtree cost of 2.21972 and the CTE version is 68.377.

    It's my understanding that the subtree cost is roughly equivalent to CPU resources required. Correct?

    So in the end, the Tally Table Cross Join solution is more efficient and if anyone disagrees, I'd love to be proven wrong and learn some more .... :hehe:

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg

Viewing 15 posts - 1 through 15 (of 60 total)

You must be logged in to reply to this topic. Login to reply