query to repeat insert same rows

  • i want to make a query using cte to insert multiple rows with same field.

    i have a result query with information how many rows should be inserted like this.

    KeyHD CountInserted

    1 2

    2 5

    3 7

    it means, KeyHD number 1 insert 2 rows with same field, KeyHD number 2 insert 5 rows, etc..

    i can make the query using cursor and looping it.. but i think it can be solved using cte recursive. i'm not familiar with it..

    Please help...

    Thank you..

  • First set up your data

    CREATE TABLE #Required (

    KeyHD INT --PRIMARY KEY CLUSTERED

    , CountInserted INT

    );

    INSERT INTO #Required (KeyHD, CountInserted)

    VALUES (1, 2), (2, 5), (3, 7);

    Now, using a CTE

    WITH

    L0 AS (SELECT c FROM (VALUES(1),(1)) AS D(c)),

    L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

    L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

    L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

    L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

    L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B)

    SELECT r.KeyHD, 'My Value'

    FROM #Required AS r

    CROSS APPLY (

    SELECT TOP(r.CountInserted) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n

    FROM L5

    ) AS n

    You can also convert the CTE into a virtual Tally Table, which is then re-usable

    CREATE FUNCTION dbo.GetNums(@low AS BIGINT, @high AS BIGINT) RETURNS TABLE

    AS

    RETURN

    WITH

    L0 AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)),

    L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

    L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

    L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

    L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

    L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),

    Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum

    FROM L5)

    SELECT TOP(@high - @low + 1) @low + rownum - 1 AS n

    FROM Nums

    ORDER BY rownum;

    GO

    Now, using the virtual tally table

    SELECT r.KeyHD, 'Some Value'

    FROM #Required AS r

    CROSS APPLY dbo.GetNums(1, CountInserted) AS n;

  • There's a quick 'n' dirty way of doing this which resembles the technique used by Des, whose set-up script I have copied (thanks!):

    IF OBJECT_ID('tempdb..#Required', 'U') IS NOT NULL

    DROP TABLE #Required;

    CREATE TABLE #Required

    (

    KeyHD INT --PRIMARY KEY CLUSTERED

    ,CountInserted INT

    );

    INSERT #Required

    (KeyHD, CountInserted)

    VALUES (1, 2),

    (2, 5),

    (3, 7);

    SELECT r.KeyHD

    FROM #Required r

    CROSS APPLY (SELECT TOP (r.CountInserted)

    1 n

    FROM sys.columns

    ) x;

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • I know I'm splitting hairs but there are quite some differences between the different methods of producing inline tally or number sequences. The method DesNorton posted is quite efficient but it can be improved by almost 50% by simplifying the construct. It may not seem much in terms of actual cost where the StatementSubTreeCost is only 0.00121224 but bringing that number down to 0.000633653 is still substantial.

    Here are two methods which can produce 10^9 rows at half the cost of the code posted previously and 10% of the cost of using sys.columns

    😎

    Method #1, this is slightly less efficient but simpler to write

    DECLARE @SAMPLE_SIZE BIGINT = 1000;

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    SELECT

    NM.N

    FROM NUMS NM;

    Method #2, most efficient method I've found but has a lengthy value session

    DECLARE @SAMPLE_SIZE BIGINT = 1000;

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N

    FROM T T1,T T2,T T3,T T4,T T5)

    SELECT

    NM.N

    FROM NUMS NM;

  • Eirikur Eiriksson (10/1/2016)


    I know I'm splitting hairs but there are quite some differences between the different methods of producing inline tally or number sequences. The method DesNorton posted is quite efficient but it can be improved by almost 50% by simplifying the construct. It may not seem much in terms of actual cost where the StatementSubTreeCost is only 0.00121224 but bringing that number down to 0.000633653 is still substantial.

    Here are two methods which can produce 10^9 rows at half the cost of the code posted previously and 10% of the cost of using sys.columns

    😎

    Method #1, this is slightly less efficient but simpler to write

    DECLARE @SAMPLE_SIZE BIGINT = 1000;

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    SELECT

    NM.N

    FROM NUMS NM;

    Method #2, most efficient method I've found but has a lengthy value session

    DECLARE @SAMPLE_SIZE BIGINT = 1000;

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)

    ,(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N

    FROM T T1,T T2,T T3,T T4,T T5)

    SELECT

    NM.N

    FROM NUMS NM;

    Thanks for this Eirikur. It's always good to find ways of improving performance.

    What I would like to understand, is

    1 - what makes Method#2 perform better than Method#1, and what makes these methods perform better than the one that I posted?

    2 - is there a performance difference between your "(ORDER BY @@VERSION)" and my "(ORDER BY (SELECT NULL))"?

    3 - is there a performance difference between your old join format ("FROM T T1,T T2, ...") and the new join format ("SELECT 1 FROM L0 AS A CROSS JOIN L0 AS B CROSS JOIN ...")?

  • DesNorton (10/3/2016)


    Thanks for this Eirikur. It's always good to find ways of improving performance.

    What I would like to understand, is

    1 - what makes Method#2 perform better than Method#1, and what makes these methods perform better than the one that I posted?

    2 - is there a performance difference between your "(ORDER BY @@VERSION)" and my "(ORDER BY (SELECT NULL))"?

    3 - is there a performance difference between your old join format ("FROM T T1,T T2, ...") and the new join format ("SELECT 1 FROM L0 AS A CROSS JOIN L0 AS B CROSS JOIN ...")?

    1) Method #2 has fewer join operators than the #1, there are more constant scans operators but the cost of those is less than the joins.

    2) No difference between @@VERSION and (SELECT NULL), simply tells the server to ignore any specific order and apply the enumeration to the order of appearance.

    3) No, the onld style join is effectively a short hand for the CROSS JOIN syntax, the server will replace the commas with CROSS JOIN clauses.

    The main thing are the number of more costly operators in the plan, the cardinality estimate offset of each of those and alignement of the row origination in the flow. If all rows originate from the initial CTE of few rows, then all the join operators above that source have to be called propotionally to the desired cardinality of the output.

    😎

  • You can also improve your function by adding the WITH SCHEMABINDING option. Even though you're not using any physical tables, SQL checks whether the underlying tables have changed if you don't use this option. Using the WITH SCHEMABINDING option allows SQL to skip this check, because the underlying tables CAN'T be changed in a way that would affect the function.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (10/3/2016)


    You can also improve your function by adding the WITH SCHEMABINDING option. Even though you're not using any physical tables, SQL checks whether the underlying tables have changed if you don't use this option. Using the WITH SCHEMABINDING option allows SQL to skip this check, because the underlying tables CAN'T be changed in a way that would affect the function.

    Drew

    Good point Drew and often overlooked.

    😎

  • Thank you Eirikur and Drew.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply