Insert records count reach limit

  • Hi Friends,

    I have a scenario that I have records incoming from a table and have to insert into another table. Incoming count is usually in millions. I have been asked to write T-SQL or create SSIS package saying that if the insert count reaches 1 million, stop it and then move to next table until it reaches a million and then third table and so on.

    I also have to check if the first table is already having 10k records, I will have to insert (1 million - 10K) = 990000.

    All tables are having same structure with just the table name changes to Table1, table2 etc..

    For Ex :

    If incoming rows are 10 million

    Table1 - 1 million

    table2 - 1 million

    table3 - 1 million

    .

    .

    .

    .

    .

    .

    .

    table10 - 1 million

    I must insert the same rows in the upcoming destination table.

    Is this possible? I know this may sound crazy but just curious if this is possible, I will spend my time or else I would say nope to the client. 🙂

    Any suggestions would be appreciated.

    Thanks,
    Charmer

  • You could use a cte and Row_Number and insert all rows which have a row number less than 1,000,000.

    WITH x AS (SELECT MyValue

    , ROW_NUMBER() OVER (ORDER BY MyValue) AS rw

    FROM #MySelectTable

    )

    INSERT INTO #MyInsertTable

    SELECT MyValue

    FROM x

    WHERE rw <= 1000000;

    In order to keep the rows already in the table and insert new rows but don't exceed the million limit then you could add a predicate where the row number is greater than the count of rows in the table you are inserting into.

    There is no checking here to ensure that duplicate rows aren't inserted.

    WITH x AS (SELECT MyValue

    , ROW_NUMBER() OVER (ORDER BY MyValue) AS rw

    FROM #MySelectTable

    )

    INSERT INTO #MyInsertTable

    SELECT MyValue

    FROM x

    WHERE rw > (SELECT COUNT(*) FROM #MyInsertTable) AND

    rw <= 1000000;

  • Charmer (3/22/2016)


    Hi Friends,

    I have a scenario that I have records incoming from a table and have to insert into another table. Incoming count is usually in millions. I have been asked to write T-SQL or create SSIS package saying that if the insert count reaches 1 million, stop it and then move to next table until it reaches a million and then third table and so on.

    I also have to check if the first table is already having 10k records, I will have to insert (1 million - 10K) = 990000.

    All tables are having same structure with just the table name changes to Table1, table2 etc..

    For Ex :

    If incoming rows are 10 million

    Table1 - 1 million

    table2 - 1 million

    table3 - 1 million

    .

    .

    .

    .

    .

    .

    .

    table10 - 1 million

    I must insert the same rows in the upcoming destination table.

    Is this possible? I know this may sound crazy but just curious if this is possible, I will spend my time or else I would say nope to the client. 🙂

    Any suggestions would be appreciated.

    What are they expecting to do with this?

    It would be far easier to insert everything in a single table and use partitioned views to show one million at a time.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi Luis,

    I am not sure but I think they are going to refer these tables in a ssrs report. I heard that they are using share point and report rendering time should not take more than 3 minutes.

    So I feel they think that if they use single table consists of records in millions , it might lead to performance issue. This is what I heard as of yet.

    can you provide me your suggestions ?

    Thanks,
    Charmer

  • If the data is being reported on then is it not more important to ensure it is correct? Can the data be consolidated during the load process which would reduce the number of rows and ensure any reporting was correct.

  • Charmer (3/23/2016)


    Hi Luis,

    I am not sure but I think they are going to refer these tables in a ssrs report. I heard that they are using share point and report rendering time should not take more than 3 minutes.

    So I feel they think that if they use single table consists of records in millions , it might lead to performance issue. This is what I heard as of yet.

    can you provide me your suggestions ?

    Using multiple tables with arbitrary divisions might actually slow things down. If you need to query 2 or 3 tables for the report, you'll need to traverse the same amount of b-trees. With a single large table, with a good clustered index, the performance will be great. It wouldn't matter if they have one million or one billion, as long as they consistently read the minimum amount of pages which is achieved by the correct design of the clustered index.

    They're trying to achieve manual table partitioning, but table partitioning isn't there to improve performance.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply