Uploading data into MULTIPLE related tables without SSIS

  • A typical ETL scenario is uploading data into a database from a single text file. During uploading data cleaning and pre-processing is necessary. The text file contains less than 100,000 rows.

    The text processor creates a temporary table and dumps the text file as-is. The next step would be inserting the data into approx 10 tables; whereby each table has a surrogate ID.

    My question is - is there an efficient was to populate each table. Since the database normalized the traditional approach would be:

    1) insert row into table A and get the newly inserted identity PK.

    2) insert row into table B and get the newly inserted identity PK.

    3) insert row into table C with keys in 1 and 2.

    4) and so forth..

    Even though cursors were avoided, this row by row operation is highly inefficient .Are there any solutions or recommendations to address this problem?

    Thanks

  • This was removed by the editor as SPAM

  • Thanks for your feedback stewartc-708166.

    I believe that the problem with the suggested approach is that there is no means how to relate TableX with #Temp.

    With the aim to explain my self clearer, I will examplify my problem.

    TempTable - with a suggorate ID

    TempTable will be truncated after the import operation completes.

    PK ColA1 ColA2 ColA3 ColA4 ColA5 ColB1 ColB2 ColB3 ColB4 ColC1 ColC2 ColC3

    Table A

    PK Col1 Col2 Col3 Col4 Col5

    Table B

    PK Col1 Col2 Col3 Col4 Col5

    Table C

    PK Col1 Col2 Col3 Col4 ColFKtoTableA ColFKtoTableB

    Columns TempTable - ColA1,ColA2,ColA3, ColA4 map to TableA. These fields are processed and could be the result of a lookup search, string manipulation, or other type of processing.

    Columns TempTable - ColB1,ColB2,ColB3, ColB4 map to TableB

    Columns TempTable - ColC1,ColC2,ColC3,ColC4 map to TableC

    Alternatively, it would be interesting to lean how to implement the above scenario with SSIS.

  • if you have a PK in your Temp table, why not just use that PK as your identifier in all your subsequent tables?

    --------------------------------------------------------------------------
    When you realize you've dug yourself into a hole....Step 1...stop digging.

  • Prefer not to include an association between tables and the temp tables.

    Would like to avoid changes in schema plus the temp table is just a temporary persisted table. This table is truncated after a successful run. I am aware that the truncate operation preserves identity keys.

  • it's not an association to the temp table. Just use the temp table's key value as your key value in the destination tables. Instead of the PK in the destination being an identity just use the Temp Table's key. If you don't want to change the field from an identity you can perform an identity insert. If, however, you have values in those tables that you cannot change because this is production, well then this idea is not a good one.

    --------------------------------------------------------------------------
    When you realize you've dug yourself into a hole....Step 1...stop digging.

  • Thanks Grasshopper for the input. Indeed the system is already in production. Hence would need to opt for an alternative solution...

  • This was removed by the editor as SPAM

  • yes my mistake. Truncate vs delete..Guess you are right 🙂

    Thanks

  • This was removed by the editor as SPAM

  • I believe that this solutions will work out well.

    It is assumed that the columns listed in the predicate are unqiue.

    ie.

    [Table A].Col1 = ColA1

    AND [Table A].Col2 = ColA2

    AND [Table A].Col3 = ColA3

    AND [Table A].Col4 = ColA4

    AND [Table A].Col5 = ColA5

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply