SSIS and Date Ranges

  • Trying to narrow the problem down, but a rough outline of the overall process is:

    1. Load from staging table
    2. Populate a couple of calculated columns
    3. Add a bunch of reference keys from tables populated earlier in the process (using Lookup)
    4. Data Conversion on one column -- DT_STR -> DT_WSTR
    5. Add a HashVal using 3rd party Multiple Hash component (used in other Data Flows in the process without issue)
    6. Insert / update as needed
  • just curious how the updates are being handled?

  • Lookup to determine new / existing, conditional split for changed / unchanged, and and OLE DB Command for the actual update, like all the other data flows in the ETL. (which aren't slow)

  • Don't use OLEDB Command for updates ... unless your number of updates is tiny. It's dog slow.

    Instead, direct all of your update rows to (yet) another staging table (as INSERTs), and at the end of your dataflow, issue a single UPDATE from staging to target.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Well, this is a big one (for this data set), at ~200-450K rows.  So, maybe not _tiny_, but certainly not huge, compared to a lot of others out there.

  • cmullican wrote:

    Well, this is a big one (for this data set), at ~200-450K rows.  So, maybe not _tiny_, but certainly not huge, compared to a lot of others out there.

    The OLEDB Command does its updates one row at a time. 500K rows will take a significantly longer time to process in this manner than in the way I described, I'd put money on it.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply