Import specific rows from Excel

  • Hello,
    I'll get right to it... I'm trying to schedule a package to pull data from a shared Excel source (which is being added to each weekday...maybe 10 records a day or so )  and into my SQL table.  Basically, I need to import only the new records.  Right now, I'm using a Conditional Split and bringing in today's date minus 1.  This actually works very well, but the problem is that on Monday, all Fridays, Saturdays, and Sundays data is dumped into the Excel file, so, if I use the Today minus one, I'll miss out on 2 days of data.  Perhaps there's a better method to use than the conditional split... you guys tell me.... I'd covet any help.  I hope I laid my issue out clearly.
    thank you!

  • John524 - Tuesday, December 4, 2018 4:16 PM

    Hello,
    I'll get right to it... I'm trying to schedule a package to pull data from a shared Excel source (which is being added to each weekday...maybe 10 records a day or so )  and into my SQL table.  Basically, I need to import only the new records.  Right now, I'm using a Conditional Split and bringing in today's date minus 1.  This actually works very well, but the problem is that on Monday, all Fridays, Saturdays, and Sundays data is dumped into the Excel file, so, if I use the Today minus one, I'll miss out on 2 days of data.  Perhaps there's a better method to use than the conditional split... you guys tell me.... I'd covet any help.  I hope I laid my issue out clearly.
    thank you!

    You realise that this is just going to get slower and slower over time, I am sure.

    I'd do this differently from you.

    First, build another table which is almost identical to your target ... let's call it wrk.t1.
    Your load process then becomes

    1. Truncate table wrk.t1
    2. Import all source rows to wrk.t1
    3. Perform an INSERT ... SELECT WHERE NOT EXISTS() from wrk.t1 to your target table.

    Assuming you have a proper unique column (or combination of columns) to match on, this method will make sure that you have all of the data, every time.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • thank you Phil.  I'm brand new to SSIS, so this is exactly what I needed to hear.  This makes good sense

  • Phil Parkin - Tuesday, December 4, 2018 5:24 PM

    John524 - Tuesday, December 4, 2018 4:16 PM

    Hello,
    I'll get right to it... I'm trying to schedule a package to pull data from a shared Excel source (which is being added to each weekday...maybe 10 records a day or so )  and into my SQL table.  Basically, I need to import only the new records.  Right now, I'm using a Conditional Split and bringing in today's date minus 1.  This actually works very well, but the problem is that on Monday, all Fridays, Saturdays, and Sundays data is dumped into the Excel file, so, if I use the Today minus one, I'll miss out on 2 days of data.  Perhaps there's a better method to use than the conditional split... you guys tell me.... I'd covet any help.  I hope I laid my issue out clearly.
    thank you!

    You realise that this is just going to get slower and slower over time, I am sure.

    I'd do this differently from you.

    First, build another table which is almost identical to your target ... let's call it wrk.t1.
    Your load process then becomes

    1. Truncate table wrk.t1
    2. Import all source rows to wrk.t1
    3. Perform an INSERT ... SELECT WHERE NOT EXISTS() from wrk.t1 to your target table.

    Assuming you have a proper unique column (or combination of columns) to match on, this method will make sure that you have all of the data, every time.

    If you're going to take it that far, just import the whole spreadsheet and say "done" so that you don't ever have to worry about things ever gettng out of sync.  That'll also allow you to do the 2 table green apple quickstep so that the total downtime for your users is the amount of time it takes to do a "drop'n'swap" of a synonym to repoint to the latest active table once it has been loaded.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply