Incremental Load

  • Hi All,

    Need Approach to load Incremental Data.

    Requirement:

    Source - Oracle

    Target - Sql Server.

    Source table contains 46 columns and i want to load as it is.

    My requirement is that, Package will execute every hour and fetch newly inserted recodrs only. I am using lookup and merge join but its taking too much time.

    Please guide for any other approach.

    Thanks,

    Abhas.

  • Is there a 'DateCreated' column in your source data?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • When you have a lookup, why should you use the merge join?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Phil Parkin (3/23/2015)


    Is there a 'DateCreated' column in your source data?

    or is there an incremental primary key in the Oracle source that you can just select fromOracle where the source is higher than the highest key in the currently Local migrated data?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (3/23/2015)


    Phil Parkin (3/23/2015)


    Is there a 'DateCreated' column in your source data?

    or is there an incremental primary key in the Oracle source that you can just select fromOracle where the source is higher than the highest key in the currently Local migrated data?

    Good point, that is a better option if it's available.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • DATE_MODIFIED column is there.

  • abhas (3/23/2015)


    DATE_MODIFIED column is there.

    If DATE_MODIFIED is populated when the row is initially inserted, and it's updated consistently by the application, and it's indexed, then that's ideal. You may find that a SSIS T-SQL task that leverages MERGE statement performs better than the SSIS Lookup or Merge tasks.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (3/23/2015)


    abhas (3/23/2015)


    DATE_MODIFIED column is there.

    If DATE_MODIFIED is populated when the row is initially inserted, and it's updated consistently by the application, and it's indexed, then that's ideal. You may find that a SSIS T-SQL task that leverages MERGE statement performs better than the SSIS Lookup or Merge tasks.

    MERGE means you first have to stage the data at SQL Server side.

    Also:

    Use Caution with SQL Server's MERGE Statement[/url]

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Eric M Russell (3/23/2015)


    abhas (3/23/2015)


    DATE_MODIFIED column is there.

    If DATE_MODIFIED is populated when the row is initially inserted, and it's updated consistently by the application, and it's indexed, then that's ideal. You may find that a SSIS T-SQL task that leverages MERGE statement performs better than the SSIS Lookup or Merge tasks.

    Actually, as only INSERTS are required, it is rather less than ideal, as existing rows will have to be ignored.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Koen Verbeeck (3/23/2015)


    Eric M Russell (3/23/2015)


    abhas (3/23/2015)


    DATE_MODIFIED column is there.

    If DATE_MODIFIED is populated when the row is initially inserted, and it's updated consistently by the application, and it's indexed, then that's ideal. You may find that a SSIS T-SQL task that leverages MERGE statement performs better than the SSIS Lookup or Merge tasks.

    MERGE means you first have to stage the data at SQL Server side.

    Also:

    Use Caution with SQL Server's MERGE Statement[/url]

    Not necessarily; MERGE can insert, update, or delete rows. However, a straight bulk load the best way to initially stage a million+ row target table. Follow that with a scheduled MERGE.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Phil Parkin (3/23/2015)


    Eric M Russell (3/23/2015)


    abhas (3/23/2015)


    DATE_MODIFIED column is there.

    If DATE_MODIFIED is populated when the row is initially inserted, and it's updated consistently by the application, and it's indexed, then that's ideal. You may find that a SSIS T-SQL task that leverages MERGE statement performs better than the SSIS Lookup or Merge tasks.

    Actually, as only INSERTS are required, it is rather less than ideal, as existing rows will have to be ignored.

    If it's just an accumulating insert, then a simple INSERT will do. However, in the real world this typically ends up being a type 1, 2, or 3 slowly changing insert / update.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks Guys.

    I want to go with MERGE.

    I am thinking first to find max DATE_MODIFIES from target and trying to fetch data greater than DATE_MODIFIES from source and inserting into staging. After that using MERGE i am inserting records.

    Now with a oracle source i am getting below error.

    [SRC From Sql developer [1]] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.

    An OLE DB record is available. Source: "OraOLEDB" Hresult: 0x80004005 Description: "".

    An OLE DB record is available. Source: "OraOLEDB" Hresult: 0x80004005 Description: "ORA-01722: invalid number".

    Does anybody has faced this type of issue.

    If I am using data access mode = command then it is working fine but , while passing as oracle source, parameters are not allowed in this access mode, so I am using data access mode as variable, that time i am getting above error.

    Thanks

    Abhas.

  • Isn't there a CDC (capture data change) mechanism for Oracle to SQL Server? I've only used SQL Server to SQL Server CDCs thus but it might be worth a look.

  • Another thing you need to notice how you want to cater the already present record which are there on your destination table but got modified in the Oracle source, if you want to insert those records directly than that's not a big task as you have to play with the Modified Date and which all records have modified date greater than the MAX(load dt) in your system pull out those records and insert them else you need to filter out if those are new or already existing

  • in the real world this typically ends up being a type 1, 2, or 3 slowly changing insert / update

    How does it end up being a type 3? If you mean an SCD 3, that's the alternate reality scenario. This usually requires special handling.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply