July 22, 2009 at 10:48 am
the data is to be inserted/updated to a table from a flat file.
I am using a Dataflow task. For each row read from the file, based on some validations I need to insert the data or update the existing data.
Thes validations include
1. Whether the row already exist. If it exists what is the vlaue of a paritucular column. If the validations are valid then update the row from File. If not skip it. Basically it is just not just exists condition,It involves couple more sql statements before update.
2. If the row doesn't exist then insert it.
July 22, 2009 at 10:51 am
So do you have a specific question on a problem or challenge that you are facing in your data flow? Or do you just want general guidence on how to do this?
For general guidence, I'd use a Lookup transformation to determine if the row exists and configure the error output to redirect non-existant rows to a OLE DB Destination to insert the new rows. Pretty simple actually.
July 22, 2009 at 10:55 am
I would probably load the file to a staging table and then perform the insert/update in a stored proc.
The stored proc would have 2 statements. In the stored proc I would have one update statement with the staging table joined to the master table with an inner join. Then an Insert statement with the staging table joined to the master with an outer join and a where master = NULL.
July 22, 2009 at 11:02 am
Thanks for quick reply. I am working on one SSIS package right now and have the situation where the insert is very simpele , that is if it doesn't exist insert it.
The update is not just based on if row exitst in table condition. It may need couple more sql queries to confirm before update.So what can I use to make this happen
July 22, 2009 at 11:03 am
allen davidson (7/22/2009)
I would probably load the file to a staging table and then perform the insert/update in a stored proc.The stored proc would have 2 statements. In the stored proc I would have one update statement with the staging table joined to the master table with an inner join. Then an Insert statement with the staging table joined to the master with an outer join and a where master = NULL.
Why dilute your ETL solution with staging tables for such a simple task? Sure, it works, but in the process, you've made your ETL solution much less portable. You can't just point it to a new DB without having to install stored procedures and burden your DB with performing the ETL work. Let the SSIS service perform the heavly lifting within it's memory buffers and leave the database alone!
SSIS is an ETL tool much different from DTS. You perform your validations and transformations before loading your data into the DB.
Staging tables = old school DTS.
Data Flow transformations = SSIS ETL.
July 22, 2009 at 11:04 am
bhanukbm (7/22/2009)
Thanks for quick reply. I am working on one SSIS package right now and have the situation where the insert is very simpele , that is if it doesn't exist insert it.The update is not just based on if row exitst in table condition. It may need couple more sql queries to confirm before update.So what can I use to make this happen
It's hard to give good guidence with such a vauge explanation of what you are trying to do. Why don't you give us a list of the logical tasks that you need to perform?
July 22, 2009 at 11:23 am
Thanks John, I hadn't really considered that.
July 22, 2009 at 11:39 am
No problem. Using staging tables and procedures is not necessarily a bad thing, but it too often the first choice for SSIS developers. SSIS was designed so that all of the ETL work can be done within SSIS using raw files and data flows. Sometimes it is easier to use the staging table approach and for us with ample DTS experience, it's a hard habit to break, but you just have other considerations for portability and DB contention to deal with every time you decide to interleave DB centric operations into your SSIS work.
July 22, 2009 at 11:44 am
Here are the logical step1.
1.Source is Tax File
2. Destination Tax_tb
From source get the row.
If the row exists in Tax_tb
find the differnece between tax column values from File to that of table.
July 22, 2009 at 11:48 am
Sorry before I finished my logical steps I pressed post..
But here are the steps.
Here are the logical step1.
1.Source is Tax File
2. Destination Tax_tb
From source get the row.
If the row exists in Tax_tb
find the differnece between tax column values from File to that of existing row value.
If this value is within the limit. then update other wise skip it. The limit is obtained from another database table.
If the vlaue is out of limit then don't update it just skip it.
July 22, 2009 at 12:18 pm
Hmm - staging table?
It does open up all of the functionality of SQL. My 2p. I often see requirements that are not fully scoped out or that are likely to change. If I have the data in the DB I know I can probably get to a solution.
July 22, 2009 at 12:38 pm
July 22, 2009 at 12:59 pm
From source file :The row for a tax record has Tax_val =100 for city xyz.
In destination table: For the same city xyz from table Tax_val = 250.
For thae same city get the allowed difference from antother database table. In this example suppose it it 95
So (250-100 ) is greater than 95 so don't update it.
I am not sure how to do the above with a dataflow task.
July 22, 2009 at 1:19 pm
OK, here's what I'd do:
Create a Data Flow with the following contents:
1. Flat File Source adapter to get your file info into the data flow.
2. Lookup Transform. Use the City value in your data flow to get the range values from your other table. Configure the Lookup Transform to add the range values into your data flow.
3. Conditional Split. Use the expression builder to evaluate the comparison between the Tax_val and range values. You'll end up w/ 2 outputs. One that contains the rows you want to work with and the other that contains the rows you don't need.
4. Lookup Transform #2. Send the rows that you want to work with from the output from step 3 above into this Lookup. Use it to look up your row to determine if the row needs inserted or updated.
5. OLE DB Destination. Send the Red arrow from your Lookup above into this destination. You'll be prompted to change the error output config. Set the error output to redirect the rows. Configure your destination to insert the rows into your table.
6. OLE DB Command. Send the Green arrow from your Lookup above into this transformation. Write a SQL Statement to UPDATE your rows...UPDATE YourTable SET YourColumn=? WHERE......use the column mappings tab to map your ? parameter to a data flow column.
That should do what you're looking to do. Post back any comments/questions/concerns in this thread.
July 22, 2009 at 2:17 pm
Thaks again for your time.
In the above example the value to compare with range value is the diffference between Source tax_val and Destination tax_val.
that is the only Source_tax_val = 100, destination tax_val = 250. and the range val is 95.
For the matching row.
If (source_tax_val_from_file - Destination_tax_val_in_table) > Range_val don't update
else
update.
I am not sure this can be acheived from the solution explianed above.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply