November 1, 2012 at 10:06 am
Looking for general advice. I've done some digging, but all I found was a lot of script tasks on posts from 2009 and before.
I have a file I need to parse through and test certain values. If they're okay, go onto next test, else, drop out and fail.
I know I want to use a ForEach loop to iterate through, but how do I hook up the file as the source of the ForEach loop? I'm assuming once I hook that up, I just set each value to a variable and use those variables in my loop to test.
It's just hooking up the file to the foreach loop that is stumping me and I have a bad feeling it's something obvious.
Thanks!
November 1, 2012 at 10:33 am
I think I might be able to answer my own question.
1) I have a data flow task that takes all the file data and puts it into a recordset dest. I assign that a variable and use a Foreach ADO enumerator with that variable as the ADO object source variable and map each field in the file to it's own variable.
2) I check specific variables in the foreach loop against tables in my db to see if they're viable. If not, I jump to the next row.
3) Need to figure out how to collect the failures and report them.
November 1, 2012 at 11:09 am
The For Each loop is used for things like looping through a set of files in a folder, or a list of records from a table usually for some sort of dynamic loading or exporting. You process a data file via the Data Flow Task.
Set up a Connection Manager to the data file, then add a Data Flow task to the Control Flow designer. Double click on the data flow task and then add a Data source pointing to your connection manager. That will read in the file, then you add tasks within the Data Flow task to process the records in the file. You can use a Look Up task if you need to take a value(s) from the source file and compare it to a value in a table to bring in another value from the table and you can direct matching and non match output to different destinations (like an error table). You can also add a Script Component and do some logic in VB .NET/C# as each record is read in. Or you can do more simple Derived Column transformations using SSIS expressions, like doing Null handling or data cleansing. You can use an OLE DB Command to send column values in each row to a stored procedure. Or you can just do a basic load and map the incoming columns to a data flow destination (which will also require a connection manager). There are many other data flow tasks, I'd recommend reading up on the SSIS stairway series here or poking around BOL for descriptions of how you might use those tasks.
MWise
November 1, 2012 at 12:46 pm
So I don't need a foreach loop to read all the rows in the file? It'll do that inside the data flow once I connect to the specific file?
November 1, 2012 at 2:39 pm
November 1, 2012 at 8:50 pm
Matthew Cushing (11/1/2012)
Looking for general advice. I've done some digging, but all I found was a lot of script tasks on posts from 2009 and before.I have a file I need to parse through and test certain values. If they're okay, go onto next test, else, drop out and fail.
I know I want to use a ForEach loop to iterate through, but how do I hook up the file as the source of the ForEach loop? I'm assuming once I hook that up, I just set each value to a variable and use those variables in my loop to test.
It's just hooking up the file to the foreach loop that is stumping me and I have a bad feeling it's something obvious.
Thanks!
Gosh... why not just load the whole file up and test the value using setbased techniques and mark which rows are good or bad according to your tests?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 2, 2012 at 7:36 am
Gosh? Just looking for some general advice on how to get back into ssis after being away for a while. Thanks for the support.
November 2, 2012 at 3:29 pm
Matthew Cushing (11/2/2012)
Gosh? Just looking for some general advice on how to get back into ssis after being away for a while. Thanks for the support.
Not sure what was wrong with the word "gosh" here but it really was meant as a helpful suggestion. Instead of thinking about what you want to do with a row, think about what you want to do with a column (or columns). I'm not sure what kind of checks you intend to make but usually such checks can be done en masse by updating a "status" or "keeper" column with an UPDATE (or two) with a couple of CASE statements based on criteria that does the checks you're looking for.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 2, 2012 at 4:12 pm
Matt,
What kind of value tests are you looking to do against the rows? As mentioned above, you can do things with lookup tests and the like to test for foreign key violations and the like during the datastream, also you can use derived columns and conditional splits (sometimes in tandem, if necessary) to usually do most of your tests to split the stream into success/review/fail streams as needed.
Additionally, there's always the transformation script component for more complex checks during the datastream, allowing you to directly choose which output stream to deliver an inbound stream to. You'll need asynchronous for that one, but it's not much harder than a synchronous transformation.
The things you'll want to avoid during this process however is anything that requires the stream to 'pause', as that'll kill any optimization you might get out of using the datastream instead of dumping to staging table and running bulk T-SQL against it. Those include sorts and aggregates, as those are the usual culprits for something like this. Fuzzy Grouping is another one. Basically anything that needs to grab a group of rows before it can do something. Avoid those like the plague, they're usually best off performed in T-SQL unless you're multicasting for final counts or similar.
On a side note, Jeff wasn't being sarcastic or trying to imply anything other than what he directly said. He's incredibly efficient in T-SQL and try as I might I can't convince him of the value of SSIS in datastream optimizations... mostly because he can usually counter with an equivalent speed method in T-SQL. :w00t: I do occassionally get him though, but it comes down to 'It Depends' scenarios.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 2, 2012 at 8:17 pm
Sorry, my bad. In a bad place, two friends lost everything to Sandy and I took it the wrong way. My apologies.
November 3, 2012 at 7:39 am
Matthew Cushing (11/2/2012)
Sorry, my bad. In a bad place, two friends lost everything to Sandy and I took it the wrong way. My apologies.
Thanks for that Matthew. I definitely understand that. My heart goes out to all those folks and to folks like you that have friends and relatives in that area.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply