Importing vertical data using the flat file task?

Question

Importing vertical data using the flat file task?

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 1

TonyP (3/19/2008)
Well, it is not exactly inserting each application file into many tables. It is more like taking information out of the application file and sending it to a variety of different tables.

Same thing said just a little differently. You won't have a problem either way. The key is like someone (Matt?) suggested earlier... get the data into a "staging table" where we can work on/distribute the data to a variety of different tables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 2

All right- if you're hard-core about wanting to use SSIS:

1. You're going to need to read up on file-system tasks and foreach tasks (so that you can "loop" on each file it finds.) once you do that - then you can focus on the data task that imports the file.

2. The data flow I come up with looks something like this:

- 1 flat file input. Make it delimited, with a delimiter you're NOT using (say - pipe). Creating something that imports each row as a single column

- split the app number off (using a conditional split) from the other rows

- for the app number branch, use SUBSTRING and FIND to pull out the app number and the code after that (whatever that is)

- for the everything else branch: use the string functions to split each row into 5 fields: table, rowID, fieldname, value, code

Example

Veh1_Make, Chevy, 0 --> VEH....1....Make....chevy....0

Veh2_Make, Nissan, 0 --> VEH....2....Make....nissan....0

Veh1_Year, 2004, 0 --> VEH....1....Make....2004....0

Veh2_Year, 2003, 0 --> VEH....2....Make....2003....0

- merge task to put the AppNuM and it code back in wherever it needs to be (appNum everywhere, code - I don't know).

- add a PIVOT task. Pivot the fdervice columns based on tables, using the rowID's to match them up, grouping by appnum.

- output to however many tables you end up with.

That's going to be a BIG-looking SSIS.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 3

That's going to be a BIG-looking SSIS

I figured that much out... I remember how big something like this got in DTS... that's why I was taking the T-SQL road... at least it would be fast 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 4

Jeff Moden (3/19/2008)
That's going to be a BIG-looking SSIS
I figured that much out... I remember how big something like this got in DTS... that's why I was taking the T-SQL road... at least it would be fast 😛

Well - don't sell SSIS short yet. It makes for an impressive John Madden-"like" description of a process, but it does have some fairly serious giddy up and go to it. Just because there's a lot of steps, all of which are one or more boxes on the process display, doesn't mean it can't fly through them.

I asked andy L when he first posted his article about it, and from what I can tell - he's right. It does seem to move through quite a bit faster than ol' DTS. How MUCH faster is a good question - not sure I have a solid handle on how to really "time" the sucker down to the ms. But I've imported 2.5 million rows (denormalized) and renormalized them within 2 minutes. A similar DTS took somewhere around twice that. Not too shabby.

With something like this process - it's going to be hard to tell, since the true perf test is how fast can it process through a directory with 100,000 of these files in it.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

TonyP SSC-Addicted Points: 401 More actions · Answer 5

I'll review these suggestions and get back with you guys later.

Dave Balsillie SSCertifiable Points: 5589 More actions · Answer 6

Jeff is very right of course.

Some comments that might help to wrap your head around SSIS and get going:

- From a design perspective, expect to create lot's of steps and pieces. Don't try to stuff all operations into one little step.

- consider creating a "dispatcher step" to route the different records to different edit checks. This'll help modularize your logic.

- Stuff the results into memory variables (you may end up creating a lot), then when the "key changes", trggering a write or transfer to a recordset. Then, you'll be able to logically output to whereever you want.

Insurance applications typically contain lots of data (been there), so creating all of the individual pieces may seem tedious. However, once done, I expect you'll find it flies, and will be realtively easy to maintain. (And you'll be able to repurpose it easily, if you get it right.)

Good luck.