December 2, 2010 at 10:56 am
Hey all,
I am running into an issue with a file and I was hoping to get some help.
I have a mess of files that are pipe delimited, which I'm importing in a ForEachLoop Container:
Hope|you|can|help
This|has|me|stumped
|And|this|is|it
|Having|trouble|with|pipes
As you can see, lines 3 and 4 begin with a pipe which is killing my import. Can you help me to remove this character upon import?
Thanks
December 2, 2010 at 11:48 am
Something like this?
SELECT CASE WHEN LEFT(YourCol,1) ='|' THEN STUFF(YourCol,1,1,'') ELSE YourCol END
December 2, 2010 at 12:10 pm
Thanks but not exactly. I need it to happen upon import, before it tries to import the file i need to verify that the format is right. Essentially stripping off a leading | as it starts to come into sql.
December 2, 2010 at 12:27 pm
In that case, either an IIF statement or REGEX should help. (Google regex or IIF in ssis for details.)
You'd basically use a script component after your data import to remove the "|", if needed.
December 2, 2010 at 12:27 pm
samartel (12/2/2010)
Thanks but not exactly. I need it to happen upon import, before it tries to import the file i need to verify that the format is right. Essentially stripping off a leading | as it starts to come into sql.
Ah, the source flat file is getting all wonky on you. I see your problem.
This is not going to be a pretty solution, but it can be effective. What you do is create a source with your file that doesn't detect the column splits, just treats each row as one column. Perform the solution above on this data while pumping it to a new text file.
Then, in a new dataflow set to occur afterwards, do your regular import mechanism.
You will have to affect the base file first, and it will slow down your process. Be it .bat file, Word VBA, or SSIS tricks, something's going to have to happen to the file first. The metadata assigned to the source file can't be adjusted the way you're looking for.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 2, 2010 at 12:29 pm
LutzM (12/2/2010)
In that case, either an IIF statement or REGEX should help. (Google regex or IIF in ssis for details.)You'd basically use a script component after your data import to remove the "|", if needed.
Needs to happen before the import/source file kicks off, Lutz, or half the rows end up in misplaced columns.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 2, 2010 at 12:33 pm
Craig Farrell (12/2/2010)
LutzM (12/2/2010)
In that case, either an IIF statement or REGEX should help. (Google regex or IIF in ssis for details.)You'd basically use a script component after your data import to remove the "|", if needed.
Needs to happen before the import/source file kicks off, Lutz, or half the rows end up in misplaced columns.
The string needs to come in as a single column, I agree. But a script component could fix the initial-pipe problem and also do the parsing --> output columns in a single pass.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 2, 2010 at 12:35 pm
Are you absolutely certain the pipes at the beginning of the row don't mean the first column is empty/null? I've seen delimiters do that before, in that case.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 2, 2010 at 12:36 pm
Craig Farrell (12/2/2010)
LutzM (12/2/2010)
In that case, either an IIF statement or REGEX should help. (Google regex or IIF in ssis for details.)You'd basically use a script component after your data import to remove the "|", if needed.
Needs to happen before the import/source file kicks off, Lutz, or half the rows end up in misplaced columns.
Agreed. I was thinking along the same way you described in your prev. post (to treat it as one column). Just left it out of my explanation. :blush:
I don't think it's required to pump the result to a new file. It should be enough to process the output of that script component.
December 2, 2010 at 12:38 pm
Phil Parkin (12/2/2010)
Craig Farrell (12/2/2010)
LutzM (12/2/2010)
In that case, either an IIF statement or REGEX should help. (Google regex or IIF in ssis for details.)You'd basically use a script component after your data import to remove the "|", if needed.
Needs to happen before the import/source file kicks off, Lutz, or half the rows end up in misplaced columns.
The string needs to come in as a single column, I agree. But a script component could fix the initial-pipe problem and also do the parsing --> output columns in a single pass.
That's a new one on me, Phil. How would you approach that? Afaik, the VB Script component can't output a data structure. Alright, dumb question. How are you going to do the resultant split after you've done that? What structure are you feeding it into?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 2, 2010 at 12:55 pm
Wow....go grab a coffee and my post has a bunch of solutions....lol
I think what i'm going to do is actually redirect the file as an error. As GSquared said, I cant be certain that it isnt there on purpose.
I'll leave the rest up to the people responsible for the data.
Thanks
December 2, 2010 at 1:33 pm
samartel (12/2/2010)
Wow....go grab a coffee and my post has a bunch of solutions....lolI think what i'm going to do is actually redirect the file as an error. As GSquared said, I cant be certain that it isnt there on purpose.
I'll leave the rest up to the people responsible for the data.
Thanks
:crying: We were SOOOO close to a really neat solution. And now you decide to do it the easy way. :crying: 😉
Honestly, I hope Phil will still present his solution since it might help others facing a similar issue or -if it turns out that the input file needs to be modified- will help you when faced with such a requirement.
December 2, 2010 at 1:49 pm
LOL...dont get me wrong, I like the idea. Am actually testing the script component now to see if it will remove that pipe on those records that are affected. The only drawback is performance. I will lose a tonne of it if I do it that way. I actually have changed my tune twoce and wil merely reject the rows and redirect them to a log. This way these can be sent to those who make decisions and if they feel that the pipe should be removed, then its an easy SQL statement and export to text to reimport.
Thanks to all for the help....and please post your solution, as will I if I find that the script component does not kill performance as I anticipate it will.
December 3, 2010 at 1:19 am
Sounds like you're well on the way to creating a script component to do the job.
I will write down my thoughts, just in case they include anything you haven't considered.
The way I would do this is to have the single column flat-file source feeding into the script component as a transformation. So the script component will have only one input.
Then you need to code it to work asynchronously to generate the multiple output columns, removing that troublesome pipe on the way.
Another possibility is to do some pre-processing on the file, at binary level - a global replace of [CR][LF]¦ with [CR][LF] - and then to feed that straight into a standard data flow with no fancy script required. It might be faster ...
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply