Flat File Import, skip empty rows

  • Hi all,

    following issue on SSIS.

    I specify a CSV flat file import in my connection manager. This works fine.. except that the flat file does every now and then contain an 'extra cR/LF'

    as per below

    -------------------------------

    user1,Windows XP,Service Pack 2,2007/09/25,08:45:39,DC1,GB,10.5.230.99,IBM,8184NGQ

    user12,Windows XP,Service Pack 2,2007/09/25,08:45:51,DC1,GB,10.5.230.202,IBM,THINKPAD R40

    user3,Windows XP,Service Pack 2,2007/09/25,08:46:00,DC1,GB,10.5.236.51,IBM,2724-QG1,

    user55,Windows XP,Service Pack 2,2007/09/25,09:46:01,DC3,IT,10.5.62.158,DELL,Latitude D630

    user 103,Windows XP,Service Pack 2,2007/09/25,08:46:11,DC1,GB,10.5..34.158,IBM,8184NGQ

    user21,Windows XP,Service Pack 2,2007/09/25,08:46:07,DC1,GB,10.5.226.47,IBM,2724-QG1

    user11,Windows XP,Service Pack 2,2007/09/25,08:46:07,DC1,GB,10.5.234.114,IBM,8184-NGQ,

    user321,Windows XP,Service Pack 2,2007/09/25,08:46:05,DC1,GB,10.5.234.167,IBM,THINKCENTRE S50

    ----------------

    what happens is that the import recognizes the format, delimiters etc but does add the extra CR/LF in front of the data of the rows where i do have valid data on

    I want to leave the sourcefile untouched, but transform/correct this when processing the data in SSIS.

    Do i have to use a script (ouch, not good at VB programming) to correct this stream or is there a smarter way to tell the import flat file to just skip these blanc rows?

  • I think you can use a (L)TRIMM function after the import on the columns.

    And regarding the skipping of blanc records you could delete the blanc records afterwards.

  • tnxs Abceeetje

    but LTRim removes leading spaces, not special characters.

    I've tried a derived column but the

    replace (column1,Char(10)+char(13),)

    didn't work as the interpreter didn't understand (support) the CHAR

    so still, my surplus CRLF are being added as weird characters in front of the items in column 1

    need something else

  • Hmmz, my mail to you uses an open the csv in Excel, sort ASC on Column A, save and go.

    But to leave the file as is... Hmmm, in SSIS I didn't find it...

    Maybe someone else has a thought?

  • I'm fairly new to SSIS, but what about using a scripting task, replacing VBCRLF where it is repeated?

  • But I believe he didnt want to script?

  • Sorry, didn't see the last comment under the dialog box.

    Aside from scripting, however, I'm not sure what other options you have here. Maybe one of the gurus on this site will know.

    I'm sure you can find some scripting examples online somewhere if you have to go down that path.

    Good luck!

  • Try the following:

    REPLACE( [Column 0], "\\r\\l","" )

  • Not the most efficient way to accomplish what you need, but possibly the easiest if you want to avoid scripting.....

    Create a package w/ a data flow task, then execute sql task, then two more data flow tasks

    1. DF task 1

    a. Flat file source w/ ragged right format on the connection

    b. Make all columns be one huge column (varchar(4000)) or whatever

    c. SQL destination or OLE DB - import to table

    2. Execute sql task

    b. Connect to sql database, and then "DELETE Table1 WHERE LTRIM(RTRIM(Col1)) = '' "

    3. DF task 2

  • *****Sorry for the quick trigger on the last post - hit the wrong button in the WYSIWYG editor**********

    Certainly not the most efficient way to accomplish what you need, but possibly the easiest if you want to avoid scripting.....

    Create a package w/ a data flow task, then execute sql task, then two more data flow tasks

    1. DF task 1

    a. Flat file source w/ ragged right format on the connection

    b. Import as one huge column (varchar(4000)) or whatever

    c. SQL destination or OLE DB - import to table

    2. Execute sql task

    a. Connect to sql database, and then "DELETE Table1 WHERE LTRIM(RTRIM(Col1)) = '' "

    3. DF task 2

    a. OLE db Source to Flat File destination

    b. extract your data back out to new flat file - your empty rows should be gone now

    4. DF task 3

    a. Flat file source w/ comma seperated format on the connection

    b. import to your real table.

  • Actually, to scrub the records, It mighht be simpler to

    Task 1

    Import the whole record (no delimiters)

    Split the record stream

    Output to a temp file.

    Task Two

    Import the temp file

    etc.

    Less I/O, CPU. Easier to maintain. Had this solution, but I think he wanted to avoid the extra step and fix the records on input.

  • I did put my not-really-efficient disclaimer on my post, so yes, Dave's right - less i/o cpu if you keep it all in SSIS and temp files until you get your data scrubbed. But I think we're barking up the same tree. Basic steps - On the first step don't use delimiters/treat as one full column, then get rid of the empty rows (try using expression in conditional split transform), send the good data down one path, and the empty rows down a trash path, and then push the good data in.

    I attached a jpg showing what I think I/Dave means....

  • Thanks for all the tips guys, i've been playing with double delimiters (CRLFCRLF) etc but was still missing out rows like this. Naivily, i thought the failed rows (i.ow. where there's only one CRLF at the end) would be redirected to the failure component but they just were skipped over.

    The tip with the split columns and no column delimiter showed to be successfull

    Step 1

    Define flat file but with no delimiter

    step 2

    Create that conditional split , if the column named 'filtered' is not empty, create a stream called 'non blancs', otherwise redirect to stream called blancs

    step 3

    Redirect stream 'non blancs' to a flat file export.

    Step 4

    Launch indeed second packages to import the exported flat file but now with good row and column delimiters

    works like a charm

    Appreciated your time and effort here

    K

  • Ah, it's more easy than it was at first 😉

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply