Edit Text in file before working with it

  • Hi All,

    For reasons outlined in my previous topic, http://www.sqlservercentral.com/Forums/Topic829742-148-1.aspx, I need to find a way to automatically edit an XML file to replace ampersand characters with either & or simply "and".

    I currently open the file in notepad, and do a find and replace. However I would really like to be able to do automate this process.

    Does anyone have any ideas about how this could be done within SSIS?

  • Script task will do it in about 5 lines of code ...

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks for the swift reply.

    Really? That sounds ideal.

    My XML Files get read in a Foreach loop... so would the script task be able to use the same variable to open the file as the one that is passed to the XML Source component?

    Is there any chance that you could assist with the script task please?

  • Is the help I am asking for inappropriate for this section? I wonder if it should have gone into the Newbie section?

  • [font="Comic Sans MS"]

    I am sure there are many experts here who would be able to help you with script task - meanwhile you may wish to look here .. there are good examples of script tasks you can you.

    http://msdn.microsoft.com/en-us/library/ms135944(SQL.90).aspx

    [/font]

    [font="Comic Sans MS"]--
    Sabya[/font]

  • Paul_Harvey (12/8/2009)


    Hi All,

    For reasons outlined in my previous topic, http://www.sqlservercentral.com/Forums/Topic829742-148-1.aspx, I need to find a way to automatically edit an XML file to replace ampersand characters with either & or simply "and".

    I currently open the file in notepad, and do a find and replace. However I would really like to be able to do automate this process.

    Does anyone have any ideas about how this could be done within SSIS?

    You can use this script as reference how to setup find-and-replace.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Maybe have a look at this thread too ... It sure helped me at the time.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks for the input. The scripts look interesting, as does the article in the T-SQL method.

    Both methods appear to rely on the text being in the database first. Unfortunately I cannot get this far, as the XML Source component fails straight away, due to the XML not being valid. MS Access also will not touch the file.

    Also, the file can be quite large in size (25 to 30 MB). I wonder if I can rename the file to text.. then import as a text field, then export back to a text file. Do you think that would work?

    Both m

  • When using the script task (not a script component), you don't have to pull the file into the database to be able to work with it. Using a script, you could write out a clean version of the file with the text replaced.

    Bear with me because the following is done without the benefit of an IDE, but this should help you get started:

    ' The following 2 lines assume you've set up SSIS variables containing the dirty file and the clean file

    Dim reader as System.IO.StreamReader = new System.IO.StreamReader(dts.Variables("OldFile").Value)

    Dim writer as System.IO.StreamWriter = new System.IO.StreamWriter(dts.Variables("NewFile").Value)

    while not reader.EndOfStream

    Dim theLine As String = reader.ReadLine()

    ' Do your replacements here

    theLine = theLine.Replace("oldValue", "newValue")

    theLine = theLine.Replace("oldValue1", "newValue1")

    ' etc, etc, etc.

    ' Write the cleansed output to the new file

    writer.WriteLine(theLine)

    end while

    writer.Flush()

    writer.Close()

    hth,

    Tim

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Hi Tim,

    Thanks for the advice. I have inserted your script task, and amended the text to call my variable, as both the read and write.

    Unfortunately it returns this error:

    Error 30518: Overload resolution failed because no accessible 'New' can be called with these arguments:

    'Public Sub New(path As String)': Option Strict On disallows implicit conversions from 'Object' to 'String'.

    'Public Sub New(stream As System.IO.Stream)': Option Strict On disallows implicit conversions from 'Object' to 'System.IO.Stream'.

    Line 23 Column 48 through 111

    Is this something that can be resolved without taking up too much of your time?

    I have in the meantime got the functionality I required by calling a VB Script from a batch file executed within the SSIS package. This works, but I am not keen in the need to store the VB Script file, and batch file along with the package. They are bound to become separated at some point!

    Help much appreciated,

    Paul

  • Looks like I left off a string conversion. Add the ToString() method to both of the .Value objects, as such:

    dts.Variables("OldFile").Value

    becomes

    dts.Variables("OldFile").Value.ToString()

    ... and so forth.

    hth,

    Tim

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Thanks Tim. I will give it a try. I will be back to that particular project on Tuesday, so will update the thread then.

  • This worked brilliantly. I completed the changes, and did some testing and all looked well, but it wasn't until some live data was sent through that contained the ampersand that I have updated this thread.

    Thanks again Tim. Great work.

  • Great, I'm glad it worked out for you. Thanks for the follow-up.

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • I'm attempting to implement a find and replace in SSIS 2008 but I believe the code that was provided in this thread is VB not C#. I'm receiving the error : "Cannot convert type 'string' to 'System.IO.StreamReader'"

    System.IO.StreamReader reader;

    reader = (System.IO.StreamReader) Dts.Variables["User::FileSource"].Value.ToString();

    System.IO.StreamWriter writer;

    writer = (System.IO.StreamWriter) Dts.Variables["User::FileDestination"].Value.ToString();

    Without the cast to a string there are no errors with the C# code but then I'm receiving the following error when executing the script task.

    Error: 0x1 at Script Task: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.InvalidCastException: Unable to cast object of type 'System.String' to type 'System.IO.StreamReader'.

    Any assistance is greatly appreciated.

    Thank you!

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply