Script Component writing UNC path very slow...

  • I created an SSIS package that pulls data and partitions them based on State and then Partition Number.

    The partitioning works great, the trouble I have is the performance of writing directly to the files on the destination server which is internal.

    The script component uses the parameters and opens up a StreamWriter for each state and channel. At the row-level a counter per set of states and paritions is kept to distribute the records evenly. In this case there are 6 files. The files are written directly to their destination server (\\destserv\stategroup1\file1.dat etc.)

    The server the SSIS is running on has a 1Gb card but is only using 1% of network utilization

    On the desination server side it's using 12% of a 100Mb card.

    If I copy a file generated from the source server to the destination I get 70% utilization on the destination server and 7% on the source server.

    Is there a way to increase performace here? I'm next going to test a write out to the source server and file copy to the destination. The previous test with file copy was from a file generated last month using another process.

    Also, I thought this was the appropriate forum to post this in... if it should go in the Performance Tuning my apologies in advance.

  • Just curious, why are you using a script task for this? I aggree that you should attempt to run a test where you write the files locally and then move them to the destination server, but I have a feeling that you'll find the bottleneck in your script task.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • The number of files we send varies. I found myself creating several flat file destinations. I was thinking to programatically write these files out directly to the destination.

    Before going this route I had 9 destination flat files. I used a conditional split to sort out the states which was fine, but to split the next level just seemed to give up the flexibility I was looking for.

    So instead I decided to write the files by script component instead because the flexibility was great, although it appears I suffer in performance.

    Writing to local files is definitely helping. During the first 30 minutes the package has written just over 4.5GBs.

    It'll probably be better to create the partitioned sets in tables (or a table) and cycle through and export.

    Either way, it was fun getting this to work in a Script Task.

  • You could create one file destination and use the expression builder to create the file names so your process is dynamic enough to create as many (or few) files as needed based on your partitioning logic.

    As far as getting the partitions into a data flow, here's what I'd do:

    1. Control Flow - execute SQL task: call a SP that retrieves just the distinct states and partition #s from the database.

    2. For each loop container - use this to assign the loop through the recordset from #1 and assign state, partition numbers into variables.

    3. Data flow within loop - your Source query can use the variables in #2 to get the data for that partition. You'll then write that data to your file destination and use whatever identifying values (such as the variable values) in your expression builder to define the file name, path, etc.

    Make sense?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • You have mentioned "Script Task" and "Script Component" as if they are the interchangeable - they are not. Just for clarification, I'm assuming that you mean Script Task here?

    If you were using a Script Component for this, I could understand why performance might be lacking ...

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Have you tried using BufferedStream ?

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • I'm using Script Component for this.

    John,

    Yep I see where you're going with that. That is the route I'm planning on taking if this doesn't pan out. Appreciate the help.

    Thanks for the suggestion of the bufferedstream. I'm not sure how to implement it exactly.

    A few excerpts from the script regarding the way I write to a file:

    I declare a 2D array of streamwriters under the public class:

    Public textWriter(10, 10) As StreamWriter

    I loop through the states and partition counts and assign files to the streamwriters:

    For i = 1 To Me.Variables.numdays

    For j = chanbot To chantop

    textWriter(i, j) = New StreamWriter(basepath + "day" + CStr(i) + "idwcdp" + CStr(j) + ".txt")

    Next

    chan(i) = chanbot

    Next

    At the row level I check the value of state and write to the appropriate file. The partition (chan) for that day is cycled to write out the data evenly.

    If .state statelim1 and .state<=statelim2) then

    day = 2

    Else

    day = 3

    End If

    textWriter(day, chan(day)).WriteLine(Row.lname)

    textWriter(day, chan(day)).WriteLine(Row.fname)

    textWriter(day, chan(day)).WriteLine(Row.address)

    .... etc etc

    If chan(day) = chantop Then

    chan(day) = chanbot

    Else

    chan(day) = chan(day) + 1

    End If

    End With

    From trying to quickly implement the BufferedStream it appears it doesn't like StreamWriter as an object type:

    Value of type system.io.streamwriter cannot be converterd to system.io.stream

    The error was from attempting this in the Pre-Execute Class:

    bufStream(i, j) = New BufferedStream(textWriter(i, j))

    Along with this beforehand:

    Public bufStream(10, 10) As BufferedStream

  • Actually you can use FileStream, which is also buffered. Here is how you can construct it:

    textWriter(i, j) = New StreamWriter(New FileStream(basepath + "day" + CStr(i) + "idwcdp" + CStr(j) + ".txt", FileMode.CreateNew))

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Phil Parkin (6/17/2009)


    You have mentioned "Script Task" and "Script Component" as if they are the interchangeable - they are not. Just for clarification, I'm assuming that you mean Script Task here?

    If you were using a Script Component for this, I could understand why performance might be lacking ...

    Phil, you've broght up an interesting point here. I try to avoid script tasks/components unless absolutely necessary and even then I try to use it for variable manipulation, error processing, and non-data centric operations. Can you elaborate on the differences between the script task and component? I'm assuming that the component processes rows 1-by-1 similar to the OLE DB Command transformation. Is this what you are referring to?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John Rowan (6/18/2009)


    Phil Parkin (6/17/2009)


    You have mentioned "Script Task" and "Script Component" as if they are the interchangeable - they are not. Just for clarification, I'm assuming that you mean Script Task here?

    If you were using a Script Component for this, I could understand why performance might be lacking ...

    Phil, you've broght up an interesting point here. I try to avoid script tasks/components unless absolutely necessary and even then I try to use it for variable manipulation, error processing, and non-data centric operations. Can you elaborate on the differences between the script task and component? I'm assuming that the component processes rows 1-by-1 similar to the OLE DB Command transformation. Is this what you are referring to?

    Actually the script tasks / component are very fast. They are in fact full-blown .NET classes. The performance problem is not caused by the used technology.

    The difference between script task and script component is that:

    - script task is piece of functionality used in the SSIS package control flow.

    - script component is piece of functionality used in the SSIS data flow.

    This is it. The same terminology is applied to the other parts of the SSIS framework. You have FTP Task and then you have Lookup Component.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • I implemented your suggestion CozyRoc and didn't see a performance gain. I also tried increasing bufferrow and buffersize properties for the dataflow which didn't yield antying.

    I also tried to use FileStream on it's own with a function I found in an example. Critical changes are:

    Public textWriter(10, 10) As FileStream

    textWriter(i, j) = New FileStream(basepath + "day" + CStr(i) + "idwcdp" + CStr(j) + ".txt", FileMode.Create)

    AddText(textWriter(day, chan(day)), .lname & .fname & .address & .city & .state & .zip _

    & .zip4 & .phone & .mtgdt & .mor & .mor & .blank & .npid & .ssn)

    Private Shared Sub AddText(ByVal fs As FileStream, ByVal value As String)

    Dim info As Byte() = New UTF8Encoding(True).GetBytes(value)

    fs.Write(info, 0, info.Length)

    End Sub

    Again, no performance increase, roughly 1 million records written every 1 minute and 48 seconds. Same as before if not slower.

    Thanks again for the suggestions... I learned a couple of things here. If you have other suggestions I welcome them. If not I'll go the other route of writing out one file at a time and handle the partitioning in an SQL Task.

  • I don't know what is your network hard drive, but if it is not designed to handle parallel processing, then this might be your bottleneck. I have seen SAN type of devices performing really well, but are quite expensive.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • I suppose so... thanks again for your suggestions.

  • pcapazzi (6/18/2009)


    I suppose so... thanks again for your suggestions.

    Is it possible to run performance counters on the server and see where all the time is spent?

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • A Script Task is the way to go here in my opinion.

    File I/O is the bottleneck so you want to limit the number of writes you do. I recommend caching the data you want to write to the file for a bit and then flushing it all to file.

    One way to do this is to write a class that handles this for you. So basically the class holds every line you add in memory for a while until the limit is hit, at which point it flushes out to file.

    A friend did something similar before and also created the files locally before copying them to the destination. This was the most efficient he was able to get it.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply