July 13, 2009 at 6:14 am
Hi Experts,
we are using a Script Component as a DataSource in the dataFlow.
Our package runs and runs till the virtual memory is full and the system is "deadlocked". Our guess is that the Script component used as a DataSource is blocking the DataFlow till ALL data is read from source. So it is NOT handling the data in blocks of 10000 rows like OLE DB Source does?! Is this correct?
If it is blocking the dataflow till all rows are in memory is there a way to do the processing in these blocks of 10000 rows in script component?
Many thanks in advance!
Cheers,
Georg
July 13, 2009 at 7:36 am
Georg,
This could be due in part to how your script is written. Can you post some of the code (or at least pseudocode) from your script component source?
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
July 13, 2009 at 8:28 am
Hi Tim,
thanks for your reply and sorry for my deficient post. Here is the code of my script Component:
' Microsoft SQL Server Integration Services user script component
' This is your new script component in Microsoft Visual Basic .NET
' ScriptMain is the entrypoint class for script components
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits UserComponent
Public Overrides Sub CreateNewOutputRows()
Dim OleAdapter As New Data.OleDb.OleDbDataAdapter
Dim DataTable As New Data.DataTable
Dim DataRow As Data.DataRow
OleAdapter.Fill(DataTable, Me.Variables.VARPRESULT)
For Each DataRow In DataTable.Rows
Output0Buffer.AddRow()
Output0Buffer.RECVALUE = DataRow(0).ToString
Next
End Sub
End Class
As far as I understand this the dataTable is populated withthe resultset with is stored in variable VARPRESULT. Then the datatable is iterated by rows and per each row a Value is sent to the Outputbuffer.
Your reply sugguested that I could solve the 10000 Rows per Block Thing in my code. Could you please guide me the right way?
Thanks in advance!
best regards
Georg
July 13, 2009 at 8:44 am
Georg,
First, I don't see any connection information or the query string in your code. I assume you removed those for security reasons?
As far as the code, I would suggest replacing the Fill() method with an OleDBReader object. The OleDBReader is a fast, forward-only reader that may help reduce your RAM utilization.
The following page gives a good example of using an OleDBReader object:
http://msdn.microsoft.com/en-us/library/system.data.oledb.oledbdatareader(VS.71,loband).aspx
Let me know how this works out.
hth,
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
July 24, 2009 at 6:41 am
Hey Tim,
Sorry I was busy the last few days. I tried your suggestion but actually it is not what i was aiming at. Your solution works fine for Tables, but I have a Variable with type Object (actually a ADO.NET recordSet) that should be the source from which i want to read. I tried it with a sample of 10000 rows in this variable and it works fine (Executiontime about 2 Minutes). My live data is about 2 Million rows and after a 20 Hour run I aborted the package frustrated. Is there a way to tell the Script component to handle the ADO.NET ReocrdSet Object like a "normal" Source (like a table in a DB). For a table SSIS handles data in blocks of MAX 10000 rows. Is there a way to code this in the Scriptcomponent?
Cheers,
Georg
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply