SSIS Data Load Question.

Question

Post reply

SSIS Data Load Question.

sanjoy.chatterjee

SSC Rookie

Points: 40
More actions
March 2, 2010 at 1:17 pm

#137569

I get all my data in flat files and the last row in the flat file is a total rows in the flat file.
Question 1. In SSIS when I use a flat file source how do I avoid the last row in the data file while loading the data into the table.
Question 2. Is there a way I can capture the last row that has the row count as the only column in a varible to validate that all rows above were loaded into my table. Would appreciate your response.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 1

1. The flat file source will not do this automatically for you.

2. Maybe.

The real question is whether you can tell the last record from all the other records, by something OTHER than it is the last record..

CEWII

sanjoy.chatterjee SSC Rookie Points: 40 More actions · Answer 2

I cannot tell which one is the last record as it varies by file and every month. So once of the challenges is to identify the last record.

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 3

Yes, that is your big challenge. I don't know of a way to catch the last row any way. Even using a script component I don't know of a way to tell. Also, if that last row has a different structure it will probably break the flat file source.

So that leaves an unpleasant method. You can generate a Script Source, read the file yourself and write each record into the buffer.

Here is an example that takes variables that were passed into the Data-Flow script source and creates a buffer row for them:

' Microsoft SQL Server Integration Services user script component

' This is your new script component in Microsoft Visual Basic .NET

' ScriptMain is the entrypoint class for script components

Imports System

Imports System.Data

Imports System.Math

Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain

Inherits UserComponent

Public Overrides Sub CreateNewOutputRows()

'

' Add rows by calling AddRow method on member variable called "Buffer"

' E.g., MyOutputBuffer.AddRow() if your output was named "My Output"

'

With DiskUsageVariablesBuffer

.AddRow()

.Description = Variables.Description

.DeviceID = Variables.DeviceID

.FreeSpace = Variables.FreeSpace

.Size = Variables.Size

.ServerName = Variables.SystemName

End With

End Sub

End Class

The output from the task was called: DiskUsageVariables so you are adding records to the DiskUsageVariablesBuffer.

I hope this gives you some ideas..

CEWII

Koen Verbeeck SSC Guru Points: 259155 More actions · Answer 4

If the flat file source can read all rows, even the last one, you can try to filter the last row out by using a conditional split.

The condition to find the last row, well, that is up to you, since I don't know your file structure.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 5

That has already been suggested and discussed..

CEWII

Koen Verbeeck SSC Guru Points: 259155 More actions · Answer 6

Elliott W (3/3/2010)
That has already been suggested and discussed..
CEWII

I'm confused. I saw nothing about a conditional split.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 7

Must have been a different post.. Either way identifying the last record is a big problem, especially if the structure is different.

CEWII

sanjoy.chatterjee SSC Rookie Points: 40 More actions · Answer 8

The file structure has 10 columns they are comma delimited but the last row has only one number hence the schema differs between all valid rows and the last row. Yes the challenge is identifying the last row for using a conditional split unless there is a way to do this that I am not aware off.

Alternately I thought I could do the following

1. Use the RowCount task with a flat file source to save the total rows in a variable TotalRows .

2. Then use the bulk insert task to insert total (TotalRows -1).

Do you think that will work?

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 9

So let me see if I understand this, the last row has a number and then all the other "fields" are null/blank?

If that is true AND that NEVER happens on any other record then you have your criteria for the conditional split. The big problem was finding a condition that is true for the last record but not true for ANY other record. I hope this is it.

CEWII

sanjoy.chatterjee SSC Rookie Points: 40 More actions · Answer 10

Yes that is corrrect the last row has only one column that is the rowcount. So do you think my rowcount approach will work.

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 11

I do. I"m looking at it to see how you are going to fail it if the numbers are different.

You can capture the record and get the number but you need a record count component right before the destination, I say there because it is the last place that a record goes through before being inserted. The problem is that if you compare the record count at the moment you get the last record it will probably be different. The reason why is that records flow through the pipeline, so the number of records you got from the source is usually different than the number you are dealing with at the bottom of a pipeline, at least until it is all done. You might have to add in some wait logic to make it hold off the comparison for a bit. You could build some logic like if it doesn't match then wait 3 minutes and try again for a couple retries. If it still doesn't match return script failed as the result.

CEWII

Koen Verbeeck SSC Guru Points: 259155 More actions · Answer 12

Maybe this is an option:

right before you read your flat file, edit the file with a script task. Go to the last row, place something before the number and add some commas.

For example:

before

xxx,yyy,zzz,fff

frrrrrr,gggg,hhhh,u

2 (this is last row, indicating that there are 2 rows of data. The data consists of 4 columns.)

after

xxx,yyy,zzz,fff

frrrrrr,gggg,hhhh,u

<last_row>,2,<dummy_data>,<dummy_data>,

Now the last row also has 4 columns and it can be read by the flat file source.

After the source, place a conditional split that searches for <last_row> and place the value of the second column in a variable (with a script transform for example).

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

SSIS Data Load Question.

Cookies on SQLServerCentral