Calculate checkSum for a flat file connection

Question

Calculate checkSum for a flat file connection

happy55

Ten Centuries

Points: 1243
More actions
August 13, 2015 at 8:41 am

#301848

My requirement is to create a text file with header ,detail and trailer records.
The trailer record should contain total number of records in the file and a checksum field.
If I have the file with all detail records ,how can I calculate adler 32 checksum and append it to the trailer or any other suggestions to calculate adler 32 checksum for the whole file .
Thanks,

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 1

If you're creating the text file in a data flow, the columns are fixed and you won't be able to define a different format for the trailer record (going to the same flat file destination). The simplest approach is probably to write the file (with the header record) in a data flow task, then use a script task to open the file, read it and calculate the checksum, and append the trailer record to the file.

If you want to do it in one pass for maximum efficiency, you could use a script transform in the data flow to manually create the text row from the input columns, and calculate the checksum as you go. This would be an asynchronous script component with a single string column for the whole row. You'll have to create the header row manually, the flat file destination won't be able to create a header row (presumable with column names) because it has no idea what the columns are. Don't forget to include the end-of-line characters that will be added to each row in the checksum calculation. When the input runs out of rows, generate the trailer row and add it to the output.

If you're going to write that much code in a data flow script transform component, and the data flow otherwise is a simple Source -> Destination, then you might just skip the data flow use a script task. You only have to add a little code to open the data source and output file, then append the formatted text for each row into a StringBuilder object instead of adding rows to a Buffer object. You don't say how large the files can get, so you might want to write out and clear the StringBuilder when it gets to some size or rowcount limit. One advantage is that the StringBuilder will have the exact file contents, including end-of-line characters, which might simplify your checksum calculations.

happy55 Ten Centuries Points: 1243 More actions · Answer 2

Thank you so much for your inputs. I have one confusion . You mean I write just one script task in the whole SSIS package and in this C# script task I generate the header ,write the file with header ,then generate the detail records and format them as per the fixed width requirements then write the file with detail records and then generate trailer and write the file again, So there will be just one script task in the whole package ?

Or you mean have one data flow for the header and detail records and then a script task for the trailer.

Also I don't know how to calculate checksum ,any help is much appreciated.

Thanks,

happy55 Ten Centuries Points: 1243 More actions · Answer 3

happy55

Ten Centuries

Points: 1243

August 17, 2015 at 9:55 am

#1820487

Can anyone help on this ?

Thanks,

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 4

You don't spell out the format of the header record. I assume it is column names, in which case a regular data flow task with a flat file output could generate it. The trailer record with the linecount and checksum is obviously a completely different column structure. You can't create this record in a standard data flow with the detail records. Also, I assume the checksum applies to the header and detail rows, but not the trailer.

So the first option is to create the file with a standard data flow task with all columns defined. Then you need a separate script task that reads the file back in to calculate the checksum, and appends the trailer row to the file. This is probably the easiest way, but requires two passes through the data.

If you want to do it in one pass for maximum efficiency, the next option is to modify the data flow task to use an asynchronous script transform component with a single string output column. It has to generate the header record and all the detail records, calculating the checksum as it goes. Then it adds the trailer.

The third option is to eliminate the dataflow task and just use one script task. There is a little extra code required to open the input data source and the output file, but it simplifies the checksum calculation. And yes, this means the entire package could be one script task.

As far as calculating the checksum, I found this on Wikipedia: https://en.wikipedia.org/wiki/Adler-32