Delete a header row from a text file?

  • I have a text file I'm trying to import into a table (through a scheduled DTS package).  The file uses fixed-width columns and CR/LF to separate rows of data.  The problem I'm having is with a header row that uses different spacing.  It is only 10 characters long, whereas all other rows are 61.  As a result, if I set the DTS transform task to skip the first row, it skips the header row and the first row of actual data.  If I set it to not skip any rows (thinking that I could add in an SQL command to delete the first row after the import), then it makes all the rows only 10 characters long, which is a real mess.  I can clean it up through Excel first, but for various reasons that is not an option for our production environment.

    Could anyone please give me a shove in the right direction as to how to delete this header row through VB script or some kind of T-SQL manipulation (or any other possibilities)?  Thanks in advance for any advice and ideas.        

  • save following to DelHeaderLine.wsf

    Usage sample:

    DelHeaderLine source destination

    It copy the source file to destination without the first line.

    <job id="DelHeaderLine">

      <runtime>

         <description>Delete text file header line</description>

         <named name="SFile"

             helpstring="Source file"

             type="string"

             required="true"

         />

         <named name="DFile"

             helpstring="Destination file"

             type="string"

             required="true"

         />

         <example>Delheaderline.wsf /SFile:151.txt /DFile:aa.txt</example>

      </runtime>

      <script language="VBScript">

        Dim argsNamed, argsUnnamed

        Dim objFSO, sName, dName

        Dim sNameOpen, dNameOpen

        set argsNamed = WScript.Arguments.Named

        Set argsUnnamed = WScript.Arguments.Unnamed

        'can not be both named and unnamed

        If ((argsNamed.Count <> 0) and (argsUnNamed.Count <> 0)) then

          WScript.Arguments.ShowUsage

          WScript.Quit

        End If

        if argsNamed.Count <> 0 then

          If not (WScript.Arguments.Named.Exists("SFile") and WScript.Arguments.Named.Exists("DFile")) then

            WScript.Arguments.ShowUsage

            WScript.Quit

          End If

          sName = WScript.Arguments.Named("SFile")

          dName = WScript.Arguments.Named("DFile")

        else

          sName = WScript.Arguments(0)

          dName = WScript.Arguments(1)

        end if

        Set objFSO = CreateObject("Scripting.FileSystemObject")

        'Check exists

        If Not objFSO.FileExists(sName) Then

          WScript.Echo sName & " does not exists"

          WScript.Quit

        End If

        If objFSO.FileExists(dName) Then

          WScript.Echo "Delete " & dName & " and run it again"

          WScript.Quit

        End If

       

        'source file should not be empty

        set sNameOpen = objFSO.GetFile(sName)

        if sNameOpen.Size = 0 then

          WScript.Echo sName & " is empty."

          WScript.Quit

        end if

        set sNameOpen = objFSO.OpenTextFile(sName, 1, FALSE, -2)

        set dNameOpen = objFSO.OpenTextFile(dName, 2, TRUE, -2)

        Dim s

        s = sNameOpen.ReadLine 'skip the first line

        DO while sNameOpen.AtEndOfStream <> TRUE

          s = sNameOpen.ReadLine

          dNameOpen.WriteLine s

        Loop

        sNameOpen.Close

        dNameOpen.Close

        WScript.Echo "Operation successful!!!"

      </script>

    </job>

     

  • Bulk Insert with a FirstRow argument of 2 would work, as it does not make inaccurate assumptions like DTS does based on the characteristics of the first line of data.

    E.G.

    BULK INSERT table_name FROM file_path WITH (FirstRow = 2) 

  • Thanks very much for the script.  We've been trying it out in test this morning and it looks like it will do exactly what we needed. 

    I found out someone else had also tried the bulk insert route before me.  They couldn't get past the fixed column widths.  There are no other column delimiters involved (such as the usual tabs, commas,etc), so they could never figure out a way to get the data to insert correctly.

    Thanks for the help.  Both replys were very appreciated. 

  • I should have noted that using the Bulk Insert approach involves the use of a "raw data" staging table with a single character field of length equal to the fixed record length. The staged data can then be parsed into fields using string manipulation SQL functions.

    E.G.

    create table raw_data_table(Record char(61))

    -- Import the data, one line at a time, skipping the first incomplete record

    Bulk Insert raw_data_table From 'file_path With (FirstRow= 2, Tablock)

    -- Parse the imported records

    Select Substring(Record, 1, field_1_length) as Field1, substring(Record, i, j) as Field2, ...

    Into parsed_data_table

    from raw_data_table

    Note: If preserving incoming file record order is important and the order can not be determined from the data itself, then the raw_data_table can include an identity field.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply