March 8, 2004 at 3:11 pm
I have a text file I'm trying to import into a table (through a scheduled DTS package). The file uses fixed-width columns and CR/LF to separate rows of data. The problem I'm having is with a header row that uses different spacing. It is only 10 characters long, whereas all other rows are 61. As a result, if I set the DTS transform task to skip the first row, it skips the header row and the first row of actual data. If I set it to not skip any rows (thinking that I could add in an SQL command to delete the first row after the import), then it makes all the rows only 10 characters long, which is a real mess. I can clean it up through Excel first, but for various reasons that is not an option for our production environment.
Could anyone please give me a shove in the right direction as to how to delete this header row through VB script or some kind of T-SQL manipulation (or any other possibilities)? Thanks in advance for any advice and ideas.
March 8, 2004 at 7:30 pm
save following to DelHeaderLine.wsf
Usage sample:
DelHeaderLine source destination
It copy the source file to destination without the first line.
<job id="DelHeaderLine">
<runtime>
<description>Delete text file header line</description>
<named name="SFile"
helpstring="Source file"
type="string"
required="true"
/>
<named name="DFile"
helpstring="Destination file"
type="string"
required="true"
/>
<example>Delheaderline.wsf /SFile:151.txt /DFile:aa.txt</example>
</runtime>
<script language="VBScript">
Dim argsNamed, argsUnnamed
Dim objFSO, sName, dName
Dim sNameOpen, dNameOpen
set argsNamed = WScript.Arguments.Named
Set argsUnnamed = WScript.Arguments.Unnamed
'can not be both named and unnamed
If ((argsNamed.Count <> 0) and (argsUnNamed.Count <> 0)) then
WScript.Arguments.ShowUsage
WScript.Quit
End If
if argsNamed.Count <> 0 then
If not (WScript.Arguments.Named.Exists("SFile") and WScript.Arguments.Named.Exists("DFile")) then
WScript.Arguments.ShowUsage
WScript.Quit
End If
sName = WScript.Arguments.Named("SFile")
dName = WScript.Arguments.Named("DFile")
else
sName = WScript.Arguments(0)
dName = WScript.Arguments(1)
end if
Set objFSO = CreateObject("Scripting.FileSystemObject")
'Check exists
If Not objFSO.FileExists(sName) Then
WScript.Echo sName & " does not exists"
WScript.Quit
End If
If objFSO.FileExists(dName) Then
WScript.Echo "Delete " & dName & " and run it again"
WScript.Quit
End If
'source file should not be empty
set sNameOpen = objFSO.GetFile(sName)
if sNameOpen.Size = 0 then
WScript.Echo sName & " is empty."
WScript.Quit
end if
set sNameOpen = objFSO.OpenTextFile(sName, 1, FALSE, -2)
set dNameOpen = objFSO.OpenTextFile(dName, 2, TRUE, -2)
Dim s
s = sNameOpen.ReadLine 'skip the first line
DO while sNameOpen.AtEndOfStream <> TRUE
s = sNameOpen.ReadLine
dNameOpen.WriteLine s
Loop
sNameOpen.Close
dNameOpen.Close
WScript.Echo "Operation successful!!!"
</script>
</job>
March 9, 2004 at 8:17 am
Bulk Insert with a FirstRow argument of 2 would work, as it does not make inaccurate assumptions like DTS does based on the characteristics of the first line of data.
E.G.
BULK INSERT table_name FROM file_path WITH (FirstRow = 2)
March 9, 2004 at 9:50 am
Thanks very much for the script. We've been trying it out in test this morning and it looks like it will do exactly what we needed.
I found out someone else had also tried the bulk insert route before me. They couldn't get past the fixed column widths. There are no other column delimiters involved (such as the usual tabs, commas,etc), so they could never figure out a way to get the data to insert correctly.
Thanks for the help. Both replys were very appreciated.
March 9, 2004 at 10:44 am
I should have noted that using the Bulk Insert approach involves the use of a "raw data" staging table with a single character field of length equal to the fixed record length. The staged data can then be parsed into fields using string manipulation SQL functions.
E.G.
create table raw_data_table(Record char(61))
-- Import the data, one line at a time, skipping the first incomplete record
Bulk Insert raw_data_table From 'file_path With (FirstRow= 2, Tablock)
-- Parse the imported records
Select Substring(Record, 1, field_1_length) as Field1, substring(Record, i, j) as Field2, ...
Into parsed_data_table
from raw_data_table
Note: If preserving incoming file record order is important and the order can not be determined from the data itself, then the raw_data_table can include an identity field.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply