September 7, 2005 at 9:09 am
I have a file that is a mixture of text and binary data. I can import the file and SQL Server will read the binary data to the best of its ability but the problem I am having seems to be that SQL Server reads the binary data incorrectly.
For example, using a 3rd Party binary file reader, I get '0x31D40000'. Yet with SQL Server, I get '0x31C82020'.
When these binary values are converted into integer values, the 3rd Party value is 54321, whilst the SQL Server value is 51249 and I would expect them to be the same, but they are obviously not.
Is there a reason why SQL Server cannot read the binary data incorrectly?
I am importing the file into a temporary table and then using CONVERT(VARBINARY, SUBSTRING(MyField, 1, 4)) to return the binary value which I can then change into an integer.
I cannot post the actual information in the file as it is personably identifiable.
Many thanks.
September 7, 2005 at 10:42 am
How are you importing the file?
September 7, 2005 at 10:51 am
CREATE TABLE
##Tmp1
(
MyText TEXT
)
DECLARE @Cmd VARCHAR(100)
SET @Cmd = 'BCP ##Tmp1 IN "file details" -T -c -r "##~~##~~##"'
EXEC master.dbo.xp_cmdshell @Cmd, NO_OUTPUT
This is the only way that I could get the entire file into a single field so that I could parse it. It loads the entire file into a single field.
The file is fixed width and all records are heirachical so the file requires a logical approach, i.e., the structure is :
File Header
Record Header
Record
Record
Record Header
Record
Record
Record
etc
The intention is to split the file into separate tables - but before doing that I need to ensure the binary values can be correctly converted to integer as once the file is split, the converted values will act as validation, so things like total number of records etc are stored in the binary values.
September 7, 2005 at 10:57 am
Hmmm. The default column terminator is a tab character, which is ASCII (9). I wonder if that's causing you any problems, assuming there may be imbedded tab chars in the binary data. Are you getting the same number of rows imported as you expected?
(Presuming the example you provided wasn't literal, that is.)
Also, could there be some case-shifting going on?
September 8, 2005 at 2:58 am
The file is created using a program that was written in Delphi and below is an example of the first record in the file.
H1Ô 0504151427505479150405© º í
The rows are fixed to 100 character length (the last 57 characters are padding for the header record) and the header record is identified by the 'H' as the first character.
The next 4 characters (2-5) are the binary value for an ID that is unique. The number 1 and the Ô are followed by two zeros (or nulls). The problem seems to be that when you look at these values (1, Ô, null, null) in a 3rd party binary file viewer, the value of Ô is different to how SQL Server interprets it.
This is the case for all the binary values in the file. If we use the 3rd party binary viewer to convert the binary values to numbers we can get the correct row counts to verify that the file structure and import routine are fine and that the correct number of rows are imported - but when SQL Server converts the binary values, we get the wrong figures.
This is doubly frustrating because the unique ID is required to link back to the source data to verify the correct data has been exported.
Do you think that this could be something to do with non-printable characters being handled incorrectly within Windows or the use of SUBSTRING treating the binary incorrectly?
I don't think it is case-shifting as the problems only happen with the binary data which is all upper case.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply