Weird rounding for decimal while doing a bulk insert from a CSV

Question

Post reply

Weird rounding for decimal while doing a bulk insert from a CSV

springrider

Mr or Mrs. 500

Points: 523
More actions
September 7, 2010 at 5:35 pm

#227444

Good morning
I am currently importing CSV file using a bulk insert.
I discover that my decimal values of temperature were rounded in a strange way.
For example, 27.15 in the CSV becomes 27.14 in SQL Server, 26.7 becomes 26.69.
On the other hand, 16.05 is correctly imported as well as 16.5 for example.
I can not really predict what is happening.
In my Schema.ini for running the bulk insert, i declared these columns as DOUBLE and in my SQL Server, i declared them as Decimal (5.2). I tried using numeric or having Decimal 5.3 and it doesn' t work (26.7 becomes 26.699).
Could any of you help on this, i don' t know how to make it working?
Thanks in advance for your help
Sylvain

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Madhivanan-208264 SSCertifiable Points: 7516 More actions · Answer 1

Is the cell formatted? If so, remove any formation and try again

Madhivanan

Failing to plan is Planning to fail

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 2

Thanks for your answer.

What do you mean about the cell format?

In my SQL Server, the field has been defined as decimal(5,2), i tried with Numeric(5,2), and i have still the same result.

In my CSV file opened with NotePad for example, i have effectively 26.7 and not 26.69 as obtained in SQL Server.

For my bulk insert, i am using a schema.ini where i have defined this field as being double with NumberDigits=2.

Really don't see what is going wrong...

Thanks for your help

Sylvain

David Webb-CDS SSCoach Points: 17398 More actions · Answer 3

DOUBLE is a floating point datatype which is approximate.

http://msdn.microsoft.com/en-us/library/ayazw934.aspx

If you declare it as numeric or decimal, it may work more precisely.

And then again, I might be wrong ...
David Webb

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 4

Thanks David,

I just tried to change the data type in my schema.ini for the temperature with float:

Col4=TEMP_INT float and i obtain the same uncertainty about the rounding.

It seems that i can not use any other type for the decimal values:

http://msdn.microsoft.com/en-us/library/ms709353%28VS.85%29.aspx

Could the problem come from somewhere else?

Thanks again

Sylvain

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

Can you provide the DDL (CREATE TABLE statement) for the target table and a sample of the data you are attempting to import, and what the expected results should be when the import is completed?

pstamant Valued Member Points: 53 More actions · Answer 6

Open your CSV file in a text editor and look at the row you are importing. Does it have two decimal places or more? The first rows in the CSV will have an impact on the way the numbers are converted when imported. Try manipulating the numbers in Excel and exporting to a new CSV and reimport the file to see if the numbers are still coming in wrong.

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 7

Thanks all for your replies

THis is the SQL code of my table in SQL Server:

CREATE TABLE [ARCHIVAL].[ARC_DATA](

[arc_data_id] [int] IDENTITY(1,1) NOT NULL,

[arc_tag_id] [int] NOT NULL,

[date_UTC] [datetime] NULL,

[date_local] [datetime] NULL,

[depth] [decimal](6, 2) NULL,

[temp_int] [numeric](5, 2) NULL,

[temp_ext] [numeric](5, 2) NULL,

[light_surface] [decimal](7, 2) NULL,

[light_depth] [decimal](7, 2) NULL,

CONSTRAINT [PK_ARC_DATA] PRIMARY KEY CLUSTERED

(

[arc_data_id] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

This is a snapshot of my CSV opened with Notepad++:

Date UTC,Local date/time,Depth,Recorder Temperature,Stalk Temperature,Light Level

10/05/2008 17:48:00,10/05/2008 7:48:00,44,27.15,26.7,144

10/05/2008 17:48:30,10/05/2008 7:48:30,69.5,27.15,26.65,134

10/05/2008 17:49:00,10/05/2008 7:49:00,99.5,27.15,26.3,123

10/05/2008 17:49:30,10/05/2008 7:49:30,115.5,26.9,21.15,112

This is now what i obtained in my SQL Server table:

arc_data_idarc_tag_iddate_UTCdate_localdepth temp_int temp_ext light_surface light_depth

255714792022008-05-10 17:48:00.0002008-05-10 07:48:00.00044.00 27.1426.69 NULL144.00

255714802022008-05-10 17:48:30.0002008-05-10 07:48:30.00069.50 27.1426.64 NULL134.00

255714812022008-05-10 17:49:00.0002008-05-10 07:49:00.00099.50 27.1426.30 NULL123.00

255714822022008-05-10 17:49:30.0002008-05-10 07:49:30.000115.50 26.8921.14 NULL112.00

As you can see 27.15 becomes 27.14...

My schema.ini to make bulk insert is as follows:

ColNameHeader=True

Format=CSVDelimited

DecimalSymbol=.

NumberDigits=2

DateTimeFormat=dd/mm/yyyy hh:nn:ss

Col1=DATE_UTC DateTime

Col2=DATE_LOCAL DateTime

Col3=DEPTH float

Col4=TEMP_INT float

Col5=TEMP_EXT float

Col6=LIGHT_DEPTH float

For doing the bulk insert, i wrote a small application in VB.Net, nothing fancy...

I tried already to modify the datatype in the schem.ini between DOUBLE and Float, or using decimal datatype in my SQL server table but no change...

If anyone have an idea... Start being desperate

Thanks in advance

Sylvain

steve-893342 SSCarpal Tunnel Points: 4439 More actions · Answer 8

I have come up with a SQL version using OPENROWSET BULK which seems to work OK. You could also use BULK INSERT via a staging table if you don't want to use a format file. Not sure how you're populating arc_tag_id, so I've hard coded it as 202 for the time being.

INSERT ARCHIVAL.ARC_DATA

(

arc_tag_id,

date_UTC,

date_local,

depth,

temp_int,

temp_ext,

light_depth

)

SELECT

202 AS arc_tag_id,

CONVERT(DATETIME, date_UTC, 103),

CONVERT(DATETIME, date_local, 103),

CONVERT(DECIMAL(6,2), depth),

CONVERT(NUMERIC(5,2), temp_int),

CONVERT(NUMERIC(5,2), temp_ext),

CONVERT(DECIMAL(7,2), light_depth)

FROM OPENROWSET (BULK 'C:\Arc.txt', FORMATFILE = 'C:\Arc.fmt', FIRSTROW = 2) AS Z

Arc.fmt

7.0

6

1SQLCHAR00","1date_UTC

2SQLCHAR00","2date_local

3SQLCHAR00","3depth

4SQLCHAR00","4temp_int

5SQLCHAR00","5temp_ext

6SQLCHAR00"\r\n"6light_depth

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 9

Thanks Steve,

Just tried your code in my VB.Net application:

cnSQL.ConnectionString = "Data Source=TETAUTAI;Initial Catalog=TUNA_DBS;UID=user;PWD=pwd;"

cnSQL.Open()

cmdSQL.Connection = cnSQL

cmdSQL.CommandText = "INSERT INTO archival.ARC_DATA(arc_tag_id,date_UTC,date_local,depth,temp_int,temp_ext,light_depth)" _

& " SELECT 204 AS arc_tag_id,CONVERT(DATETIME, date_UTC, 103),CONVERT(DATETIME, date_local, 103)" _

& " ,CONVERT(DECIMAL(6,2), depth),CONVERT(NUMERIC(5,2), temp_int),CONVERT(NUMERIC(5,2), temp_ext)," _

& " CONVERT(DECIMAL(7,2), light_depth)" _

& " FROM OPENROWSET (BULK 'F:\import\490597_local_time.csv', FORMATFILE = 'F:\import\Arc.fmt', FIRSTROW = 2) AS Z;"

cmdSQL.ExecuteNonQuery()

cnSQL.Close()

And it works!! All the decimal values are correctly transferred.

But i remembered why i didn't follow this Bulk solution at the beginning because of the fact that the BULK statement seems to only point to files stored on the DB server only.

In my situation, i am accessing from my local desktop, CSV files on my office network and load them into a remote SQL server DB.

Can i overcome this limitation for the BULK?

I have already changed the permission of "user' to allow bulkadmin roles but now have issue to read files not located on the DB server. I had some of these errors "Cannot bulk load because the file could not be opened"

Thanks again

Sylvain

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 10

You might try using the UNC of the files location. It does mean you need to share the folder or use the hidden admin share if there is one.

steve-893342 SSCarpal Tunnel Points: 4439 More actions · Answer 11

springrider (9/9/2010)
Thanks Steve,
Just tried your code in my VB.Net application:
cnSQL.ConnectionString = "Data Source=TETAUTAI;Initial Catalog=TUNA_DBS;UID=user;PWD=pwd;"
cnSQL.Open()
cmdSQL.Connection = cnSQL
cmdSQL.CommandText = "INSERT INTO archival.ARC_DATA(arc_tag_id,date_UTC,date_local,depth,temp_int,temp_ext,light_depth)" _
& " SELECT 204 AS arc_tag_id,CONVERT(DATETIME, date_UTC, 103),CONVERT(DATETIME, date_local, 103)" _
& " ,CONVERT(DECIMAL(6,2), depth),CONVERT(NUMERIC(5,2), temp_int),CONVERT(NUMERIC(5,2), temp_ext)," _
& " CONVERT(DECIMAL(7,2), light_depth)" _
& " FROM OPENROWSET (BULK 'F:\import\490597_local_time.csv', FORMATFILE = 'F:\import\Arc.fmt', FIRSTROW = 2) AS Z;"
cmdSQL.ExecuteNonQuery()
cnSQL.Close()
And it works!! All the decimal values are correctly transferred.
But i remembered why i didn't follow this Bulk solution at the beginning because of the fact that the BULK statement seems to only point to files stored on the DB server only.
In my situation, i am accessing from my local desktop, CSV files on my office network and load them into a remote SQL server DB.
Can i overcome this limitation for the BULK?
I have already changed the permission of "user' to allow bulkadmin roles but now have issue to read files not located on the DB server. I had some of these errors "Cannot bulk load because the file could not be opened"
Thanks again
Sylvain

Lynn Pettis (9/9/2010)
You might try using the UNC of the files location. It does mean you need to share the folder or use the hidden admin share if there is one.

Thanks for the feedback folks. Were you able to set up a share on your local machine to accommodate the OPENROWSET BULK or was this not an acceptable solution for you?

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 12

hello all,

Thanks for your feedback.

The OPENROWSET will not work for me due to limited permissions in accessing my SQL server.

I am pretty sure the issue is linked to the fact that my datacolumn in my datatable for the numeric value are defined as DOUBLE and that the output fields in SQL Server are DECIMAL.

My idea now would be to convert my DOUBLE datacolumns in my datatable as DECIMAL.

These DOUBLE datacolumns are defined in my Schema.ini.

Would any of you see a way of keeping the schema.ini (important for me because it defines data format) and in the meantime changing the data types of my numeric columns as DECIMAL?

From what i have seen, you can not change the datatype after the column has been populated. Is is possible to assign the schema.ini to the datatable before filling it with data? It will then allow me to update the datacolumns type as DECIMAL.

Thanks for your help

Sylvain

caladba SSCommitted Points: 1514 More actions · Answer 13

Have you tried to convert to string first then to decimal?

springrider Mr or Mrs. 500 Points: 523 More actions · Answer 14

OK i finally find a solution and wanted to share it with you in case it is useful to someone else.

The problem of the bulk import is the difference of types between DOUBLE in the schema.ini and DECIMAL in the output table. Schema.ini does not provide any other choice than DOUBLE to define my numeric field.

So i filled my datatable using the Schema.ini and then recreate new datacolums with decimal datatypes which will be copy of the other Double columns.

Then i used these new columns as being the one to be imported into SQL Server. I have a perfect match now between my datatable columns and my SQL SERVER table columns for the types and there is no loss of precision.:-)

Dim myCSVAdapter As New OleDb.OleDbDataAdapter

myCSVAdapter.Fill(dt, myRSCSV)

'My datatable is filled using the schema.ini with DOUBLE columns

Dim dcDepth As New DataColumn()

dcDepth.DataType = System.Type.GetType("System.Decimal")

dcDepth.ColumnName = "DEPTH_DEC"

dcDepth.Expression = "DEPTH"

dt.Columns.Add(dcDepth)

This new decimal column is a clone of the original DOUBLE column and it will be the one i will use for the bulk insert.

Thanks a lot for your help it was really helpful

Cheers

Sylvain