January 3, 2019 at 6:17 am
Input CSV file:
"Id","Name","Sal"
"1","vikas,j","10000.5"
"2","James","5000.2"
"3","V
J","4000.3"
create table dbo.demo
(
Id char(1),
Name varchar(50),
Sal float
)
bcp.fmt file:
12.0
4
1 SQLCHAR 0 0 "\"" 0 FIRST_QUOTE ""
2 SQLCHAR 0 3 "\"," 1 Id ""
3 SQLCHAR 0 55 ",\"" 2 Name ""
4 SQLCHAR 0 15 "\"\r\n" 3 Sal ""
BULK INSERT dbo.demo from 'D:\Input.csv' with(DATAFILETYPE='char',fieldterminator= '","' ,Firstrow = 2, Rows_per_batch = 100000 , rowterminator='\n',FORMATFILE = 'D:\bcp.fmt')
Error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal)
Can anyone tell me what am I missing here, from what I understand I am doing something wrong in format file while passing Terminator value?
January 3, 2019 at 9:01 am
The problem is how you're treating the leading quote. You can't treat it as a delimiter. It has to be treated as a guaranteed single character (a fixed field length of 1). Also, the delimiters between the fields are actually "," (quotes included) and so must the delimiters be in the BCP format file. Since a " must be escaped when used as a part of a delimiter in BCP Format Files and remembering that a delimiter in a BCP Format File marks the END of a field and the whole delimiter must be encapsulated in quotes, the delimiter between the fields must be marked as "\",\"".
And, with the understanding that I've not tested this specific example, please see the following....
12.0
4
1 SQLCHAR 0 1 "" 0 FIRST_QUOTE ""
2 SQLCHAR 0 3 "\",\"" 1 Id ""
3 SQLCHAR 0 55 "\",\"" 2 Name ""
4 SQLCHAR 0 15 "\"\r\n" 3 Sal ""
I also recommend that since your data (and the first and last quotes are actually a part of the data) can be variable width, that you NOT specify the exact column width you're looking for. Let the datatype of the columns in the target table do that for you so that you don't need to maintain the BCP format file for simple expansions of columns. And, yeah... the oversized 9999 bytes is very intentional so that I get error feedback if the field in the file contain more data than the table columns do.
This is what I 'd end up with for a BCP format file this this file...
12.0
4
1 SQLCHAR 0 1 "" 0 FIRST_QUOTE ""
2 SQLCHAR 0 9999 "\",\"" 1 Id ""
3 SQLCHAR 0 9999 "\",\"" 2 Name ""
4 SQLCHAR 0 9999 "\"\r\n" 3 Sal ""
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2019 at 11:14 pm
Thanks, Jeff I tried your suggested format file
Bulk insert generated the error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal). But in actual it had inserted 2 rows in the table. However 3rd row where name columns value goes to the next line that 1 row was not inserted. Please note this is how I get value in Input CSV and that cannot be changed.
"3","V
J","4000.3"
So Ideally expected output in the table for the above row is :
Id Name Sal
3 V<space>J 4000.3
In SQL 2017 this can be achieved without using format file. But can't this be done in SQL 2014 even when I am using format file?
January 4, 2019 at 4:49 am
the issue is that the last line fails to meet the correct delimiter "\"\r\n" and as such the value contains a quote
try changing the table to
create table dbo.demo
(
Id char(1),
Name varchar(50),
Sal varchar(100)
)
and you will see what I mean
BCP prior to 2017 will not work correctly no matter what is done with the delimiters - there will always be some issues that need to be coded around.
this will work.
if object_id('demo_staging') is not null
drop table demo_staging;
create table dbo.demo_staging
(
Id char(1),
Name varchar(50),
Sal varchar(100),
id2 int identity(1,1)
)
BULK INSERT dbo.demo_staging from 'c:\temp\test.csv' with(DATAFILETYPE='char'
,Firstrow = 2, Rows_per_batch = 100000
-- not using these as they are defined within the format file
-- , rowterminator='\n'
--,fieldterminator= '","'
,FORMATFILE = 'c:\temp\test.fmt')
select *
from dbo.demo_staging
-- as we know the last value is "incorrect" clean it
update t2
set t2.sal = left(t2.sal, len(t2.sal) -1)
from dbo.demo_staging t2
where t2.id2 = (select max(id2) from dbo.demo_staging)
select *
from dbo.demo_staging
-- now it can be inserted into a table with the correct datatypes
if object_id('demo') is not null
drop table demo;
create table dbo.demo
(
Id char(1),
Name varchar(50),
Sal float
)
insert into dbo.demo
select id
, name
, convert(float, sal)
from dbo.demo_staging
select *
from dbo.demo
if object_id('demo_staging') is not null
drop table demo_staging;
if object_id('demo') is not null
drop table demo;
regarding "3 V<space>J 4000.3"
this will not be correct - the space will either be \n or \r\n - if you need it to be space you need to replace it yourself
January 4, 2019 at 8:48 am
vikasjagadale8 - Thursday, January 3, 2019 11:14 PMThanks, Jeff I tried your suggested format fileBulk insert generated the error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal). But in actual it had inserted 2 rows in the table. However 3rd row where name columns value goes to the next line that 1 row was not inserted. Please note this is how I get value in Input CSV and that cannot be changed.
"3","V
J","4000.3"So Ideally expected output in the table for the above row is :
Id Name Sal
3 V<space>J 4000.3In SQL 2017 this can be achieved without using format file. But can't this be done in SQL 2014 even when I am using format file?
Can you attach the actual test file (please, no PII or other sensitive information) you're using for this? I'd like to examine it for control characters. If YOU made the test file, I think I know what the problem is but would like to confirm it before making a suggestion.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 4, 2019 at 9:00 am
Jeff Moden - Friday, January 4, 2019 8:48 AMvikasjagadale8 - Thursday, January 3, 2019 11:14 PMThanks, Jeff I tried your suggested format fileBulk insert generated the error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal). But in actual it had inserted 2 rows in the table. However 3rd row where name columns value goes to the next line that 1 row was not inserted. Please note this is how I get value in Input CSV and that cannot be changed.
"3","V
J","4000.3"So Ideally expected output in the table for the above row is :
Id Name Sal
3 V<space>J 4000.3In SQL 2017 this can be achieved without using format file. But can't this be done in SQL 2014 even when I am using format file?
Can you attach the actual test file (please, no PII or other sensitive information) you're using for this? I'd like to examine it for control characters. If YOU made the test file, I think I know what the problem is but would like to confirm it before making a suggestion.
Sure! But unfortunately, since it's a CSV file I cannot Upload that file. So I am uploading the images of the CSV file. When opened in Notepad++ so you can see field terminator(Comma) and End of Line (CRLF) and a second image of actual CSV so that you can see how excel by default wraps that text
January 4, 2019 at 9:25 am
vikasjagadale8 - Friday, January 4, 2019 9:00 AMJeff Moden - Friday, January 4, 2019 8:48 AMvikasjagadale8 - Thursday, January 3, 2019 11:14 PMThanks, Jeff I tried your suggested format fileBulk insert generated the error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal). But in actual it had inserted 2 rows in the table. However 3rd row where name columns value goes to the next line that 1 row was not inserted. Please note this is how I get value in Input CSV and that cannot be changed.
"3","V
J","4000.3"So Ideally expected output in the table for the above row is :
Id Name Sal
3 V<space>J 4000.3In SQL 2017 this can be achieved without using format file. But can't this be done in SQL 2014 even when I am using format file?
Can you attach the actual test file (please, no PII or other sensitive information) you're using for this? I'd like to examine it for control characters. If YOU made the test file, I think I know what the problem is but would like to confirm it before making a suggestion.
Sure! But unfortunately, since it's a CSV file I cannot Upload that file. So I am uploading the images of the CSV file. When opened in Notepad++ so you can see field terminator(Comma) and End of Line (CRLF) and a second image of actual CSV so that you can see how excel by default wraps that text
I can't use that. I need the file for other reasons, as well. Yep... I could type it out. Just don't want to do that. Just change the extension of the CSV file to TXT and that'll work.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 4, 2019 at 9:31 am
Jeff Moden - Friday, January 4, 2019 9:25 AMvikasjagadale8 - Friday, January 4, 2019 9:00 AMJeff Moden - Friday, January 4, 2019 8:48 AMvikasjagadale8 - Thursday, January 3, 2019 11:14 PMThanks, Jeff I tried your suggested format fileBulk insert generated the error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 4 (Sal). But in actual it had inserted 2 rows in the table. However 3rd row where name columns value goes to the next line that 1 row was not inserted. Please note this is how I get value in Input CSV and that cannot be changed.
"3","V
J","4000.3"So Ideally expected output in the table for the above row is :
Id Name Sal
3 V<space>J 4000.3In SQL 2017 this can be achieved without using format file. But can't this be done in SQL 2014 even when I am using format file?
Can you attach the actual test file (please, no PII or other sensitive information) you're using for this? I'd like to examine it for control characters. If YOU made the test file, I think I know what the problem is but would like to confirm it before making a suggestion.
Sure! But unfortunately, since it's a CSV file I cannot Upload that file. So I am uploading the images of the CSV file. When opened in Notepad++ so you can see field terminator(Comma) and End of Line (CRLF) and a second image of actual CSV so that you can see how excel by default wraps that text
I can't use that. I need the file for other reasons, as well. Yep... I could type it out. Just don't want to do that. Just change the extension of the CSV file to TXT and that'll work.
January 4, 2019 at 9:50 am
I was able to get this to return the correct data. Format File and data file are attached
Here's the sql
SELECT *
FROM OPENROWSET (
BULK 'c:\work\data\demo1.txt',
Firstrow = 2,
FORMATFILE = 'c:\work\data\demofmt.txt'
) AS t1;
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
January 4, 2019 at 10:12 am
Mike01 - Friday, January 4, 2019 9:50 AMI was able to get this to return the correct data. Format File and data file are attachedHere's the sql
SELECT *
FROM OPENROWSET (
BULK 'c:\work\data\demo1.txt',
Firstrow = 2,
FORMATFILE = 'c:\work\data\demofmt.txt'
) AS t1;
Thanks, Mike, that worked. But still what I couldn't understand is how come just changing the format to .txt worked. And then what's the problem with CSV file ? Because ideally, I am going to get CSV file for importing data. I can change the format to txt but want to do that only as a last option.
I think I got it how it worked when I checked your demo.txt file after the last row there is an additional line at the end (basically you have pressed enter after the last row). Now it worked even for CSV format. Please correct me If I am wrong.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply