Stripping out double quotes (") in bulk insert

Question

Stripping out double quotes (") in bulk insert

Viewing 15 posts - 31 through 45 (of 58 total)

You must be logged in to reply to this topic. Login to reply

Adrian Nichols-360275 SSCertifiable Points: 7694 More actions · Answer 1

E,

Which bit doesn't work? The stripping of spaces?

You could always run UPDATE statements to trim trailing spaces... It's not an elegant solution but would provide A solution.

UPDATE JobTable SET field = RTRIM(field)

:ermm:

Ade

A Freudian Slip is when you say one thing and mean your mother.
For detail-enriched answers, ask detail-enriched questions...[/url]

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 2

Nahh,

It's the bulk insert that doesn't work... That's to do with the format file and the structure of the data.

If the delimitor used for this kind of data is "," and if that ignores all spaces to the right of each field then there must be something wrong with the data.

Is that right?

Here's the error:

Msg 4863, Level 16, State 1, Line 1

Bulk load data conversion error (truncation) for row 2, column 4 (AddressLinks).

Msg 4863, Level 16, State 1, Line 1

Bulk load data conversion error (truncation) for row 3, column 4 (AddressLinks).

***In responce to what I said above***:

IMAGINE that '_' is a space (the forum get rid of them so i can't present the data properly...)

This is how the data looks:

1374691,1,0,Y_________,Y_________,Y_________ ,E________ ,......

WHERE '_' IS ACTUALLY A SPACE!!!

The bulk insert import the 1st 4 fields but errors as soon as it reaches fields with the spaces.

The columns with just 1 letter are char(1) and can not accomodate the spaces.

And since the bulk insert doesn't seem to ignore them it fails. Is there a way I can strip the spaces DURING the Bulk insert...?

Another way I have thought is to import the data in a table with massive columns. Then strip the spaces and import the data into the wanted table.

The data MUST have the required structure for further processing, so it MUST fit into that table correctly...!

This is going to be a frequent process so it will be automated and therefore I'd rather if i don't have to add all that process...!!!

This is fun isn't it...? lol

David Burrows SSC Guru Points: 65143 More actions · Answer 3

BULK INSERT / BCP will not truncate/remove trailing spaces on input data, and since you already stated that the data contains trailing spaces and is longer than the column definitions then you will get this error.

Try creating a staging table with larger columns, BULK INSERT into that table and then insert into the actual table using RTRIM to remove trailing spaces as already suggested.

Ooops just read the last bit of last post suggesting this :blush:

Other than that you will have to preprocess the data before bulk insert, not something you want to do as indicated in your last post.

Far away is close at hand in the images of elsewhere.
Anon.

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 4

That's fine, mate.

At least now I know that's the only way.

Thank you very much everybofy for your help.

i really appreciate it.

I will keep you posted of any other issues I will come up with 😛

Keep it up!

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 5

Sure... using slightly different method...

--Create a linked server.

EXEC sp_addlinkedserver txtsrv, 'Jet 4.0',

'Microsoft.Jet.OLEDB.4.0',

'c:\temp', --<<LOOK!!!! CHANGE TO DIRECTORY WHERE FILES ARE AT!

NULL,

'Text'

GO

--Set up login mappings.

EXEC sp_addlinkedsrvlogin txtsrv, FALSE, NULL, Admin, NULL

GO

--List the tables in the linked server.

EXEC sp_tables_ex txtsrv

GO

--Query one of the tables: Import#txt

--by using a four-part name.

SELECT *

FROM txtsrv...[Import#txt]

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 6

That was a little too much for my knowledge... 🙁

How about a dummies guide...?

I'm using MS SQL 2005.

Assuming that my datafile is in:

\\sigfiler1\Data$\EuroDirect Departments\Production\Neptune\CallTraceFormatFiles

and my format file is in:

\\sigfiler1\Data$\EuroDirect Departments\Production\Neptune\CallTraceFormatFiles\csv

What do i need to do exactly...?

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 7

I dont think i have permissions to perform this action...

sp_addlinkedserver

Emmanouil

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 8

The problem is with the data and BCP/Bulk Insert... it doesn't like it when the format changes like with some rows having a quoted first field and some not (because of the embedded comma, I suspect).

The linked server requires no format file and will read just about anything... but more slowly. If you don't have the privs, you DBA could make the linked server. I tried it on the 5 lines of data you provided and it did the trick nicely.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 9

Hey jeff,

Do you know of any way to count how many columns a file has (other than the manual way...)?

The files are usually comma delimited, tab delimited, or fixed length.

I'm looking for something to add on my Bulk insert which return the number of columns of the data file to be inserted.

Any bright ideas?

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 10

Import 1 line and count the LEN difference between the line with delimiters and the line where the delimiters have been replaced with nothing and add 1 to that.

For fixed field, no guarantees because the data may have spaces in it... can't rely on spaces being a delimiter there.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 11

Im thinking the following and then compare that to the initial length...

Is that what you meant?

replace(RowData,',',0) from #RowTable where ID = 1

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 12

Do you know of a way to import just one line of the file?

the followinf for example imports the whole data in one row per line...

exec xp_cmdshell 'type "\\sigfiler1\Data$\EuroDirect Departments\Production\Neptune\CallTraceFormatFiles\datacsv56.csv" '

I need something like that but for just the 1st line...

Any ideas?

E.

Adrian Nichols-360275 SSCertifiable Points: 7694 More actions · Answer 13

You can use the Type command through the xp_cmdshell in conjunction with SET ROWCOUNT

SET ROWCOUNT 1

CREATE TABLE #temp (line VARCHAR(4000))

INSERT INTO #temp

EXEC master.dbo.xp_cmdShell 'Type \\myPC\mydirectory\myFile.csv'

SET ROWCOUNT 0

This will import the top line in to your temp table. To get the fields you can then, as Jeff rightly said:

SELECT (LEN(line) - LEN(REPLACE(line, ',', ''))) + 1

FROM #temp

Ade

A Freudian Slip is when you say one thing and mean your mother.
For detail-enriched answers, ask detail-enriched questions...[/url]

lexus1234 SSC Enthusiast Points: 107 More actions · Answer 14

Hi,

I just finished an assignment where I had to import gigabytes of mainframe/AS400 data into SQL Server 2005. After days of work I discovered that:

1. Don't use Excel - it will try to fix columns it thinks are numbers/dates and the results may not be what you expect. The new Excel (2007) has 1,000,000 rows and 16,000 columns (in 'x' type spreadsheets) but in large files it can get very slow and it will massage your data.

2. Use MS Access. Import the data into Access where you can tell it to use " as a delimiter, check/fix all the column names, turn off all the indexes (even though SQL will ignore the indexes it can take a long time) and check what type of field Access thinks each column is. Save the resulting MDB file and then then import from the Access table within the MDB. By the way old versions of Access had a limit of 2GB per table so I had to break up some of the tables. I don't know that the current limit, if any, is. I imported all the columns as text and converted them on the SQL Server side to date/numbers. You will have to take care of nulls after the import if you want to simplify things when the data is used.

What you have to watch out for is a double quote embedded in the string/text columns. Yes - users will enter a double quote (paired or single) to highlight text or will use a double quote instead of a single quote. Single quotes can also do unexpected things when passing the string to another field via code - the inserts/updates done through code will not work as you expect Using parameters to update the columns will usually solve such issues. I used reqex on the PC side to remove embedded double quotes.

Emmanouil Karaiskakis Ten Centuries Points: 1013 More actions · Answer 15

I'm trying to automate the process mate.I'm using Bulk Insert and it's pretty quick assuming the data is consistent. Although MS Access will be used for complicated and bad files.