Bulk Inserting Comma separated Quotation Delimited text file

Question

Bulk Inserting Comma separated Quotation Delimited text file

Frank Cazabon

Hall of Fame

Points: 3467
More actions
January 13, 2013 at 4:48 pm

#287721

I am receiving a text file with comma separated data with quotation marks around certain fields that I need to bulk insert into SQL Server 2008 R2. An example of the data is:
"BUNGALOW 101,RIDGEWAY DR","POINT FORTIN",,,"TT","868-389-9769",,"ZPXB303191",1
As you can see there is a comma in the first field.
Using this Bulk Insert command causes that first field to be split into two columns. I know BOL says this is what is supposed to happen and to choose the separators carefully, but I have no control over the text file which is created by a 20 year old system which cannot be changed.
Any ideas how I can import this?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86948 More actions · Answer 1

I'm not sure if this will help you, but the following code will replace the comma embedded between quotation marks (as long as it doesn't appear in the first position) with a semicolon.

;WITH MyData (ID, strcol) AS (

SELECT 1, '"BUNGALOW 101,RIDGEWAY DR","POINT FORTIN",,,"TT","868-389-9769",,"ZPXB303191",1'),

NewData (ID, ItemNumber, strcol) AS (

SELECT ID, ItemNumber

,strcol=CASE WHEN CHARINDEX(',', Item) > 1 THEN REPLACE(Item, ',', ';') ELSE Item END

FROM MyData

CROSS APPLY PatternSplitCM(strcol, '["]'))

SELECT ID, (

SELECT strcol + ''

FROM NewData b

WHERE a.ID = b.ID

ORDER BY ItemNumber

FOR XML PATH(''))

FROM NewData a

GROUP BY ID

The PatternSplitCM FUNCTION can be found in the 4th article linked into my signature line.

You would need to decide what character was appropriate to use in place of semicolon, then REPLACE it back to comma after BULK INSERT completes its task (if that is possible).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

jerry-621596 SSCertifiable Points: 6096 More actions · Answer 2

SSIS would handle this very nicely and be easily repeatable.

Jeff Moden SSC Guru Points: 1004312 More actions · Answer 3

jerry-621596 (1/13/2013)
SSIS would handle this very nicely and be easily repeatable.

Agreed. ACE drivers with OPENROWSET would also work well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Frank Cazabon Hall of Fame Points: 3467 More actions · Answer 4

Frank Cazabon

Hall of Fame

Points: 3467

January 14, 2013 at 4:47 am

#1576832

Thanks Dwain,

I'll try that out

Frank Cazabon Hall of Fame Points: 3467 More actions · Answer 5

Sorry, I should have said this was SQL Server 2008 R2 Express. I don't think SSIS is available.

Frank Cazabon Hall of Fame Points: 3467 More actions · Answer 6

Jeff Moden (1/13/2013)
jerry-621596 (1/13/2013)
SSIS would handle this very nicely and be easily repeatable.
Agreed. ACE drivers with OPENROWSET would also work well.

Thanks, I'll research that.