September 27, 2010 at 7:40 pm
I have a source text file that for one type of record (inactive) has 25 fields, but for another type of record (active) has up to 36 fields.
I cannot seem to find a way eiether using bcp, BULK INSERT with or without a FILEFORMAT xml file to actually import this.
Any ideas on how I may be able to do this? The only way I can think of doing this is to load the whole record into a single field and parse it once it is in the database....
Thanks in advance.
Craig
September 27, 2010 at 8:20 pm
Look up on Books Online (free 'Help' that comes with the SQL Server) for OPENROWSET.. that will help you resolve this issue..
September 28, 2010 at 12:24 am
Thanks for your response. I have been hunting high and low, and through what I am assuming is the page you have referenced and there does not seem to be any mention of a situation where you have a source file that looks like the following:
ID|Name|Active|Price
00001|Test Product #1|Y|1.50
00002|Test Product #2|N
00003|Test Product #3|Y|23.45
00004|Some other name|N
with data.xml as follows:
<?xml version="1.0"?>
<BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="4"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="50"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="1"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="" MAX_LENGTH="10"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ID" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="2" NAME="Name" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="Active" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="Price" xsi:type="SQLMONEY"/>
</ROW>
</BCPFORMAT>
the following command
select * from openrowset(bulk 'c:\users\craig\desktop\test.dat', FIRSTROW=2, FORMATFILE='c:\users\craig\desktop\test.xml') as t
produces an error
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 3, column 3 (Active).
September 28, 2010 at 12:28 am
And to qualify - the following data file
ID|Name|Active|Price
00001|Test Product #1|Y|1.50
00002|Test Product #2|N|1
00003|Test Product #3|Y|23.45
00004|Some other name|N|1
works as expected!
Thanks....
September 28, 2010 at 12:33 am
craig 33233 (9/27/2010)
I have a source text file that for one type of record (inactive) has 25 fields, but for another type of record (active) has up to 36 fields
Ah, this seems tricky... if the number of fields per row is consistent in each files, with each file having varied number of fields, then OPENROWSET will work.. BUT , when there is inconsistent number of fields in each row for each files, then i cant think of a way to do it 🙁 ..
Lets for the bigwigs of SSC to wield their shields and come up with an approach..
A BULK INSERT of the all the rows as a single blob and then using any delimited-splitter will give you what you want, but that will take some coding...
September 28, 2010 at 5:57 am
ColdCoffee (9/28/2010)
craig 33233 (9/27/2010)
I have a source text file that for one type of record (inactive) has 25 fields, but for another type of record (active) has up to 36 fieldsAh, this seems tricky... if the number of fields per row is consistent in each files, with each file having varied number of fields, then OPENROWSET will work.. BUT , when there is inconsistent number of fields in each row for each files, then i cant think of a way to do it 🙁 ..
Lets for the bigwigs of SSC to wield their shields and come up with an approach..
A BULK INSERT of the all the rows as a single blob and then using any delimited-splitter will give you what you want, but that will take some coding...
Using a blob field would be quite inefficient here and would carry the extra overhead of having to split on the row terminators which is unnecessary.
Rather you could create a simple format file called BulkFile.fmt which has a single SQLCHAR 8000 column called BulkColumn. Alternatively if you don't want to use a format file, you could BULK INSERT into a single column staging table and modify the code accordingly.
;WITH cteTally (N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM master..syscolumns
)
,
cte AS
(
SELECT BulkColumn FROM OPENROWSET (BULK 'C:\users\craig\desktop\YourFile.txt', FORMATFILE = 'C:\users\craig\desktop\BulkFile.fmt', FIRSTROW = 2) AS Z
)
SELECT [1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31],[32],[33],[34],[35],[36]
FROM cte
CROSS APPLY
(
SELECT [1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31],[32],[33],[34],[35],[36]
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)),
SUBSTRING(BulkColumn + '|', N, CHARINDEX('|', BulkColumn + '|', N) - N)
FROM cteTally
WHERE N < LEN(BulkColumn) + 2 AND SUBSTRING('|' + BulkColumn + '|', N, 1) = '|'
) AS Z (ROW, VALUE)
PIVOT(MAX(Value) FOR ROW IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31],[32],[33],[34],[35],[36])) AS pvt
) AS Y
September 28, 2010 at 11:14 am
Thanks for the advice, but knowing that Microsoft thought it *inconceivable* (flashbacks to the Princess Bride!) that this could ever happen and that I'm not missing something I've decided to go with a different solution which may hopefully be easier to maintain.
I've explicitly defined the fields that I know will always be there in the xml, and then clumped the (optional) remainder into a single field (up to row terminator) and parse it out manually in the upsert (merge) if there is something there...
Cheers.
Craig
September 28, 2010 at 1:24 pm
OK then, another idea perhaps. How about this monster CROSS APPLY version based on some cool CROSS APPLY tricks 🙂 from Brad Schulz
IF NOT OBJECT_ID('tempdb.dbo.#stage', 'U') IS NULL DROP TABLE #stage
CREATE TABLE #stage (BulkColumn varchar(8000))
BULK INSERT #stage FROM 'C:\users\craig\desktop\YourFile.txt'
WITH
(
FIRSTROW = 2, TABLOCK
)
;WITH cte (BulkColumn) AS
(
SELECT BulkColumn + REPLICATE('|', 36) FROM #stage
)
SELECT
col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20, col21, col22, col23, col24, col25, col26, col27, col28, col29, col30, col31, col32, col33, col34, col35, col36
FROM cte
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn)) AS Z1 (p1)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z1.p1 + 1)) AS Z2 (p2)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z2.p2 + 1)) AS Z3 (p3)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z3.p3 + 1)) AS Z4 (p4)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z4.p4 + 1)) AS Z5 (p5)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z5.p5 + 1)) AS Z6 (p6)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z6.p6 + 1)) AS Z7 (p7)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z7.p7 + 1)) AS Z8 (p8)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z8.p8 + 1)) AS Z9 (p9)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z9.p9 + 1)) AS Z10 (p10)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z10.p10 + 1)) AS Z11 (p11)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z11.p11 + 1)) AS Z12 (p12)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z12.p12 + 1)) AS Z13 (p13)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z13.p13 + 1)) AS Z14 (p14)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z14.p14 + 1)) AS Z15 (p15)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z15.p15 + 1)) AS Z16 (p16)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z16.p16 + 1)) AS Z17 (p17)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z17.p17 + 1)) AS Z18 (p18)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z18.p18 + 1)) AS Z19 (p19)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z19.p19 + 1)) AS Z20 (p20)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z20.p20 + 1)) AS Z21 (p21)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z21.p21 + 1)) AS Z22 (p22)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z22.p22 + 1)) AS Z23 (p23)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z23.p23 + 1)) AS Z24 (p24)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z24.p24 + 1)) AS Z25 (p25)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z25.p25 + 1)) AS Z26 (p26)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z26.p26 + 1)) AS Z27 (p27)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z27.p27 + 1)) AS Z28 (p28)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z28.p28 + 1)) AS Z29 (p29)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z29.p29 + 1)) AS Z30 (p30)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z30.p30 + 1)) AS Z31 (p31)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z31.p31 + 1)) AS Z32 (p32)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z32.p32 + 1)) AS Z33 (p33)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z33.p33 + 1)) AS Z34 (p34)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z34.p34 + 1)) AS Z35 (p35)
CROSS APPLY (SELECT CHARINDEX('|', cte.BulkColumn, Z35.p35 + 1)) AS Z36 (p36)
CROSS APPLY (SELECT SUBSTRING(cte.BulkColumn, 1, Z1.p1 - 1)
,SUBSTRING(cte.BulkColumn, Z1.p1 + 1, Z2.p2 - Z1.p1 - 1)
,SUBSTRING(cte.BulkColumn, Z2.p2 + 1, Z3.p3 - Z2.p2 - 1)
,SUBSTRING(cte.BulkColumn, Z3.p3 + 1, Z4.p4 - Z3.p3 - 1)
,SUBSTRING(cte.BulkColumn, Z4.p4 + 1, Z5.p5 - Z4.p4 - 1)
,SUBSTRING(cte.BulkColumn, Z5.p5 + 1, Z6.p6 - Z5.p5 - 1)
,SUBSTRING(cte.BulkColumn, Z6.p6 + 1, Z7.p7 - Z6.p6 - 1)
,SUBSTRING(cte.BulkColumn, Z7.p7 + 1, Z8.p8 - Z7.p7 - 1)
,SUBSTRING(cte.BulkColumn, Z8.p8 + 1, Z9.p9 - Z8.p8 - 1)
,SUBSTRING(cte.BulkColumn, Z9.p9 + 1, Z10.p10 - Z9.p9 - 1)
,SUBSTRING(cte.BulkColumn, Z10.p10 + 1, Z11.p11 - Z10.p10 - 1)
,SUBSTRING(cte.BulkColumn, Z11.p11 + 1, Z12.p12 - Z11.p11 - 1)
,SUBSTRING(cte.BulkColumn, Z12.p12 + 1, Z13.p13 - Z12.p12 - 1)
,SUBSTRING(cte.BulkColumn, Z13.p13 + 1, Z14.p14 - Z13.p13 - 1)
,SUBSTRING(cte.BulkColumn, Z14.p14 + 1, Z15.p15 - Z14.p14 - 1)
,SUBSTRING(cte.BulkColumn, Z15.p15 + 1, Z16.p16 - Z15.p15 - 1)
,SUBSTRING(cte.BulkColumn, Z16.p16 + 1, Z17.p17 - Z16.p16 - 1)
,SUBSTRING(cte.BulkColumn, Z17.p17 + 1, Z18.p18 - Z17.p17 - 1)
,SUBSTRING(cte.BulkColumn, Z18.p18 + 1, Z19.p19 - Z18.p18 - 1)
,SUBSTRING(cte.BulkColumn, Z19.p19 + 1, Z20.p20 - Z19.p19 - 1)
,SUBSTRING(cte.BulkColumn, Z20.p20 + 1, Z21.p21 - Z20.p20 - 1)
,SUBSTRING(cte.BulkColumn, Z21.p21 + 1, Z22.p22 - Z21.p21 - 1)
,SUBSTRING(cte.BulkColumn, Z22.p22 + 1, Z23.p23 - Z22.p22 - 1)
,SUBSTRING(cte.BulkColumn, Z23.p23 + 1, Z24.p24 - Z23.p23 - 1)
,SUBSTRING(cte.BulkColumn, Z24.p24 + 1, Z25.p25 - Z24.p24 - 1)
,SUBSTRING(cte.BulkColumn, Z25.p25 + 1, Z26.p26 - Z25.p25 - 1)
,SUBSTRING(cte.BulkColumn, Z26.p26 + 1, Z27.p27 - Z26.p26 - 1)
,SUBSTRING(cte.BulkColumn, Z27.p27 + 1, Z28.p28 - Z27.p27 - 1)
,SUBSTRING(cte.BulkColumn, Z28.p28 + 1, Z29.p29 - Z28.p28 - 1)
,SUBSTRING(cte.BulkColumn, Z29.p29 + 1, Z30.p30 - Z29.p29 - 1)
,SUBSTRING(cte.BulkColumn, Z30.p30 + 1, Z31.p31 - Z30.p30 - 1)
,SUBSTRING(cte.BulkColumn, Z31.p31 + 1, Z32.p32 - Z31.p31 - 1)
,SUBSTRING(cte.BulkColumn, Z32.p32 + 1, Z33.p33 - Z32.p32 - 1)
,SUBSTRING(cte.BulkColumn, Z33.p33 + 1, Z34.p34 - Z33.p33 - 1)
,SUBSTRING(cte.BulkColumn, Z34.p34 + 1, Z35.p35 - Z34.p34 - 1)
,SUBSTRING(cte.BulkColumn, Z35.p35 + 1, Z36.p36 - Z35.p35 - 1)
) AS ZSplit (col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20, col21, col22, col23, col24, col25, col26, col27, col28, col29, col30, col31, col32, col33, col34, col35, col36)
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply