Dynamically Import Multiple XML files (SQL2K5)

  • Hello All:

    I'm just now dipping my toe into the waters of XML. For the time being I've been playing with bulkload going through some of the the examples in http://msdn.microsoft.com/en-us/library/ms171806.aspx I tried out some of the sample code and it works fine. However when I try to apply it to the xml files I have, I don't get so far. I can verify the xml files, but then I get errors. Starting with "schema: relationship expected on General".

    Most of all I'm not sure if I am taking the right approach. Should I be using bulkload or another approach. This process needs to be automated as much as possible and I ultimately want the data on SQL2K5. The files definitely come into excel 2007 with ease. Access 2007 is another story. Basically, I keep searching for the best approach and end up going in different directions.

    Ultimately, I will have several XML files in a directory that I would like to loop through and append the data to one table. Below is an example of what the xml files look like(Several with hundreds of records). The only data I really need is the CaseDetail attributes.

    <?xml version="1.0" encoding="UTF-8"?>

    <summary version="3">

    <General>

    <ComparisonDate>2010-04-08T14:29:30Z</ComparisonDate>

    <ComparisonEndDate>2010-04-08T15:07:03Z</ComparisonEndDate>

    <Baseline>C:\<!--commented out--></Baseline>

    <Testset>C:\<!--commented out--></Testset>

    <Results>C:\<!--commented out--></Results>

    <NrCasesCompared>239</NrCasesCompared>

    <NrWarningCases>0</NrWarningCases>

    <NrErrorCases>0</NrErrorCases>

    <UserName><!--commented out--></UserName>

    <RunPurpose><!--commented out--></RunPurpose>

    <CalcSysVersion><!--commented out--></CalcSysVersion>

    </General>

    <CaseDetail caseName="107-001.xml" result="0" status="4"/>

    <CaseDetail caseName="107-002.xml" result="0" status="4"/>

    <CaseDetail caseName="107-003.xml" result="0" status="4"/>

    <CaseDetail caseName="107-004.xml" result="0" status="4"/>

    <CaseDetail caseName="107-005.xml" result="0" status="4"/>

    <CaseDetail caseName="107-006.xml" result="0" status="4"/>

    <CaseDetail caseName="107-007.xml" result="0" status="4"/>

    <CaseDetail caseName="107-008.xml" result="0" status="4"/>

    <CaseDetail caseName="107-009.xml" result="0" status="4"/>

    <CaseDetail caseName="107-010.xml" result="0" status="4"/>

    </summary>

    Overall I'm finding it difficult to get good examples on how to approach this. I'm hoping that someone has had a similar problem to solve or can at least shove me in the right direction.

    Pleasee let me know if I can provide additonal information.

    Thanks in advance for any help you can provide.

  • One of the most interesting methods to connect to a list of files in a folder with (almost) pure SQL has been posted by Lowell here .

    Once you have that list of XML files you could import it into a table with an ID and a column of XML data type (maybe you can query the xml files directly but I never tried). Next step would be to add an XML index (if needed).

    Use XQuery to shred the data into your table.

    A useful link regarding XML stuff might be Jacob Sebastian's XQuery collection



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Lutz:

    Wow! This is some idea, Thanks. Do you think this approach can be used with xml files. I must admit I'm a little intimidated by the whole process. Since Xquery and using a a folder as linked server is new to me :w00t:

    I will give it a shot. Thanks.

    I'm also open to any other potential solutions.

  • I have a similar situation and put together this solution, I think it might help you out. You would just need to modify the layout of the xml in my example and you should be able to pull the data straight into your SQL tables.

    CREATE PROCEDURE [dbo].[ErrorResponse]

    @FileName varchar(255)

    AS

    --DECLARE @FileName varchar(255)

    DECLARE @ExecCmd VARCHAR(255)

    DECLARE @FileContents VARCHAR(max)

    DECLARE @TempXml TABLE(ThisLine varchar(255))

    DECLARE @ResponseHandle INT

    --SET @FileName = 'c:\transfer\errors.xml'

    -- Create a table out of the datafile

    SET @ExecCmd = 'type ' + @FileName

    SET @FileContents = ''

    INSERT INTO @TempXML EXEC master.dbo.xp_cmdshell @ExecCmd

    SELECT @FileContents = @FileContents + ISNULL(Thisline, '') FROM @TempXml

    SELECT CONVERT(xml, @FileContents) as FileContents

    -- Create a file handle to point to the xml file

    EXEC sp_Xml_prepareDocument @ResponseHandle OUTPUT, @FileContents

    -- Create a temp results table

    SELECT * INTO #TempResults

    FROM OPENXML(@ResponseHandle, '/data/return/errors/error')

    WITH

    (

    PartnerIdint'//data/transmitting_party',

    ErrorCountint '//data/return/error_count',

    InputIdint'input_id',

    ErrorIdint 'error_id',

    ErrorMsgvarchar(255) 'error_message'

    )

    --Clear the file handle (it's a resource hog)

    EXEC sp_Xml_RemoveDocument @ResponseHandle

    INSERT INTO dbo.ResponseErrors

    (ResponsefileName, PartnerId, ErrorCount, InputId, ErrorId, ErrorMsg)

    SELECT @FileName, PartnerId, ErrorCount, InputId, ErrorId, ErrorMsg

    FROM #TempResults

    WHERE ErrorCount <> 0

    DROP TABLE #TempResults

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply