Load data from XML files

  • Hi,

    I've just inherited quite a few (1000s) XML files that I'd like to be able to analyze with my existing SQL database.

    I think it would probably be best to be able to load these files into my existing SQL Server database, but don't really know what a good process is for this? Is there an import/export wizard type process that could accomplish this without too much trouble?

    Thanks!

  • XML in sql server is not at all difficult part. Have some googling on 'XML in sql server'.

    R u using SS 2k5? there is a xml data type to store xml data as is in the database. though SS2k doesnt hv xml data type, but u can still store xml data and run query on it.

    Hope below links will be useful to you.

    http://msdn.microsoft.com/en-us/library/ms190936(SQL.90).aspx

    http://www.15seconds.com/issue/050803.htm

    http://www.sitepoint.com/article/data-as-xml-sql-server/

    http://www.developer.com/db/article.php/3531196

    http://windowsitpro.com/Articles/Index.cfm?ArticleID=45131&DisplayTab=Article

    🙂

    "Don't limit your challenges, challenge your limits"

  • Yes, it's quiet simple to handle XML in SQL Server since 2k5. If you are using SQL Server 2k the XML functionality is quiet limited. Here a little example:

    /*

    sample xml:

    <root>

    <elem>value1</elem>

    <elem>value2</elem>

    </root>

    */

    DECLARE @xml XML

    SELECT @xml = BulkColumn

    FROM OPENROWSET

    (

    BULK 'C:\Users\Flo\Temp\Test\test.xml', SINGLE_BLOB

    ) T

    SELECT @xml

    SELECT T.C.value('text()[1]', 'nvarchar(50)')

    FROM @xml.nodes('//root/elem') T(C)

    BUT

    It depends on your requirement if this is the right way. You say you have 1000s of files and have to validate the content against the database. If you have to do complex transformations you may consider to use SSIS for a previous conversion.

    Greets

    Flo

  • Sorry about the delay in this - I got distracted with a more urgent need and have only now had time to come back to this.

    For loading the XML files into my SQL database, I have used some of your hints and am able to load and parse the XML as I would like to... however, I can only seem to do it one at a time so far. And considering that I have 10000+ of these XML files, I need to get this to a streamlined process.

    I have an SQL query that loads the XML file as datatype XML, then parses it and loads it into the appropriate column. What I'm really looking for is a way to automate the insertion of the filename (which is on line 6 of the SQL query), and to have this iterate through many files one at a time.

    Any ideas that anyone has are much appreciated! Thanks!

    -------------------------------------

    Here's the SQL query:

    -- Create empty temp table

    CREATE TABLE #WorkingTable (data XML)

    -- Insert the XML data

    INSERT INTO #WorkingTable

    SELECT *

    FROM OPENROWSET (BULK 'C:\SampleXMLFiles\A_File1.xml', SINGLE_BLOB) AS data

    SELECT * FROM #WorkingTable

    DECLARE @XML AS XML

    , @hDoc AS INT

    , @Symbol AS VARCHAR(10)

    , @MyDate AS DATETIME

    -- Assigns XML data to variable

    SELECT @XML = data FROM #WorkingTable

    -- Parses XML using XML parser

    EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML

    -- Organize the XML data into format desired, using cross join

    -- to have Symbol and MyDate on every row

    SELECT *

    INTO #TempTable

    FROM

    OPENXML(@hDoc, '/C', 1)

    WITH (Symbol VARCHAR(10) '@S'

    , MyDate DATETIME '@D' ) AS IDS

    CROSS JOIN

    OPENXML(@hDoc, '/C/T', 1)

    WITH (Number INT '@N'

    , Time TIME '@T'

    , P DECIMAL(12,4) '@P'

    , B DECIMAL(12,4) '@B'

    , A DECIMAL(12,4) '@A'

    , Shs INT '@S' ) AS Trades

    EXEC sp_xml_removedocument @hDoc

    -- Insert into database table in columns desired, using join to add another identifier, SymbolID

    INSERT INTO TestParseXMLData (SymbolID, Symbol, MyDate, N, Time, P, B, A, Shs)

    SELECT S.SymbolID, T.Symbol, T.MyDate, T.Number, T.Time, T.P, T.B, T.A, T.Shs

    FROM #TempTable T

    LEFT JOIN tblSymbol S ON S.Symbol = T.Symbol

    -- Display what's loaded into the SQL table

    SELECT * FROM TestParseXMLData

  • Let me be clear...

    The example code as above is for single xml file A_File1.xml, and you have 10000+ files such like, right? And you want to insert all files with the same routine, right?

    If so, then you can create stored procedure with for loop and having file name like A_File1.xml, A_File2.xml, A_File3.xml... or something similar........

    "Don't limit your challenges, challenge your limits"

  • search for OPENXML... it will probably be your final solution...

  • Instead analyzing the XML files and creating the scheme, If you convert teh XML file CSV file and load it in SQL Server and then analysize the data.

    To Convert XML to CSV:

    http://www.codeproject.com/KB/vbscript/xml2csv.aspx

    You can load CSV file into SQL Server using BULK INSERT.

  • Jonathan Mallia (4/8/2009)


    search for OPENXML... it will probably be your final solution...

    Hi Jonathan

    OPENXML is marked as deprecated in SQL Server 2005. It will be removed in future version. You should use the new XML functions of XML data type.

    Greets

    Flo

  • If you can write some .net code (C#/VB). Then you can loop through the folder containing the XML files and use the dataset.ReadXml(filename) to get it to a dataset.

    Then its easier for you to create the tables & data from the dataset.

  • Jonathan Mallia (4/8/2009)


    search for OPENXML... it will probably be your final solution...

    Don't use OPENXML! Use XQuery.

    See: http://www.sqlservercentral.com/blogs/michael_coles/archive/2008/01/20/stop-using-openxml-please.aspx


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • There were somany suggestion to the question :). Which one you are planning to choose!!!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply