March 27, 2009 at 3:49 pm
Hi,
I've just inherited quite a few (1000s) XML files that I'd like to be able to analyze with my existing SQL database.
I think it would probably be best to be able to load these files into my existing SQL Server database, but don't really know what a good process is for this? Is there an import/export wizard type process that could accomplish this without too much trouble?
Thanks!
March 27, 2009 at 11:49 pm
XML in sql server is not at all difficult part. Have some googling on 'XML in sql server'.
R u using SS 2k5? there is a xml data type to store xml data as is in the database. though SS2k doesnt hv xml data type, but u can still store xml data and run query on it.
Hope below links will be useful to you.
http://msdn.microsoft.com/en-us/library/ms190936(SQL.90).aspx
http://www.15seconds.com/issue/050803.htm
http://www.sitepoint.com/article/data-as-xml-sql-server/
http://www.developer.com/db/article.php/3531196
http://windowsitpro.com/Articles/Index.cfm?ArticleID=45131&DisplayTab=Article
🙂
"Don't limit your challenges, challenge your limits"
March 28, 2009 at 5:59 am
Yes, it's quiet simple to handle XML in SQL Server since 2k5. If you are using SQL Server 2k the XML functionality is quiet limited. Here a little example:
/*
sample xml:
<root>
<elem>value1</elem>
<elem>value2</elem>
</root>
*/
DECLARE @xml XML
SELECT @xml = BulkColumn
FROM OPENROWSET
(
BULK 'C:\Users\Flo\Temp\Test\test.xml', SINGLE_BLOB
) T
SELECT @xml
SELECT T.C.value('text()[1]', 'nvarchar(50)')
FROM @xml.nodes('//root/elem') T(C)
BUT
It depends on your requirement if this is the right way. You say you have 1000s of files and have to validate the content against the database. If you have to do complex transformations you may consider to use SSIS for a previous conversion.
Greets
Flo
April 6, 2009 at 12:20 pm
Sorry about the delay in this - I got distracted with a more urgent need and have only now had time to come back to this.
For loading the XML files into my SQL database, I have used some of your hints and am able to load and parse the XML as I would like to... however, I can only seem to do it one at a time so far. And considering that I have 10000+ of these XML files, I need to get this to a streamlined process.
I have an SQL query that loads the XML file as datatype XML, then parses it and loads it into the appropriate column. What I'm really looking for is a way to automate the insertion of the filename (which is on line 6 of the SQL query), and to have this iterate through many files one at a time.
Any ideas that anyone has are much appreciated! Thanks!
-------------------------------------
Here's the SQL query:
-- Create empty temp table
CREATE TABLE #WorkingTable (data XML)
-- Insert the XML data
INSERT INTO #WorkingTable
SELECT *
FROM OPENROWSET (BULK 'C:\SampleXMLFiles\A_File1.xml', SINGLE_BLOB) AS data
SELECT * FROM #WorkingTable
DECLARE @XML AS XML
, @hDoc AS INT
, @Symbol AS VARCHAR(10)
, @MyDate AS DATETIME
-- Assigns XML data to variable
SELECT @XML = data FROM #WorkingTable
-- Parses XML using XML parser
EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML
-- Organize the XML data into format desired, using cross join
-- to have Symbol and MyDate on every row
SELECT *
INTO #TempTable
FROM
OPENXML(@hDoc, '/C', 1)
WITH (Symbol VARCHAR(10) '@S'
, MyDate DATETIME '@D' ) AS IDS
CROSS JOIN
OPENXML(@hDoc, '/C/T', 1)
WITH (Number INT '@N'
, Time TIME '@T'
, P DECIMAL(12,4) '@P'
, B DECIMAL(12,4) '@B'
, A DECIMAL(12,4) '@A'
, Shs INT '@S' ) AS Trades
EXEC sp_xml_removedocument @hDoc
-- Insert into database table in columns desired, using join to add another identifier, SymbolID
INSERT INTO TestParseXMLData (SymbolID, Symbol, MyDate, N, Time, P, B, A, Shs)
SELECT S.SymbolID, T.Symbol, T.MyDate, T.Number, T.Time, T.P, T.B, T.A, T.Shs
FROM #TempTable T
LEFT JOIN tblSymbol S ON S.Symbol = T.Symbol
-- Display what's loaded into the SQL table
SELECT * FROM TestParseXMLData
April 6, 2009 at 10:44 pm
Let me be clear...
The example code as above is for single xml file A_File1.xml, and you have 10000+ files such like, right? And you want to insert all files with the same routine, right?
If so, then you can create stored procedure with for loop and having file name like A_File1.xml, A_File2.xml, A_File3.xml... or something similar........
"Don't limit your challenges, challenge your limits"
April 8, 2009 at 5:10 am
search for OPENXML... it will probably be your final solution...
April 8, 2009 at 8:24 am
Instead analyzing the XML files and creating the scheme, If you convert teh XML file CSV file and load it in SQL Server and then analysize the data.
To Convert XML to CSV:
http://www.codeproject.com/KB/vbscript/xml2csv.aspx
You can load CSV file into SQL Server using BULK INSERT.
April 8, 2009 at 8:28 am
Jonathan Mallia (4/8/2009)
search for OPENXML... it will probably be your final solution...
Hi Jonathan
OPENXML is marked as deprecated in SQL Server 2005. It will be removed in future version. You should use the new XML functions of XML data type.
Greets
Flo
April 9, 2009 at 8:03 am
If you can write some .net code (C#/VB). Then you can loop through the folder containing the XML files and use the dataset.ReadXml(filename) to get it to a dataset.
Then its easier for you to create the tables & data from the dataset.
April 10, 2009 at 8:41 am
Jonathan Mallia (4/8/2009)
search for OPENXML... it will probably be your final solution...
Don't use OPENXML! Use XQuery.
April 10, 2009 at 9:51 am
There were somany suggestion to the question :). Which one you are planning to choose!!!
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply