What''s the best way to generate an XML data file from a table?

  • Hello all,

    I'm trying to generate an XML archive of old history data to speed up report rendering (report engine transforms all data to XML before render) -- the report engine will UNION the archived XML with the current year T-SQL query, so I just need to figure out the best way to create the most efficient XML data file possible using T-SQL. The following query returns ~1.35M rows in ~1.56GB XML file:

    SELECT AccountNumber,TenantNumber,SequenceNumber,RecordType,Date,Reference,Code,ServiceCode,RateCode,

           MeterNumber,Amount,BudgetBillAmount,Reading,DemandReading,Usage,DemandUsage,ServiceSequence,ReasonCode

    FROM UBAccountHistory

    WHERE Date < '01/01/2007'

    FOR XML AUTO, ELEMENTS XSINIL

    But it returns a file that includes the namespace in every "UBAccountHistory" tag...like so:

    <UBAccountHistory xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><AccountNumber>9999999</AccountNumber>

     <TenantNumber>98</TenantNumber>

     <SequenceNumber>56</SequenceNumber>

     <RecordType>5</RecordType>

     <Date>2007-01-12T00:00:00</Date>

     <Reference>0</Reference>

     <Code>0</Code>

     <ServiceCode>WA</ServiceCode>

     <RateCode>W41</RateCode>

     <MeterNumber>1234567</MeterNumber>

     <Amount>0.00</Amount>

     <BudgetBillAmount>0.00</BudgetBillAmount>

     <Reading>0</Reading>

     <DemandReading>0.0000</DemandReading>

     <Usage>0</Usage>

     <DemandUsage>0.0000</DemandUsage>

     <ServiceSequence>0</ServiceSequence>

     <ReasonCode>0</ReasonCode>

    </UBAccountHistory>

    It returns one of these per row, and obviously this is a format that is invalid XML, as i've got to add a parent wrapper and move the namespace to it/remove it from each record. Is there a way to generate the file using the:

    <UBAccountHistory AccountNumber="99999" TenantNumber="90" .... />

    format? This will chop the filesize and improve the report performance significantly.

    Thanks in advance!

    Mike

  • You probably want to read up on using XML PATH.  This allow you to specify a root node, and will give you a little more control over how the XMl will look.

    Even without that - if you simply remove the ELEMENTS predicate, the fields will be passed as attributes instead. (meaning - they'll flatten out into a <UBAccountHistory AccountNumber="99999" TenantNumber="90" .... /> layout)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I do believe it is because you are declaring the Elements keyword in your For XML Clause.

    from BOL

    ELEMENTS

    Specifies that the columns are returned as subelements. Otherwise, they are mapped to XML attributes. This option is supported in RAW, AUTO and PATH modes only. For more information, see RAW Mode Enhancements.

    use AdventureWorks

    select ContactID, FirstName, LastName, Title

    from Person.Contact

    for XML Auto

    Results

    <

    Person.Contact ContactID="1" FirstName="Gustavo" LastName="Achong" Title="Mr." />

    <Person.Contact ContactID="2" FirstName="Catherine" LastName="Abel" Title="Ms." />

    select ContactID, FirstName, LastName, Title

    from Person.Contact

    for XML Auto, ELEMENTS

    Results

    <Person.Contact>

      <ContactID>1</ContactID>

      <FirstName>Gustavo</FirstName>

      <LastName>Achong</LastName>

      <Title>Mr.</Title>

    </Person.Contact>

    <Person.Contact>

      <ContactID>2</ContactID>

      <FirstName>Catherine</FirstName>

      <LastName>Abel</LastName>

      <Title>Ms.</Title>

    </Person.Contact>

    <Person.Contact>

    Something Like

    select ContactID as '@ContactID', FirstName as '@FirstName', LastName as '@LastName', Title as '@Title'

    from Person.Contact

    for XML Path, root ('Contacts')

    Results

    <Contacts>

      <row ContactID="1" FirstName="Gustavo" LastName="Achong" Title="Mr." />

      <row ContactID="2" FirstName="Catherine" LastName="Abel" Title="Ms." />

    </Contacts>

  • Thanks for the replies! That looks like almost exactly what I need...one issue i'm having now...

    When I run the query and I save the results to a file (call it test.xml) -- the file shows random CR+NL characters in the MIDDLE of some of the rows!

    I imagine this could have something to do with trying to save directly from the results pane, but maybe not. Is there a way to build a file directly as XML without having to save it and change the extension?

    I am trying again now using the "results to file" but the default file extension for this is .rpt -- which is obviously not what i'm looking for here.

    Thanks again!

    Mike

  • anybody knows how do to handle extended ASCII (ë, é ...) with this method to create XML?

     

    thx in advance,

     

    wim

  • Zippy -

    look up the BCP utility.  It's a command-line util which will allow you to save the contents of a SQL execution.  There are a bunch of options you might or might not want to set.  Just set the field and row delimiters to something XML doesn't care about (like " ").

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Well, I tried bcp, but couldn't get it to generate valid XML. I did RTFM, but I guess I was missing something (i'm guessing within that format mess) -- HOWEVER, I did get it to work!

    First, I created a script file:

    filename is "xmlBuild.sql"

    :XML ON

    SELECT AccountNumber,TenantNumber,SequenceNumber,RecordType,Date,Reference,Code,ServiceCode,RateCode,

           MeterNumber,Amount,BudgetBillAmount,Reading,DemandReading,Usage,DemandUsage,ServiceSequence,ReasonCode

    FROM Logics.dbo.UBAccountHistory as dt

    WHERE Date > '01/01/2007'

    FOR XML AUTO, ROOT('dtAnalysisGrid')

    Then I used SQLCMD with some options to run it...

    sqlcmd -Smyserver -dmyDB -E -i"xmlbuild.sql" -r1 -h-1 -o"results.xml"

    Now my only problem was when trying to view it, I ran out of disk space

    But it is loaded successfully in to my report! PERFECT!

    Thanks again for all of everyone's help!

    Mike

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply