Another XML explicit problem

  • I'm sure the answer to this is out there and I've probably read over it but not able to make sense of it.

    Time to just ask the pros what is probably a really simple question.

    We are building an ETL project that is going to take hundreds of legacy data fields and place them into XML fields in a new SQL server database. At the same time we are trying to eliminate duplicate records from the database.

    So the table format is like:

    IDField varchar(64),

    EffectiveDate datetime,

    XData xml

    The IDField and EffectiveDate fields are already populated so we are just trying to generate XML objects within SQL to update the XData field for each record.

    I have been working with this query trying to generate just one XML record

    select 1 as Tag

    , null as Parent

    , IDField as [endo!1!IDField]

    , effectiveDate as [endo!1!EffectiveDate]

    , null as [form!2!FormApplies!Element]

    , null as [form!2!Form_ED_date!Element]

    , null as [form!2!Form_Last_Byte!Element]

    , null as [form!2!Form_Nbr_Fields!Element]

    , null as [form!2!Form_Prem!Element]

    , null as [form!2!Form_State!Element]

    , null as [form!2!Form_Var_Area!Element]

    , null as [form!2!Form_Variable!Element]

    , null as [form!2!Form_Var_Length!Element]

    , null as [form!2!Form_Var_Occurs!Element]

    , null as [form!2!Form_Version!Element]

    from Stage_Umbrellas

    where IDField = '002~ ~WPX ~390146~0~2002062~005' and effectiveDate = '4/3/2002'

    UNION ALL

    select 2

    , 1

    , IDField

    , effectiveDate

    , ltrim(rtrim(Form_Applies)) form_applies

    , ltrim(rtrim(Form_Ed_Date)) form_ed_date

    , ltrim(rtrim(form_last_byte)) form_last_byte

    , ltrim(rtrim(form_nbr_fields)) form_nbr_fields

    , ltrim(rtrim(form_prem)) form_prem

    , ltrim(rtrim(form_state)) form_state

    , ltrim(rtrim(form_var_area)) form_var_area

    , ltrim(rtrim(form_variable)) form_variable

    , ltrim(rtrim(form_var_length)) form_var_length

    , ltrim(rtrim(form_var_occurs)) form_var_occurs

    , ltrim(rtrim(form_version)) form_version

    from Stage_Umbrellas umbrel

    where IDField = '002~ ~WPX ~390146~0~2002062~005' and effectiveDate = '4/3/2002'

    order by [endo!1!IDField], [endo!1!EffectiveDate]

    for XML EXPLICIT

    The second half of the query works fine but the first part generates:

    <endo IDField="002~ ~WPX ~390146~0~2002062~005 " EffectiveDate="2002-04-03T00:00:00" />

    <endo IDField="002~ ~WPX ~390146~0~2002062~005 " EffectiveDate="2002-04-03T00:00:00" />

    <endo IDField="002~ ~WPX ~390146~0~2002062~005 " EffectiveDate="2002-04-03T00:00:00" />

    <endo IDField="002~ ~WPX ~390146~0~2002062~005 " EffectiveDate="2002-04-03T00:00:00">

    How do I get the query to only return 1 root element for the parent as in:

    <endo IDField="002~ ~WPX ~390146~0~2002062~005 " EffectiveDate="2002-04-03T00:00:00">

    Thank you in advance for your help

  • Quick suggestion, use FOR XML PATH rather than EXPLICIT

    😎

  • kenksoftware (12/28/2015)


    We are building an ETL project that is going to take hundreds of legacy data fields and place them into XML fields in a new SQL server database

    Have you folks actually and seriously studied and done a real life proof of principle and performance tests of the down-line impact of what's going to happen after you do such a thing?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • we are not even sure if these fields are ever going to be used in the future and there are hundreds of them which is why I'm stuffing them into XML fields. This project is already a year or so behind, so its quicker to just add one XML with 40 fields in it than to add 40 fields to the target table and then update all of the SSIS packages. So no, I'm not doing any long-term study here, I'm just being paid to complete a process they already paid a consulting agency for 2 years ago.

  • kenksoftware (12/28/2015)


    we are not even sure if these fields are ever going to be used in the future and there are hundreds of them which is why I'm stuffing them into XML fields. This project is already a year or so behind, so its quicker to just add one XML with 40 fields in it than to add 40 fields to the target table and then update all of the SSIS packages. So no, I'm not doing any long-term study here, I'm just being paid to complete a process they already paid a consulting agency for 2 years ago.

    Whether you're a consult or an employee, have you informed the powers that be how this will seriously increase disk space requirements, how that will affect backup and restore requirements (tape size and time), and how it will affect index maintenance? And look at all the regression testing you're going to have to do after you "upgrade all of the SSIS packages".

    My apologies for being explicit but this conversion is one of the worst ideas ever.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think Jeff has a pint point here with the anti XML and SSIS garlic, keep it around your neck and you will stay healthy:-D

    😎

    Something tells me that things have gone very wrong in this particular project, 12+ months overrun, missing deadline Douglas Adams style etc.. Must say that my thought is that you have to take the path of least losses and fewest bridges burned from this one, the sooner the better from what I understand.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply