Node.value and insert the result to an existing table

  • I do not want the operation to be terminated because a row in the middle was not been able to be inserted. but that row goes to a log table. I have a unique constraint on email and we can not disable it.

    insert into SP.UserTrial

    select * from

    (SELECT TempXML.Node.value('(email)[1]','nvarchar(50)') as Email,' ' as Pass,

    TempXML.Node.value('(name/n/given)[1]', 'nVARCHAR(50)') as FirstName,

    TempXML.Node.value('(name/n/family)[1]', 'VARCHAR(50)') as LastName,

    TempXML.Node.value('(name/fn)[1]','nvarchar(50)') as DisplayName,' ' as [Profile],

    TempXML.Node.value('(email)[1]','nvarchar(50)') as DisplayEmail,' ' as CellPhone, 1 as UpdatedBy,GETDATE() as UpdateDate, 0 as deleted

    FROM @PersonXML.nodes('/enterprise/person') TempXML (Node)) as a

    How can I do that

    Thanks

  • Store the shredded xml data into a relational (temp) staging table.

    Use this table and check for existing emails within your insert statement.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g

    Thanks

  • Would you please provide some sample data to play with?

    Basically, a sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • sohairzaki (5/8/2010)


    Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g

    Thanks

    heh... as a side bar, if performance were really important, you wouldn't be accepting XML data. 😀 Instead, you'd hit the vendor up for a nice fast tab delimited file. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • <?xml version="1.0" encoding="UTF-8"?>

    <enterprise>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>John Jack</fn>

    <n>

    <family>John</family>

    <given>Jack</given>

    </n>

    </name>

    <email>JohnJack@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Mary Luiz</fn>

    <n>

    <family>Luiz</family>

    <given>Mary</given>

    </n>

    </name>

    <email>MayLuiz@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Hani Safwat</fn>

    <n>

    <family>Safwat</family>

    <given>Hani</given>

    </n>

    </name>

    <email>HaniSafwat@xyz.com</email>

    </person>

    Thanks

    I have a duplicate email intentially.

  • sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.

  • Thanks

    here is my target table structure

    UserIDintUnchecked

    Emailnvarchar(50)Unchecked

    Passnvarchar(20)Unchecked

    FirstNamenvarchar(20)Checked

    LastNamenvarchar(40)Checked

    DisplayNamenvarchar(50)Checked

    Profilenvarchar(MAX)Checked

    DisplayEmailnvarchar(50)Checked

    CellPhonenvarchar(20)Checked

    UpdatedByintChecked

    UpdateDatedatetimeUnchecked

    DeletedbitUnchecked

    where email is unique

    Thanks again

  • Ok, here's how I'd do it:

    DECLARE @tbl TABLE

    (

    UserID INT ,

    Email NVARCHAR(50) ,

    Pass NVARCHAR(20) ,

    FirstName NVARCHAR(20) ,

    LastName NVARCHAR(40) ,

    DisplayName NVARCHAR(50) ,

    PROFILE NVARCHAR(MAX) ,

    DisplayEmail NVARCHAR(50) ,

    CellPhone NVARCHAR(20) ,

    UpdatedBy INT ,

    UpdateDate DATETIME ,

    Deleted BIT

    )

    DECLARE @xml XML

    SET @xml ='<?xml version="1.0" encoding="UTF-8"?>

    <enterprise>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>John Jack</fn>

    <n>

    <family>John</family>

    <given>Jack</given>

    </n>

    </name>

    <email>JohnJack@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Mary Luiz</fn>

    <n>

    <family>Luiz</family>

    <given>Mary</given>

    </n>

    </name>

    <email>MayLuiz@xyz.com</email>

    </person>

    </enterprise>

    '

    SELECT

    c.value('email[1]','varchar(50)') AS email,

    v.value('fn[1]','varchar(20)') AS DisplayName,

    x.value('given[1]','varchar(40)') AS FirstName,

    x.value('family[1]','varchar(50)') AS LastName

    INTO #mails

    FROM @xml.nodes('enterprise/person') T(c)

    CROSS APPLY

    T.c.nodes('name') U(v)

    CROSS APPLY

    U.v.nodes('n') W(x)

    CREATE CLUSTERED INDEX cx_#mails ON #mails(email)

    INSERT INTO @tbl (Email, FirstName, LastName, DisplayName)

    SELECT email, FirstName, LastName, DisplayName

    FROM #mails

    WHERE NOT EXISTS (SELECT 1 FROM @tbl t WHERE t.Email = #mails.email)

    GROUP BY email, DisplayName, FirstName, LastName

    SELECT *

    FROM @tbl

    DROP TABLE #mails

    A few other issues you might want to check:

    If the email column does really allow only 50 chars you might want to expand it.

    If DisplayName usually is FirstName + LastName you should change the columns so they will add up (20+40=60).

    If the content of PROFILE cannot exceed 4000 char, you should prefer NVARCHAR(4000) over (MAX).

    And finally, you need to cover the following scenarios:

    a) different data in the xml file for a given email address (e.g. differen CellPhone)

    b) How to deal with emails, that are stored in the final table but marked as deleted a.s.o.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply