May 7, 2010 at 8:20 pm
I do not want the operation to be terminated because a row in the middle was not been able to be inserted. but that row goes to a log table. I have a unique constraint on email and we can not disable it.
insert into SP.UserTrial
select * from
(SELECT TempXML.Node.value('(email)[1]','nvarchar(50)') as Email,' ' as Pass,
TempXML.Node.value('(name/n/given)[1]', 'nVARCHAR(50)') as FirstName,
TempXML.Node.value('(name/n/family)[1]', 'VARCHAR(50)') as LastName,
TempXML.Node.value('(name/fn)[1]','nvarchar(50)') as DisplayName,' ' as [Profile],
TempXML.Node.value('(email)[1]','nvarchar(50)') as DisplayEmail,' ' as CellPhone, 1 as UpdatedBy,GETDATE() as UpdateDate, 0 as deleted
FROM @PersonXML.nodes('/enterprise/person') TempXML (Node)) as a
How can I do that
Thanks
May 8, 2010 at 5:12 am
Store the shredded xml data into a relational (temp) staging table.
Use this table and check for existing emails within your insert statement.
May 8, 2010 at 11:50 am
Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g
Thanks
May 8, 2010 at 12:58 pm
Would you please provide some sample data to play with?
Basically, a sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.
May 8, 2010 at 5:19 pm
sohairzaki (5/8/2010)
Can you explain more how to shredded down. Performance is very important because the xml files are of size 1gThanks
heh... as a side bar, if performance were really important, you wouldn't be accepting XML data. 😀 Instead, you'd hit the vendor up for a nice fast tab delimited file. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
May 9, 2010 at 7:55 pm
<?xml version="1.0" encoding="UTF-8"?>
<enterprise>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>John Jack</fn>
<n>
<family>John</family>
<given>Jack</given>
</n>
</name>
<email>JohnJack@xyz.com</email>
</person>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>Mary Luiz</fn>
<n>
<family>Luiz</family>
<given>Mary</given>
</n>
</name>
<email>MayLuiz@xyz.com</email>
</person>
<person>
<name>
<fn>Hani Safwat</fn>
<n>
<family>Safwat</family>
<given>Hani</given>
</n>
</name>
<email>HaniSafwat@xyz.com</email>
</person>
Thanks
I have a duplicate email intentially.
May 10, 2010 at 12:50 am
sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.
May 10, 2010 at 10:58 am
Thanks
here is my target table structure
UserIDintUnchecked
Emailnvarchar(50)Unchecked
Passnvarchar(20)Unchecked
FirstNamenvarchar(20)Checked
LastNamenvarchar(40)Checked
DisplayNamenvarchar(50)Checked
Profilenvarchar(MAX)Checked
DisplayEmailnvarchar(50)Checked
CellPhonenvarchar(20)Checked
UpdatedByintChecked
UpdateDatedatetimeUnchecked
DeletedbitUnchecked
where email is unique
Thanks again
May 10, 2010 at 1:19 pm
Ok, here's how I'd do it:
DECLARE @tbl TABLE
(
UserID INT ,
Email NVARCHAR(50) ,
Pass NVARCHAR(20) ,
FirstName NVARCHAR(20) ,
LastName NVARCHAR(40) ,
DisplayName NVARCHAR(50) ,
PROFILE NVARCHAR(MAX) ,
DisplayEmail NVARCHAR(50) ,
CellPhone NVARCHAR(20) ,
UpdatedBy INT ,
UpdateDate DATETIME ,
Deleted BIT
)
DECLARE @xml XML
SET @xml ='<?xml version="1.0" encoding="UTF-8"?>
<enterprise>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>John Jack</fn>
<n>
<family>John</family>
<given>Jack</given>
</n>
</name>
<email>JohnJack@xyz.com</email>
</person>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>Mary Luiz</fn>
<n>
<family>Luiz</family>
<given>Mary</given>
</n>
</name>
<email>MayLuiz@xyz.com</email>
</person>
</enterprise>
'
SELECT
c.value('email[1]','varchar(50)') AS email,
v.value('fn[1]','varchar(20)') AS DisplayName,
x.value('given[1]','varchar(40)') AS FirstName,
x.value('family[1]','varchar(50)') AS LastName
INTO #mails
FROM @xml.nodes('enterprise/person') T(c)
CROSS APPLY
T.c.nodes('name') U(v)
CROSS APPLY
U.v.nodes('n') W(x)
CREATE CLUSTERED INDEX cx_#mails ON #mails(email)
INSERT INTO @tbl (Email, FirstName, LastName, DisplayName)
SELECT email, FirstName, LastName, DisplayName
FROM #mails
WHERE NOT EXISTS (SELECT 1 FROM @tbl t WHERE t.Email = #mails.email)
GROUP BY email, DisplayName, FirstName, LastName
SELECT *
FROM @tbl
DROP TABLE #mails
A few other issues you might want to check:
If the email column does really allow only 50 chars you might want to expand it.
If DisplayName usually is FirstName + LastName you should change the columns so they will add up (20+40=60).
If the content of PROFILE cannot exceed 4000 char, you should prefer NVARCHAR(4000) over (MAX).
And finally, you need to cover the following scenarios:
a) different data in the xml file for a given email address (e.g. differen CellPhone)
b) How to deal with emails, that are stored in the final table but marked as deleted a.s.o.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply