August 1, 2014 at 3:18 am
I am importing xml multiple times a day from a vendor. However when SSIS created the ID's for nested XML data it is not unique. So importing the first time and I get 3-4 records it looks fine. However subsequent imports all use the same ID's so it isn't unique, how do I go about changing this as I cant find anything about it.
I can provide more info if needed, and searching on google bring lots of results but no clear answers.
Any help is appreciated!
August 1, 2014 at 4:08 am
What method are you using to import the xml (within SSIS)? How are you calculating and assigning these IDs?
August 1, 2014 at 4:11 am
I dont generate them. I am using the XML source and it generates all the child ID's automatically. However running the same import again (say 400 records) it reuses the exact same ID's over again.
August 1, 2014 at 5:41 am
This was removed by the editor as SPAM
August 1, 2014 at 6:34 am
Hi, thanks for trying to help but that link is just a basic guide on importing. It has no information on the auto generation of ID's for nested data etc.
August 1, 2014 at 7:30 am
Just throwing in my 2 cents, my preferred way is to use SSIS to import the XML into a staging table and shred it with XQuery into the destination tables.
😎
August 1, 2014 at 7:42 am
The ids created by the XML Source in Dataflow aren't guaranteed to be unique between calls for different xml files. How could they be?
Add a fileId column to your tables, make it a primary key or unique constraint with the other column and include it in the pipeline with a Derived Column task. Here's an example from my SSIS package that imports deadlock xml. The deadlock_Id column is created by the XML Source and the fileId added in via Derived Column:
CREATE TABLE dbo.deadlock (
fileIdINTNOT NULL,
deadlock_IdNUMERIC(20, 0) NOT NULL,
victimNVARCHAR(255) NULL,
CONSTRAINT PK_deadlock PRIMARY KEY ( fileId, deadlock_Id )
) ON [PRIMARY]
GO
August 1, 2014 at 3:07 pm
dan.james (8/1/2014)
I dont generate them. I am using the XML source and it generates all the child ID's automatically. However running the same import again (say 400 records) it reuses the exact same ID's over again.
Use a derived column with a datetime stamp as a secondary column for your key. Alternatives include GUIDs or detecting the next secondary key out of the table with a MAX()+1 or something similar. In pretty much all cases, you end up with a dual (or more) column key.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply