Importing XML data contained in a table's column to another table

  • Hello every body!

    Iā€™m an SSIS newbie :-), I'm trying to retreive data from XML files that are stored in a column in a table and then store it in a specific table witch i've already created. So can SSIS 2005 be of any help for me?

  • You should write Xpath /xquery to extract data from the xml contained column and then insert into another table.

    I am not sure what exactly you requirement is, but you can do the above mentioned in a strored procedure.

  • I'll second RV.

    Using XQuery inside a stored procedure is most probably the easier way.

    If you need assistance please post sample data and expected result in a ready to use format as described in the first link in my signature.

    Or you can search this site for "XML Workshop Jacob Sebastian" and you'll find a series of very helpful articles.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thank you RV and Imu for the advices! I'm sorry I couldn't answer quickly due to somme connection problems šŸ™‚ .

    I'm not asking you here to do my job šŸ™‚ but I'm really stuck on this issue! And I just want to give you an idea about what I'm doing (as suggested Imu :D).

    I'm working on a Data Warhouse project in the otomotive field (vehicule technical controle). My aim is to make the data in my Operational Source System (OSS) (DB name="EOSTransfert") more tangible and useful for decision makers.

    So, I'm creating a Data Mart that retrieves information from the culumn 'RContenu' (XML type (contains the technical results of a vehicul technical controle)) in the table "Controles" from the OSS, and then store it in my multidimensional DB called "TransfertDW". Here is my dimensional model (data names is in french! but I can translate if there's any need) :

    Fact table:

    FaitControlesTechniques

    IdTemps(FK,Timestamp)

    NumAgrement(FK,nvarcha(50))

    AgentCIN(FK,nvarcha(50))

    CodeTypeVisite(FK,nchar(10))

    NumPV(int)

    NumberOfControles(int)(it's a measure)

    Dimension tables:

    DimTemps (for time storing:not sure of the atributes type yet!)

    DimAgentVisiteur (AgentCIN (PK)... contains information about the worker who did the technical controle)

    DimCentreVisiteTechnique(NumAgrement (PK)...contains information about the center in wich the technical controle was made)

    DimTypeVisiteTechnique (CodeTypeVisite (PK),LibelleTypeVisite: technical controle code and name)

    Here an XML sample file and the source code to create the entire "TransfertDW" (dimensional model) and the "EOSTransfert" data bases:

    XML Sample:

    <envoi xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" AgrementCVT="xxxxx" NumeroOrdre="0" DateHeureEnvoi="2009-02-09T07:53:57.043" LigneNumero="1">

    <Controle NumeroPV="0000" AgentCIN="00000">

    <TypeControle>VT</TypeControle>

    <PreviousNegativeControl xsi:nil="true" />

    <DateHeureDebutControle>2009-02-09T07:42:51.577</DateHeureDebutControle>

    <DateHeureFinControle>2009-02-09T07:44:30.843</DateHeureFinControle>

    <PVDateHeure>2009-02-09T07:53:23.607</PVDateHeure>

    <LotNumero>xxxxx</LotNumero>

    <ControleResult>VOK</ControleResult>

    <NextControleType>VT</NextControleType>

    <VisiteExpirationDate>2010-02-09</VisiteExpirationDate>

    <VehiculeKms>120000</VehiculeKms>

    <MatriculeAssocieZone xsi:nil="true" />

    <MatriculeAssocieLetter>19</MatriculeAssocieLetter>

    <MatriculeAssocieSerial xsi:nil="true" />

    <vehicule MatriculeZone="1" MatriculeLetter="A" MatriculeSerial="0000" ChassisType="0000" ChassisSerial="xxxxx">

    <ImmatriculationDate>1992-03-10</ImmatriculationDate>

    <MiseEnCirculationDate>1981-12-04</MiseEnCirculationDate>

    <ImprimeSN>0000</ImprimeSN>

    <MutationDate>2003-05-09</MutationDate>

    <NombreCylindre>4</NombreCylindre>

    <Marque>xxxxx</Marque>

    <Genre>11</Genre>

    <GenreReel>11</GenreReel>

    <Carburant>DI</Carburant>

    <PuissanceFiscale>8</PuissanceFiscale>

    <PTAC xsi:nil="true" />

    <PTC xsi:nil="true" />

    <PollutionControlType>CPOG</PollutionControlType>

    <Categorie>2</Categorie>

    <ProprietaireID xsi:nil="true" />

    <Nom>xxxxx</Nom>

    <Prenom>xxxxx</Prenom>

    </vehicule>

    <pneus count="4">

    <pneu code="1" Marque="xxxxx" refSection="70" refSerie="24" refStruct="R" refDiametre="70" refCharge="20" refSerial="0245" />

    <pneu code="2" Marque="xxxxx" refSection="41" refSerie="2" refStruct="R" refDiametre="70" refCharge="20" refSerial="02455" />

    </pneus>

    <mesures>

    <essieux count="1">

    <essieu code="0">

    <EffiG xsi:nil="true" />

    <EffiD xsi:nil="true" />

    <FreinSVCDesequi xsi:nil="true" />

    <SuspensionDissymetry xsi:nil="true" />

    </essieu>

    </essieux>

    <Pollution1>0.00</Pollution1>

    <Pollution2>0.00</Pollution2>

    <Pollution3>0.00</Pollution3>

    <PollutionCalc>2.00</PollutionCalc>

    <PolCORalentiAccelere xsi:nil="true" />

    <PolLambda xsi:nil="true" />

    </mesures>

    <constats>

    <constat code="0.1.1.1.1." res="0" />

    <constat code="0.1.1.2.1." res="0" />

    <constat code="0.1.1.3.1." res="0" />

    </constats>

    </Controle>

    </envoi>

    "TransfertDW" data base:

    SET DATEFORMAT DMY

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimTemps]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[DimTemps](

    [IdTemps] [timestamp] NOT NULL,

    CONSTRAINT [PK_DimTemps] PRIMARY KEY CLUSTERED

    (

    [IdTemps] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimAgentVisiteur]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[DimAgentVisiteur](

    [AgentCIN] [nvarchar](50) NOT NULL,

    CONSTRAINT [PK_DimAgentVisiteur] PRIMARY KEY CLUSTERED

    (

    [AgentCIN] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimCentreVisiteTechnique]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[DimCentreVisiteTechnique](

    [NumAgrement] [nvarchar](50) NOT NULL,

    [Code] [nvarchar](50) NOT NULL,

    [Nom] [nvarchar](50) NOT NULL,

    [Responsable] [nvarchar](50) NULL,

    [Telephone] [nvarchar](50) NULL,

    [Fax] [nvarchar](50) NULL,

    CONSTRAINT [PK_DimCentreVisiteTechnique] PRIMARY KEY NONCLUSTERED

    (

    [NumAgrement] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimTypeVisiteTechnique]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[DimTypeVisiteTechnique](

    [CodeTypeVisite] [nchar](10) NOT NULL,

    [LibelleTypeVisite] [char](50) NULL,

    CONSTRAINT [PK_DimTypeVisiteTechnique] PRIMARY KEY CLUSTERED

    (

    [CodeTypeVisite] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[FaitControlesTechniques](

    [IdTemps] [timestamp] NOT NULL,

    [NumAgrement] [nvarchar](50) NOT NULL,

    [AgentCIN] [nvarchar](50) NOT NULL,

    [CodeTypeVisite] [nchar](10) NOT NULL,

    [NombreVisiteTechnique] [int] NOT NULL,

    CONSTRAINT [PK_FaitControlesTechniques] PRIMARY KEY CLUSTERED

    (

    [IdTemps] ASC,

    [NumAgrement] ASC,

    [AgentCIN] ASC,

    [CodeTypeVisite] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    GO

    IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimAgentVisiteur]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))

    ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimAgentVisiteur] FOREIGN KEY([AgentCIN])

    REFERENCES [dbo].[DimAgentVisiteur] ([AgentCIN])

    GO

    IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimCentreVisiteTechnique]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))

    ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimCentreVisiteTechnique] FOREIGN KEY([NumAgrement])

    REFERENCES [dbo].[DimCentreVisiteTechnique] ([NumAgrement])

    GO

    IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimTemps]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))

    ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimTemps] FOREIGN KEY([IdTemps])

    REFERENCES [dbo].[DimTemps] ([IdTemps])

    GO

    IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimTypeVisiteTechnique]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))

    ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimTypeVisiteTechnique] FOREIGN KEY([CodeTypeVisite])

    REFERENCES [dbo].[DimTypeVisiteTechnique] ([CodeTypeVisite])

    "EOSTransfert" data base:

    SET DATEFORMAT DMY

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[CONTROLES]') AND type in (N'U'))

    BEGIN

    CREATE TABLE [dbo].[CONTROLES](

    [CDateHeure] [datetime] NOT NULL,

    [VImmatriculation] [nvarchar](50) NOT NULL,

    [RContenu] [xml] NOT NULL,

    [EContenu] [xml] NULL,

    [EDateHeure] [datetime] NULL,

    [EHash] [nvarchar](50) NULL,

    [EGuid] [nvarchar](50) NULL,

    CONSTRAINT [PK_CONTROLES] PRIMARY KEY CLUSTERED

    (

    [CDateHeure] ASC,

    [VImmatriculation] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    END

    -My problem is how can i retrieve data from my XML files and store it in my Data Mart and eventually automate that?

    -Also I need help with my time dimension:how can I agregate it into more significant date attributes?

    I would really apreciate your help šŸ™‚ !

    Thank you in advance for your time and your interest!

  • Hi,

    here's a rather large sample of various methods using XQuery.

    Unfortunately, you didn't provide a "translation list", meaning what nodes you'd like to see in what column. Therefore, I had to make something up...

    I hope I covered all methods you need to shred your xml data. If not, come back and we'll be happy to help you.

    Side note: the most important thing is the usage of WITH XMLNAMESPACES as soon as you're faced with namespaces inside the xml data.

    INSERT INTO CONTROLES(CDateHeure,VImmatriculation,RContenu)

    SELECT GETDATE(),1,'<envoi xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" AgrementCVT="xxxxx" NumeroOrdre="0" DateHeureEnvoi="2009-02-09T07:53:57.043" LigneNumero="1">

    <Controle NumeroPV="0000" AgentCIN="00000">

    <TypeControle>VT</TypeControle>

    <PreviousNegativeControl xsi:nil="true" />

    <DateHeureDebutControle>2009-02-09T07:42:51.577</DateHeureDebutControle>

    <DateHeureFinControle>2009-02-09T07:44:30.843</DateHeureFinControle>

    <PVDateHeure>2009-02-09T07:53:23.607</PVDateHeure>

    <LotNumero>xxxxx</LotNumero>

    <ControleResult>VOK</ControleResult>

    <NextControleType>VT</NextControleType>

    <VisiteExpirationDate>2010-02-09</VisiteExpirationDate>

    <VehiculeKms>120000</VehiculeKms>

    <MatriculeAssocieZone xsi:nil="true" />

    <MatriculeAssocieLetter>19</MatriculeAssocieLetter>

    <MatriculeAssocieSerial xsi:nil="true" />

    <vehicule MatriculeZone="1" MatriculeLetter="A" MatriculeSerial="0000" ChassisType="0000" ChassisSerial="xxxxx">

    <ImmatriculationDate>1992-03-10</ImmatriculationDate>

    <MiseEnCirculationDate>1981-12-04</MiseEnCirculationDate>

    <ImprimeSN>0000</ImprimeSN>

    <MutationDate>2003-05-09</MutationDate>

    <NombreCylindre>4</NombreCylindre>

    <Marque>xxxxx</Marque>

    <Genre>11</Genre>

    <GenreReel>11</GenreReel>

    <Carburant>DI</Carburant>

    <PuissanceFiscale>8</PuissanceFiscale>

    <PTAC xsi:nil="true" />

    <PTC xsi:nil="true" />

    <PollutionControlType>CPOG</PollutionControlType>

    <Categorie>2</Categorie>

    <ProprietaireID xsi:nil="true" />

    <Nom>xxxxx</Nom>

    <Prenom>xxxxx</Prenom>

    </vehicule>

    <pneus count="4">

    <pneu code="1" Marque="xxxxx" refSection="70" refSerie="24" refStruct="R" refDiametre="70" refCharge="20" refSerial="0245" />

    <pneu code="2" Marque="xxxxx" refSection="41" refSerie="2" refStruct="R" refDiametre="70" refCharge="20" refSerial="02455" />

    </pneus>

    <mesures>

    <essieux count="1">

    <essieu code="0">

    <EffiG xsi:nil="true" />

    <EffiD xsi:nil="true" />

    <FreinSVCDesequi xsi:nil="true" />

    <SuspensionDissymetry xsi:nil="true" />

    </essieu>

    </essieux>

    <Pollution1>0.00</Pollution1>

    <Pollution2>0.00</Pollution2>

    <Pollution3>0.00</Pollution3>

    <PollutionCalc>2.00</PollutionCalc>

    <PolCORalentiAccelere xsi:nil="true" />

    <PolLambda xsi:nil="true" />

    </mesures>

    <constats>

    <constat code="0.1.1.1.1." res="0" />

    <constat code="0.1.1.2.1." res="0" />

    <constat code="0.1.1.3.1." res="0" />

    </constats>

    </Controle>

    </envoi>'

    SET DATEFORMAT ymd

    -- fill DimAgentVisiteur

    ;WITH xmlnamespaces (

    'http://www.w3.org/2001/XMLSchema-instance' AS xsi,

    'http://www.w3.org/2001/XMLSchema' AS xsd

    )

    --INSERT INTO DimAgentVisiteur

    SELECT

    c.value('@AgentCIN[1]','varchar(30)') AgentCIN

    FROM CONTROLES

    CROSS APPLY RContenu.nodes('envoi/Controle') T(c)

    --select specific values from mesures node

    ;WITH xmlnamespaces (

    'http://www.w3.org/2001/XMLSchema-instance' AS xsi,

    'http://www.w3.org/2001/XMLSchema' AS xsd

    )

    --INSERT INTO targetTable(colList)

    SELECT

    VImmatriculation,

    c.value('PolLambda[1]','varchar(30)') PolLambda,

    c.value('PollutionCalc[1]','varchar(30)') PollutionCalc

    FROM CONTROLES

    CROSS APPLY RContenu.nodes('envoi/Controle//mesures') T(c)

    -- select list of values from more than one node

    ;WITH xmlnamespaces (

    'http://www.w3.org/2001/XMLSchema-instance' AS xsi,

    'http://www.w3.org/2001/XMLSchema' AS xsd

    )

    --INSERT INTO targetTable(colList)

    SELECT

    VImmatriculation,

    Node.value('(/constat/@code)[1]','char(10)') AS code

    FROM

    (SELECT VImmatriculation, X.Y.query('.') AS Node

    FROM CONTROLES

    CROSS APPLY RContenu.nodes('envoi/Controle/constats/constat') X(Y)

    ) Z

    -- select data from several node levels

    ;WITH xmlnamespaces (

    'http://www.w3.org/2001/XMLSchema-instance' AS xsi,

    'http://www.w3.org/2001/XMLSchema' AS xsd

    )

    --INSERT INTO targetTable(colList)

    SELECT

    VImmatriculation,

    c.value('@AgentCIN[1]','varchar(30)') AgentCIN,

    c.value('DateHeureDebutControle[1]','datetime') DateHeureDebutControle,

    v.value('@MatriculeZone[1]','varchar(30)') MatriculeZone,

    v.value('ImmatriculationDate[1]','datetime') ImmatriculationDate,

    x.value('PolLambda[1]','varchar(30)') PolLambda,

    x.value('PollutionCalc[1]','varchar(30)') PollutionCalc

    FROM CONTROLES

    CROSS APPLY RContenu.nodes('envoi/Controle') T(c)

    CROSS APPLY t.c.nodes('vehicule') U(v)

    CROSS APPLY t.c.nodes('mesures') W(x)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Lutz,

    Its a great reply

  • RV-813525 (4/7/2010)


    Lutz,

    Its a great reply

    Thank you, RV!

    I hope one or the other method is useful, not only for the OP.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hi everybody,

    Thank you very much Lutz I've been looking for this along time ago :-), even if I didn't understand the whole methodes but don't worry I'm using Jacob Sebastian's tutorials as you suggested to help me understand using XML in SQL Server. And whenever I face a problem in the future in my project I will post it immediately on the forum :-D.

    Best regards,

    Othmane

  • Glad I could help! šŸ˜€



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply