April 6, 2010 at 12:00 pm
Hello every body!
Iām an SSIS newbie :-), I'm trying to retreive data from XML files that are stored in a column in a table and then store it in a specific table witch i've already created. So can SSIS 2005 be of any help for me?
April 6, 2010 at 12:05 pm
You should write Xpath /xquery to extract data from the xml contained column and then insert into another table.
I am not sure what exactly you requirement is, but you can do the above mentioned in a strored procedure.
April 6, 2010 at 12:11 pm
I'll second RV.
Using XQuery inside a stored procedure is most probably the easier way.
If you need assistance please post sample data and expected result in a ready to use format as described in the first link in my signature.
Or you can search this site for "XML Workshop Jacob Sebastian" and you'll find a series of very helpful articles.
April 7, 2010 at 8:56 am
Thank you RV and Imu for the advices! I'm sorry I couldn't answer quickly due to somme connection problems š .
I'm not asking you here to do my job š but I'm really stuck on this issue! And I just want to give you an idea about what I'm doing (as suggested Imu :D).
I'm working on a Data Warhouse project in the otomotive field (vehicule technical controle). My aim is to make the data in my Operational Source System (OSS) (DB name="EOSTransfert") more tangible and useful for decision makers.
So, I'm creating a Data Mart that retrieves information from the culumn 'RContenu' (XML type (contains the technical results of a vehicul technical controle)) in the table "Controles" from the OSS, and then store it in my multidimensional DB called "TransfertDW". Here is my dimensional model (data names is in french! but I can translate if there's any need) :
Fact table:
FaitControlesTechniques
IdTemps(FK,Timestamp)
NumAgrement(FK,nvarcha(50))
AgentCIN(FK,nvarcha(50))
CodeTypeVisite(FK,nchar(10))
NumPV(int)
NumberOfControles(int)(it's a measure)
Dimension tables:
DimTemps (for time storing:not sure of the atributes type yet!)
DimAgentVisiteur (AgentCIN (PK)... contains information about the worker who did the technical controle)
DimCentreVisiteTechnique(NumAgrement (PK)...contains information about the center in wich the technical controle was made)
DimTypeVisiteTechnique (CodeTypeVisite (PK),LibelleTypeVisite: technical controle code and name)
Here an XML sample file and the source code to create the entire "TransfertDW" (dimensional model) and the "EOSTransfert" data bases:
XML Sample:
<envoi xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" AgrementCVT="xxxxx" NumeroOrdre="0" DateHeureEnvoi="2009-02-09T07:53:57.043" LigneNumero="1">
<Controle NumeroPV="0000" AgentCIN="00000">
<TypeControle>VT</TypeControle>
<PreviousNegativeControl xsi:nil="true" />
<DateHeureDebutControle>2009-02-09T07:42:51.577</DateHeureDebutControle>
<DateHeureFinControle>2009-02-09T07:44:30.843</DateHeureFinControle>
<PVDateHeure>2009-02-09T07:53:23.607</PVDateHeure>
<LotNumero>xxxxx</LotNumero>
<ControleResult>VOK</ControleResult>
<NextControleType>VT</NextControleType>
<VisiteExpirationDate>2010-02-09</VisiteExpirationDate>
<VehiculeKms>120000</VehiculeKms>
<MatriculeAssocieZone xsi:nil="true" />
<MatriculeAssocieLetter>19</MatriculeAssocieLetter>
<MatriculeAssocieSerial xsi:nil="true" />
<vehicule MatriculeZone="1" MatriculeLetter="A" MatriculeSerial="0000" ChassisType="0000" ChassisSerial="xxxxx">
<ImmatriculationDate>1992-03-10</ImmatriculationDate>
<MiseEnCirculationDate>1981-12-04</MiseEnCirculationDate>
<ImprimeSN>0000</ImprimeSN>
<MutationDate>2003-05-09</MutationDate>
<NombreCylindre>4</NombreCylindre>
<Marque>xxxxx</Marque>
<Genre>11</Genre>
<GenreReel>11</GenreReel>
<Carburant>DI</Carburant>
<PuissanceFiscale>8</PuissanceFiscale>
<PTAC xsi:nil="true" />
<PTC xsi:nil="true" />
<PollutionControlType>CPOG</PollutionControlType>
<Categorie>2</Categorie>
<ProprietaireID xsi:nil="true" />
<Nom>xxxxx</Nom>
<Prenom>xxxxx</Prenom>
</vehicule>
<pneus count="4">
<pneu code="1" Marque="xxxxx" refSection="70" refSerie="24" refStruct="R" refDiametre="70" refCharge="20" refSerial="0245" />
<pneu code="2" Marque="xxxxx" refSection="41" refSerie="2" refStruct="R" refDiametre="70" refCharge="20" refSerial="02455" />
</pneus>
<mesures>
<essieux count="1">
<essieu code="0">
<EffiG xsi:nil="true" />
<EffiD xsi:nil="true" />
<FreinSVCDesequi xsi:nil="true" />
<SuspensionDissymetry xsi:nil="true" />
</essieu>
</essieux>
<Pollution1>0.00</Pollution1>
<Pollution2>0.00</Pollution2>
<Pollution3>0.00</Pollution3>
<PollutionCalc>2.00</PollutionCalc>
<PolCORalentiAccelere xsi:nil="true" />
<PolLambda xsi:nil="true" />
</mesures>
<constats>
<constat code="0.1.1.1.1." res="0" />
<constat code="0.1.1.2.1." res="0" />
<constat code="0.1.1.3.1." res="0" />
</constats>
</Controle>
</envoi>
"TransfertDW" data base:
SET DATEFORMAT DMY
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimTemps]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[DimTemps](
[IdTemps] [timestamp] NOT NULL,
CONSTRAINT [PK_DimTemps] PRIMARY KEY CLUSTERED
(
[IdTemps] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimAgentVisiteur]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[DimAgentVisiteur](
[AgentCIN] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_DimAgentVisiteur] PRIMARY KEY CLUSTERED
(
[AgentCIN] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimCentreVisiteTechnique]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[DimCentreVisiteTechnique](
[NumAgrement] [nvarchar](50) NOT NULL,
[Code] [nvarchar](50) NOT NULL,
[Nom] [nvarchar](50) NOT NULL,
[Responsable] [nvarchar](50) NULL,
[Telephone] [nvarchar](50) NULL,
[Fax] [nvarchar](50) NULL,
CONSTRAINT [PK_DimCentreVisiteTechnique] PRIMARY KEY NONCLUSTERED
(
[NumAgrement] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DimTypeVisiteTechnique]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[DimTypeVisiteTechnique](
[CodeTypeVisite] [nchar](10) NOT NULL,
[LibelleTypeVisite] [char](50) NULL,
CONSTRAINT [PK_DimTypeVisiteTechnique] PRIMARY KEY CLUSTERED
(
[CodeTypeVisite] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[FaitControlesTechniques](
[IdTemps] [timestamp] NOT NULL,
[NumAgrement] [nvarchar](50) NOT NULL,
[AgentCIN] [nvarchar](50) NOT NULL,
[CodeTypeVisite] [nchar](10) NOT NULL,
[NombreVisiteTechnique] [int] NOT NULL,
CONSTRAINT [PK_FaitControlesTechniques] PRIMARY KEY CLUSTERED
(
[IdTemps] ASC,
[NumAgrement] ASC,
[AgentCIN] ASC,
[CodeTypeVisite] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO
IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimAgentVisiteur]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))
ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimAgentVisiteur] FOREIGN KEY([AgentCIN])
REFERENCES [dbo].[DimAgentVisiteur] ([AgentCIN])
GO
IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimCentreVisiteTechnique]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))
ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimCentreVisiteTechnique] FOREIGN KEY([NumAgrement])
REFERENCES [dbo].[DimCentreVisiteTechnique] ([NumAgrement])
GO
IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimTemps]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))
ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimTemps] FOREIGN KEY([IdTemps])
REFERENCES [dbo].[DimTemps] ([IdTemps])
GO
IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_FaitControlesTechniques_DimTypeVisiteTechnique]') AND parent_object_id = OBJECT_ID(N'[dbo].[FaitControlesTechniques]'))
ALTER TABLE [dbo].[FaitControlesTechniques] WITH CHECK ADD CONSTRAINT [FK_FaitControlesTechniques_DimTypeVisiteTechnique] FOREIGN KEY([CodeTypeVisite])
REFERENCES [dbo].[DimTypeVisiteTechnique] ([CodeTypeVisite])
"EOSTransfert" data base:
SET DATEFORMAT DMY
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[CONTROLES]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[CONTROLES](
[CDateHeure] [datetime] NOT NULL,
[VImmatriculation] [nvarchar](50) NOT NULL,
[RContenu] [xml] NOT NULL,
[EContenu] [xml] NULL,
[EDateHeure] [datetime] NULL,
[EHash] [nvarchar](50) NULL,
[EGuid] [nvarchar](50) NULL,
CONSTRAINT [PK_CONTROLES] PRIMARY KEY CLUSTERED
(
[CDateHeure] ASC,
[VImmatriculation] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
-My problem is how can i retrieve data from my XML files and store it in my Data Mart and eventually automate that?
-Also I need help with my time dimension:how can I agregate it into more significant date attributes?
I would really apreciate your help š !
Thank you in advance for your time and your interest!
April 7, 2010 at 11:59 am
Hi,
here's a rather large sample of various methods using XQuery.
Unfortunately, you didn't provide a "translation list", meaning what nodes you'd like to see in what column. Therefore, I had to make something up...
I hope I covered all methods you need to shred your xml data. If not, come back and we'll be happy to help you.
Side note: the most important thing is the usage of WITH XMLNAMESPACES as soon as you're faced with namespaces inside the xml data.
INSERT INTO CONTROLES(CDateHeure,VImmatriculation,RContenu)
SELECT GETDATE(),1,'<envoi xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" AgrementCVT="xxxxx" NumeroOrdre="0" DateHeureEnvoi="2009-02-09T07:53:57.043" LigneNumero="1">
<Controle NumeroPV="0000" AgentCIN="00000">
<TypeControle>VT</TypeControle>
<PreviousNegativeControl xsi:nil="true" />
<DateHeureDebutControle>2009-02-09T07:42:51.577</DateHeureDebutControle>
<DateHeureFinControle>2009-02-09T07:44:30.843</DateHeureFinControle>
<PVDateHeure>2009-02-09T07:53:23.607</PVDateHeure>
<LotNumero>xxxxx</LotNumero>
<ControleResult>VOK</ControleResult>
<NextControleType>VT</NextControleType>
<VisiteExpirationDate>2010-02-09</VisiteExpirationDate>
<VehiculeKms>120000</VehiculeKms>
<MatriculeAssocieZone xsi:nil="true" />
<MatriculeAssocieLetter>19</MatriculeAssocieLetter>
<MatriculeAssocieSerial xsi:nil="true" />
<vehicule MatriculeZone="1" MatriculeLetter="A" MatriculeSerial="0000" ChassisType="0000" ChassisSerial="xxxxx">
<ImmatriculationDate>1992-03-10</ImmatriculationDate>
<MiseEnCirculationDate>1981-12-04</MiseEnCirculationDate>
<ImprimeSN>0000</ImprimeSN>
<MutationDate>2003-05-09</MutationDate>
<NombreCylindre>4</NombreCylindre>
<Marque>xxxxx</Marque>
<Genre>11</Genre>
<GenreReel>11</GenreReel>
<Carburant>DI</Carburant>
<PuissanceFiscale>8</PuissanceFiscale>
<PTAC xsi:nil="true" />
<PTC xsi:nil="true" />
<PollutionControlType>CPOG</PollutionControlType>
<Categorie>2</Categorie>
<ProprietaireID xsi:nil="true" />
<Nom>xxxxx</Nom>
<Prenom>xxxxx</Prenom>
</vehicule>
<pneus count="4">
<pneu code="1" Marque="xxxxx" refSection="70" refSerie="24" refStruct="R" refDiametre="70" refCharge="20" refSerial="0245" />
<pneu code="2" Marque="xxxxx" refSection="41" refSerie="2" refStruct="R" refDiametre="70" refCharge="20" refSerial="02455" />
</pneus>
<mesures>
<essieux count="1">
<essieu code="0">
<EffiG xsi:nil="true" />
<EffiD xsi:nil="true" />
<FreinSVCDesequi xsi:nil="true" />
<SuspensionDissymetry xsi:nil="true" />
</essieu>
</essieux>
<Pollution1>0.00</Pollution1>
<Pollution2>0.00</Pollution2>
<Pollution3>0.00</Pollution3>
<PollutionCalc>2.00</PollutionCalc>
<PolCORalentiAccelere xsi:nil="true" />
<PolLambda xsi:nil="true" />
</mesures>
<constats>
<constat code="0.1.1.1.1." res="0" />
<constat code="0.1.1.2.1." res="0" />
<constat code="0.1.1.3.1." res="0" />
</constats>
</Controle>
</envoi>'
SET DATEFORMAT ymd
-- fill DimAgentVisiteur
;WITH xmlnamespaces (
'http://www.w3.org/2001/XMLSchema-instance' AS xsi,
'http://www.w3.org/2001/XMLSchema' AS xsd
)
--INSERT INTO DimAgentVisiteur
SELECT
c.value('@AgentCIN[1]','varchar(30)') AgentCIN
FROM CONTROLES
CROSS APPLY RContenu.nodes('envoi/Controle') T(c)
--select specific values from mesures node
;WITH xmlnamespaces (
'http://www.w3.org/2001/XMLSchema-instance' AS xsi,
'http://www.w3.org/2001/XMLSchema' AS xsd
)
--INSERT INTO targetTable(colList)
SELECT
VImmatriculation,
c.value('PolLambda[1]','varchar(30)') PolLambda,
c.value('PollutionCalc[1]','varchar(30)') PollutionCalc
FROM CONTROLES
CROSS APPLY RContenu.nodes('envoi/Controle//mesures') T(c)
-- select list of values from more than one node
;WITH xmlnamespaces (
'http://www.w3.org/2001/XMLSchema-instance' AS xsi,
'http://www.w3.org/2001/XMLSchema' AS xsd
)
--INSERT INTO targetTable(colList)
SELECT
VImmatriculation,
Node.value('(/constat/@code)[1]','char(10)') AS code
FROM
(SELECT VImmatriculation, X.Y.query('.') AS Node
FROM CONTROLES
CROSS APPLY RContenu.nodes('envoi/Controle/constats/constat') X(Y)
) Z
-- select data from several node levels
;WITH xmlnamespaces (
'http://www.w3.org/2001/XMLSchema-instance' AS xsi,
'http://www.w3.org/2001/XMLSchema' AS xsd
)
--INSERT INTO targetTable(colList)
SELECT
VImmatriculation,
c.value('@AgentCIN[1]','varchar(30)') AgentCIN,
c.value('DateHeureDebutControle[1]','datetime') DateHeureDebutControle,
v.value('@MatriculeZone[1]','varchar(30)') MatriculeZone,
v.value('ImmatriculationDate[1]','datetime') ImmatriculationDate,
x.value('PolLambda[1]','varchar(30)') PolLambda,
x.value('PollutionCalc[1]','varchar(30)') PollutionCalc
FROM CONTROLES
CROSS APPLY RContenu.nodes('envoi/Controle') T(c)
CROSS APPLY t.c.nodes('vehicule') U(v)
CROSS APPLY t.c.nodes('mesures') W(x)
April 7, 2010 at 12:15 pm
Lutz,
Its a great reply
April 7, 2010 at 12:24 pm
RV-813525 (4/7/2010)
Lutz,Its a great reply
Thank you, RV!
I hope one or the other method is useful, not only for the OP.
April 9, 2010 at 3:52 am
Hi everybody,
Thank you very much Lutz I've been looking for this along time ago :-), even if I didn't understand the whole methodes but don't worry I'm using Jacob Sebastian's tutorials as you suggested to help me understand using XML in SQL Server. And whenever I face a problem in the future in my project I will post it immediately on the forum :-D.
Best regards,
Othmane
April 9, 2010 at 11:16 am
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply