Parsing XML data into multiple Datasets

Question

Post reply

Parsing XML data into multiple Datasets

SQLServerQuery

Old Hand

Points: 333

January 5, 2023 at 9:13 pm

Go to Answer

#4134231

Hi,

Thanks in advance for your help. I’m trying to download some data from a website and import it into a SQL Database Here is the website:

Disclosure Table – The Takeover Panel

The website posts the data in multiple formats; however I think the most useful format they provide (for this use case) is the XML version (Found below):

https://www.thetakeoverpanel.org.uk/new/disclosureTable/v3/disclosuretable.xml

The XLM file has multiple sections which I’d like to parse into multiple datasets (Table variables or Temp tables):

[additions] [deletions], [amendments] and [maintable]

DECLARE @Additions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)

DECLARE @Deletions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)

DECLARE @MainTable AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)

Each dataset has the same format.

I’ve been able to extract individual nodes (ISIN) in this example:

SELECT
T.C.value('.', 'varchar(100)') AS [maintable_ISIN]
FROM
@myDoc.nodes('(/disclosure_table/maintable/case/offeree/twoten_information/twoten_line/ISIN)') as T(C)

But I’d like to extract the whole thing directly into their relevant datasets so that they look like the attached screenshot (Example_Results.png).

Can anyone help me extract the data into separate datasets?

Thanks

Attachments:

You must be logged in to view attached files.

SQLServerQuery

Old Hand

Points: 333

January 6, 2023 at 4:07 pm

Answer

#4134592

Posted this question elsewhere and got a great answer:

https://learn.microsoft.com/en-us/answers/questions/1153653/parsing-xml-data-into-multiple-datasets.html#comment-1154683

     SELECT
       ROW_NUMBER()OVER(ORDER BY Caseid.value('@caseid[1]', 'int')) AS RowID,
       Caseid.value('@caseid[1]', 'int') AS caseid,
       offeree.value('@name[1]', 'NVARCHAR (200)') AS Offeree_Name,
       offeree.value('@offer_period_commenced[1]', 'VARCHAR(20)') AS Offer_Period_Commenced,
       twoten_information.value('name[1]', 'VARCHAR(20)') AS Name,
       twoten_information.value('ISIN[1]', 'VARCHAR(20)') AS ISIN,
       twoten_information.value('NSI[1]', 'VARCHAR(20)') AS NSI,
       offeror.value('@name[1]', 'NVARCHAR (200)') AS Offeror_Name,
       offeror.value('@rule_26_deadline[1]', 'NVARCHAR (200)') AS Offeror_rule_26_deadline,
       offeror.value('@offeror_identified[1]', 'NVARCHAR (200)') AS Offeror_identified,
       offeror.value('empty_2.10[1]', 'NVARCHAR (200)') AS [Offeror_empty_2.10]
     FROM @myDoc.nodes('/disclosure_table/maintable/case') AS XMLtable1(Caseid)
     CROSS APPLY Caseid.nodes('offeree') XMLtable2(offeree)
     CROSS APPLY Caseid.nodes('offeror') XMLtable3(offeror)
     CROSS APPLY offeree.nodes('twoten_information/twoten_line') XMLtable4(twoten_information)

"To improve your answer:

There is no need to use [1] for the attributes. Each attribute is unique in the context of its element.

While addressing elements, it is better to use the following XPath expression:

'(ISIN/text())[1]' instead of the 'ISIN[1]'

It will produce a dramatic performance improvement.£

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 1

Aye... thanks for posting the link for that. Much appreciated.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)