Hi,
Thanks in advance for your help. I’m trying to download some data from a website and import it into a SQL Database Here is the website:
Disclosure Table – The Takeover Panel
The website posts the data in multiple formats; however I think the most useful format they provide (for this use case) is the XML version (Found below):
https://www.thetakeoverpanel.org.uk/new/disclosureTable/v3/disclosuretable.xml
The XLM file has multiple sections which I’d like to parse into multiple datasets (Table variables or Temp tables):
[additions] [deletions], [amendments] and [maintable]
DECLARE @Additions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
DECLARE @Deletions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
DECLARE @MainTable AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
Each dataset has the same format.
I’ve been able to extract individual nodes (ISIN) in this example:
SELECT
T.C.value('.', 'varchar(100)') AS [maintable_ISIN]
FROM
@myDoc.nodes('(/disclosure_table/maintable/case/offeree/twoten_information/twoten_line/ISIN)') as T(C)
But I’d like to extract the whole thing directly into their relevant datasets so that they look like the attached screenshot (Example_Results.png).
Can anyone help me extract the data into separate datasets?
Thanks
Posted this question elsewhere and got a great answer:
SELECT
ROW_NUMBER()OVER(ORDER BY Caseid.value('@caseid[1]', 'int')) AS RowID,
Caseid.value('@caseid[1]', 'int') AS caseid,
offeree.value('@name[1]', 'NVARCHAR (200)') AS Offeree_Name,
offeree.value('@offer_period_commenced[1]', 'VARCHAR(20)') AS Offer_Period_Commenced,
twoten_information.value('name[1]', 'VARCHAR(20)') AS Name,
twoten_information.value('ISIN[1]', 'VARCHAR(20)') AS ISIN,
twoten_information.value('NSI[1]', 'VARCHAR(20)') AS NSI,
offeror.value('@name[1]', 'NVARCHAR (200)') AS Offeror_Name,
offeror.value('@rule_26_deadline[1]', 'NVARCHAR (200)') AS Offeror_rule_26_deadline,
offeror.value('@offeror_identified[1]', 'NVARCHAR (200)') AS Offeror_identified,
offeror.value('empty_2.10[1]', 'NVARCHAR (200)') AS [Offeror_empty_2.10]
FROM @myDoc.nodes('/disclosure_table/maintable/case') AS XMLtable1(Caseid)
CROSS APPLY Caseid.nodes('offeree') XMLtable2(offeree)
CROSS APPLY Caseid.nodes('offeror') XMLtable3(offeror)
CROSS APPLY offeree.nodes('twoten_information/twoten_line') XMLtable4(twoten_information)
"To improve your answer:
There is no need to use [1] for the attributes. Each attribute is unique in the context of its element.
While addressing elements, it is better to use the following XPath expression:
'(ISIN/text())[1]' instead of the 'ISIN[1]'
It will produce a dramatic performance improvement.£
January 6, 2023 at 5:54 pm
Aye... thanks for posting the link for that. Much appreciated.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply