January 16, 2012 at 3:48 am
I have a seemingly very tricky problem here.
I have a column in a table which has what looks like XML but it actually isn't. It is simply varchar column (string) with data looking like XML.
What I have to do is extract various parts of this column.
The parts I have to extract will be at different positions in the string.
For instance if I want to extract Mr from between the <TITLE> tags then it could be at a different starting position each time - the length of the string is consistent.
I have tried CHARINDEX and can find the starting point eg 127, 137, 200, 198 etc., though this is of no use as the start is different each time, I have tried a REVERSE to the end tag </TITLE> though this still fails. If I can get to the different starting point each time then I can SUBSTRING the amount I need, for instance Mr will be two characters etc.,
The string stored in SQL is show below.
Any help with floating position extract text would be brilliant.
Kind regards
Paul
<Booking Reference="XXXXX"> <Purchases> <Train Reference="00000000"> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <Outbound> <RailLeg OriginStation="MANCHESTER OXFORD ROAD" DepartDateTime="2011-11-25T07:15:00" DestinationStation="LONDON CHARING CROSS" ArrivalDateTime="2011-11-25T10:25:00" Direction="Outbound" Currency="GBP" FareClass="Standard" FareCost="35.50" FareDescription="Standard Advance Single" FareRef="16" SequenceNumber="0" /> </Outbound> <Inbound> <RailLeg OriginStation="LONDON CHARING CROSS" DepartDateTime="2011-11-26T07:02:00" DestinationStation="MANCHESTER OXFORD ROAD" ArrivalDateTime="2011-11-26T09:56:00" Direction="Inbound" Currency="GBP" FareClass="Standard" FareCost="0.00" FareDescription="Standard Advance Single" FareRef="66" SequenceNumber="5" /> </Inbound> </Train> <Flight Reference="AVJAOS"> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <Outbound> <AirLeg OriginAirportName="Manchester, MAN" DepartDateTime="2011-11-18T08:10:00" DestinationAirportName="London Heathrow, LHR" ArrivalDateTime="2011-11-18T09:15:00" Direction="Outbound" FlightNumber="BD581" ServiceClass="Economy" BookingClass="Economy" Currency="GBP" FareCost="43.86" TaxCost="0.00" SequenceNumber="0" /> </Outbound> <Inbound> <AirLeg OriginAirportName="London Heathrow, LHR" DepartDateTime="2011-11-20T07:35:00" DestinationAirportName="Manchester, MAN" ArrivalDateTime="2011-11-20T08:40:00" Direction="Inbound" FlightNumber="AAAAA" ServiceClass="Economy" BookingClass="Economy" Currency="GBP" FareCost="0.00" TaxCost="0.00" SequenceNumber="0" /> </Inbound> </Flight> <Hotel Reference="18760971" Name="Hendon Hall"> <Address>Off Parson Street Hendon London NW4 1HF</Address> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <HotelRooms> <HotelRoom RateType="RACK" RateCost="132.00" /> </HotelRooms> </Hotel> </Purchases> <Booker SVCID="100" Title="Mr." Forename="Test" Surname="Testington" EmailAddress="cgtms@expotel.com" /> <Traveller SVCID="100" Title="Mr." Forename="Test" Surname="Testington" EmailAddress="cgtms@expotel.com" /> </Booking>
January 16, 2012 at 4:15 am
PATINDEX should do the trick:
DECLARE @SampleData TABLE (
data nvarchar(max)
)
INSERT INTO @SampleData VALUES (N'<Booking Reference="XXXXX"> <Purchases> <Train Reference="00000000"> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <Outbound> <RailLeg OriginStation="MANCHESTER OXFORD ROAD" DepartDateTime="2011-11-25T07:15:00" DestinationStation="LONDON CHARING CROSS" ArrivalDateTime="2011-11-25T10:25:00" Direction="Outbound" Currency="GBP" FareClass="Standard" FareCost="35.50" FareDescription="Standard Advance Single" FareRef="16" SequenceNumber="0" /> </Outbound> <Inbound> <RailLeg OriginStation="LONDON CHARING CROSS" DepartDateTime="2011-11-26T07:02:00" DestinationStation="MANCHESTER OXFORD ROAD" ArrivalDateTime="2011-11-26T09:56:00" Direction="Inbound" Currency="GBP" FareClass="Standard" FareCost="0.00" FareDescription="Standard Advance Single" FareRef="66" SequenceNumber="5" /> </Inbound> </Train> <Flight Reference="AVJAOS"> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <Outbound> <AirLeg OriginAirportName="Manchester, MAN" DepartDateTime="2011-11-18T08:10:00" DestinationAirportName="London Heathrow, LHR" ArrivalDateTime="2011-11-18T09:15:00" Direction="Outbound" FlightNumber="BD581" ServiceClass="Economy" BookingClass="Economy" Currency="GBP" FareCost="43.86" TaxCost="0.00" SequenceNumber="0" /> </Outbound> <Inbound> <AirLeg OriginAirportName="London Heathrow, LHR" DepartDateTime="2011-11-20T07:35:00" DestinationAirportName="Manchester, MAN" ArrivalDateTime="2011-11-20T08:40:00" Direction="Inbound" FlightNumber="AAAAA" ServiceClass="Economy" BookingClass="Economy" Currency="GBP" FareCost="0.00" TaxCost="0.00" SequenceNumber="0" /> </Inbound> </Flight> <Hotel Reference="18760971" Name="Hendon Hall"> <Address>Off Parson Street Hendon London NW4 1HF</Address> <Travellers> <Adult> <Title>Mr.</Title> <FirstName>Test</FirstName> <LastName>Testington</LastName> </Adult> </Travellers> <HotelRooms> <HotelRoom RateType="RACK" RateCost="132.00" /> </HotelRooms> </Hotel> </Purchases> <Booker SVCID="100" Title="Mr." Forename="Test" Surname="Testington" EmailAddress="cgtms@expotel.com" /> <Traveller SVCID="100" Title="Mr." Forename="Test" Surname="Testington" EmailAddress="cgtms@expotel.com" /> </Booking>')
SELECT tagContent = SUBSTRING(data, titleTagOpen + tagLength, titleTagClose - titleTagOpen - tagLength)
FROM (
SELECT PATINDEX('%<TITLE>%', data) AS titleTagOpen,
PATINDEX('%</TITLE>%', data) AS titleTagClose,
LEN('<TITLE>') AS tagLength,
data
FROM @SampleData
) AS src
-- Gianluca Sartori
January 16, 2012 at 4:17 am
If you have the same structure, but different length for each value, then you can convert it to XML and then use Xquery to get the data that you need. By the way if this is the case, then maybe you should also store it as XML instead of a string.
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
January 16, 2012 at 4:21 am
Wow thank you so much for the prompt and excellent reply. It works a treat. Now I can dismantle my horrible SSIS solution.
Thanks once again
Paul
January 16, 2012 at 10:29 pm
Adi Cohn-120898 (1/16/2012)
If you have the same structure, but different length for each value, then you can convert it to XML and then use Xquery to get the data that you need. By the way if this is the case, then maybe you should also store it as XML instead of a string.Adi
Heh... my recommendation is to parse it all out and store it as normalized data unstead of the blob known as XML. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 17, 2012 at 2:26 am
Jeff Moden (1/16/2012)
Adi Cohn-120898 (1/16/2012)
If you have the same structure, but different length for each value, then you can convert it to XML and then use Xquery to get the data that you need. By the way if this is the case, then maybe you should also store it as XML instead of a string.Adi
Heh... my recommendation is to parse it all out and store it as normalized data unstead of the blob known as XML. 😉
+10000
But I guess it's not an option any more since we got to this point.
-- Gianluca Sartori
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply