Extracting a string from between two delimiting characters
The problem
A table contains a column (particulars) consisting of a semi-colon-delimited string of text, for which the text between the first and second semi-colon needs to be extracted. The extracted text will be no longer than 99 characters.
The position of the first and second semi-colons is variable and the second semi-colon may not be present.
Extract the text between the semi-colons.
Sample data
Table : Sample_table |
Column : Particulars |
LDR ; LUC20031026901 ; Vehicle |
LDR ; Consolidated |
LDR ; SUB35030172701 ; Building |
LDRR ; LIQ200310121 ; Liquor |
Solution
SELECT Particulars, CASE -- when no second semi-colon, WHEN CHARINDEX(';',(SUBSTRING(Particulars,CHARINDEX(';',Particulars,1)+1,99))) = 0 -- then extract the full string from first semi-colon to the max length of 99 THEN LTRIM(RTRIM(SUBSTRING(Particulars,CHARINDEX(';',Particulars,1)+1,99))) -- else extract the full string from the first semi-colon up to second semi-colon ELSE LTRIM(RTRIM(SUSTRING( /* <text>> */particulars, /* <start > */CHARINDEX(';',Particulars,1) + 1, /* <length> */CHARINDEX(';', SUBSTRING(Particulars, CHARINDEX(';', Particulars, 1) + 1, 99)) - 1 ) ) ) END AS Result_string FROM Sample_table
Resulting data
Particulars | Result_string |
LDR ; LUC20031026901 ; Vehicle | LUC20031026901 |
LDR ; Consolidated | Consolidated |
LDR ; SUB35030172701 ; Building | SUB35030172701 |
LDR ; LIQ200310121 ; Liquor | LIQ200310121 |
Method in Detail
SUBSTRING is used to extract the actual string value required.
SUBSTRING(<text>, <start>, <length>) <text> = Particulars
CHARINDEX is used to identify the position of the semi-colons within the string.
CHARINDEX(, , )
The start position is easy, being the position of the first semi-colon plus 1 character.
= CHARINDEX(';', Particulars, 1) + 1
The string length is the length of the text from <Start_pos> to the second semi-colon. This requires a second SUBSTRING to get a revised string of all text following the first semi-colon. A CHARINDEX over this revised text (less one character) gives the length.
<length> = CHARINDEX(';', <revised_text>, 1) - 1 <revised_text> = SUBSTRING(Particulars, <start>, <len>) <start> = CHARINDEX(';',Particulars,1) + 1 <len> = 99 (maximum)
Substituting these back into the original SUBSTRING function gives
SUBSTRING( /* <text> */particulars, /* <start > */CHARINDEX(';', Particulars, 1) + 1, /* <length> */CHARINDEX(';', SUBSTRING(Particulars, CHARINDEX(';', Particulars, 1) + 1, 99)) - 1 )
This now only needs to be wrapped in LTRIM and RTRIM functions to remove leading and trailing blanks.
Additional code was also added to handle the condition where no second semi-colon was present.
Alternative Method
You can of course come up with other methods of doing this for example, the ELSE statement above can be substituted with:
-- else extract the full string from the first semi-colon up to second semi-colonELSE RTRIM(LTRIM(LEFT(
-- This is the text following the first semi-colon up to the max length of 99
SUBSTRING(Particulars,CHARINDEX(';',Particulars,1)+1,99),
-- This is the length, one character before the next semi-colon
CHARINDEX(';',SUBSTRING(Particulars,CHARINDEX(';',Particulars,1)+1,99)) - 1
) ) )
Summary
You have seen demonstrated a method of extracting text from between two embedded delimiting characters. This has made use of the SUBSTRING, CHARINDEX, LTRIM and RTRIM functions.
I have searched for simpler ways to do this without resorting to a cursor, but as yet have not found one. Any alternative suggestions would be most welcome.