Parse a string help

Question

Post reply

Parse a string help

Jeffery Williams

SSCertifiable

Points: 6441
More actions
March 6, 2018 at 12:59 pm

#410294

I have a string that I need to parse. The text will vary but the format always the same.
Example Text: Hello5-E-100
I need this parsed into two variables.
First being Hello5-E and the second simply 100
Can someone please assist with this? I have the following code that creates (HELLO5 and second as E100)
DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'
-- HOTEL5-E 100
SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'')))
,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))
<hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Jeffery Williams SSCertifiable Points: 6441 More actions · Answer 1

A little Closer now just need to move the 'E' over and I will be set... HELP! lol

DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'

-- HOTEL5-E 100

SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

<hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Jeffery Williams - Tuesday, March 6, 2018 1:02 PM
A little Closer now just need to move the 'E' over and I will be set... HELP! lol
DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'
-- HOTEL5-E 100
SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

Something like this?

DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100';

SELECT LEFT(@VendItemNumber,LEN(@VendItemNumber) - CHARINDEX('-',REVERSE(@VendItemNumber))), RIGHT(@VendItemNumber,CHARINDEX('-',REVERSE(@VendItemNumber)) - 1);

Jeffery Williams SSCertifiable Points: 6441 More actions · Answer 3

Lynn Pettis - Tuesday, March 6, 2018 1:55 PM
Jeffery Williams - Tuesday, March 6, 2018 1:02 PM
A little Closer now just need to move the 'E' over and I will be set... HELP! lol
DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'
-- HOTEL5-E 100
SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))
Something like this?
DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100';
SELECT LEFT(@VendItemNumber,LEN(@VendItemNumber) - CHARINDEX('-',REVERSE(@VendItemNumber))), RIGHT(@VendItemNumber,CHARINDEX('-',REVERSE(@VendItemNumber)) - 1);

Thank you so much

<hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

Lynn Pettis

SSC Guru

Points: 442467

March 6, 2018 at 2:03 pm

#1982152

My pleasure.

Alan Burstein SSC Guru Points: 61141 More actions · Answer 5

Some code formatting food for thought...
For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
CHARINDEX('-',REVERSE(@VendItemNumber)

You can simplify the code using the Table Value constructor like this:
SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1) FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);

For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.

There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:

-- sample data CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2)); INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4); GO

DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90; -- version 1 with repeated formula SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END, someRank = dense_rank() OVER (ORDER BY CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END) FROM #sometable WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END BETWEEN 900 AND 2000 ORDER BY -- simulate another event that causes a sort CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;

-- version 2 with inline alias SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = itvf.result, someRank = dense_rank() OVER (ORDER BY itvf.result) FROM #sometable CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)) itvf(result) WHERE itvf.result between 900 and 2000 ORDER BY itvf.result;

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

Alan.B - Wednesday, March 7, 2018 9:11 AM
Some code formatting food for thought...
For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
CHARINDEX('-',REVERSE(@VendItemNumber)
You can simplify the code using the Table Value constructor like this:
SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1) FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);
For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.
There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:
-- sample data CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2)); INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4); GO
DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90; -- version 1 with repeated formula SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END, someRank = dense_rank() OVER (ORDER BY CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END) FROM #sometable WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END BETWEEN 900 AND 2000 ORDER BY -- simulate another event that causes a sort CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;
-- version 2 with inline alias SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = itvf.result, someRank = dense_rank() OVER (ORDER BY itvf.result) FROM #sometable CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)) itvf(result) WHERE itvf.result between 900 and 2000 ORDER BY itvf.result;

Actually, thanks for showing this as it goes right along with the DRY principle. I just need to work harder and using it.

Alan Burstein SSC Guru Points: 61141 More actions · Answer 7

Lynn Pettis - Wednesday, March 7, 2018 9:48 AM
Alan.B - Wednesday, March 7, 2018 9:11 AM
Some code formatting food for thought...
For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
CHARINDEX('-',REVERSE(@VendItemNumber)
You can simplify the code using the Table Value constructor like this:
SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1) FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);
For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.
There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:
-- sample data CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2)); INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4); GO
DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90; -- version 1 with repeated formula SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END, someRank = dense_rank() OVER (ORDER BY CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END) FROM #sometable WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END BETWEEN 900 AND 2000 ORDER BY -- simulate another event that causes a sort CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;
-- version 2 with inline alias SELECT TOP (@topPct) PERCENT someid, somevalue, someCalculation = itvf.result, someRank = dense_rank() OVER (ORDER BY itvf.result) FROM #sometable CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)) itvf(result) WHERE itvf.result between 900 and 2000 ORDER BY itvf.result;
Actually, thanks for showing this as it goes right along with the DRY principle. I just need to work harder and using it.

😀

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

GaryV Hall of Fame Points: 3307 More actions · Answer 8

Jeffery Williams - Tuesday, March 6, 2018 12:59 PM
I have a string that I need to parse. The text will vary but the format always the same.
Example Text: Hello5-E-100
I need this parsed into two variables.
First being Hello5-E and the second simply 100

You said the format will always be the same.

Does that mean it will always be 6 a/n chars, a dash, a char, another dash, 3 numbers?

Or it will always be a string of varying length containing 2 dashes somewhere?

If the latter, simply using LEFT and RIGHT would suffice. But I suspect that's not the answer or you would have thought of it yourself - so see the other replies.

Jeffrey Williams SSC Guru Points: 90309 More actions · Answer 9

If the format will always be delimited by a dash - then you can easily utilize a string split utility. A normal string split will split the results into separate rows - but a modified utility can be built to return a specific element from a string. Using something like that - this becomes:

Declare @testTable Table (InputString varchar(100));

Insert Into @testTable Values ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001');

Select concat(p1.Item, '-', p2.Item) As Part1 , p3.Item As Part2 From @testTable t Cross Apply dbo.fnGetStringElement(t.InputString, '-', 1) As p1 Cross Apply dbo.fnGetStringElement(t.InputString, '-', 2) As p2 Cross Apply dbo.fnGetStringElement(t.InputString, '-', 3) As p3;

Here is the function...

Create Function [dbo].[fnGetStringElement] ( @pString varchar(8000) , @pDelimiter char(1) , @pElement int) Returns Table With Schemabinding As Return

With e1(n) As ( --=== Create Ten 1s Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 --10 ) , e2(n) As (Select 1 From e1 a, e1 b) -- 100 , e3(n) As (Select 1 From e2 a, e2 b) -- 10,000 , cteTally (Number) As ( Select Top (datalength(isnull(@pString, 0))) row_number() over(Order By (Select Null)) From e3 ) , cteStart(n1) As ( Select 1 Union All Select t.Number + 1 From cteTally t Where substring(@pString, t.Number, 1) = @pDelimiter ) , cteEnd (n1, l1) As ( Select s.n1 , coalesce(nullif(charindex(@pDelimiter, @pString, s.n1), 0) - s.n1, 8000) From cteStart s ) , cteSplit --==== Do the split As ( Select row_number() over(Order By e.n1) As ItemNumber , substring(@pString, e.n1, e.l1) As Item From cteEnd e ) Select ltrim(rtrim(Item)) As Item From cteSplit Where ItemNumber = @pElement; Go

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Alan Burstein SSC Guru Points: 61141 More actions · Answer 10

Jeffrey Williams 3188 - Friday, March 9, 2018 1:20 PM
If the format will always be delimited by a dash - then you can easily utilize a string split utility. A normal string split will split the results into separate rows - but a modified utility can be built to return a specific element from a string. Using something like that - this becomes:
Declare @testTable Table (InputString varchar(100));
Insert Into @testTable Values ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001');
Select concat(p1.Item, '-', p2.Item) As Part1 , p3.Item As Part2 From @testTable t Cross Apply dbo.fnGetStringElement(t.InputString, '-', 1) As p1 Cross Apply dbo.fnGetStringElement(t.InputString, '-', 2) As p2 Cross Apply dbo.fnGetStringElement(t.InputString, '-', 3) As p3;
Here is the function...
Create Function [dbo].[fnGetStringElement] ( @pString varchar(8000) , @pDelimiter char(1) , @pElement int) Returns Table With Schemabinding As Return
With e1(n) As ( --=== Create Ten 1s Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 --10 ) , e2(n) As (Select 1 From e1 a, e1 b) -- 100 , e3(n) As (Select 1 From e2 a, e2 b) -- 10,000 , cteTally (Number) As ( Select Top (datalength(isnull(@pString, 0))) row_number() over(Order By (Select Null)) From e3 ) , cteStart(n1) As ( Select 1 Union All Select t.Number + 1 From cteTally t Where substring(@pString, t.Number, 1) = @pDelimiter ) , cteEnd (n1, l1) As ( Select s.n1 , coalesce(nullif(charindex(@pDelimiter, @pString, s.n1), 0) - s.n1, 8000) From cteStart s ) , cteSplit --==== Do the split As ( Select row_number() over(Order By e.n1) As ItemNumber , substring(@pString, e.n1, e.l1) As Item From cteEnd e ) Select ltrim(rtrim(Item)) As Item From cteSplit Where ItemNumber = @pElement; Go

No.

Using a splitter for this is kind of like pole vaulting over mouse turds. Next, if you understand what DelimitedSplit8K.ItemNumber does then you don't need a function named fnGetStringElement based on DelimitedSplit8K. If you were to use a splitter you could just do this:
SELECT Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)), Part2 = Item FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s WHERE ItemNumber = 3

What Lynn posted is the way to go. Another efficient way (just for fun):
SELECT part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)), part2 = PARSENAME(clean.string,1) FROM @testTable t CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jeffrey Williams SSC Guru Points: 90309 More actions · Answer 11

Alan.B - Friday, March 9, 2018 5:53 PM
No.
Using a splitter for this is kind of like pole vaulting over mouse turds. Next, if you understand what DelimitedSplit8K.ItemNumber does then you don't need a function named fnGetStringElement based on DelimitedSplit8K. If you were to use a splitter you could just do this:
SELECT Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)), Part2 = Item FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s WHERE ItemNumber = 3
What Lynn posted is the way to go. Another efficient way (just for fun):
SELECT part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)), part2 = PARSENAME(clean.string,1) FROM @testTable t CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

Here is a thought - change the specification slightly and how does your solution work? For example, instead of the first 3 elements of the string let's say we want to return the 2nd, 3rd and 5th elements. Or maybe we want the 1st, 3rd, 5th or any other elements that are non-contiguous? Your first solution won't work because you cannot eliminate the 1st or 4th elements of the string without additional code (charindex to locate the first dash but then how do you eliminate the 4th element?) - your second solution won't work because PARSENAME only works with 4 elements in a string.

How would you code for this: Return the 2nd through 10th pipe-delimited elements as separate fields from the 1st underscore element, the first and second caret (^) elements from the 6th pipe-delimited element of the second underscore element and the 4th caret element of the 11th pipe-delimited element of the second underscore element? Replace the underscores with CHAR(13) and add a whole lot more segments and you have a standard HL7 message.

Lynn's solution works - but has the same problem as it only addresses one specific requirement and cannot easily be adapted to resolve a slightly different requirement.

DelimitedSplit8K could be utilized to return single elements - but then you are doing the exact same thing I am doing in fnGetStringElement so I really see no difference other than the fact that for fnGetStringElement I don't need to specify the ItemNumber in the where clause to eliminate the other rows returned.

SELECT ... FROM Table t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p1 CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p2 CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p3 WHERE p1.ItemNumber = 2 AND p2.ItemNumber = 3 AND p3.ItemNumber = 5

SELECT ... FROM Table t CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p2 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 5) As p3

With that said - I really don't understand why you had to be so rude with your response...

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Alan Burstein SSC Guru Points: 61141 More actions · Answer 12

My apologies if I came off as rude. I was just trying to point out that what you posted was overkill for the requirement. I write must faster code than I used to because people on this forum have, for years, pointed out what my code could be optimized. The OP stated:

Example Text: Hello5-E-100
I need this parsed into two variables.
First being Hello5-E and the second simply 100

Let's assume we're dealing with two hyphens - everything before the last hyphen is part1, the text after is part2. In that scenario, the four solutions presented thusfar work:

Sample data
IF OBJECT_ID('tempdb..#testTable') IS NOT NULL DROP TABLE #testTable; CREATE TABLE #testTable(InputString varchar(100)); INSERT #testTable VALUES ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001'); GO

-- fnGetStringElement solution SELECT part1 = concat(p1.Item, '-', p2.Item), Part2 = p3.Item FROM #testTable t CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 2) As p2 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p3;

-- Alan Splitter Solution SELECT Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)), Part2 = Item FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s WHERE ItemNumber = 3

-- Lynn SELECT part1 = LEFT(t.InputString,LEN(t.InputString) - CHARINDEX('-',REVERSE(t.InputString))), part2 = RIGHT(t.InputString,CHARINDEX('-',REVERSE(t.InputString)) - 1) FROM #testTable t

-- PARSENAME SELECT part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)), part2 = PARSENAME(clean.string,1) FROM #testTable t CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

To understand where I was coming from let's start with the execution plans:

fnGetStringElement solution

This solution does three itvf function calls then three nested loop joins to merge the data back together.

Alan Splitter Solution

The est. subtree cost of this soution is roughtly 1/3rd (0.022 vs 0.06 on my PC). This is due to the fact that I am making only one itvf call vs. three. Lynn's solution and my PARSENAME solution produce the same plan:

Now a 1,000,000 row test harness
INSERT #testTable SELECT TOP (1000000) LEFT(newid(),17) FROM sys.all_columns a, sys.all_columns b;

PRINT 'fnGetStringElement solution'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100); SELECT @part1 = concat(p1.Item, '-', p2.Item), @Part2 = p3.Item FROM #testTable t CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 2) As p2 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p3; PRINT datediff(MS, @st, getdate()); GO 3

PRINT 'Alan Splitter Solution'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100); SELECT @part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)), @part2 = Item FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s WHERE ItemNumber = 3; PRINT datediff(MS, @st, getdate()); GO 3

PRINT 'LYNN'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100); SELECT @part1 = LEFT(t.InputString,LEN(t.InputString) - CHARINDEX('-',REVERSE(t.InputString))), @part2 = RIGHT(t.InputString,CHARINDEX('-',REVERSE(t.InputString)) - 1) FROM #testTable t PRINT datediff(MS, @st, getdate()); GO 3

PRINT 'Parsename'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100); SELECT @part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)), @part2 = PARSENAME(clean.string,1) FROM #testTable t CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string); PRINT datediff(MS, @st, getdate()); GO 3

Results;
fnGetStringElement solution
--------------------------------------------------
Beginning execution loop
4424
2470
2820
Batch execution completed 3 times.
Alan Splitter Solution
--------------------------------------------------
Beginning execution loop
990
960
956
Batch execution completed 3 times.
LYNN
--------------------------------------------------
Beginning execution loop
637
630
626
Batch execution completed 3 times.
Parsename
--------------------------------------------------
Beginning execution loop
1510
1520
1547
Batch execution completed 3 times.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jeffrey Williams SSC Guru Points: 90309 More actions · Answer 13

Alan - I agree for this specific scenario the other options will perform better. However, you didn't answer the questions I asked - what if the requirements or data change?

The problem with Lynn's solution is that it cannot be utilized for anything other than splitting into 2 columns. If the requirement changes and the users now want the data split into 3 columns this solution doesn't work. If the users decide to add another element to the beginning - or add additional elements to the end but they still want the original 2 columns - this solution will not work. If anything changes (and it generally does) - this will require a complete code rewrite and testing.

The problem with PARSENAME is that it only works as long as there are no more than 4 elements in the string. As soon as we get a fifth element in the string PARSENAME no longer works at all. The same applies to this solution - once the requirements change it no longer works and will require a complete rewrite and testing of the new process.

What is very interesting here is that using DelimitedSplit8K actually performs worse than using fnGetStringElement. This test does not include Lynn's solution or PARSENAME - but includes 2 versions of DelimitedSplit8K. The first version uses 3 CROSS APPLY's to get each individual element - the second version uses one CROSS APPLY and a CROSS TAB to get the results.

SET NOCOUNT ON; IF OBJECT_ID('tempdb..#testTable', 'U') IS NOT NULL DROP TABLE #testTable;

SELECT TOP (1000000) cast(newid() As varchar(255)) As InputString INTO #TESTTABLE FROM sys.all_columns a, sys.all_columns b;

PRINT 'fnGetStringElement solution'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100); SELECT @Input = t.InputString, @part1 = p1.Item, @part3 = p2.Item, @Part5 = p3.Item FROM #testTable t CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p2 CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 5) As p3; PRINT datediff(MS, @st, getdate()); GO 3

PRINT 'Alan Splitter Solution'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100); SELECT @Input = t.InputString, @part1 = s.Item, @part3 = s3.Item, @part5 = s5.Item FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s3 CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s5 WHERE s.ItemNumber = 1 AND s3.ItemNumber = 3 AND s5.ItemNumber = 5; PRINT datediff(MS, @st, getdate()); GO 3

PRINT 'Modified Alan Splitter Solution'+char(10)+replicate('-',50); GO DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100); SELECT @Input = t.InputString, @part1 = max(Case When s.ItemNumber = 1 Then s.Item End), @part3 = max(Case When s.ItemNumber = 3 Then s.Item End), @part5 = max(Case When s.ItemNumber = 5 Then s.Item End) FROM #testTable t CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s GROUP BY t.InputString; PRINT datediff(MS, @st, getdate()); GO 3

The results of my test are:

fnGetStringElement solution
--------------------------------------------------
Beginning execution loop
4570
4913
4743
Batch execution completed 3 times.
Alan Splitter Solution
--------------------------------------------------
Beginning execution loop
7453
7480
7466
Batch execution completed 3 times.
Modified Alan Splitter Solution
--------------------------------------------------
Beginning execution loop
Warning: Null value is eliminated by an aggregate or other SET operation.
24540
Warning: Null value is eliminated by an aggregate or other SET operation.
24810
Warning: Null value is eliminated by an aggregate or other SET operation.
24503
Batch execution completed 3 times.

All I can say is - wow...the cross tab version is extremely bad and the DelimitedSplit8K version is a bit slower than the fnGetStringElement version. So - given the specific scenario from the OP then either Lynn's solution of PARSENAME would be best - when and only when it is guaranteed that the requirements will never change. If the requirements change - even slightly - then another solution would be required.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

If the requirements change then obviously the code has to change. Sometimes you just need what works rather than trying to code to a bunch of what ifs that may never occur.