Uncommon Table Expressions
I present here a use for Common Table Expressions which may be helpful in your programming environment.
The Fundamentals
The basic component of this article is a CTE that is used to parse a string into a table with a single character column. This is done to allow string processing without WHILE loops or TALLY tables. Once your string is in this format many rowset-based operations can be applied to it resulting in the creation of SQL-based string functionality.
Following is the basic CTE used for the string functions:
Declare @Str VarChar(max); Select @Str='This is a test...'; With Parse as ( Select SubString(@Str,1,1)[Chr], 1[Idx] Union All Select SubString(@Str,Idx+1,1), Idx+1 from Parse where (Idx+1)<=Len(@Str) ) Select * from Parse option (MaxRecursion 0);
Let’s break this recursive CTE down into its functional components for analysis.
First we have the anchor component, so called because it forms the initial result set from which subsequent iterations (recursions) will process.
Select SubString(@Str,1,1)[Chr], 1[Idx]
Here is it's initial contribution to the output rowset:
|Chr|Idx| 1|T|1|
Next we have the recursion component appended to the anchor via Union All.
Notice how it refers to the column values in the initial row.
Union All Select SubString(@Str,Idx+1,1), Idx+1 from Parse
and it's intermediate, iteration 1, contribution to the output result set:
|Chr|Idx| 1|T|1| 2|h|2|
Iteration 2:
|Chr|Idx| 1|T|1| 2|h|2| 3|i|3|
And finally, the completed output result set:
|Chr|Idx| 1|T|1| 2|h|2| 3|i|3| 4|s|4| 5| |5| 6|i|6| 7|s|7| 8| |8| 9|a|9| 10| |10| 11|t|11| 12|e|12| 13|s|13| 14|t|14| 15|.|15| 16|.|16| 17|.|17|
But what caused the recursion to stop? From SQL Server 2005 Books Online
The termination check is implicit; recursion stops when no rows are returned from the previous invocation.
This is accomplished via the WHERE clause on the recursion component.
Recursion stops after the last character of the string has been parsed.
Union All Select SubString(@Str,Idx+1,1), Idx+1 from Parse where (Idx+1)<=Len(@Str)
Now that we understand how the recursive CTE is processing our data lets create a few useful functions.
The OCCURS Function
This function will count the occurrences of a specified character in a string.
Declare @Str VarChar(max); Select @Str='This is a test...'; With Occurs as ( Select SubString(@Str,1,1)[Chr], 1[Idx] Union All Select SubString(@Str,Idx+1,1), Idx+1 from Occurs where (Idx+1)<=Len(@Str) ) Select Count(*) from Occurs where Chr='t' option (MaxRecursion 0);
Here the Occurs CTE does the same thing as the Parse CTE in the initial example. The occurrence counting actually happens in the SELECT statement that calls the CTE in conjunction with it's WHERE clause.
The PHONE_CLEAN Function
This function will remove all non-numeric characters from a string.
Declare @Str1 VarChar(max), @Str2 VarChar(max); Select @Str1='(406) 555-1212', @Str2=''; With PhoneClean as ( Select Case when SubString(@Str1,1,1) like '[0-9]' then SubString(@Str1,1,1) else '' End[Chr], 1[Idx] Union All Select Case when SubString(@Str1,Idx+1,1) like '[0-9]' then SubString(@Str1,Idx+1,1) else '' End, Idx+1 from PhoneClean where (Idx+1)<=Len(@Str1) ) Select @Str2=@Str2+Chr from PhoneClean option (MaxRecursion 0); Select @Str2;
In this example the code in the CTE will only emit characters that are numeric into the output result set. The expected result is realized via a running concatenation in the SELECT statement that calls the CTE.
The REPLACE_Nth_CHAR Function
This function will replace the Nth occurrence of a character in a string with a specified character.
Declare @Str1 VarChar(max), @Str2 VarChar(max); Select @Str1='XXXXX', @Str2=''; With ReplaceNthChar as ( Select SubString(@Str1,1,1)[Chr], 1[Idx] Union All Select SubString(@Str1,Idx+1,1), Idx+1 from ReplaceNthChar where (Idx+1)<=Len(@Str1) ) Select @Str2=@Str2+ Case Idx when 3 then '0' else Chr End from ReplaceNthChar option (MaxRecursion 0); Select @Str2;
Here the ReplaceNthChar CTE does the same thing as the Parse CTE in the initial example. The Nth occurrence is replaced in the SELECT statement that calls the CTE via a CASE statement in a running concatenation.
I think you're getting the picture!
I leave it to you to repackage these examples as functions or procedures for use in your shop.
DISCLAIMER
For large character counts this technique won't perform as well as a routine using a pre-created "tally" table as this technique essentially creates it's "tally" table on the fly each time it is executed.