May 17, 2016 at 11:08 am
Comments posted to this topic are about the item DigitsOnlyEE and AlphaNumericOnly
-- Itzik Ben-Gan 2001
May 17, 2016 at 11:28 am
Argh! The SQL code get messed up for AlphanumericOnly somehow (bring on the contribution preview button 😉 ).
The code is still good, it just looks odd at the moment. I have fixed the article and hopefully the corrected version will be posted soon. In the meantime Here's the correctly formatted AlphanumericOnly code:
IF OBJECT_ID('dbo.AlphaNumericOnly') IS NOT NULL DROP FUNCTION dbo.AlphaNumericOnly;
GO
CREATE FUNCTION dbo.AlphaNumericOnly (@pString varchar(8000))
RETURNS TABLE WITH SCHEMABINDING AS RETURN
/****************************************************************************************
Purpose:
Given a VARCHAR(8000) or less string, returns only the alphanumeric digits from the
string.
Compatibility:
SQL Server 2008+, Azure SQL Database, Azure SQL Data Warehouse & Parallel Data Warehouse
Parameters:
@pString = varchar(8000); Input string to be cleaned
Returns:
AlphaNumericOnly - nvarchar(max)
Syntax:
--===== Autonomous
SELECT ca.AlphaNumericOnly
FROM dbo.AlphaNumericOnly(@pString) ca;
--===== CROSS APPLY example
SELECT ca.AlphaNumericOnly
FROM dbo.SomeTable st
CROSS APPLY dbo.AlphaNumericOnly(st.SomeVarcharCol) ca;
Programmer's Notes:
1. Based on Jeff Moden/Eirikur Eiriksson's DigitsOnlyEE function. For more details see:
http://www.sqlservercentral.com/Forums/Topic1585850-391-2.aspx#bm1629360
2. This is an iTVF (Inline Table Valued Function) that performs the same task as a
scalar user defined function (UDF) accept that it requires the APPLY table operator.
Note the usage examples below and see this article for more details:
http://www.sqlservercentral.com/articles/T-SQL/91724/
The function will be slightly more complicated to use than a scalar UDF but will yeild
much better performance. For example - unlike a scalar UDF, this function does not
restrict the query optimizer's ability generate a parallel query plan. Initial testing
showed that the function generally gets a
3. AlphaNumericOnly runs 2-4 times faster when using make_parallel() (provided that you
have two or more logical CPU's and MAXDOP is not set to 1 on your SQL Instance).
4. This is an iTVF (Inline Table Valued Function) that will be used as an iSF (Inline
Scalar Function) in that it returns a single value in the returned table and should
normally be used in the FROM clause as with any other iTVF.
5. CHECKSUM returns an INT and will return the exact number given if given an INT to
begin with. It's also faster than a CAST or CONVERT and is used as a performance
enhancer by changing the BIGINT of ROW_NUMBER() to a more appropriately sized INT.
6. Another performance enhancement is using a WHERE clause calculation to prevent
the relatively expensive XML PATH concatentation of empty strings normally
determined by a CASE statement in the XML "loop".
7. Note that AlphaNumericOnly returns an nvarchar(max) value. If you are returning small
numbers consider casting or converting yout values to a numeric data type if you are
inserting the return value into a new table or using it for joins or comparison
purposes.
8. AlphaNumericOnly is deterministic; for more about deterministic and nondeterministic
functions see https://msdn.microsoft.com/en-us/library/ms178091.aspx
Usage Examples:
--===== 1. Basic use against a literal
SELECT ao.AlphaNumericOnly
FROM dbo.AlphaNumericOnly('xxx123abc999!!!') ao;
--===== 2. Against a table
DECLARE @sampleTxt TABLE (txtID int identity, txt varchar(100));
INSERT @sampleTxt(txt) VALUES ('!!!A555A!!!'),(NULL),('AAA.999');
SELECT txtID, OldTxt = txt, AlphaNumericOnly
FROM @sampleTxt st
CROSS APPLY dbo.AlphaNumericOnly(st.txt);
---------------------------------------------------------------------------------------
Revision History:
Rev 00 - 20150526 - Inital Creation - Alan Burstein
Rev 00 - 20150526 - 3rd line in WHERE clause to correct something that was missed
- Eirikur Eiriksson
****************************************************************************************/
WITH
E1(N) AS ( SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))X(N) ),
iTally(N) AS
(
SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))))
FROM E1 a CROSS JOIN E1 b CROSS JOIN E1 c CROSS JOIN E1 d
)
SELECT AlphaNumericOnly =
(
SELECT SUBSTRING(@pString,N,1)
FROM iTally
WHERE ((ASCII(SUBSTRING(@pString,N,1)) - 48) & 0x7FFF) < 10
OR ((ASCII(SUBSTRING(@pString,N,1)) - 65) & 0x7FFF) < 26
OR ((ASCII(SUBSTRING(@pString,N,1)) - 97) & 0x7FFF) < 26 FOR XML PATH('')
);
-- Itzik Ben-Gan 2001
December 7, 2016 at 7:14 am
Interesting script, thanks.
March 6, 2019 at 5:51 pm
Should this (and PatExclude8K/PatReplace8K,et al) have Paul White's ORDER BY bug fixes put in place?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 7, 2019 at 1:46 pm
TheSQLGuru - Wednesday, March 6, 2019 5:51 PMShould this (and PatExclude8K/PatReplace8K,et al) have Paul White's ORDER BY bug fixes put in place?
Yes. I have slacked on this but did rewrite them with the intention of writing an article but am going to just post the updated functions instead. Thanks for the reminder!
-- Itzik Ben-Gan 2001
May 8, 2020 at 4:06 am
You can simplify the WHERE clause in AlphaNumericOnly and AlphaOnly by taking advantage of the fact that upper and lower case letters differ only by the 6th bit, which is 0 for upper case letters and 1 for lower case. So, the following code from AlphaOnly...
WHERE ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 65) & 0x7FFF) < 26
OR ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 97) & 0x7FFF) < 26
can be changed to...
WHERE ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 65) & 0x7FDF) < 26
Changing the mask from 0x7FFF to 0x7FDF makes the first expression case insensitive, which eliminates the need for a separate lower case expression and makes these functions more efficient.
- Anthony
December 7, 2024 at 6:57 pm
I'm a little late here but great observation, Anthony. I'll have to give that a try.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 7, 2024 at 7:58 pm
You can simplify the WHERE clause in AlphaNumericOnly and AlphaOnly by taking advantage of the fact that upper and lower case letters differ only by the 6th bit, which is 0 for upper case letters and 1 for lower case. So, the following code from AlphaOnly...
WHERE ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 65) & 0x7FFF) < 26
OR ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 97) & 0x7FFF) < 26can be changed to...
WHERE ((ASCII(SUBSTRING(@pString,CHECKSUM(i.N),1)) - 65) & 0x7FDF) < 26Changing the mask from 0x7FFF to 0x7FDF makes the first expression case insensitive, which eliminates the need for a separate lower case expression and makes these functions more efficient.
- Anthony
That worked, great. Thanks for the suggestion!
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply