February 14, 2006 at 7:04 pm
Hi, I was asked to help a newbie on a wierd request. There is a column in a table having characters like letters, numbers, brackets, single quotes, double quotes, etc. He wants to remove all characters but letters only. Can we build a query removing characters not like letters from a to z? In other words, keep letters and delete other characters? I am a pure SQL server admin with some light T-SQL skills so this looks like a big problem for me.
Thanks a lot.
February 14, 2006 at 7:38 pm
Try this:
SELECT TOP 255
Identity (int, 1,1) as i
INTO #Numbers
from syscolumns C1, syscolumns C2
declare @Chars table (ToRemove int)
INSERT INTO @Chars (ToRemove)
select i from #NUMBERS
where CHAR(i) not like '[A-Z]' and CHAR(i) not like '[0-9]'
declare @String as nvarchar(50)
select @String = 'Initial , $%&^*
String - '
select @String = replace (@String, char(ToRemove), '')
FROM @Chars
select @String
drop table #Numbers
_____________
Code for TallyGenerator
February 15, 2006 at 11:16 am
thanks a lot.
February 16, 2006 at 12:43 am
Funny, I just posted a similar solution to remove everything except 0-9...
If you don't already have a "Tally" table, now's the time to make one... here's the code to make a Tally table...
--===== Create and populate a Tally table on the fly
SELECT TOP 9999
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysComments sc1,
Master.dbo.SysComments sc2
--===== Give it a primary key
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N)
--===== Give everyone "SELECT" rights
GRANT SELECT ON dbo.Tally TO PUBLIC
...and that makes the solution to your problem both easy and high speed...
CREATE FUNCTION dbo.StripChars( @String VARCHAR(32))
RETURNS VARCHAR(32)
AS
BEGIN
DECLARE @Result VARCHAR(32)
SET @Result = ''
SELECT @Result = @Result + SUBSTRING(@String,N,1)
FROM dbo.Tally
WHERE N<=LEN(@String)
AND SUBSTRING(@String,N,1) LIKE '[A-Z]'
RETURN (NULLIF(@Result,''))
END
... It will resolve 100,000 records in about 6 seconds on my humble 1.8 Ghz desktop machine.
Here's the usage example...
SELECT dbo.StripChars(somecolname)
FROM dbo.sometablename
--Jeff Moden
Change is inevitable... Change for the better is not.
February 16, 2006 at 10:57 am
I use the function listed below without any added tables. So I created the given solution above and compared execution plans: they came out the same.
ALTER FUNCTION dbo.udf_strip_non_alphabetic ( @IN_STRING VARCHAR(8000) )
RETURNS VARCHAR(8000)
AS
BEGIN
-- If string is null, return empty
IF ISNULL( @IN_STRING, '' ) = ''
BEGIN
RETURN('')
END
DECLARE @OUT_STRING AS VARCHAR( 2000 )
SET @OUT_STRING = ''
DECLARE @currentChar CHAR
DECLARE @tempInt INT
SET @currentChar = Substring(@IN_STRING, 1, 1)
-- While the input string has length.
WHILE ( Len( @IN_STRING ) > 0 )
BEGIN
-- Get the decimal value of the ASCII character, then compare
SET @tempInt = ASCII( @currentChar )
IF(@tempInt between 65 and 90
OR @tempInt between 97 and 122)
BEGIN
SET @OUT_STRING = @OUT_STRING + @currentChar
END
-- Overwrite what's left of the input string, slicing off the left character
SET @IN_STRING = substring( @IN_STRING, 2, Len( @IN_STRING ) )
-- Get the next character
SET @currentChar = substring( @IN_STRING, 1, 1 )
END
RETURN( @OUT_STRING )
END
February 16, 2006 at 2:14 pm
you guys are great. thx.
February 16, 2006 at 6:30 pm
Ya just gotta love what Microsoft did with their execution plans ... I absolutely agree that both execution plans came out the same for the following...
SELECT dbo.udf_strip_non_alphabetic (ToStrip)
FROM dbo.AlphaTest
SELECT dbo.StripChars (ToStrip) --This one is twice as fast even though plan doesn't say so
FROM dbo.AlphaTest
It even says they'll both take the same amount of time (50%/50%). But ... the first one takes 13 seconds to return the stripped 100,000 k rows from my test table and the second on takes only 6. Doesn't sound like a lot in seconds but the second query is more than twice as fast.
AND, If you run the execution plans (simultaneously) on the code within each function, it shows the second query loosing big time (.02% to 99.98%)! Still, the second query runs twice as fast... Microsoft will tell us it's not a fault... it's a feature
Anyway, this isn't to badmouth anyone's code... it's to show 2 things... the power of a Tally table (should make one permanent) and to demonstrate that, once again, Microsoft generated execution (or is that "excretion" ) plans lie like a rug . AND, if you bring up the performance tab of Windows Task Manager to watch the cpu usage (set refresh to high speed), they both peg the cpu at 100% for the duration of their run. Dunno about you but if I gotta peg a cpu, I'd rather do it for 6 seconds than 13.
Just for grins, I ran both on a million rows of data (change 100,000 to a million in the code below)... the first query took 140 seconds and the second query only took 56. The cpu on my desktop box was pegged for both runs (I gotta hook a garden hose up to that poor baby ).
So, my recommendation is that execution plans are good as an "estimate" but nothing beats a test... like my ol' physics professor used to say, "One measurement is worth a thousand calculations.
Now, don't take my word for it ... here's the code I used to build the test table... do your own testing and give us a jingle back... I'm always curious how these things run on different boxes so I'd really appreciate your feedback...
--===== Create and populate a test table by using a cross join
SELECT TOP 100000
IDENTITY(INT,1,1) AS ID,
CAST('AFGBS004SA55' AS VARCHAR(32)) AS ToStrip
INTO dbo.AlphaTest
FROM Master.dbo.SysComments sc1,
Master.dbo.SysComments sc2
--===== Add a primary key
ALTER TABLE [dbo].[AlphaTest]
ADD PRIMARY KEY CLUSTERED ([ID])
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply