April 25, 2008 at 11:33 am
one of the ways to do this kind of replacements is to go one level deeper in regex matching using Capturing Groups of a Match:
consider this input string;
string MySrtToClean = @"this number sh be intact: 3.4E-10; the rest of stuff d^&*( should go";
our goal is : to retain number 3.4E-10 AND get rid of the rest of non-digits [\D]
we do matching using this OR-ed regex:
([1-9]\d*(?:\.\d*)?[Ee]-?\d+)|(\D+)
[capturing groups are in parentheses embracing both parts or the OR-ed *piped* regex]
we get these 3 matches and CAPTURING GROUPS $1 and $2
Match $1 $2
this number sh be intact: this number sh be intact:
3.4E-10 3.4E-10
; the rest of stuff d^&* ( should go ; the rest of stuff d^&*( should go
now the fun part: we are going to REPLACE with an empty string ONLY GROUP $2 i.e rest of garbage, resulting in this cleaned output:
3.4E-10
to code this in C# u'll have to associate .NET Match Evaluator delegate with your CLR regexfunction
dbo.regexreplace(MySrtToClean, '([1-9]\d*(?:\.\d*)?[Ee]-?\d+)|(\D+)', '')
Match Evaluator will take both Capturing Groups and replace only the second one with empty string. You can use logic of any depth there.
Match Evaluator will return modified match back to function dbo.regexreplace.
April 25, 2008 at 11:41 am
try and see how the above regex works in this online tester:
http://regexlib.com/RETester.aspx
it will not do replace, just will show u matches and groups
June 30, 2008 at 2:24 pm
Hi Friends,
What I am looking for is to replace all non-us special characters with a ' ' (Space)
This can be for a column or a whole table.
Help please...
July 1, 2008 at 12:20 am
Please define what the US non-special characters are...
--Jeff Moden
Change is inevitable... Change for the better is not.
July 2, 2008 at 5:11 am
CREATE function dbo.udfGetNumVal
(
@strInput nvarchar(50)
)
Returns DECIMAL(18,0)
AS
BEGIN
DECLARE @iCtr INT
DECLARE @AscVal INT
DECLARE @NumVal NVARCHAR(100)
SET @iCtr = 1
SET @NumVal = ''
WHILE @iCtr <= Len(@strInput)
BEGIN
SET @AscVal = Ascii(substring(@strInput,@iCtr,1))
IF (@AscVal >= 48) and (@AscVal <= 57)
BEGIN
SET @NumVal = @NumVal + char(@AscVal)
END
SET @iCtr = @iCtr + 1
END
Return Convert(decimal,Coalesce(@NumVal,0))
END
BEST OF LUCK...!!!
July 2, 2008 at 5:59 am
gaurava16fc (7/2/2008)
CREATE function dbo.udfGetNumVal(
@strInput nvarchar(50)
)
Returns DECIMAL(18,0)
AS
BEGIN
DECLARE @iCtr INT
DECLARE @AscVal INT
DECLARE @NumVal NVARCHAR(100)
SET @iCtr = 1
SET @NumVal = ''
WHILE @iCtr <= Len(@strInput)
BEGIN
SET @AscVal = Ascii(substring(@strInput,@iCtr,1))
IF (@AscVal >= 48) and (@AscVal <= 57)
BEGIN
SET @NumVal = @NumVal + char(@AscVal)
END
SET @iCtr = @iCtr + 1
END
Return Convert(decimal,Coalesce(@NumVal,0))
END
BEST OF LUCK...!!!
Good solution but you don't need a WHILE loop for this...
CREATE FUNCTION dbo.udfGetNumVal
(
@StrInput NVARCHAR(50)
)
RETURNS DECIMAL(18,0)
AS
BEGIN
--===== Return Variable
DECLARE @NumVal NVARCHAR(50)
--===== Keep only digits
SELECT @NumVal = COALESCE(@NumVal,'') + SUBSTRING(@StrInput,N,1)
FROM dbo.Tally
WHERE SUBSTRING(@StrInput,N,1) LIKE '[0-9]'
AND N <= LEN(@StrInput)
RETURN @NumVal
END
If you don't already know what a Tally table is, take a look at the following article...
http://www.sqlservercentral.com/articles/TSQL/62867/
Change the operand of the LIKE to suit your needs ...
--Jeff Moden
Change is inevitable... Change for the better is not.
July 2, 2008 at 10:17 am
Ok... no WHILE loop, no TALLY table...
Declare @OldStr VarChar(max),@NewStr VarChar(max);
Select @NewStr='',@OldStr='ab123c';
With NumOnly as
(
Select Case when SubString(@OldStr,1,1) like '[0-9]' then SubString(@OldStr,1,1) else '' End[Chr],1[Idx]
Union All
Select Case when SubString(@OldStr,Idx+1,1) like '[0-9]' then SubString(@OldStr,Idx+1,1) else '' End,Idx+1
from NumOnly
where Idx<Len(@OldStr)
)
Select @NewStr=@NewStr+Chr from NumOnly Option (MaxRecursion 32767);
Print @NewStr;
And this can be extended into many other areas! 🙂
July 2, 2008 at 10:41 am
Ups, had to changed a couple of things I missed when I constructed this from a couple of different functions...
July 2, 2008 at 10:43 am
And here's an OCCURS function using a CTE...
Declare @STR VarChar(max);
Select @STR='This is a test...';
With Occurs as
(
Select SubString(@Str,1,1)[Chr],1[Idx]
Union All
Select SubString(@Str,Idx+1,1),Idx+1
from Occurs
where Idx<Len(@Str)
)
Select Count(*) from Occurs where Chr='t' Option (MaxRecursion 32767)
July 2, 2008 at 3:05 pm
Peter E. Kierstead (7/2/2008)
Ok... no WHILE loop, no TALLY table...Declare @OldStr VarChar(max),@NewStr VarChar(max);
Select @NewStr='',@OldStr='ab123c';
With NumOnly as
(
Select Case when SubString(@OldStr,1,1) like '[0-9]' then SubString(@OldStr,1,1) else '' End[Chr],1[Idx]
Union All
Select Case when SubString(@OldStr,Idx+1,1) like '[0-9]' then SubString(@OldStr,Idx+1,1) else '' End,Idx+1
from NumOnly
where Idx<Len(@OldStr)
)
Select @NewStr=@NewStr+Chr from NumOnly Option (MaxRecursion 32767);
Print @NewStr;
And this can be extended into many other areas! 🙂
Many have come up with a similar solution... try it on a million rows and then try the Tally table... see what happens for yourself. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
July 2, 2008 at 3:14 pm
Yes, every technique has its achilles heel... one of the CTE's is the recursion level...
But this works great for most things I used to use WHILE loops or TALLY tables for. I like things that are self-contained with no external components (TALLY tables).
Hey :cool:, I'm just trying to present alternatives 🙂
July 2, 2008 at 4:27 pm
Peter E. Kierstead (7/2/2008)
Yes, every technique has its achilles heel... one of the CTE's is the recursion level...But this works great for most things I used to use WHILE loops or TALLY tables for. I like things that are self-contained with no external components (TALLY tables).
Hey :cool:, I'm just trying to present alternatives 🙂
Heh... no problem... I've just found recurrsion to be as slow or slower than a loop and wanted to make sure that folks understood the performance risks of using, as you say, the achilles heel of CTE's.
Speaking of "alternatives"... this will blow most recursive methods away for creating sequenced sets of numbers... will actually beat some of the larger Tally tables... IN FACT, here's some test code for just some of the methods ... this one shows just how slow recurrsion really is...
SET NOCOUNT ON
--=============================================================================
-- Recursive CTE does the count
--=============================================================================
PRINT '========== Recursive CTE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Top INT
SET @Top = 100000
;WITH cteTally
AS (
SELECT 1 AS N
UNION ALL
SELECT N+1 FROM cteTally WHERE N<@Top
)
SELECT N
INTO #Test1
FROM cteTally
OPTION (MAXRECURSION 0)
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100)
GO
--=============================================================================
-- ROW_NUMBER CTE does the count
--=============================================================================
PRINT '========== ROW_NUMBER CTE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Top INT
SET @Top = 100000
;WITH cteTally
AS (
SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Type) AS N
FROM Master.dbo.spt_Values sc1,
Master.dbo.spt_Values sc2
)
SELECT *
INTO #Test2
FROM cteTally
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100)
GO
--=============================================================================
-- ROW_NUMBER query does the job directly
--=============================================================================
PRINT '========== ROW_NUMBER Query =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Top INT
SET @Top = 100000
SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Type) AS N
INTO #Test3
FROM Master.dbo.spt_Values sc1,
Master.dbo.spt_Values sc2
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100)
GO
--=============================================================================
-- IDENTITY query does the job directly
--=============================================================================
PRINT '========== IDENTITY =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Top INT
SET @Top = 100000
SELECT TOP(@Top) IDENTITY(INT,1,1) AS N
INTO #Test4
FROM Master.dbo.spt_Values sc1,
Master.dbo.spt_Values sc2
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100)
GO
DROP TABLE #Test1,#Test2,#Test3,#Test4
--Jeff Moden
Change is inevitable... Change for the better is not.
July 3, 2008 at 8:14 am
I do agree... at the end of the day performance generally decides the issue. Its just that even a simple language like TSQL allows for a suprising number of ways to git-er-dun. I was just contibuting to the diversity!
Jeff, nice examples.
July 3, 2008 at 8:29 am
and... CRAP :angry:
while running Jeff's examples I discovered torn pages in my master database...
Msg 824, Level 24, State 2, Line 12
SQL Server detected a logical consistency-based I/O error: torn page (expected signature: 0xffffffff; actual signature: 0xdd88ffff). It occurred during a read of page (1:353) in database ID 1 at offset 0x000000002c2000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\master.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
July 3, 2008 at 8:49 am
Hooo-boy! I've not had to deal with that one but I know it's serious and not so easy to fix. Good luck, Peter.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 46 through 60 (of 81 total)
You must be logged in to reply to this topic. Login to reply