A quick query puzzle:

  • Thank you for putting in time for this...

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • Although it works fine but why why do we need the UNION ALL part in the CTE?

    mickyT (10/8/2012)


    Hi

    Here's another variation that should do the trick. No function this time, just a query

    create table #t1(name varchar (100))

    GO

    INsert into #t1

    select '1234ABC123456XYZ1234567890ADS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS2134488'

    GO

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    1 mycount,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    mycount + 1,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    ),

    maxcte as (select grouper, max(mycount) lastres from cte group by grouper)

    SELECT res

    FROM cte c

    inner join maxcte m on mycount = lastres and c.grouper = m.grouper

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • S_Kumar_S (10/17/2012)


    Although it works fine but why why do we need the UNION ALL part in the CTE?

    This is a recursive CTE, so it will step through the string and replace the first group of 6 numbers in each step. A recursive CTE requires a top-level UNION ALL. You will get an error if you try a UNION or if you remove from the UNION ALL on, it will only replace the first set of six.

    To get an idea of what it is doing run the following

    create table #t1(name varchar (100))

    GO

    insert into #t1

    select '1234ABC123456XYZ1234567890ADS8785445'

    GO

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    1 mycount,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    mycount + 1,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    ),

    maxcte as (select grouper, max(mycount) lastres from cte group by grouper)

    select * from cte

    This will give you the following results

    res mycountgrouper

    --------------------------------------------------

    1234ABCxxxxxxXYZ1234567890ADS878544511

    1234ABCxxxxxxXYZxxxxxxxxxxADS878544521

    1234ABCxxxxxxXYZxxxxxxxxxxADSxxxxxxx31

    As you can see, the row has been passed through the recursive CTE three times.

    Hope this helps

  • mickyT (10/17/2012)


    S_Kumar_S (10/17/2012)


    Although it works fine but why why do we need the UNION ALL part in the CTE?

    This is a recursive CTE, so it will step through the string and replace the first group of 6 numbers in each step. A recursive CTE requires a top-level UNION ALL. You will get an error if you try a UNION or if you remove from the UNION ALL on, it will only replace the first set of six.

    To get an idea of what it is doing run the following

    create table #t1(name varchar (100))

    GO

    insert into #t1

    select '1234ABC123456XYZ1234567890ADS8785445'

    GO

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    1 mycount,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    mycount + 1,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    ),

    maxcte as (select grouper, max(mycount) lastres from cte group by grouper)

    select * from cte

    This will give you the following results

    res mycountgrouper

    --------------------------------------------------

    1234ABCxxxxxxXYZ1234567890ADS878544511

    1234ABCxxxxxxXYZxxxxxxxxxxADS878544521

    1234ABCxxxxxxXYZxxxxxxxxxxADSxxxxxxx31

    As you can see, the row has been passed through the recursive CTE three times.

    Hope this helps

    Be careful, now. Recursive CTE's that count or do other RBAR ops are nearly as bad and sometimes worse than While loops. Please see the following article for alternatives and comparisons.

    http://www.sqlservercentral.com/articles/T-SQL/74118/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Be careful, now. Recursive CTE's that count or do other RBAR ops are nearly as bad and sometimes worse than While loops. Please see the following article for alternatives and comparisons.

    http://www.sqlservercentral.com/articles/T-SQL/74118/

    Thanks Jeff

    I revisited this to see if I could eliminate the count and improve the performance. Unfortunately it is still recursive, but it did improve the performance a bit

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    )

    SELECT res

    FROM cte c

    where res not like '%[0-9][0-9][0-9][0-9][0-9][0-9]%'

    The difference between the two queries over 10,000 rows is approximately 700ms (altered) and 1300ms(original).

    (10000 row(s) affected)

    Table 'Worktable'. Scan count 2, logical reads 116460, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#t1_________________________________________________________________________________________________________________000000000A72'. Scan count 1, logical reads 57, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 702 ms, elapsed time = 715 ms.

    (10000 row(s) affected)

    Table 'Worktable'. Scan count 4, logical reads 233082, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#t1_________________________________________________________________________________________________________________000000000A72'. Scan count 2, logical reads 114, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1310 ms, elapsed time = 1297 ms.

    I wonder how much impact the max with a group by had on the original?

  • ChrisM@Work (10/5/2012)


    Edit: found an error in an extended sample data set.

    I tried it and it doesn't work for the first example of '1234ABC123456XYZ1234567890ADS'. It x's out the first 4 characters.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Of the code that actually returns the correct answers, here's the performance test of 10,000 rows.

    --===== Conditionally drop the test table to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#T1','U') IS NOT NULL

    DROP TABLE #T1

    ;

    --===== Create and popululate the test table on-the-fly.

    SELECT TOP 10000

    RowNum = IDENTITY(INT,1,1),

    Name = CAST(d.Name AS VARCHAR(100))

    INTO #T1

    FROM (SELECT '1234ABC123456XYZ1234567890ADS'

    UNION ALL

    SELECT 'cbv736456XYZ543534534545XLS')d(Name)

    CROSS JOIN sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    GO

    --=============================================================================

    -- Create any functions/procs in TempDB so we don't make a mess in a

    -- real database.

    --=============================================================================

    USE tempdb;

    SET NOCOUNT ON;

    GO

    IF OBJECT_ID('dbo.Replace6IntWithX') IS NOT NULL

    DROP FUNCTION dbo.Replace6IntWithX;

    GO

    CREATE FUNCTION dbo.Replace6IntWithX

    /******************************************************************************

    Purpose:

    Replace any consecutive sets of 6 or more digits with the same length of

    replacement characters.

    Revision History:

    Rev 00 - 17 Oct 2012 - Jeff Moden

    - Initial creation.

    ******************************************************************************/

    --===== Declare the I/O for this function

    (@pString VARCHAR(8000), @pReplacementCharacter CHAR(1))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== Do the replacement

    WITH

    cteFindStart AS

    ( --=== Find where character type changes occur and

    -- number the postion of the change.

    SELECT Element# = ROW_NUMBER()OVER(ORDER BY t.N),

    N = t.N

    FROM dbo.Tally t

    WHERE t.N <= LEN(@pString)+1

    AND ( SUBSTRING(@pString,t.N-1,2) LIKE '[0-9][^0-9]' COLLATE Latin1_General_BIN

    OR SUBSTRING(@pString,t.N-1,2) LIKE '[^0-9][0-9]' COLLATE Latin1_General_BIN

    OR t.N IN (1,LEN(@pString)+1)

    )

    ),

    cteSplitAndSize AS

    ( --=== Split the string at the character type changes

    -- and remember the length of the string for later.

    SELECT SplitString = SUBSTRING(@pString,begpos.N,endpos.N-begpos.N),

    [Length] = endpos.N - begpos.N

    FROM cteFindStart begpos

    JOIN cteFindStart endpos

    ON begpos.Element# + 1 = endpos.Element#

    )

    --===== Put the string back together making sure that if

    -- numeric strings with >= 6 characters are replaced with "x"s.

    SELECT XString =

    ( --=== Test the first character of each split string.

    -- If it's not a digit or the length < 6, then

    -- use the split string. Otherwise, replace the

    -- length of the split string with "x"s

    SELECT CASE

    WHEN LEFT(SplitString,1) LIKE '[^0-9]'

    OR [Length] < 6

    THEN SplitString

    ELSE REPLICATE(@pReplacementCharacter,[Length])

    END

    FROM cteSplitAndSize

    FOR XML PATH(''),TYPE

    ).value('.','varchar(8000)')

    ;

    GO

    --=============================================================================

    -- Start the tests. I use pure duration as a measure of performance

    -- because SET STATISTICS can throw a hammer into the gears.

    -- See the following URL for what I mean.

    -- http://www.sqlservercentral.com/articles/T-SQL/91724/

    -- I also throw away the output to take display and disk time out of the

    -- picture.

    --=============================================================================

    RAISERROR ('================================================',0,1) WITH NOWAIT;

    RAISERROR ('========== Jeff''s function ==========',0,1) WITH NOWAIT;

    DECLARE @Bitbucket VARCHAR(100),

    @StartTime DATETIME;

    SELECT @StartTime = GETDATE()

    ;

    SELECT @Bitbucket = ca.XString

    FROM #T1 t1

    CROSS APPLY Replace6IntWithX(t1.Name,'x')ca

    ;

    PRINT 'Duration = ' + CONVERT(CHAR(12),GETDATE()-@StartTime,114)

    ;

    GO

    --=============================================================================

    RAISERROR ('================================================',0,1) WITH NOWAIT;

    RAISERROR ('========== MickyT''s Latest ==========',0,1) WITH NOWAIT;

    DECLARE @Bitbucket VARCHAR(100),

    @StartTime DATETIME;

    SELECT @StartTime = GETDATE()

    ;

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    )

    SELECT @Bitbucket = res

    FROM cte c

    where res not like '%[0-9][0-9][0-9][0-9][0-9][0-9]%'

    ;

    PRINT 'Duration = ' + CONVERT(CHAR(12),GETDATE()-@StartTime,114)

    ;

    GO

    --=============================================================================

    RAISERROR ('================================================',0,1) WITH NOWAIT;

    RAISERROR ('========== Dwain''s "Nastiest" ==========',0,1) WITH NOWAIT;

    RAISERROR ('(Killed it after 2 minutes)',0,1) WITH NOWAIT;

    --

    --DECLARE @Bitbucket VARCHAR(100),

    -- @StartTime DATETIME;

    -- SELECT @StartTime = GETDATE()

    --;

    --;WITH ChunkIt (RowID, n, str1, str2, str3) AS (

    -- SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), n=1

    -- ,CASE WHEN dig < alp THEN SUBSTRING(name, 1, alp-1) ELSE SUBSTRING(name, 1, dig-1) END

    -- ,CASE WHEN dig < alp THEN SUBSTRING(name, alp, LEN(name)) ELSE SUBSTRING(name, dig, LEN(name)) END

    -- ,CAST('' AS VARCHAR(100))

    -- FROM #t1

    -- CROSS APPLY (SELECT PATINDEX('%[0-9]%', name), PATINDEX('%[A-Za-z]%', name)) a(dig, alp)

    -- UNION ALL

    -- SELECT RowID, n+1

    -- ,b.str1

    -- ,b.str2

    -- ,CASE WHEN PATINDEX('%[0-9]%', b.str2) = 0 OR PATINDEX('%[A-Za-z]%', b.str2) = 0 THEN b.str2 ELSE '' END

    -- FROM ChunkIt

    -- CROSS APPLY (SELECT PATINDEX('%[0-9]%', str2), PATINDEX('%[A-Za-z]%', str2)) a(dig, alp)

    -- CROSS APPLY (

    -- SELECT CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, 1, a.alp-1) ELSE SUBSTRING(str2, 1, a.dig-1) END

    -- ,CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, a.alp, LEN(str2)) ELSE SUBSTRING(str2, a.dig, LEN(str2)) END

    -- ) b(str1, str2)

    -- WHERE a.dig > 0 AND a.alp > 0

    -- ),

    -- ForGrouping AS (

    -- SELECT RowID, n

    -- ,str1=CASE WHEN LEN(str1) >= 6 AND PATINDEX('%[0-9]%', str1) > 0 THEN REPLICATE('x', LEN(str1)) ELSE str1 END

    -- ,str2=CASE WHEN LEN(str3) >= 6 AND PATINDEX('%[0-9]%', str3) > 0 THEN REPLICATE('x', LEN(str3)) ELSE str3 END

    -- FROM ChunkIt

    -- )

    --SELECT @Bitbucket =(

    -- SELECT str1 + str2

    -- FROM ForGrouping b

    -- WHERE a.RowID = b.RowID

    -- FOR XML PATH(''))

    --FROM ForGrouping a

    --GROUP BY RowID

    -- PRINT 'Duration = ' + CONVERT(CHAR(12),GETDATE()-@StartTime,114)

    --;

    GO

    Here are the results...

    ================================================

    ========== Jeff's function ==========

    Duration = 00:00:00:257

    ================================================

    ========== MickyT's Latest ==========

    Duration = 00:00:01:380

    ================================================

    ========== Dwain's "Nastiest" ==========

    (Killed it after 2 minutes)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Uggh! :w00t:

    I knew my "nastiest" would also be the "costliest."


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • This a non- recursice solution.My Numbers table starts with 1.

    WITH t as

    (

    select name = '1234ABC123456XYZ12345678901234567890ADS'

    ),

    Symbols as

    (

    select Symbol = substring(name,number,1), Position = Number from t join numbers on number <= len(name)

    ),

    NeighborComparison AS

    (

    select s.Symbol,s.Position, NextNeighbor = n.Symbol,PreviousNeighbor = p.Symbol,

    IsConsecutiveNumber = CASE WHEN (ISNUMERIC(s.Symbol)=1 AND ISNUMERIC(n.Symbol)=1 AND

    (ASCII(n.Symbol) - ASCII(s.Symbol) = 1 OR n.Symbol+s.Symbol='90')

    )

    OR

    (ISNUMERIC(s.Symbol)=1 AND ISNUMERIC(p.Symbol)=1 AND

    (ASCII(s.Symbol) - ASCII(p.Symbol) = 1 OR s.Symbol+p.Symbol='09')

    )

    THEN 1

    ELSE 0

    END

    from Symbols s

    left join Symbols n on n.Position = s.Position + 1

    left join Symbols p on p.Position = s.Position - 1

    )

    select Position, NewSymbol= CASE WHEN [Group] > 0 AND COUNT(*) OVER (Partition by [Group]) > 5

    THEN 'x'

    ELSE Symbol

    END

    from

    ( select Symbol,Position, [Group] = (position - row_number () OVER (ORDER by isconsecutivenumber desc,Position)+1)* isconsecutivenumber

    from NeighborComparison

    ) g

    order by position

  • This is other solution. My Numbers table starts with 1.

    WITH t as

    (

    select name = '1234ABC123456XYZ67890123456789012ADS89012hhh'

    ),

    Symbols as

    (

    select Symbol = substring(name,number,1), Position = Number from t join numbers on number <= len(name)

    ),

    NumberGroups as

    (

    select Symbol = CONVERT(INT,Symbol),Position , NumberGroup = Position - symbol + 1 from symbols where isnumeric(symbol)=1

    ),

    UnifiedCrossOverGroups as

    (

    select * ,

    ConsecutiveSymbol = NumberGroupTeenths+Symbol,

    UnifiedGroup = Position - (NumberGroupTeenths + symbol ) + 1

    from NumberGroups

    cross apply( select NumberGroupTeenths = floor(NumberGroup/10)*10 ) nt

    ),

    NewSymbols as

    (

    select NewSymbol = CASE WHEN g.RangeStart IS NULL THEN s.Symbol ELSE 'x' END ,Position

    from symbols s

    left join

    (

    select RangeStart = MIN(Position) , RangeEnd = MAX(Position) from UnifiedCrossOverGroups

    group by UnifiedGroup having count(*) > 5

    ) g ON s.Position between g.RangeStart AND g.RangeEnd

    )

    select NewText = (select NewSymbol as 'text()' from Newsymbols order by position for xml path(''), type).value('.','varchar(max)')

  • This one is insensitive to number order. My computer says it ran in 00:00:00:030, maybe you can tell me how it ran in yours.

    ;WITH t as

    (

    select name = '1234ABC123456XYZ67890123456789012ADS890112hhh'

    ),

    Symbols as

    (

    select Symbol = substring(name,number,1),

    Position = Number

    from t join numbers on number between 1 and len(name)

    ),

    Groups as

    (

    select Symbol ,Position , NumberGroup = Position + 1 - row_number () OVER (ORDER BY POSITION)

    from symbols where isnumeric(symbol)=1

    )

    select string = (select case when g.RangeStart is null then s.Symbol else 'x' end as 'text()' from symbols s

    left join ( select RangeStart = min(Position),RangeEnd = max(position),NumberGroup from groups

    group by NumberGroup having count(*) > 5

    ) g

    on s.Position between g.RangeStart AND g.RangeEnd

    for xml path(''),type

    ).value('.','varchar(8000)')

  • forget what i just said, i tested with only one row hahahah

  • If you need elegant and efficient solution for this problem then CLR using Static RegEx with compiled pattern is the way to go. I don't have VS with me now, but I will update on it tomorrow.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Look, i came up with this, how do you like it?. I test the function in Winforms project but im pretty sure it will work nice in sql clr

    Imports System

    Imports System.Data

    Imports System.Data.SqlClient

    Imports System.Data.SqlTypes

    Imports Microsoft.SqlServer.Server

    Imports System.Text.RegularExpressions

    Partial Public Class UserDefinedFunctions

    <Microsoft.SqlServer.Server.SqlFunction()> _

    Public Shared Function ReplaceWithX(ByVal StringToReplace As SqlString) As SqlString

    ' Add your code here

    Dim expression As New Regex("[0-9]{6,}", RegexOptions.Compiled)

    'Dim evaluator As MatchEvaluator = AddressOf CharReplacement

    Dim Result As String = expression.Replace(StringToReplace.ToString, Function(m As Match) New String("x", m.Length))

    Return New SqlString(Result)

    End Function

    End Class

  • it worked on sql server =D

Viewing 15 posts - 16 through 30 (of 44 total)

You must be logged in to reply to this topic. Login to reply