REPLACE Multiple Spaces with One

  • isn't Michael's explained by the fact that you're stuffing additional characters into the string, pushing the final asterisk outside of the 8000 character limit?

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • jcrawf02 (11/24/2009)


    isn't Michael's explained by the fact that you're stuffing additional characters into the string, pushing the final asterisk outside of the 8000 character limit?

    but there is no 8k limit on varchar(max)

  • The problem is replicate which will not return more than 8000 characters.

    Thus this works just fine.

    declare @STR varchar(max)

    declare @s1 varchar(max)

    declare @s2 varchar(max)

    declare @s3 varchar(max)

    set @s1=replicate(' ',4000)

    set @s2=replicate(' ',4000)

    set @s3=replicate(' ',4000)

    set @STR='*'+@s1+@s2+@s3+'*'

    select datalength(@Str)

    select @STR = REPLACE(REPLACE(REPLACE(@Str,' ','$#$'),'$$#',''),'$#$',' ')

    select @STR

    select datalength(@Str)

    So it's time for some benchmarking C#Screw.

    My guess is it will be much worse than Jeff's performance.

  • Solved!

    By the way - what is the Pork Chop joke :ermm: you all keep mentioning?:-)

  • C# Screw (11/24/2009)


    Solved!

    By the way - what is the Pork Chop joke :ermm: you all keep mentioning?:-)

    You'll have to ask the Porkslinger...author of this fine article 😉

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • C# Screw (11/24/2009)


    jcrawf02 (11/24/2009)


    isn't Michael's explained by the fact that you're stuffing additional characters into the string, pushing the final asterisk outside of the 8000 character limit?

    but there is no 8k limit on varchar(max)

    I went back too many posts, sorry.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • jcrawf02 (11/24/2009)


    C# Screw (11/24/2009)


    Solved!

    By the way - what is the Pork Chop joke :ermm: you all keep mentioning?:-)

    You'll have to ask the Porkslinger...author of this fine article 😉

    Porkslinger !! lol 😀 what!!

    That should surely be added to our list of rude-replace words!!!

  • Evening,

    here are the results : I have ommitted CLR solutions for some clarity:

    New ideas

    ----------

    New Kids on the block include :

    1/Mitesh new pattern replace -> this is tested above in two places: as pure InLine SQL, as an SQL and embedded within a new version Generic 'Data driven :clean up anything' function.

    2/HasSpace -> By creating a calculated persisted column containg expression CHARINDEX(SPACE(2),col2). This column is added in the test script, and a non-clustered index is created on this new column, the column is called HasSpace. HasSpace is tested in the InLine SQL solutions by Jeff & Michael by adding 'HasSpace>0' to the WHERE clause.

    TestData

    ---------

    The results were obtained using 5k rows containing double spaces plus 5k rows without any double spaces.

    SQL

    ----

    Here is the SQL for the batch so you can check code or have a go (CLR solutions are commented out)

    DBCC FREEPROCCACHE

    go

    --PREPARE

    SET NOCOUNT ON

    go

    ---------------------------------------------------------

    ---------- Create SQL Functions Solutions

    ---------------------------------------------------------

    go

    CREATE FUNCTION dbo.fn_CleanUp_JeffOriginal(@S VARCHAR(max))

    RETURNS VARCHAR(max)

    BEGIN

    RETURN REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(@S)) ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') --Changes the remaining X's to nothing

    END

    go

    --Looping

    CREATE FUNCTION dbo.fn_CleanUp_Brigzy(@S VARCHAR(max))

    RETURNS VARCHAR(max)

    BEGIN

    WHILE CHARINDEX(' ',@S) > 0

    SELECT @s-2 = REPLACE(@S,' ',' ')

    RETURN @s-2

    END

    go

    --Recursion

    CREATE FUNCTION dbo.fn_CleanUp_Recursion_Brigzy(@Data varchar(8000))

    RETURNS VarChar(Max)

    AS

    BEGIN

    IF CHARINDEX(' ',@Data) > 0

    BEGIN

    SET @data = REPLACE(@Data,' ',' ')

    IF CHARINDEX(' ',@Data) > 0

    SELECT @data = dbo.fn_CleanUp_Recursion_Brigzy(@Data)

    END

    RETURN @data

    END

    --$#$

    go

    CREATE FUNCTION dbo.fn_CleanUp_Mitish(@S VARCHAR(max))

    RETURNS VARCHAR(max)

    BEGIN

    RETURN REPLACE(REPLACE(REPLACE(@S,' ','$#$'),'$$#',''),'$#$',' ')

    END

    GO

    -- Michael

    CREATE FUNCTION dbo.fn_CleanUp_MichaelMeierruth(@S VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    RETURN replace(replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' ')

    END

    go

    -- C#Screw : generic replace anyting function using Michaels patters (WHERE clause ok)

    CREATE FUNCTION dbo.fn_CleanUp_DataDriven(@MyString VARCHAR(max))

    RETURNS VarChar(MAX)

    AS

    BEGIN

    -- Remove unwanted strings

    SELECT @MyString = REPLACE(@MyString, RemovePattern, ReplaceWith)

    FROM tPatterns

    WHERE CHARINDEX(RemovePattern, @MyString) > 0

    RETURN @MyString

    END

    go

    -- C#Screw : generic replace anyting function : using Matish patterbs (WHERE clause NOT ok)

    CREATE FUNCTION dbo.fn_CleanUp_DataDriven_MetishPatterns(@MyString VARCHAR(max))

    RETURNS VarChar(MAX)

    AS

    BEGIN

    -- Remove unwanted strings

    SELECT @MyString = REPLACE(@MyString, RemovePattern, ReplaceWith)

    FROM tPatterns

    -- cannot use where clause

    --WHERE CHARINDEX(RemovePattern, @MyString) > 0

    RETURN @MyString

    END

    go

    ------------------------------------

    --- CREATE TEST DATA

    ------------------------------------

    CREATE TABLE #TEMP1 (COL1 VARCHAR(900))

    CREATE TABLE #TEMP2 (COL2 VARCHAR(900), COL3 VARCHAR(900), COL4 VARCHAR(900),COL5 VARCHAR(900))

    go

    --INSERT 5k ROWS WITH RANDOM SPACES ON EACH TABLE, SEPARATE TABLES ARE USED TO AVOID CACHING, THIS MAY TAKE QUITE AWHILE

    DECLARE @SPACECOUNT1 INT,@SPACECOUNT2 INT,@SPACECOUNT3 INT,@SPACECOUNT4 INT

    SELECT @SPACECOUNT1 = CAST(CAST(NEWID() AS VARBINARY(1)) AS INT),@SPACECOUNT2 = CAST(CAST(NEWID() AS VARBINARY(1)) AS INT)

    INSERT INTO #TEMP1 (COL1)

    OUTPUT inserted.COL1 INTO #TEMP2 (COL2)

    SELECT ' '+REPLICATE('X',@SPACECOUNT1)+' '+REPLICATE('X',@SPACECOUNT2)+' '

    --INSERT 5k ROWS WITHOUT DOUBLE SPACES

    GO 5000

    DECLARE @SPACECOUNT1 INT,@SPACECOUNT2 INT,@SPACECOUNT3 INT,@SPACECOUNT4 INT

    SELECT @SPACECOUNT1 = CAST(CAST(NEWID() AS VARBINARY(1)) AS INT),@SPACECOUNT2 = CAST(CAST(NEWID() AS VARBINARY(1)) AS INT)

    INSERT INTO #TEMP1 (COL1)

    OUTPUT inserted.COL1 INTO #TEMP2 (COL2)

    SELECT 'TEST1'+SPACE(@SPACECOUNT1)+'TEST2'+SPACE(@SPACECOUNT2)+'TEST3'

    GO 5000

    ------------------------------------

    --- TESTS BEGIN

    ------------------------------------

    -- Test result table

    CREATE TABLE #tResults(Tag VARCHAR(100),StartTime DATETIME, EndTime DATETIME, Result int)

    go

    ---------------------------------------------------------

    ---------- TEST SQL INLINE Solutions

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SET @StartTime=GETDATE()

    UPDATE #TEMP2 SET COL5= REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(col2)),' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') --Changes the remaining X's to nothing

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('SQL: Jeff Original INLINE SQL ' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SET @StartTime=GETDATE()

    UPDATE #TEMP2 SET COL5= replace(replace(replace(replace(replace(replace(replace(ltrim(rtrim(col2)),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' ')

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('SQL: Michael INLINE SQL (hard coded spaces)' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    GO 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SET @StartTime=GETDATE()

    UPDATE #TEMP2 SET COL5= REPLACE(REPLACE(REPLACE(col2,' ','$#$'),'$$#',''),'$#$',' ')

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('SQL: Matish INLINE SQL $#$ Patterns' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    ---------- TEST SQL Function Solutions

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL3 = dbo.fn_CleanUp_Recursion_Brigzy(COL2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)

    VALUES ('SQL function: C#Screw : Recursive',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL3 = dbo.fn_CleanUp_JeffOriginal(Col2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('SQL function: Jeff Original : single bell char',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL4= dbo.fn_CleanUp_Brigzy(COL2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('SQL function :Brigzy (C#Screw) Looping',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL4= dbo.fn_CleanUp_MichaelMeierruth(COL2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('SQL function: Michael Meierruth with hard coded spaces',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    -- Data Driven solutiion uses a lookup table to store replace strings

    -- Create a table of all the rotten stuff we want to remove

    SELECT replicate(' ',32+1) RemovePattern, ' ' ReplaceWith, 1 ReplaceOrder

    INTO tPatterns

    UNION ALL SELECT replicate(' ',16+1),' ',2

    UNION ALL SELECT replicate(' ',8+1), ' ',3

    UNION ALL SELECT replicate(' ',4+1), ' ',4

    UNION ALL SELECT replicate(' ',2+1), ' ',5

    UNION ALL SELECT replicate(' ',1+1), ' ',6

    UNION ALL SELECT replicate(' ',1+1), ' ',7

    go

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL4= dbo.fn_CleanUp_DataDriven(COL2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('SQL function: #Screw: Data Driven Clean Anything - using Michaels patters',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    --SELECT dbo.fn_CleanUp_DataDriven(COL2) FROM #TEMP2

    --------------------------------------------------------

    -- Create a table of all the rotten stuff we want to remove - Mitesh patterns

    DROP TABLE tPatterns

    Go

    SELECT ' ' RemovePattern, '$#$' ReplaceWith, 1 AS ReplaceOrder

    INTO tPatterns

    UNION ALL SELECT '$$#','',2 AS ReplaceOrder

    UNION ALL SELECT '$#$', ' ',3 AS ReplaceOrder

    go

    --SELECT dbo.fn_CleanUp_DataDriven_MetishPatterns(COL2) FROM #TEMP2

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SELECT @StartTime= GETDATE()

    UPDATE #TEMP2 SET COL4= dbo.fn_CleanUp_DataDriven_MetishPatterns(COL2)

    FROM #TEMP2

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('SQL function: #Screw: Data Driven Clean anything - using Mitesh patters',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go

    DROP TABLE tPatterns

    go

    ---------------------------------------------------------

    -------------- Add 'HasSpace' persisted column

    ALTER TABLE #TEMP2 ADD

    HasSpace AS CHARINDEX(SPACE(2),Col2) PERSISTED

    CREATE NONCLUSTERED INDEX idx_HasSpace ON #TEMP2(HasSpace)

    UPDATE STATISTICS #temp2

    go

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SET @StartTime=GETDATE()

    UPDATE #TEMP2 SET COL5= REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(col2)) ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') --Changes the remaining X's to nothing

    FROM #TEMP2

    WHERE HasSpace > 0

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('SQL: Jeff INLINE SQL With Persisted Column HasSpace' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    DECLARE @StartTime DATETIME

    SET @StartTime=GETDATE()

    UPDATE #TEMP2 SET COL5= replace(replace(replace(replace(replace(replace(ltrim(rtrim(col2)),

    replicate(' ',32+1),' '),replicate(' ',16+1),' '),replicate(' ',8+1),' '),replicate(' ',4+1),' '),replicate(' ',2+1),' '),replicate(' ',1+1),' ')

    FROM #TEMP2

    WHERE HasSpace > 0

    INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('SQL: Michael INLINE SQL With Persisted Column HasSpace' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    go 10

    ---------------------------------------------------------

    -- Output Simple Results

    SELECT Tag + CHAR(13) + CAST(FLOOR(AVG(Result)) AS VARCHAR(20)) FROM #tResults GROUP BY

    Tag ORDER BY SUM(Result) DESC

    go

    ---------------------------------------------------------

    -- Output DMV Results

    SET STATISTICS TIME OFF

    SELECT --query = QT.[text],

    SUBSTRING(QT.[text],CHARINDEX('VALUES',text)+8,(CHARINDEX('DATEDIFF(',text)-CHARINDEX('VALUES',text))-8) QueryName,

    --execution_plan = QP.query_plan,

    run_count = QS.execution_count,

    --total_cpu_time_ms = QS.total_worker_time/1000,

    --total_logical_reads = QS.total_logical_reads,

    --total_elapsed_time_ms = QS.total_elapsed_time/1000,

    avg_cpu_time_ms = (QS.total_worker_time / QS.execution_count)/1000,

    avg_logical_reads = QS.total_logical_reads / QS.execution_count,

    avg_elapsed_time_ms = (QS.total_elapsed_time / QS.execution_count)/1000--,

    -- min_clr_time/1000, max_clr_time/1000, total_clr_time/1000

    INTO #tResults2

    FROM sys.dm_exec_query_stats QS

    CROSS

    APPLY sys.dm_exec_sql_text (QS.[sql_handle]) QT

    CROSS

    APPLY sys.dm_exec_query_plan (QS.[plan_handle]) QP

    WHERE QT.[text] LIKE '%INSERT INTO #tResults%'

    AND QT.[text] NOT LIKE '%dm_exec_query_stats%'

    ORDER BY

    --QS.last_execution_time ASC;

    QS.total_elapsed_time / QS.execution_count DESC

    DELETE #tresults2 WHERE avg_cpu_time_ms = 0

    SELECT Tag, MIN(StartTime) MultiBatch_StartTime, MAX(EndTime) MultiBtach_EndTime

    INTO #tResults3

    FROM #tresults

    GROUP BY Tag

    SELECT Tag, /*MultiBatch_StartTime, MultiBtach_EndTime,*/ run_count, avg_cpu_time_ms, avg_logical_reads, avg_elapsed_time_ms

    FROM #tResults2, #tResults3

    WHERE REPLACE(QueryName,'''','') LIKE '%'+REPLACE(Tag,'''','')+'%'

    ORDER BY avg_elapsed_time_ms desc

    GO

    ---------------------------------------------------------------------

    ---- TEST CLR Solutions

    ---------------------------------------------------------------------

    ----DECLARE @StartTime DATETIME

    ----SET @StartTime=GETDATE()

    ----UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_Looping_CLR(COL2)

    ----FROM #TEMP2

    ----INSERT INTO #tResults(Tag,Result)VALUES('CLR: Brigzy (C#Screw) Looping',DATEDIFF(ms,@StartTime,GETDATE()))

    ----go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_NotLooping_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('CLR: Not looping using ||*9*9|| technique in C#',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_SingleChar_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: using Jeff''s single char technique in C#',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_JCB_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('CLR :using JCBnew space replacer C#',DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_PW_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES('CLR: using Paul White space replacer C#' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_FLO_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: using Flo s space replacer C#' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_MMSharpScrew_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: Michael Meierruth - C#Screw conversion: hard coded spaces' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_MMSharpScrew_ExtraReplace_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: Michael Meierruth - C#Screw conversion: with extra Replace' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_SharpScrewII_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: C#Screw II - Spaces and TABS' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_FLO_SpaceAndTab_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: Flo - Spaces and TABS - v.compact' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_and_TABS_SharpScrew_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: C#Screw : Slightly reduced Paul White''s function and Added Support TABS' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_PWSpaceReplacerUnsafe_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: Paul White''s function with Pointers by Mark-101232' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    --go 10

    --DECLARE @StartTime DATETIME

    --SET @StartTime=GETDATE()

    --UPDATE #TEMP2 SET COL5= dbo.fn_TidySpace_and_TABS_SharpScrew_WithPointers_CLR(COL2)

    --FROM #TEMP2

    --INSERT INTO #tResults(Tag,Result,StartTime,EndTime)VALUES ('CLR: Paul White''s function, reduced by C#Screw with Pointers ideas from Mark-101232' ,DATEDIFF(ms,@StartTime,GETDATE()),@StartTime,GETDATE())

    ---------------------------------------------------------------------

    --CLEANUP and go home

    DROP FUNCTION dbo.fn_CleanUp_Brigzy

    DROP FUNCTION fn_CleanUp_JeffOriginal

    DROP FUNCTION fn_CleanUp_MichaelMeierruth

    DROP FUNCTION fn_CleanUp_Recursion_Brigzy

    DROP FUNCTION fn_CleanUp_DataDriven

    DROP FUNCTION fn_CleanUp_Mitish

    DROP FUNCTION fn_CleanUp_DataDriven_MetishPatterns

    DROP TABLE #TEMP1

    DROP TABLE #TEMP2

    DROP TABLE #tResults

    DROP TABLE #tResults2

    DROP TABLE #tResults3

    DROP TABLE tPatterns

  • Michael Meierruth (11/24/2009)


    Very strange this MAX stuff on SS2005. The code below behaves very strange. I.e. I set a string to 8004 characters and yet the LEN function shows it to be 8002.

    But the code to remove the extra blanks correctly returns * *.

    Strange indeed!

    Just the old implicit type thing again:

    declare @s2 varchar(max)

    set @s2 = replicate(' ',8002) -- 8000

    print datalength(@s2)

    set @s2 = replicate(convert(varchar(max), ' '),8002) -- 8002

    print datalength(@s2)

    The implicit type of the single literal space used in the first replicate is VARCHAR(8000).

  • Michael Meierruth (11/24/2009)


    The problem is replicate which will not return more than 8000 characters.

    Books Online


    If string_expression is not of type varchar(max) or nvarchar(max), REPLICATE truncates the return value at 8,000 bytes. To return values greater than 8,000 bytes, string_expression must be explicitly cast to the appropriate large-value data type.

    http://technet.microsoft.com/en-us/library/ms174383(SQL.90).aspx

    (Just to back up my previous post!)

  • C#Screw

    I decided to look at your benchmarking script and I must say there are many interesting things.

    By the way, why does Mitesh have a run count of only 1?

    And why does #temp1.col1 and #temp2.col2 have the same values? Thus it seems that you are not testing spaces at all.

    But my main question is this use of the HasSpaces column. Even if this column is persisted, I don't see how it can possibly work, much less decrease the resource consumptions.

  • Michael Meierruth (11/25/2009)


    C#Screw

    I decided to look at your benchmarking script and I must say there are many interesting things.

    By the way, why does Mitesh have a run count of only 1?

    Yes I noticed that it should have Go 10 after the Mitesh test, so the result above show results of Mitesh first run rather than average of 10 runs.

    And why does #temp1.col1 and #temp2.col2 have the same values? Thus it seems that you are not testing spaces at all.

    The test data generation goes way back from a post by Thiago - we adopted the script to make a fair compare - if there is something wrong there - perhaps we should ask Thiago ?

    But my main question is this use of the HasSpaces column. Even if this column is persisted, I don't see how it can possibly work, much less decrease the resource consumptions.

    My thoughts were that SQL will only perform the cleanup on the rows where HasSpace >0, therefore reducing cleanup operations from 10k rows to 5k rows (in the sample data only half the rows should have double spaces)

  • OK. I see it now. Missed out on the 5K+5K.

    And the HasSpaces will benefit more any algorithm that does particularly poor with strings having no double spaces.

  • Michael Meierruth (11/25/2009)


    OK. I see it now. Missed out on the 5K+5K.

    And the HasSpaces will benefit more any algorithm that does particularly poor with strings having no double spaces.

    Thanks Michael

    I suppose in the real world your choice of solution would depend on when you intend to clean the data:

    1/ETL - data import, where every row has to be crunched regardless

    2/Record Single Row Cleanup before save - perhaps via a trigger

    3/Daily cleanup - where HasSpace would effectively mark records that need to be cleaned

  • Hello Jeff

    would you think the HasSpace is finally getting rid of the RBAR? - I am thinking this because this is the only solution where SQL can use an index to eliminate a set of rows without 'touching them'?

    It has been very interesting reading your artical on using Tally instead of while loops - I will try and think of solutions using Tally in future.

    And one day I also hope to understand the Pork Chop thing as well!:-)

    Thanks

Viewing 15 posts - 286 through 300 (of 425 total)

You must be logged in to reply to this topic. Login to reply