Searching set of words by ignoring spaces

  • Luis Cazares (8/25/2016)


    The Dixie Flatline (8/25/2016)


    But isn't the problem with just replacing spaces is that it doesn't deal with other special characters?

    It depends on what you're doing.

    How can I argue with an "it depends" answer? 😀

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • DECLARE @text VARCHAR(100);

    SET @text= ' set

    nocounto

    n

    '

    WHILE PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ) > 0

    SET @text = REPLACE( @text, SUBSTRING( @text, PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ), 1 ),'')

    SELECT (@text), len(@text)

    I replace any space, tab, carriage return or line feed with an empty string in the above.

    ----------------------------------------------------

  • MMartin1 (8/26/2016)


    DECLARE @text VARCHAR(100);

    SET @text= ' set

    nocounto

    n

    '

    WHILE PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ) > 0

    SET @text = REPLACE( @text, SUBSTRING( @text, PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ), 1 ),'')

    SELECT (@text), len(@text)

    I replace any space, tab, carriage return or line feed with an empty string in the above.

    The only way to use this logic against multiple rows in a table would be to put this logic into a scalar function (you could not use apply). Then it would be very slow. A better approach would be:

    DECLARE @text varchar(1000) = 'xx'+CHAR(9)+'ddd gggg '+CHAR(10)+'fff ggg ';

    SELECT

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(@Text)),

    CHAR(9), ' '),

    CHAR(10), ' '),

    CHAR(13), ' '),

    CHAR(160),' '),

    SPACE(33),' '),

    SPACE(17),' '),

    SPACE(9), ' '),

    SPACE(5), ' '),

    SPACE(3), ' '),

    SPACE(2), ' '),

    SPACE(2), ' ') AS NewText;

    ... and you could easily apply this logic using APPLY like this:

    DECLARE @table TABLE(id int, txt varchar(100));

    INSERT @table VALUES (1,'xx xx c'),(2, 'xx '+CHAR(10)+CHAR(13)+' ffff ffff');

    SELECT *

    FROM @table t

    CROSS APPLY

    (

    SELECT

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(t.txt)),

    CHAR(9), ' '),CHAR(10), ' '),CHAR(13), ' '),CHAR(160),' '),

    SPACE(33),' '),SPACE(17),' '),SPACE(9), ' '),SPACE(5), ' '),

    SPACE(3), ' '),SPACE(2), ' '),SPACE(2), ' ')

    ) AS NewText(NewText);

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (8/26/2016)


    MMartin1 (8/26/2016)


    DECLARE @text VARCHAR(100);

    SET @text= ' set

    nocounto

    n

    '

    WHILE PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ) > 0

    SET @text = REPLACE( @text, SUBSTRING( @text, PATINDEX( '%['+char(32)+','+char(9)+','+char(10)+','+char(13)+']%', @text ), 1 ),'')

    SELECT (@text), len(@text)

    I replace any space, tab, carriage return or line feed with an empty string in the above.

    The only way to use this logic against multiple rows in a table would be to put this logic into a scalar function (you could not use apply). Then it would be very slow. A better approach would be:

    DECLARE @text varchar(1000) = 'xx'+CHAR(9)+'ddd gggg '+CHAR(10)+'fff ggg ';

    SELECT

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(@Text)),

    CHAR(9), ' '),

    CHAR(10), ' '),

    CHAR(13), ' '),

    CHAR(160),' '),

    SPACE(33),' '),

    SPACE(17),' '),

    SPACE(9), ' '),

    SPACE(5), ' '),

    SPACE(3), ' '),

    SPACE(2), ' '),

    SPACE(2), ' ') AS NewText;

    ... and you could easily apply this logic using APPLY like this:

    DECLARE @table TABLE(id int, txt varchar(100));

    INSERT @table VALUES (1,'xx xx c'),(2, 'xx '+CHAR(10)+CHAR(13)+' ffff ffff');

    SELECT *

    FROM @table t

    CROSS APPLY

    (

    SELECT

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(t.txt)),

    CHAR(9), ' '),CHAR(10), ' '),CHAR(13), ' '),CHAR(160),' '),

    SPACE(33),' '),SPACE(17),' '),SPACE(9), ' '),SPACE(5), ' '),

    SPACE(3), ' '),SPACE(2), ' '),SPACE(2), ' ')

    ) AS NewText(NewText);

    When I run the script with my value for @text

    DECLARE @text varchar(1000) = ' set

    nocounto

    n

    ';

    SELECT

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(@Text)),

    CHAR(9), ' '),

    CHAR(10), ' '),

    CHAR(13), ' '),

    CHAR(160),' '),

    SPACE(33),' '),

    SPACE(17),' '),

    SPACE(9), ' '),

    SPACE(5), ' '),

    SPACE(3), ' '),

    SPACE(2), ' '),

    SPACE(2), ' ') AS NewText;

    I get as a result " set nocount o n ". The word 'on' is broken up by a new line. I dont know if this will or will not be encountered in the situation , so thought I would cover it.

    Although yes a cross apply to a function would be slow, I think it would get the job done in this case.

    ----------------------------------------------------

  • The word 'on' is broken up by a new line. I dont know if this will or will not be encountered in the situation , so thought I would cover it

    It would not be, only if the sql text parses correctly. I dont think the OP made this clear.

    ----------------------------------------------------

  • ON needs to be on a different line in order to test if the code handle carriage returns/new lines correctly.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • thank you so much everyone!.

  • Alan.B (8/25/2016)


    ScottPletcher (8/25/2016)


    Rather than literal strings of spaces, I recommend using SPACE(), just for readability:

    /*CREATE FUNCTION ...*/

    SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(@Text)),

    SPACE(33), ' '),

    SPACE(17), ' '),

    SPACE(9), ' '),

    SPACE(5), ' '),

    SPACE(3), ' '),

    SPACE(2), ' '),

    SPACE(2), ' ') AS NewText

    +1

    Excuse me, I need to go re-write some code.

    While you're at it, test for performance. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply