speed up LEN(RTRIM(LTRIM(ERR))) > 0????

  • They do.

    And you can see it using function "datalength".

    But LEN ignores trailing spaces.

    _____________
    Code for TallyGenerator

  • Sorry but this was the post:

    LEN(RTRIM(LTRIM(ERR))) > 0

    is equal to

    LEN(ERR) > 0

    and that is not correct. Len ignores trailing spaces but not the begining ones.

    select Len ( '  Space in the begining '), Len (rtrim(ltrim('  Space in the begining ')))

    returns : 23, 21

     

    datalength is a completly different function

     

     


    * Noel

  • noeld,

    when LEN(Err) = 0 there are no beginning spaces.

    All spaces are trailing.

    Otherwise LEN gonna be > 0

    In this task we don't care about not empty strings, string is not empty no matter is it 21 or 23 charachters long.

    So,

    LEN(RTRIM(LTRIM(ERR))) > 0

    is equal to

    LEN(ERR) > 0

    and that is correct.

    _____________
    Code for TallyGenerator

  • declare @ERR char(20)

    set @ERR = '  '

    select 1

    where @ERR > ''

    select 1

    where len(@ERR) > 0

    It does not return anything. No matter that @ERR has actually 20 spaces inside.

    _____________
    Code for TallyGenerator

  • Am I correct in saying that you're looking for records where err is not full full of spaces?

    Put an index on the err column and try this

    SELECT <fields> FROM <table> WHERE Err != SPACE(255)

    Any form of function will prevent index use. the column can't be null and since it's a char it can't be an empty string, but will be always padded out to 255 char. The only way is can be empty, if I understand properly, is if all 255 characters are spaces, hence the comparison with SPACE(255).

    Does this do what you need?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • First off, thanks for all the replies.

    The row count is somewhere near 5,000,000-7,000,000.  About 100,000 total will be updated daily.  The LEN(RTRIM(LTRIM(ERR))) > 0 will be evaluated probably about 5,000 times a day, maybe more.  Allowing nulls to speed up the evaluation, but slowing down the updates is not a fair trade.

    All I really need to check for is that if the field has any text in it.  If it has any text in it besides spaces, then I assume that it contains a error message for the row.  So I don't care if the spaces are before or after the text, just if there is any text in the message.

    So do we have anything that's faster than RTRIM(ERR)>''?


    Live to Throw
    Throw to Live
    Will Summers

  • Whether the column allows NULLs or not, the following will produce returns only if the column is NOT NULL and the column has something other than all blanks regardless of the number of blanks...  In other words, the following is equivelent to saying "Find everything that is not null and is not blank"...

    SELECT *
    FROM yourtable
    WHERE yourcolumnname > ''

    Because there is no calculation on "yourcolumnname", it is possible for an index to be used if one is available and the optimizer chooses to do so.

    In other words, go back and look at Serqiy's post because he's right

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • So do we have anything that's faster than RTRIM(ERR)>''?

    Did you read my last post&

    WHERE Err>''

    is all you need.

    _____________
    Code for TallyGenerator

  • Is '5 spaces'>'' ?

    Can you store '5 spaces' in a varchar, or will it just reduce it to ''?

    I am contenplating converting the column to varchar to save space, but I want to make sure it won't mess up this comparison.


    Live to Throw
    Throw to Live
    Will Summers

  • if we are talking about 5 - 7 millions I still think that the calculated column (with an index) will be a lot faster for two reasons.

    1.The value is an integer (could be a tinyint) therefore fast searches can be performed

    2. index width is very narrow storage requirement for the index is a lot less than what char or varchar equivqlent will offer.

    Just make sure that on the query plan that index is used.

     

    Can you store '5 spaces' in a varchar, or will it just reduce it to ''?

    it will treat them as '' in a comparison with the standard connection defaults.

     

     


    * Noel

  • They say that one test is worth a thousand calculations and a million guesses... Noeld is correct.  But don't take my word for it... I expanded my normally well behaved 1-million row test into a nasty 10-million row test and here's the results (see bold areas)...

    Creating a 10-million row test table
    (10000000 row(s) affected)
    Adding the calculated column
    Adding the primary key
    Adding an index to the calculated column
    Wounding the "Kicks" column with spaces
    (1000000 row(s) affected)
    Add the index on the "Kicks" column
    Heres the stats on how many different groups of spacing there are
    String                           NumberOfSpaces Length      NumberPresent 
    -------------------------------- -------------- ----------- ------------- 
    ||                               0              0           99887
    | |                              1              0           100226
    |  |                             2              0           99861
    |   |                            3              0           99682
    |    |                           4              0           99838
    |     |                          5              0           99906
    |      |                         6              0           100278
    |       |                        7              0           100075
    |        |                       8              0           100235
    |         |                      9              0           100012
    (10 row(s) affected)
    ==============================================================================
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
                
    ----------- 
    9000000
    (1 row(s) affected)
         9.826 Seconds for CalcCol > 0
    ==============================================================================
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
                
    ----------- 
    9000000
    (1 row(s) affected)
        21.486 Seconds for LEN(Kicks) > 0
    ==============================================================================
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
                
    ----------- 
    9000000
    (1 row(s) affected)
        20.500 Seconds for Kicks > ''
    ==============================================================================
    ==============================================================================
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
                
    ----------- 
    9000000
    (1 row(s) affected)
        22.486 Seconds for LEN(LTRIM(RTRIM(Kicks))) > 0
    

    Here's the code to make the test table (note that the "Kicks" column is the column under test and it's setup as a VARCHAR column), but before you decide to run it on a production box, be advised....

    WARNING!  WARNING!  WARNING!  WARNING!  WARNING!

    DO NOT RUN THE FOLLOWING CODE ON A PRODUCTION SERVER!!!

    THE FOLLOWING CODE USES A HIGHLY AGRESSIVE TRIPLE CROSS-JOIN TO CREATE THE ROWS IN THE TEST TABLE THAT WILL CAUSE 1 OR MORE CPU'S TO MAX OUT AND TAKES NEARLY 100% OF THE HARD-DISK BUSS BANDWIDTH FOR APPROXIMATELY 6.5 MINUTES AND IT DOES SO IN AN EXCLUSIVE MANNER... NO ONE ELSE CAN GET IN OR DO ANYTHING FOR THAT TIME PERIOD.  FURTHER, THE TEST TABLE WILL CONSUME 1.7 GIGA-BYTES OF HARD-DISK SPACE AND THE INDEXES WILL CONSUME 0.7 GIGA-BYTES OF HARD-DISK SPACE.

    WARNING!  WARNING!  WARNING!  WARNING!  WARNING!

    DO NOT RUN THE FOLLOWING CODE ON A PRODUCTION SERVER!!!

    Here's the code to create the test table, indexes, calculated column, etc... do be careful...

      PRINT 'Creating a 10-million row test table'
    GO
    --===== Create a populate a 10-million row test table
         -- DO NOT RUN THIS CODE ON A PRODUCTION SERVER!!!! 
     SELECT TOP 10000000
            IDENTITY(INT,1,1) AS RowNum,
            CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT) AS UserID, --1 thru 50000 randomly
            CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))
          + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS SomeValue, --'AA' thru 'ZZ' randomly
            CAST(' A column for kicks ' AS VARCHAR(30)) AS Kicks,
            'Still another column just for proofing' AS StillAnother,
            CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY) AS SomeNumber, --0.0000 thru 99.9999 randomly
            CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME) AS ADate --(>=01/01/2000  <01/01/2010)
       INTO dbo.BigTest
       FROM Master.dbo.SysColumns sc1,
            Master.dbo.SysColumns sc2,
            Master.dbo.SysColumns sc3
      PRINT 'Adding the calculated column'
    GO
    --===== Add a calculated column
      ALTER TABLE dbo.BigTest 
         ADD CalcCol AS LEN(Kicks)
      PRINT 'Adding the primary key'
    GO
    --===== A table is not properly formed unless a Primary Key has been assigned
      ALTER TABLE dbo.BigTest
            ADD PRIMARY KEY NONCLUSTERED (RowNum)
      PRINT 'Adding an index to the calculated column'
    GO
    --===== Add an index to the calculated column
     CREATE INDEX IX_BigTest_CalcCol
            ON dbo.BigTest (CalcCol)
      PRINT 'Wounding the "Kicks" column with spaces'
    GO
    --===== Change every 10th row to have a random number of spaces from 0 to 9
     UPDATE dbo.BigTest
        SET Kicks = SPACE(CAST(RAND(CAST(NEWID() AS VARBINARY))*10 AS INT))
      WHERE RowNum%10 = 0
      PRINT 'Add the index on the "Kicks" column'
    --===== Add an index to the "Kicks" column (just for "Kicks" and to demo indexing)
     CREATE INDEX IX_BigTest_Kicks
            ON dbo.BigTest (Kicks)
      PRINT 'Heres the stats on how many different groups of spacing there are'
    GO
    --===== Prove the spaces are there including "0" spaces which is an empty string
     SELECT '|'+Kicks+'|'        AS String,
            LEN('|'+Kicks+'|')-2 AS NumberOfSpaces,
            LEN(Kicks)           AS Length,
            COUNT(*)             AS NumberPresent
       FROM dbo.BIGTEST
      WHERE RowNum%10 = 0
      GROUP BY '|'+Kicks+'|',LEN('|'+Kicks+'|')-2,LEN(Kicks)
      ORDER BY '|'+Kicks+'|' DESC
    GO

    ... and here's the code that does the actual testing...

    /*************************************************************************************************/
    --===== Create some performance measuring variables
    DECLARE @StartTime DATETIME --Start of server duration time for run
    DECLARE @EndTime   DATETIME --Start of server duration time for run
    --=================================================================================================
    --===== Clear the cache, do a measured run for CalcCol > 0
      PRINT REPLICATE('=',78)
       DBCC DROPCLEANBUFFERS
       DBCC FREEPROCCACHE
        SET @StartTime = GETDATE()
    -------------------------------
     SELECT COUNT(*)
       FROM dbo.BigTest
      WHERE CalcCol > 0
    -------------------------------
        SET @EndTime = GETDATE()
      PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for CalcCol > 0'
    --=================================================================================================
    --===== Clear the cache, do a measured run for LEN(Kicks) > 0
      PRINT REPLICATE('=',78)
       DBCC DROPCLEANBUFFERS
       DBCC FREEPROCCACHE
        SET @StartTime = GETDATE()
    -------------------------------
     SELECT COUNT(*)
       FROM dbo.BigTest
      WHERE LEN(Kicks) > 0
    -------------------------------
        SET @EndTime = GETDATE()
      PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for LEN(Kicks) > 0'
    --=================================================================================================
    --===== Clear the cache, do a measured run for Kicks > ''
      PRINT REPLICATE('=',78)
       DBCC DROPCLEANBUFFERS
       DBCC FREEPROCCACHE
        SET @StartTime = GETDATE()
    -------------------------------
     SELECT COUNT(*)
       FROM dbo.BigTest
      WHERE Kicks > ''
    -------------------------------
        SET @EndTime = GETDATE()
      PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for Kicks > '''''
      PRINT REPLICATE('=',78)
    --=================================================================================================
    --===== Clear the cache, do a measured run for LEN(LTRIM(RTRIM(Kicks))) > 0
      PRINT REPLICATE('=',78)
       DBCC DROPCLEANBUFFERS
       DBCC FREEPROCCACHE
        SET @StartTime = GETDATE()
    -------------------------------
     SELECT COUNT(*)
       FROM dbo.BigTest
      WHERE LEN(LTRIM(RTRIM(Kicks))) > 0
    -------------------------------
        SET @EndTime = GETDATE()
      PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for LEN(LTRIM(RTRIM(Kicks))) > 0'

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • '5 spaces' = ''

    Is it so difficult to test it?

    DECLARE @a char(5), @b-2 varchar(5)

    SET @a = ''

    SET @b-2 = ''

    SELECT DATALENGTH(@A), DATALENGTH(@B)

    WHERE @a = @b-2

    DATALENGTH shows you real number of bytes the value takes.

    _____________
    Code for TallyGenerator

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply