Like Operator.

  • Im trying to find ssn's that do not have a complete number, i.e. 213 instead of 213-12-3456 ect. all the numbers i need to find start with and only have the first 3 characters, how can i write this in a like statement?

    Would something like this work?

    select first_name

    ,last_name

    ,government_id

    from people

    where government_id like '[0-9][0-9][0-9]'

  • Perhaps, something like this...

    select first_name

    ,last_name

    ,government_id

    from people

    where government_id like '[0-9][0-9][0-9]-%'

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The ssn im looking for have ONLY 3 digits, not the full 9. So i dont need to liook further than the first three digits?

  • Oh, sorry... yes, your query is the ticket... mine is wrong.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For the sake of discussion...

    I have in the past done this:

    select first_name

    ,last_name

    ,government_id

    from people

    where len(government_id ) < 9

    Would this be faster, slower?

  • The LIKE will be just a tiny bit faster in this case because it returns only the rows that have precisely 3 digits... the LEN()<9 will return things that have 1 or 2 characters (actually, up to 8 characters) which may be more rows. The two methods are not functionally identical and rely on the condition of the data to come up with the same answer.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Edited for correctness.

    And if there's an index on government_id, the like might allow the optimiser to use it for a partial scan, the len (since it's a function on a column) will force a full scan.

    Depending on a really large number of factors, including the data values in the column

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Actually, both will do an index scan, I believe...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Not sure. I'll have to check. (Tomorrow. Is bed time here. I shouldn't post this late at night)

    Since the like determines that the first character is between 0 and 9, the optimiser might be able to do a partial scan, depending, of course, on what else is in the table. (emphasis on might. Thinking a second time, I'm not sure it's that smart)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Heh... no need :blush: ... dunno what I was thinking about, but I was wrong... the LIKE will use and Index Seek... the LEN will not... here's the code...

    DROP TABLE JBMTest

    GO

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeDigits = CAST(ABS(CHECKSUM(NEWID()))%50000+1 AS VARCHAR(10))

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    -- Takes about 12 seconds to execute.

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Add an index...

    CREATE INDEX IX_JBMTest_SomeDigits

    ON dbo.JBMTest (SomeDigits)

    GO

    SET STATISTICS TIME ON

    SET STATISTICS IO ON

    --===== This does not...

    SELECT RowNum,SomeDigits

    FROM dbo.JBMTest

    WHERE LEN(SomeDigits) < 4

    PRINT '----------------------------------------------'

    --===== This uses an index seek

    SELECT RowNum,SomeDigits

    FROM dbo.JBMTest

    WHERE SomeDigits LIKE '[0-9][0-9][0-9]'

    BUT... take a look at the execution times and the CPU usage... the Index Seek is twice as slow as the Index Scan in this case (3 times slower if you look at the CPU time), even though the Index Scan returned more rows...

    Surprise, surprise... in the presence of the "correct" index, LEN() took less time than the LIKE...

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 3 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (19773 row(s) affected)

    Table 'JBMTest'. Scan count 1, logical reads 2324, physical reads 0, read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 503 ms.

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 503 ms.

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 503 ms.

    ----------------------------------------------

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (17723 row(s) affected)

    Table 'JBMTest'. Scan count 1, logical reads 2324, physical reads 0, read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 797 ms, elapsed time = 984 ms.

    SQL Server Execution Times:

    CPU time = 797 ms, elapsed time = 984 ms.

    SQL Server Execution Times:

    CPU time = 797 ms, elapsed time = 984 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • P.S.

    Technically speaking, the LEN() example does, in fact, come up with the WRONG answer if you are looking for precisely 3 digits...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/19/2007)


    BUT... take a look at the execution times and the CPU usage... the Index Seek is twice as slow as the Index Scan in this case (3 times slower if you look at the CPU time), even though the Index Scan returned more rows...

    Interesting. Showing (again) that sometimes an index scan IS the best way to run a query.

    Did you try that on 2000 or 2005? I want to take a look at the exec plan, see if I can figure out why the seek is taking longer. One thing I notice is that the reads for the two are the same. Implies that the 'seek' read all of the leaf pages, just like the scan did.

    The seek would probably perform better than the scan if there were rows in the table that didn't start with a digit.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It was in 2k... Heh... I'm thinking that there's a hidden "bookmark" that didn't show up in the EP. But, correct, it does show that just because an index is used, it doesn't always mean faster or better... can't put my fingers on the code just now, but I've seen where a Table Scan proved to be a bit faster than an Index Seek... of course, it wasn't a clustered index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'll try it out in 2005, see if the results are different.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Finally got round to trying on sql 2005. (SP 2 developer edition)

    The LEN - index scan. Rated by the execution plan as 98% of the overall cost of the two queries

    Table 'JBMTest'. Scan count 1, logical reads 2333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 472 ms.

    ----------------------------------------------

    The Like - index seek. Rated by the execution plan as 2% of the overall cost of the two queries

    Table 'JBMTest'. Scan count 1, logical reads 2333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 672 ms, elapsed time = 691 ms.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply