Which is more efficient? WHERE SUBSTRING = ...or... WHERE fldField LIKE 'x%'

  • Jeff Moden (10/15/2015)


    To be honest, I trust no one that uses pure RBAR to create test data nor do I trust fixed width/grooved test data where the data is usually variable in length in real life. It's almost always an indication that something else will be missed or done incorrectly that will end up in the results being skewed.

    Yes, this too. I'll take my hat off to anyone who is prepared to put some effort into a piece of work such as this, but there are some odd features in the article. I might not write off a comparison just because the data setup used RBAR methods, but I have real trouble with the conclusions in this article because of the omission of the SARGable LIKE case. It matters. You can't just state "CHARINDEX is faster than LIKE" without this caveat.

    None of the queries shown are SARGable and most (if not all of them) use OR in the filter - which can result in some very funky execution plans using table spools. Also, pretty much every query except the LIKE query uses a combination of functions. The playing field is not level. Personally I take the results & conclusions with a hefty pinch of salt.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Did you read Matt Miller's comment a few posts above? Try making the query SARGable, re test and post the results.

    With all due respect, anyone is just as capable of taking the code from his blog, making the query SARGable, running it against SQL 2014, and posting the results.

    It's the people who are disputing the results who should be proving them wrong. If you think LIKE runs faster, well, prove it. 🙂

    Don't shoot the messenger just because I thought it was a good article and posted the link.

  • _watching (10/16/2015)


    Did you read Matt Miller's comment a few posts above? Try making the query SARGable, re test and post the results.

    With all due respect, anyone is just as capable of taking the code from his blog, making the query SARGable, running it against SQL 2014, and posting the results.

    It's the people who are disputing the results who should be proving them wrong. If you think LIKE runs faster, well, prove it. 🙂

    Don't shoot the messenger just because I thought it was a good article and posted the link.

    So why don't you pick up the baton? All of us who have contributed to this dispute know for sure that LIKE can be SARGable.

    As an aside, a recent tuning win involved changing this:

    WHERE LEFT(a.somecolumn,6) = b.someothercolumn --(a VARCHAR(6))

    to this:

    WHERE a.somecolumn LIKE b.someothercolumn + '%'

    The execution time improved from about 25 minutes to milliseconds.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • _watching (10/16/2015)


    Did you read Matt Miller's comment a few posts above? Try making the query SARGable, re test and post the results.

    With all due respect, anyone is just as capable of taking the code from his blog, making the query SARGable, running it against SQL 2014, and posting the results.

    It's the people who are disputing the results who should be proving them wrong. If you think LIKE runs faster, well, prove it. 🙂

    Don't shoot the messenger just because I thought it was a good article and posted the link.

    Folks who lurk around ssc for any length of time quickly pick up the importance of testing, because the answer to many questions is all too often "it depends". With so many folks posting on-the-fly performance comparison scripts here, it quickly becomes second nature to knock them up during development at work.

    Here's a very simple script which nocks up a million rows of data with probably insufficient row variability to please purists such as Jeff, but I think it's more than sufficient to meet the requirements for the test. The results, which speak for themselves, are at the bottom:

    ----------------------------------------------------------------------------------------------

    -- Data setup

    ----------------------------------------------------------------------------------------------

    DROP TABLE #TestData

    ;WITH

    E1 (x) AS (SELECT 0 FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (x)),

    E2 (x) AS (SELECT 0 FROM E1 a, E1 b),

    E4 (x) AS (SELECT 0 FROM E2 a, E2 b),

    iTally AS (SELECT n = CHECKSUM(ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1) FROM E4 a, E2 b)

    SELECT

    n,

    aDate = DATEADD(HOUR,n,'19100101'),

    IntValue = CHECKSUM(NEWID()),

    CharValue = CAST(NEWID() AS CHAR(36))

    INTO #TestData

    FROM iTally

    CREATE UNIQUE CLUSTERED INDEX ucx_n ON #TestData (n)

    CREATE INDEX ix_CharValue ON #TestData (CharValue)

    ----------------------------------------------------------------------------------------------

    -- Tests

    ----------------------------------------------------------------------------------------------

    -- Pick a row at random, grab the first 8 characters from CharValue

    DECLARE @StringToFind CHAR(8)

    SELECT TOP 1

    @StringToFind = LEFT(CharValue,8)

    FROM #TestData ORDER BY NEWID()

    PRINT 'LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT CharValue FROM #TestData WHERE CharValue LIKE @StringToFind+'%'

    SET STATISTICS IO, TIME OFF

    PRINT 'SUBSTRING =================================================='

    SET STATISTICS IO, TIME ON

    SELECT CharValue FROM #TestData WHERE SUBSTRING(CharValue,1,8) = @StringToFind

    SET STATISTICS IO, TIME OFF

    PRINT 'LEFT =================================================='

    SET STATISTICS IO, TIME ON

    SELECT CharValue FROM #TestData WHERE LEFT(CharValue,8) = @StringToFind

    SET STATISTICS IO, TIME OFF

    PRINT 'CHARINDEX =================================================='

    SET STATISTICS IO, TIME ON

    SELECT CharValue FROM #TestData WHERE CHARINDEX(@StringToFind,CharValue) > 0

    SET STATISTICS IO, TIME OFF

    ----------------------------------------------------------------------------------------------

    -- Results:

    ----------------------------------------------------------------------------------------------

    --LIKE ==================================================

    --Table '#TestData'. Scan count 1, logical reads 4.

    -- SQL Server Execution Times:

    -- CPU time = 0 ms, elapsed time = 0 ms.

    --SUBSTRING ==================================================

    --Table '#TestData'. Scan count 9, logical reads 5727.

    -- SQL Server Execution Times:

    -- CPU time = 376 ms, elapsed time = 52 ms.

    --LEFT ==================================================

    --Table '#TestData'. Scan count 9, logical reads 5727.

    -- SQL Server Execution Times:

    -- CPU time = 376 ms, elapsed time = 53 ms.

    --CHARINDEX ==================================================

    --Table '#TestData'. Scan count 9, logical reads 5727.

    -- SQL Server Execution Times:

    -- CPU time = 1544 ms, elapsed time = 206 ms.

    --------------------------------------------------------------

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • _watching (10/15/2015)


    Matt Miller (#4) (10/15/2015)


    ChrisM@Work (10/15/2015)


    _watching (10/14/2015)


    If you're interested in updated speed comparisons on how fast each is (well, on SQL 2014 anyway), this ➡ LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX article with benchmarks makes for a great read.

    ChrisM@Work (10/7/2015)


    With no help whatsoever from indexing, it's interesting to see that using the function is faster than not using it, but otherwise I am not sure what else it proves. With indexing though - LIKE wins fairly handily.

    Hmmm... not why you're saying LIKE wins fairly handily with indexing where according to the page, the table with the first results on a clustered index column, LIKE is more than 4x slower.

    ??

    The nature of the query makes it so that the indexing is not used at all. You can see that in the exec plan. I actually was about to post exactly this on Dave Lozinsky's page - but it's locked down with Disqus (which I happen to hate given its propensity to spam my inbox with irrelevant junk).

    I'd be happy to post my findings a bit later today (still working), but they fall in line with what Chris M posted.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I altered the test from Chris and run it on my laptop with the following results.

    DECLARE @StringToFind CHAR(8), @Dummy CHAR(36)

    SELECT TOP 1

    @StringToFind = LEFT(CharValue,8)

    FROM #TestData ORDER BY NEWID()

    CHECKPOINT

    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    PRINT 'LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = CharValue FROM #TestData WHERE CharValue LIKE @StringToFind+'%'

    SET STATISTICS IO, TIME OFF

    CHECKPOINT

    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    PRINT 'SUBSTRING =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = CharValue FROM #TestData WHERE SUBSTRING(CharValue,1,8) = @StringToFind

    SET STATISTICS IO, TIME OFF

    CHECKPOINT

    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    PRINT 'LEFT =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = CharValue FROM #TestData WHERE LEFT(CharValue,8) = @StringToFind

    SET STATISTICS IO, TIME OFF

    CHECKPOINT

    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    PRINT 'CHARINDEX =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = CharValue FROM #TestData WHERE CHARINDEX(@StringToFind,CharValue) = 1

    SET STATISTICS IO, TIME OFF

    --====================================================================================================

    --Results

    --====================================================================================================

    --LIKE ==================================================

    --SQL Server parse and compile time:

    -- CPU time = 1 ms, elapsed time = 1 ms.

    --Table '#TestData'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- SQL Server Execution Times:

    -- CPU time = 0 ms, elapsed time = 20 ms.

    --SUBSTRING ==================================================

    --SQL Server parse and compile time:

    -- CPU time = 0 ms, elapsed time = 1 ms.

    --Table '#TestData'. Scan count 1, logical reads 5720, physical reads 0, read-ahead reads 5672, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- SQL Server Execution Times:

    -- CPU time = 141 ms, elapsed time = 857 ms.

    --LEFT ==================================================

    --SQL Server parse and compile time:

    -- CPU time = 0 ms, elapsed time = 2 ms.

    --Table '#TestData'. Scan count 1, logical reads 5720, physical reads 0, read-ahead reads 5672, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- SQL Server Execution Times:

    -- CPU time = 234 ms, elapsed time = 865 ms.

    --CHARINDEX ==================================================

    --SQL Server parse and compile time:

    -- CPU time = 0 ms, elapsed time = 2 ms.

    --Table '#TestData'. Scan count 1, logical reads 5720, physical reads 0, read-ahead reads 5672, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- SQL Server Execution Times:

    -- CPU time = 515 ms, elapsed time = 934 ms.

    The changes were to ensure the CHARINDEX found the string at the start and to reduce advantage from data in buffers or cache.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Renée (6/23/2011)


    Hi all,

    I'm a SQL newbie who sort of inherited a 2005 db. I've got a query that returns a SUBSTRING of a field value in one column. I need to run this query today to only pull records that have a specific substring in that column.

    The column containing the values is assigned an alias in the SELECT statement, and I've now learned the SELECT statement isn't processed until after the WHERE statement (yes, that's how new I am!!), so [Column Alias] = 'NCB' won't work in the WHERE statement.

    So, which is more efficient?

    (1)... WHERE SUBSTRING(fldField,1,3) = 'NCB'

    or

    (2)... WHERE fldField LIKE 'NCB%'

    ...or maybe something else I haven't thought of?

    I know having the wildcard at the beginning of a LIKE pattern is slower, but since that's not the case here, I have no clue which of the two options above are faster for the SQL engine.

    Thanks so much!

    Regardless of whether you use LIKE or SUBSTRING, either way you're doing a full table scan. However, if we can turn this into a partial index scan, then that's a game changer.

    create table #T ( fldField varchar(30) not null );

    Unfortunately we can't use SUBSTRING or LIKE when creating a filtered index.

    create index ix_T_is_NCB on #T (fldField)

    where SUBSTRING(fldField,1,3) = 'NCB'

    or fldField LIKE 'NCB%';

    Msg 10735, Level 15, State 1, Line 6

    Incorrect WHERE clause for filtered index 'ix_T_is_NCB' on table '#T'.

    However, we can add a non-persisted computed column that returns 0 or 1 to indicate of the field contains 'NCB', and then index that.

    alter table #T add is_NCB

    as (case when SUBSTRING(fldField,1,3) = 'NCB'

    OR fldField LIKE 'NCB%' then 1 else 0 end);

    create index ix_T_is_NCB on #T ( is_NCB );

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • ChrisM@Work (10/16/2015)


    Jeff Moden (10/15/2015)


    To be honest, I trust no one that uses pure RBAR to create test data nor do I trust fixed width/grooved test data where the data is usually variable in length in real life. It's almost always an indication that something else will be missed or done incorrectly that will end up in the results being skewed.

    Yes, this too. I'll take my hat off to anyone who is prepared to put some effort into a piece of work such as this, but there are some odd features in the article. I might not write off a comparison just because the data setup used RBAR methods, but I have real trouble with the conclusions in this article because of the omission of the SARGable LIKE case. It matters. You can't just state "CHARINDEX is faster than LIKE" without this caveat.

    None of the queries shown are SARGable and most (if not all of them) use OR in the filter - which can result in some very funky execution plans using table spools. Also, pretty much every query except the LIKE query uses a combination of functions. The playing field is not level. Personally I take the results & conclusions with a hefty pinch of salt.

    Exactly, especially on the non-SARGable stuff. And, to be sure, I don't just write-off such articles no matter how poorly the test harness is written but it does usually indicate that there's going to be something wrong with the rest of the test harness and the conclusions derived from it. To date, the ratio I've seen is about 7 out of 10 that use RBAR to gen test data come up with either the wrong conclusion or missed/partially incorrect conclusions.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • _watching (10/16/2015)


    Did you read Matt Miller's comment a few posts above? Try making the query SARGable, re test and post the results.

    With all due respect, anyone is just as capable of taking the code from his blog, making the query SARGable, running it against SQL 2014, and posting the results.

    It's the people who are disputing the results who should be proving them wrong. If you think LIKE runs faster, well, prove it. 🙂

    We're in the process of doing just that.

    Don't shoot the messenger just because I thought it was a good article and posted the link.

    Have to. It's a tradition... 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Isn't it also the case that using LIKE allows the QO to use statistics to estimate the selectivity of the query.

    According to this White Paper: https://msdn.microsoft.com/en-us/library/dd535534(v=sql.100).aspx

    String Summary Statistics

    SQL Server 2008 includes patented technology for estimating the selectivity of LIKE conditions. It builds a statistical summary of substring frequency distribution for character columns (a string summary). This includes columns of type text, ntext, char, varchar, and nvarchar. Using the string summary, SQL Server can accurately estimate the selectivity of LIKE conditions where the pattern may have any number of wildcards in any combination. For example, SQL Server can estimate the selectivity of predicates of the following form.

  • Column LIKE 'string%'
  • Column LIKE '%string'
  • Column LIKE '%string%'
  • Column LIKE 'string'
  • Column LIKE 'str_ing'
  • Column LIKE 'str[abc]ing'
  • Column LIKE '%abc%xy'
  • If there is a user-specified escape character in a LIKE pattern (that is, the pattern is of the form LIKE pattern ESCAPE escape_character), SQL Server 2008 guesses selectivity.

    This is an improvement over SQL Server 2000, which uses a guess for selectivity when any wildcard other than a trailing wildcard % is used in the LIKE pattern, and it has limited accuracy in its estimates in that case.

    The String Index field in the first row set returned by DBCC SHOW_STATISTICS includes the value YES if the statistics object also includes a string summary. The contents of the string summary are not shown. The string summary includes additional information beyond what is shown in the histogram.

    For strings longer than 80 characters, the first and last 40 characters are extracted from the string and concatenated prior to considering the string in the creation of the string summary. Hence, accurate frequency estimates for substrings that appear only in the ignored portion of a string are not available.

    Which means that using LIKE not only makes the predicate SARGable, but also allows for the use of statistics when deciding on an execution plan.

    This can be seen in the test script provided by Chris, although it is a simple one table select.

    The attached images show row estimates for the LIKE (which actually becomes a greater than / less than query) , but not for the LEFT (which becomes a substring query).

    I haven't come across any similar papers for other versions of SQL, but the images are taken from SQL 2016.

    It specifically references SQL 2008 compared to SQL 2000, so I don't know if the same holds true for SQL 2005, and don't have it handy to test.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • As promised, if you make is so the indexing can help to some degree:

    LIKE test against clustered indexed column 126 milliseconds

    SUBSTRING test against clustered indexed column330 milliseconds

    LEFT/RIGHT test against clustered indexed column316 milliseconds

    CHARINDEX test against clustered indexed column170 milliseconds

    Keep in mind - the INSERT in the test is a big part of the time. If you to check the differences on filtering - perhaps cut the insert step out:

    LIKE test against clustered indexed column 43 milliseconds

    SUBSTRING test against clustered indexed column90 milliseconds

    LEFT/RIGHT test against clustered indexed column83 milliseconds

    CHARINDEX test against clustered indexed column130 milliseconds

    In any case - this doesn't show that LIKE is the clear winner either, just means I fashioned a use case that works well for it. I think it just goes to show our usual findings... IT DEPENDS.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • mister.magoo (10/16/2015)


    Isn't it also the case that using LIKE allows the QO to use statistics to estimate the selectivity of the query.

    According to this White Paper: https://msdn.microsoft.com/en-us/library/dd535534(v=sql.100).aspx

    String Summary Statistics

    SQL Server 2008 includes patented technology for estimating the selectivity of LIKE conditions. It builds a statistical summary of substring frequency distribution for character columns (a string summary). This includes columns of type text, ntext, char, varchar, and nvarchar. Using the string summary, SQL Server can accurately estimate the selectivity of LIKE conditions where the pattern may have any number of wildcards in any combination. For example, SQL Server can estimate the selectivity of predicates of the following form.

  • Column LIKE 'string%'
  • Column LIKE '%string'
  • Column LIKE '%string%'
  • Column LIKE 'string'
  • Column LIKE 'str_ing'
  • Column LIKE 'str[abc]ing'
  • Column LIKE '%abc%xy'
  • If there is a user-specified escape character in a LIKE pattern (that is, the pattern is of the form LIKE pattern ESCAPE escape_character), SQL Server 2008 guesses selectivity.

    This is an improvement over SQL Server 2000, which uses a guess for selectivity when any wildcard other than a trailing wildcard % is used in the LIKE pattern, and it has limited accuracy in its estimates in that case.

    The String Index field in the first row set returned by DBCC SHOW_STATISTICS includes the value YES if the statistics object also includes a string summary. The contents of the string summary are not shown. The string summary includes additional information beyond what is shown in the histogram.

    For strings longer than 80 characters, the first and last 40 characters are extracted from the string and concatenated prior to considering the string in the creation of the string summary. Hence, accurate frequency estimates for substrings that appear only in the ignored portion of a string are not available.

    Which means that using LIKE not only makes the predicate SARGable, but also allows for the use of statistics when deciding on an execution plan.

    This can be seen in the test script provided by Chris, although it is a simple one table select.

    The attached images show row estimates for the LIKE (which actually becomes a greater than / less than query) , but not for the LEFT (which becomes a substring query).

    I haven't come across any similar papers for other versions of SQL, but the images are taken from SQL 2016.

    It specifically references SQL 2008 compared to SQL 2000, so I don't know if the same holds true for SQL 2005, and don't have it handy to test.

    Thanks MM. On first reading this I figured it would be trivial to set up a test or two employing a two-table join to amplify the efficiency of LIKE over CHARINDEX. Although the outcome is generally as you would expect, the route taken is far from predictable – at least for me. One test uses a string comparison to a unique column as before, the other uses a string comparison to a column containing 26 distinct strings. There are three uses cases within each test: LIKE, CHARINDEX and “BETWEEN & LIKE” - since LIKE uses a range seek and a residual predicate (as you pointed out) I figured it would be fun to explicitly code for this, and in practice it’s generally faster than LIKE alone.

    Have fun 🙂

    ----------------------------------------------------------------------------------------------

    -- Data setup

    ----------------------------------------------------------------------------------------------

    DROP TABLE #TestData

    ;WITH

    E1 (x) AS (SELECT 0 FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (x)),

    E2 (x) AS (SELECT 0 FROM E1 a, E1 b),

    E4 (x) AS (SELECT 0 FROM E2 a, E2 b),

    iTally AS (SELECT n = CHECKSUM(ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1) FROM E4 a, E2 b)

    SELECT

    i.n,

    aDate = DATEADD(HOUR,i.n,'19100101'),

    IntValue = CHECKSUM(NEWID()),

    x.Stringy,

    CharValue = CAST(REVERSE(NEWID()) AS VARCHAR(36))

    INTO #TestData

    FROM iTally i

    CROSS APPLY (

    SELECT TOP 1 Stringy

    FROM (VALUES

    ('Onestringy'),('twostringy'),('threestringy'),('fourstringy'),('fivestringy') ,('sixstringy') ,('sevenstringy') ,('eightstringy'),('ninestringy'),('tenstringy'),

    ('elevenstringy'),('twelvestringy'),('thirteenstringy'),('fourteenstringy'),('fifteenstringy'),('sixteenstringy'),('seventeenstringy'),('eighteenstringy'),('nineteenstringy'),('twentystringy'),

    ('twentyonestringy'),('twentytwostringy'),('twentythreestringy'),('twentyfourstringy'),('twentyfivestringy'),('twentysixstringy')

    ) d (Stringy)

    ORDER BY n % 2, NEWID()

    ) x

    OPTION (MAXDOP 1)

    CREATE UNIQUE CLUSTERED INDEX ucx_n ON #TestData (n)

    CREATE UNIQUE INDEX ix_CharValue ON #TestData (CharValue)

    CREATE UNIQUE INDEX ix_n_CharValue ON #TestData (n, CharValue)

    CREATE UNIQUE INDEX ix_n ON #TestData (n)

    CREATE INDEX ix_Stringy ON #TestData (Stringy)

    CREATE UNIQUE INDEX ix_n_Stringy ON #TestData (n, Stringy)

    ----------------------------------------------------------------------------------------------

    -- Test 1: Unique values in character column

    ----------------------------------------------------------------------------------------------

    DECLARE @StringToFind VARCHAR(10), @Dummy VARCHAR(36)

    SELECT TOP 1

    @StringToFind = LEFT(CharValue,10)

    FROM #TestData ORDER BY NEWID()

    SELECT @StringToFind

    PRINT ''

    PRINT 'LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE t1.CharValue LIKE @StringToFind+'%'

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    PRINT ''

    PRINT 'RANGE & LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE t1.CharValue >= @StringToFind AND t1.CharValue < @StringToFind+CHAR(255)

    AND t1.CharValue LIKE @StringToFind+'%'

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    PRINT ''

    PRINT 'CHARINDEX =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE CHARINDEX(@StringToFind,t1.CharValue) > 0

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    ----------------------------------------------------------------------------------------------

    -- Test 2: Non-unique values in character column

    ----------------------------------------------------------------------------------------------

    DECLARE @StringToFind VARCHAR(10), @Dummy VARCHAR(36)

    SELECT TOP 1

    @StringToFind = LEFT(Stringy,10)

    FROM #TestData ORDER BY NEWID()

    SELECT @StringToFind, COUNT(*) FROM #TestData WHERE Stringy LIKE @StringToFind+'%'

    PRINT ''

    PRINT 'LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE t1.Stringy LIKE @StringToFind+'%'

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    PRINT ''

    PRINT 'RANGE & LIKE =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE t1.Stringy >= @StringToFind AND t1.Stringy < @StringToFind+CHAR(255)

    AND t1.Stringy LIKE @StringToFind+'%'

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    PRINT ''

    PRINT 'CHARINDEX =================================================='

    SET STATISTICS IO, TIME ON

    SELECT @Dummy = t2.CharValue

    FROM #TestData t1

    INNER JOIN #TestData t2 ON t2.n = t1.n

    WHERE CHARINDEX(@StringToFind,t1.CharValue) > 0

    OPTION(RECOMPILE)

    SET STATISTICS IO, TIME OFF

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Viewing 12 posts - 16 through 26 (of 26 total)

    You must be logged in to reply to this topic. Login to reply