Case when index scan faster then index seek (see example)

  • Hi,

    we had case today  where query   with index seek   provided  much   worst performance compare to   index scan.

    original query using 40+ union  and 5+ intersect ( it is machine build by java script..)

    I used AdventureWorks2017  to recreate similar scenario

    Step 1 

    create nonclustered index IDX_AG_001 on [Person].[Address] (AddressLine1) with(online =on)

    Step 2

    set statistics time on

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    [AddressLine1] like '6%e'

    intersect

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    [AddressLine1] like '6%e'

    intersect

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    [AddressLine1] like '6%e'

    intersect

    (SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    [AddressLine1] like '6%e'

    union

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    [AddressLine1] like '6%e'

    )

    set statistics time off

    on my laptop it provide  index seek with stat

    SQL Server Execution Times: CPU time = 46 ms, elapsed time = 247 ms.

    (original query was stopped after 15 minutes of execution)

     

    Then application changed code to 

    Step 2 

    set statistics time on

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    left([AddressLine1],1) ='6'

    and

    right([AddressLine1],1) = 'e'

    intersect

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    left([AddressLine1],1) ='6'

    and

    right([AddressLine1],1) = 'e'

    intersect

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    left([AddressLine1],1) ='6'

    and

    right([AddressLine1],1) = 'e'

    intersect

    (SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    left([AddressLine1],1) ='6'

    and

    right([AddressLine1],1) = 'e'

    union

    SELECT

    [AddressLine1]

    FROM [Person].[Address]

    where

    left([AddressLine1],1) ='6'

    and

    right([AddressLine1],1) = 'e'

    )

    set statistics time off

    on my laptop it provide  index scan  with stat

     SQL Server Execution Times: CPU time = 16 ms, elapsed time = 169 ms.

    ( original query executed 363 ms)

    Question

    Why  second query runs faster ?

    Thank you

     

     

     

     

     

     

     

     

    • This topic was modified 4 years, 10 months ago by  ebooklub.
    • This topic was modified 4 years, 10 months ago by  ebooklub.
    • This topic was modified 4 years, 10 months ago by  ebooklub.
  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • It's not just a seek versus a scan issue here. There are a whole slew of differences between these two queries, even though it looks like the result sets are the same. If you look at the two plans, the second one is filtering information as it retrieves it and/or through the join operations. The first one on the other hand is running Filter operators to remove data after retrieving it.

    So, despite the fact that one has more reads, caused by the scan, the extra CPU time for the Filter operations is resulting in a longer execution time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Seek and Scan is not under your control unless you specifically instruct SQL Server Optimizer to do it using Hints. In the above case it has nothing to do with scan and seek because hints are nowhere mentioned. The difference in the Execution Time is due to Filter clause (WHERE clause). In the first statement you have done wildcard search using LIKE whereas in the second statement exact search with comparatively strong filter clause. In my opinion the second statement should perform better.

    I'm wondering why you need so many UNION's and INTERSECTION's? I can see in each query only a single column ADDRESSLINE1 is used both in SELECT as well as WHERE. I think you can achieve the same result by avoiding UNION and INTERSECTION. Your existing query will cause overhead both in terms of Disk IO as well as CPU due to repeated task of scanning (Logical & Physical) the same dataset/pages again and again.

  • Thanks everybody for help.

    I  was assigned to project to check  why Batch processing  takes 100% of CPU

    Using query store I found that during batch processing  job it calls from 50  to  200 + queries where only difference is number of unions and intersect operations, because  of it  each query get unique query_id

    SELECT T0.product_pk

    FROM products T0

    WHERE(T0.product_pk IN

    (

    (

    select

    product01

    from

    customer as customer_7

    WHERE((customer_7.p_validation LIKE 'productsssss/13333.0/%.kzpd10007666c'

    OR customer_7.p_desc LIKE 'Dumbler/KZKTD/%.kzpd10007666c')

    AND customer_7.p_condition = '8559317127113')

    AND (customer_7.CodeP = @P0)

    )

    INTERSECT

    (

    select

    product01

    from

    customer as customer_8

    WHERE((customer_8.p_validation LIKE 'productsssss/13333.0/%.kzpd10007666a'

    OR customer_8.p_desc LIKE 'Dumbler/KZKTD/%.kzpd10007666a')

    AND customer_8.p_condition = '8559317127113')

    AND (customer_8.CodeP = @P1)

    )

    INTERSECT

    (

    select

    product01

    from

    customer as customer_9

    WHERE((customer_9.p_validation LIKE 'productsssss/13333.0/%.kzpd10007666d'

    OR customer_9.p_desc LIKE 'Dumbler/KZKTD/%.kzpd10007666d')

    AND customer_9.p_condition = '8559317127113')

    AND (customer_9.CodeP = @P2)

    union

    select

    product01

    from

    customer as customer_9

    WHERE((customer_7.p_validation LIKE 'productsssss/13333.0/%.kzpd10007666k'

    OR customer_9.p_desc LIKE 'Dumbler/KZKTD/%.kzpd10007666k')

    AND customer_9.p_condition = '8559317127113')

    AND (customer_9.CodeP = @P3)

    )

    )

    .. add another 150 unions ...

    )

    I was told told that code generated in application by Java class and people  can't/want to change it ...

    I run

    select

    count(1),

    count(distinct(p_validation)),

    count(distinct(product01)),

    count(distinct(p_condition)),

    count(distinct(CodeP))

    from Customer

    and got following result

    Count(1) = 6885256

    count(distinct(p_validation)) = 504100

    count(distinct(product01)) = 15

    count(distinct(p_condition))  =10

    count(distinct(CodeP)) = 1

    I created  index bellow

    CREATE NONCLUSTERED INDEX [IDX_customer_7_DBA_001] ON [dbo].[customer]([p_validation])INCLUDE ( [product01],[p_condition],[CodeP]) with (online =on)

    index help to reduce CPU from 100% to 11% ,but job would continue to run from 5 to 12 + hrs

    at this point I told developers that  they need to  consider code changes

    one of the developers accessed java class and  changed code use different where clause

    WHERE

    (

    (

    (LEFT(customer_8.p_validation,37) = 'Eproductsssss/13333.0/%.kzpd10007666a' and RIGHT(customer_8.p_validation,13) = 'kzpd10007666a')

    or

    (LEFT(customer_8.p_desc,14) = 'Dumbler/KZKTD/' and RIGHT(customer_8.p_desc,9) = 'kzpd10007666a')

    )

    AND customer_8.p_condition = '8796286059104'

    )

    AND (

    customer_8.CodeP = @P0

    )

    ... and  job execution reduced from 12 hrs to 45 minutes

    Modified  query use read more data but run faster

    As you point out in "the second one is filtering information as it retrieves it and/or through the join operations"

     

     

     

  • All those functions on the columns, LEFT/RIGHT, that's horrible. Before we get to why all the INTERSECT & UNION from the generated code, that stuff is going to inflict pain. Then, 100+ UNION queries is also going to be problematic. Also, all the wild card searches in the middle of strings, same problem. This is seriously bad code. No index is going to fix it. The code needs to be adjusted and, just guessing, you probably have some table & design issues too.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply