Almost same query, different execution time

  • Hi everyone,

    We have sql server 2000 and the execution time for the last query is suprising me. Almost the same query, same execution plan except for the cost of some operation, same data, but different execution timing ? any suggestion ? Thanks a lot.

    SELECT addr.sysrecno,addr.access,addr.address_seq_no

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE addr.address_seq_no = clmk.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE addr.address_seq_no = agar.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM carrier carr

                        WHERE addr.address_seq_no = carr.address_seq_no  ) AND

    ...  01:15 sec approximately

    vs

    SELECT addr.sysrecno,addr.access,addr.address_seq_no

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE clmk.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE agar.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM carrier carr

                        WHERE carr.address_seq_no = addr.address_seq_no  ) AND

    ...  0:04 sec approximately

  • I would run this with the statistics io on.  This will give you the number of scans happening on the tables.  My guess is that the address table is considerably bigger than the other tables and it is scanning that table more in the first query than the second query.  You can also use statistics time on to see the execution time in milleseconds.

    set statistics io on

    set statistics time on

    Execute code here and look at the results tab.

    Let us know.

    Tom

  • I see two potential reasons behind this, (I have done similar testing in pushing the use of JOINs over WHERE clauses). 

    First, the Optimizer will buffer up the plan for these queries.  Try running the second select first and see how your time looks.  This should not account for such a huge difference, though. 

    Second, I have read documentation indicating the you should put the smaller table on the left for an equality clause.  I have no idea about your tables, but it would be my guess that the "address" table may be much larger than the the other three you are using.  If this is the case, (and this does not include any issues of Indexing) it may be that your reads from the smaller tables consumes much less time by putting them to the left of the equality. 

    I wasn't born stupid - I had to study.

  • Also try something like this to see if it doesn't perform better:

    SELECT

     addr.sysrecno,

     addr.access,

     addr.address_seq_no

    FROM

     dbo.address addr

    LEFT JOIN

     dbo.commlmk clmk

    ON

     clmk.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.garacc agar

    ON

     agar.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.carrier carr

    ON

     carr.address_seq_no = addr.address_seq_no

    WHERE

     -- If SQL 2000

     Coalesce(clmk.address_seq_no, agar.address_seq_no, carr.address_seq_no) IS NULL

     -- IF SQL 7

     clmk.address_seq_no IS NULL AND

     agar.address_seq_no IS NULL AND

     carr.address_seq_no IS NULL

  • Hy guys,

    First at all, the table address is not big table ( 20821 rows ). The only two tables bigger than address is runactv (1293406 rows) and tripacc (837146 rows).

    Here is the output with the statistics io on :

    SELECT addr.sysrecno,addr.access,addr.address_seq_no

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE addr.address_seq_no = clmk.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE addr.address_seq_no = agar.address_seq_no  ) AND

    ...

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE addr.address_seq_no = ract.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE addr.address_seq_no = atrp.pu_place OR

                                  addr.address_seq_no = atrp.do_place )

    SQL Server parse and compile time:

       CPU time = 266 ms, elapsed time = 295 ms.

    Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'predaddr'. Scan cnt 3, log reads 279, phys reads 0, read-ahead reads 0.

    Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'carrier'. Scan cnt 1, log reads 3, phys reads 0, read-ahead reads 0.

    Table 'garacc'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'commlmk'. Scan cnt 3, log reads 4, phys reads 0, read-ahead reads 0.

    Table 'address'. Scan cnt 3, log reads 205, phys reads 0, read-ahead 0.

    Table 'custaddr'. Scan cnt 3, log reads 98, phys reads 0, read-ahead 0.

    Table 'runactv'. Scan cnt 170, log reads 34457, phys reads 0, read-ahead 0.

    Table 'tripacc'. Scan cnt 451, logical reads 17378386, physical reads 0, read-ahead reads 48.

    SQL Server Execution Times:

       CPU time = 224125 ms,  elapsed time = 93857 ms.

    VS

    SELECT addr.sysrecno,addr.access,addr.address_seq_no

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE clmk.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE agar.address_seq_no = addr.address_seq_no  ) AND

    ...

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE ract.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE atrp.pu_place = addr.address_seq_no OR

                                  atrp.do_place = addr.address_seq_no )

    Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 0, read-ahead 0.

    Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'carrier'. Scan cnt 1, log reads 3, phys reads 0, read-ahead reads 0.

    Table 'garacc'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.

    Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 0, read-ahead reads 0.

    Table 'address'. Scan cnt 4, log reads 205, phys reads 0, read-ahead 0.

    Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 0, read-ahead 0.

    Table 'runactv'. Scan cnt 170, log reads 34457, phys reads 0, read-ahead 0.

    Table 'tripacc'. Scan cnt 8, log reads 78588, phys reads 70, read-ahead  8351.

    SQL Server Execution Times:

       CPU time = 5454 ms,  elapsed time = 5732 ms.

    SQL Server Execution Times:

       CPU time = 5454 ms,  elapsed time = 5766 ms.

    SQL Server Execution Times:

       CPU time = 5454 ms,  elapsed time = 5767 ms.

    After that i try modify the query putting the small table on the left of the equality. Well, nothing change for the execution time. Still take 1:15 sec

    SELECT addr.sysrecno,addr.access,addr.address_seq_no

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE clmk.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE agar.address_seq_no = addr.address_seq_no  ) AND

    ...

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE addr.address_seq_no = ract.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE addr.address_seq_no = atrp.pu_place OR

                                  addr.address_seq_no = atrp.do_place )

    And the last thing that i try is to modify the query with the left join as suggested but it's worst than the original query execution time. I have to cancel myself the execution of the query after 23 minutes.

    Well, any clue ? Thanks a lot. Tran

     

  • Wait a minute.  The code seems to be changing here.  Your original posting had an additional AND at the end, this new positng now contains OR's.  Is there more we should know? 

    Also, I shyed away from Indexes in my first response.  Might this be an issue here? 

    I wasn't born stupid - I had to study.

  • From looking at the statistics, the problem seems to be with:

    Table 'tripacc'. Scan cnt 451, logical reads 17378386, physical reads 0, read-ahead reads 48

    That's alot of logical reads on that table, hence the time.  Even if the execution plan is the same Sql server is doing different things when it is reading through the data.

    I would mess with the order of the various exists clauses and also with indexes.

    Tom

  • ANd I think the entire segment of code would be usefull for us to understand what you are really doing. You actually should be able to get better preformance out of what I suggest in many cases but I had only a portion of the code to base on.

  • Are you Flushing the CACHE between the two batches?

    menaning

    dbcc dropcleanbuffers

    dbcc freeproccache


    * Noel

  • Hi,

    A mistake of my part. Sorry for that.

    set statistics io on

    set statistics time on

    SELECT count(*)

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE addr.address_seq_no = clmk.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE addr.address_seq_no = agar.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM carrier carr

                        WHERE addr.address_seq_no = carr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM sponsor sps

                        WHERE addr.address_seq_no = sps.address_seq_no   ) AND

           not exists ( SELECT 'x' FROM predaddr caddr

                        WHERE addr.address_seq_no = caddr.address_seq_no ) AND

           not exists ( SELECT 'x' FROM custaddr cuad

                        WHERE addr.address_seq_no = cuad.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM batchcan tbca

                        WHERE addr.address_seq_no = tbca.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE addr.address_seq_no = ract.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE addr.address_seq_no = atrp.pu_place )        AND

           not exists  ( SELECT 'x' FROM tripacc atrp

                        WHERE addr.address_seq_no = atrp.do_place )     

    Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 1, read-ahead 278.

    Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'carrier'. Scan cnt 1, log reads 3, phys reads 1, read-ahead reads 2.

    Table 'garacc'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'address'. Scan cnt 4, log reads 205, phys reads 1, read-ahead 204.

    Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 2, read-ahead reads 2.

    Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 3, read-ahead 97.

    Table 'runactv'. Scan cnt 50, log reads 34166, phys reads 13, read-ahead reads 34074.

    Table 'tripacc'. Scan cnt 452, log reads 17780756, phys reads 70, read-ahead reads 40459.

    SQL Server Execution Times:

       CPU time = 263654 ms,  elapsed time = 105946 ms.

    VS

    set statistics io on

    set statistics time on

    SELECT count(*)

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE clmk.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE agar.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM carrier carr

                        WHERE carr.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM sponsor sps

                        WHERE sps.address_seq_no = addr.address_seq_no   ) AND

           not exists ( SELECT 'x' FROM predaddr caddr

                        WHERE caddr.address_seq_no = addr.address_seq_no ) AND

           not exists ( SELECT 'x' FROM custaddr cuad

                        WHERE cuad.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM batchcan tbca

                        WHERE tbca.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE ract.address_seq_no = addr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE atrp.pu_place = addr.address_seq_no )        AND

           not exists  ( SELECT 'x' FROM tripacc atrp

                        WHERE atrp.do_place = addr.address_seq_no )     

    Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 1, read-ahead 278.

    Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'carrier'. Scan cnt 1, log reads 3, phys reads 1, read-ahead reads 2.

    Table 'garacc'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.

    Table 'address'. Scan cnt 4, log reads 205, phys reads 1, read-ahead 204.

    Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 2, read-ahead reads 2.

    Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 4, read-ahead 97.

    Table 'runactv'. Scan cnt 51, log reads 34169, phys reads 13, read-ahead reads 34070.

    Table 'tripacc'. Scan cnt 8, log reads 80408, phys reads 70, read-ahead reads 40459.

    SQL Server Execution Times:

       CPU time = 3860 ms,  elapsed time = 27156 ms.

  • The big difference is in the exists for tripacc which you make two different hits on.

    I would be curious if this doesn't produce the same effect. '

    SELECT count(*)

    FROM   address addr

    WHERE  not exists ( SELECT 'x' FROM commlmk clmk

                        WHERE addr.address_seq_no = clmk.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM garacc agar

                        WHERE addr.address_seq_no = agar.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM carrier carr

                        WHERE addr.address_seq_no = carr.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM sponsor sps

                        WHERE addr.address_seq_no = sps.address_seq_no   ) AND

           not exists ( SELECT 'x' FROM predaddr caddr

                        WHERE addr.address_seq_no = caddr.address_seq_no ) AND

           not exists ( SELECT 'x' FROM custaddr cuad

                        WHERE addr.address_seq_no = cuad.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM batchcan tbca

                        WHERE addr.address_seq_no = tbca.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM runactv ract

                        WHERE addr.address_seq_no = ract.address_seq_no  ) AND

           not exists ( SELECT 'x' FROM tripacc atrp

                        WHERE atrp.pu_place = addr.address_seq_no OR atrp.do_place = addr.address_seq_no )

    First at all, the table address is not big table ( 20821 rows ).

    The only two tables bigger than address is runactv (1293406 rows) and tripacc (837146 rows).

    Based on this also tripacc (837146 rows) depending on the width the logical reads the first time as oppsoed to the second time suggest to me that

    1) You may need to do some index and other maintainence on that table.

    2) pu_place and do_place possible have an index on them.

    3) Changing addr.address_seq_no in relation to = sign is causing one to do a table scan and the other to take advantage of an index.

    runactv on the other hand produces no noticeable difference between runs but may need so index maintainence done still to improve performance.

    Had you tried DBCC SHOWCONTIG to see how badly these table may be fragmented?

    Now exists (especially not exists) are normally going to slow you down in that they execute once per row every row that enters the buffer from the main table.

    You also have a lot of these which means a lot of extra executions occurring to get at the data.

    This should out perform your current query but may need some minor tweaking to get it just right.

    SELECT

     count(*)

    FROM

     dbo.address addr

    LEFT JOIN

     dbo.commlmk clmk

    ON

     clmk.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.garacc agar

    ON

     agar.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.carrier carr

    ON

     carr.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.sponsor sps

    ON

     sps.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.predaddr caddr

    ON

     caddr.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.custaddr cuad

    ON

     cuad.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.batchcan tbca

    ON

     tbca.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.runactv ract

    ON

     ract.address_seq_no = addr.address_seq_no

    LEFT JOIN

     dbo.tripacc atrp

    ON

     atrp.pu_place = addr.address_seq_no OR

     atrp.do_place = addr.address_seq_no

    WHERE

     -- If SQL 2000

     Coalesce(clmk.address_seq_no, agar.address_seq_no, carr.address_seq_no, sps.address_seq_no, caddr.address_seq_no, cuad.address_seq_no, tbca.address_seq_no, ract.address_seq_no, atrp.pu_place, atrp.do_place) IS NULL

     -- IF SQL 7

     clmk.address_seq_no  IS NULL AND

     agar.address_seq_no  IS NULL AND

     carr.address_seq_no  IS NULL AND

     sps.address_seq_no  IS NULL AND

     caddr.address_seq_no IS NULL AND

     cuad.address_seq_no  IS NULL AND

     tbca.address_seq_no  IS NULL AND

     ract.address_seq_no  IS NULL AND

     atrp.pu_place   IS NULL AND

     atrp.do_place   IS NULL

    -- OR In SQL 7 - but is a bit harder to read and easier to get incorrect when changing.

     IsNull(clmk.address_seq_no, IsNull(agar.address_seq_no, IsNull(carr.address_seq_no, IsNull(sps.address_seq_no, IsNull(caddr.address_seq_no, IsNull(cuad.address_seq_no, IsNull(tbca.address_seq_no, IsNull(ract.address_seq_no, IsNull(atrp.pu_place, atrp.do_place))))))))) IS NULL

    But you should look at the execution plan and make sure you have an index on the columns being used for the join in each ON to get best performance.

    There are still a couple of other ways you could right using Having and Group By that can have unexpected results in the way of performance.

  • On top of the index analysis, I would look at changing the order of the exists clause and put the bigger tables to the top of the statement. 

    set statistics io on

    set statistics time on

    SELECT count(*)

    FROM   address addr

    WHERE 

    not exists 

        (

            SELECT 'x' FROM tripacc atrp

            WHERE atrp.do_place = addr.address_seq_no

        )

    AND not exists

        (    

            SELECT 'x' FROM runactv ract

            WHERE ract.address_seq_no = addr.address_seq_no 

        )

    and not exists

        (   

            SELECT 'x' FROM commlmk clmk

            WHERE clmk.address_seq_no = addr.address_seq_no 

        )

    AND not exists

        (

            SELECT 'x' FROM garacc agar

            WHERE agar.address_seq_no = addr.address_seq_no 

        )

    AND not exists

        (

            SELECT 'x' FROM carrier carr

            WHERE carr.address_seq_no = addr.address_seq_no 

        )

    AND not exists

        (     SELECT 'x' FROM sponsor sps

              WHERE sps.address_seq_no = addr.address_seq_no  

        )

    AND not exists

        (

            SELECT 'x' FROM predaddr caddr

            WHERE caddr.address_seq_no = addr.address_seq_no

        )

    AND not exists

        (    

            SELECT 'x' FROM custaddr cuad

            WHERE cuad.address_seq_no = addr.address_seq_no 

        )

    AND not exists

        (

            SELECT 'x' FROM batchcan tbca

            WHERE tbca.address_seq_no = addr.address_seq_no 

        )

    AND not exists

        (    

            SELECT 'x' FROM tripacc atrp

            WHERE atrp.pu_place = addr.address_seq_no

        )

    If it looks like it is executing the same way, I would add the following code at the end:  option (force order)

    That will make sql run the tables in the order you have them in the query.  Have you looked at your statistics lately.  I would update all of the statistics on these tables.

     

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply