January 7, 2005 at 7:28 am
Hi everyone,
We have sql server 2000 and the execution time for the last query is suprising me. Almost the same query, same execution plan except for the cost of some operation, same data, but different execution timing ? any suggestion ? Thanks a lot.
SELECT addr.sysrecno,addr.access,addr.address_seq_no
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE addr.address_seq_no = clmk.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE addr.address_seq_no = agar.address_seq_no ) AND
not exists ( SELECT 'x' FROM carrier carr
WHERE addr.address_seq_no = carr.address_seq_no ) AND
... 01:15 sec approximately
vs
SELECT addr.sysrecno,addr.access,addr.address_seq_no
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE clmk.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE agar.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM carrier carr
WHERE carr.address_seq_no = addr.address_seq_no ) AND
... 0:04 sec approximately
January 7, 2005 at 8:07 am
I would run this with the statistics io on. This will give you the number of scans happening on the tables. My guess is that the address table is considerably bigger than the other tables and it is scanning that table more in the first query than the second query. You can also use statistics time on to see the execution time in milleseconds.
set statistics io on
set statistics time on
Execute code here and look at the results tab.
Let us know.
Tom
January 7, 2005 at 8:12 am
I see two potential reasons behind this, (I have done similar testing in pushing the use of JOINs over WHERE clauses).
First, the Optimizer will buffer up the plan for these queries. Try running the second select first and see how your time looks. This should not account for such a huge difference, though.
Second, I have read documentation indicating the you should put the smaller table on the left for an equality clause. I have no idea about your tables, but it would be my guess that the "address" table may be much larger than the the other three you are using. If this is the case, (and this does not include any issues of Indexing) it may be that your reads from the smaller tables consumes much less time by putting them to the left of the equality.
I wasn't born stupid - I had to study.
January 7, 2005 at 8:41 am
Also try something like this to see if it doesn't perform better:
SELECT
addr.sysrecno,
addr.access,
addr.address_seq_no
FROM
dbo.address addr
LEFT JOIN
dbo.commlmk clmk
ON
clmk.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.garacc agar
ON
agar.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.carrier carr
ON
carr.address_seq_no = addr.address_seq_no
WHERE
-- If SQL 2000
Coalesce(clmk.address_seq_no, agar.address_seq_no, carr.address_seq_no) IS NULL
-- IF SQL 7
clmk.address_seq_no IS NULL AND
agar.address_seq_no IS NULL AND
carr.address_seq_no IS NULL
January 7, 2005 at 2:17 pm
Hy guys,
First at all, the table address is not big table ( 20821 rows ). The only two tables bigger than address is runactv (1293406 rows) and tripacc (837146 rows).
Here is the output with the statistics io on :
SELECT addr.sysrecno,addr.access,addr.address_seq_no
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE addr.address_seq_no = clmk.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE addr.address_seq_no = agar.address_seq_no ) AND
...
not exists ( SELECT 'x' FROM runactv ract
WHERE addr.address_seq_no = ract.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE addr.address_seq_no = atrp.pu_place OR
addr.address_seq_no = atrp.do_place )
SQL Server parse and compile time:
CPU time = 266 ms, elapsed time = 295 ms.
Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'predaddr'. Scan cnt 3, log reads 279, phys reads 0, read-ahead reads 0.
Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'carrier'. Scan cnt 1, log reads 3, phys reads 0, read-ahead reads 0.
Table 'garacc'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'commlmk'. Scan cnt 3, log reads 4, phys reads 0, read-ahead reads 0.
Table 'address'. Scan cnt 3, log reads 205, phys reads 0, read-ahead 0.
Table 'custaddr'. Scan cnt 3, log reads 98, phys reads 0, read-ahead 0.
Table 'runactv'. Scan cnt 170, log reads 34457, phys reads 0, read-ahead 0.
Table 'tripacc'. Scan cnt 451, logical reads 17378386, physical reads 0, read-ahead reads 48.
SQL Server Execution Times:
CPU time = 224125 ms, elapsed time = 93857 ms.
VS
SELECT addr.sysrecno,addr.access,addr.address_seq_no
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE clmk.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE agar.address_seq_no = addr.address_seq_no ) AND
...
not exists ( SELECT 'x' FROM runactv ract
WHERE ract.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE atrp.pu_place = addr.address_seq_no OR
atrp.do_place = addr.address_seq_no )
Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 0, read-ahead 0.
Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'carrier'. Scan cnt 1, log reads 3, phys reads 0, read-ahead reads 0.
Table 'garacc'. Scan cnt 1, log reads 2, phys reads 0, read-ahead reads 0.
Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 0, read-ahead reads 0.
Table 'address'. Scan cnt 4, log reads 205, phys reads 0, read-ahead 0.
Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 0, read-ahead 0.
Table 'runactv'. Scan cnt 170, log reads 34457, phys reads 0, read-ahead 0.
Table 'tripacc'. Scan cnt 8, log reads 78588, phys reads 70, read-ahead 8351.
SQL Server Execution Times:
CPU time = 5454 ms, elapsed time = 5732 ms.
SQL Server Execution Times:
CPU time = 5454 ms, elapsed time = 5766 ms.
SQL Server Execution Times:
CPU time = 5454 ms, elapsed time = 5767 ms.
After that i try modify the query putting the small table on the left of the equality. Well, nothing change for the execution time. Still take 1:15 sec
SELECT addr.sysrecno,addr.access,addr.address_seq_no
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE clmk.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE agar.address_seq_no = addr.address_seq_no ) AND
...
not exists ( SELECT 'x' FROM runactv ract
WHERE addr.address_seq_no = ract.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE addr.address_seq_no = atrp.pu_place OR
addr.address_seq_no = atrp.do_place )
And the last thing that i try is to modify the query with the left join as suggested but it's worst than the original query execution time. I have to cancel myself the execution of the query after 23 minutes.
Well, any clue ? Thanks a lot. Tran
January 7, 2005 at 2:45 pm
Wait a minute. The code seems to be changing here. Your original posting had an additional AND at the end, this new positng now contains OR's. Is there more we should know?
Also, I shyed away from Indexes in my first response. Might this be an issue here?
I wasn't born stupid - I had to study.
January 10, 2005 at 10:45 am
From looking at the statistics, the problem seems to be with:
Table 'tripacc'. Scan cnt 451, logical reads 17378386, physical reads 0, read-ahead reads 48
That's alot of logical reads on that table, hence the time. Even if the execution plan is the same Sql server is doing different things when it is reading through the data.
I would mess with the order of the various exists clauses and also with indexes.
Tom
January 10, 2005 at 2:42 pm
ANd I think the entire segment of code would be usefull for us to understand what you are really doing. You actually should be able to get better preformance out of what I suggest in many cases but I had only a portion of the code to base on.
January 10, 2005 at 2:59 pm
Are you Flushing the CACHE between the two batches?
menaning
dbcc dropcleanbuffers
dbcc freeproccache
* Noel
January 11, 2005 at 9:57 am
Hi,
A mistake of my part. Sorry for that.
set statistics io on
set statistics time on
SELECT count(*)
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE addr.address_seq_no = clmk.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE addr.address_seq_no = agar.address_seq_no ) AND
not exists ( SELECT 'x' FROM carrier carr
WHERE addr.address_seq_no = carr.address_seq_no ) AND
not exists ( SELECT 'x' FROM sponsor sps
WHERE addr.address_seq_no = sps.address_seq_no ) AND
not exists ( SELECT 'x' FROM predaddr caddr
WHERE addr.address_seq_no = caddr.address_seq_no ) AND
not exists ( SELECT 'x' FROM custaddr cuad
WHERE addr.address_seq_no = cuad.address_seq_no ) AND
not exists ( SELECT 'x' FROM batchcan tbca
WHERE addr.address_seq_no = tbca.address_seq_no ) AND
not exists ( SELECT 'x' FROM runactv ract
WHERE addr.address_seq_no = ract.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE addr.address_seq_no = atrp.pu_place ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE addr.address_seq_no = atrp.do_place )
Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 1, read-ahead 278.
Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'carrier'. Scan cnt 1, log reads 3, phys reads 1, read-ahead reads 2.
Table 'garacc'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'address'. Scan cnt 4, log reads 205, phys reads 1, read-ahead 204.
Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 2, read-ahead reads 2.
Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 3, read-ahead 97.
Table 'runactv'. Scan cnt 50, log reads 34166, phys reads 13, read-ahead reads 34074.
Table 'tripacc'. Scan cnt 452, log reads 17780756, phys reads 70, read-ahead reads 40459.
SQL Server Execution Times:
CPU time = 263654 ms, elapsed time = 105946 ms.
VS
set statistics io on
set statistics time on
SELECT count(*)
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE clmk.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE agar.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM carrier carr
WHERE carr.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM sponsor sps
WHERE sps.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM predaddr caddr
WHERE caddr.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM custaddr cuad
WHERE cuad.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM batchcan tbca
WHERE tbca.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM runactv ract
WHERE ract.address_seq_no = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE atrp.pu_place = addr.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE atrp.do_place = addr.address_seq_no )
Table 'batchcan'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'predaddr'. Scan cnt 4, log reads 279, phys reads 1, read-ahead 278.
Table 'sponsor'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'carrier'. Scan cnt 1, log reads 3, phys reads 1, read-ahead reads 2.
Table 'garacc'. Scan cnt 1, log reads 2, phys reads 2, read-ahead reads 0.
Table 'address'. Scan cnt 4, log reads 205, phys reads 1, read-ahead 204.
Table 'commlmk'. Scan cnt 4, log reads 4, phys reads 2, read-ahead reads 2.
Table 'custaddr'. Scan cnt 4, log reads 98, phys reads 4, read-ahead 97.
Table 'runactv'. Scan cnt 51, log reads 34169, phys reads 13, read-ahead reads 34070.
Table 'tripacc'. Scan cnt 8, log reads 80408, phys reads 70, read-ahead reads 40459.
SQL Server Execution Times:
CPU time = 3860 ms, elapsed time = 27156 ms.
January 12, 2005 at 8:33 am
The big difference is in the exists for tripacc which you make two different hits on.
I would be curious if this doesn't produce the same effect. '
SELECT count(*)
FROM address addr
WHERE not exists ( SELECT 'x' FROM commlmk clmk
WHERE addr.address_seq_no = clmk.address_seq_no ) AND
not exists ( SELECT 'x' FROM garacc agar
WHERE addr.address_seq_no = agar.address_seq_no ) AND
not exists ( SELECT 'x' FROM carrier carr
WHERE addr.address_seq_no = carr.address_seq_no ) AND
not exists ( SELECT 'x' FROM sponsor sps
WHERE addr.address_seq_no = sps.address_seq_no ) AND
not exists ( SELECT 'x' FROM predaddr caddr
WHERE addr.address_seq_no = caddr.address_seq_no ) AND
not exists ( SELECT 'x' FROM custaddr cuad
WHERE addr.address_seq_no = cuad.address_seq_no ) AND
not exists ( SELECT 'x' FROM batchcan tbca
WHERE addr.address_seq_no = tbca.address_seq_no ) AND
not exists ( SELECT 'x' FROM runactv ract
WHERE addr.address_seq_no = ract.address_seq_no ) AND
not exists ( SELECT 'x' FROM tripacc atrp
WHERE atrp.pu_place = addr.address_seq_no OR atrp.do_place = addr.address_seq_no )
First at all, the table address is not big table ( 20821 rows ).
The only two tables bigger than address is runactv (1293406 rows) and tripacc (837146 rows).
Based on this also tripacc (837146 rows) depending on the width the logical reads the first time as oppsoed to the second time suggest to me that
1) You may need to do some index and other maintainence on that table.
2) pu_place and do_place possible have an index on them.
3) Changing addr.address_seq_no in relation to = sign is causing one to do a table scan and the other to take advantage of an index.
runactv on the other hand produces no noticeable difference between runs but may need so index maintainence done still to improve performance.
Had you tried DBCC SHOWCONTIG to see how badly these table may be fragmented?
Now exists (especially not exists) are normally going to slow you down in that they execute once per row every row that enters the buffer from the main table.
You also have a lot of these which means a lot of extra executions occurring to get at the data.
This should out perform your current query but may need some minor tweaking to get it just right.
SELECT
count(*)
FROM
dbo.address addr
LEFT JOIN
dbo.commlmk clmk
ON
clmk.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.garacc agar
ON
agar.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.carrier carr
ON
carr.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.sponsor sps
ON
sps.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.predaddr caddr
ON
caddr.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.custaddr cuad
ON
cuad.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.batchcan tbca
ON
tbca.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.runactv ract
ON
ract.address_seq_no = addr.address_seq_no
LEFT JOIN
dbo.tripacc atrp
ON
atrp.pu_place = addr.address_seq_no OR
atrp.do_place = addr.address_seq_no
WHERE
-- If SQL 2000
Coalesce(clmk.address_seq_no, agar.address_seq_no, carr.address_seq_no, sps.address_seq_no, caddr.address_seq_no, cuad.address_seq_no, tbca.address_seq_no, ract.address_seq_no, atrp.pu_place, atrp.do_place) IS NULL
-- IF SQL 7
clmk.address_seq_no IS NULL AND
agar.address_seq_no IS NULL AND
carr.address_seq_no IS NULL AND
sps.address_seq_no IS NULL AND
caddr.address_seq_no IS NULL AND
cuad.address_seq_no IS NULL AND
tbca.address_seq_no IS NULL AND
ract.address_seq_no IS NULL AND
atrp.pu_place IS NULL AND
atrp.do_place IS NULL
-- OR In SQL 7 - but is a bit harder to read and easier to get incorrect when changing.
IsNull(clmk.address_seq_no, IsNull(agar.address_seq_no, IsNull(carr.address_seq_no, IsNull(sps.address_seq_no, IsNull(caddr.address_seq_no, IsNull(cuad.address_seq_no, IsNull(tbca.address_seq_no, IsNull(ract.address_seq_no, IsNull(atrp.pu_place, atrp.do_place))))))))) IS NULL
But you should look at the execution plan and make sure you have an index on the columns being used for the join in each ON to get best performance.
There are still a couple of other ways you could right using Having and Group By that can have unexpected results in the way of performance.
January 12, 2005 at 12:41 pm
On top of the index analysis, I would look at changing the order of the exists clause and put the bigger tables to the top of the statement.
set statistics io on
set statistics time on
SELECT count(*)
FROM address addr
WHERE
not exists
(
SELECT 'x' FROM tripacc atrp
WHERE atrp.do_place = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM runactv ract
WHERE ract.address_seq_no = addr.address_seq_no
)
and not exists
(
SELECT 'x' FROM commlmk clmk
WHERE clmk.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM garacc agar
WHERE agar.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM carrier carr
WHERE carr.address_seq_no = addr.address_seq_no
)
AND not exists
( SELECT 'x' FROM sponsor sps
WHERE sps.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM predaddr caddr
WHERE caddr.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM custaddr cuad
WHERE cuad.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM batchcan tbca
WHERE tbca.address_seq_no = addr.address_seq_no
)
AND not exists
(
SELECT 'x' FROM tripacc atrp
WHERE atrp.pu_place = addr.address_seq_no
)
If it looks like it is executing the same way, I would add the following code at the end: option (force order)
That will make sql run the tables in the order you have them in the query. Have you looked at your statistics lately. I would update all of the statistics on these tables.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply