EXIST IN JOIN, Results vary....

Question

EXIST IN JOIN, Results vary....

SeeCoolGuy

SSCrazy

Points: 2902
More actions
March 15, 2004 at 11:22 am

#107943

I am running time diffrences in the following MAX inner queries, as you can see the first and last which use an IN clause or a Join clause to find the max entryTIME of a machineinstall returns the same number of rows however even while doing the scan count and the logical reads being the same, the latter (JOIN) is even slower by an additional second... the EXISTS is the fastest one, but the results vary and I wanted to get an outsiders persective if I'm even doing this part correctly...
thanks for your time.
--RUNTIME:04
--(80411 row(s) affected)
--Table 'tbl_MachineInstalls'. Scan count 2, logical reads 3264, physical reads 0, read-ahead reads 0.
SELECT MIo.MachInstallID AS CurrentInstallID,
MIo.ActiveServiceDlrID AS ServicedByID,
MIo.InstallDate AS CurrentInstallDate,
MIo.MachineID,
MIo.entryTIME
FROM dbo.tbl_MachineInstalls AS MIo WITH (NOLOCK)
WHERE MIo.entryTIME IN
(SELECT MAX(MIi.entryTime)
FROM dbo.tbl_MachineInstalls AS MIi WITH (NOLOCK)
WHERE MIi.MachineID = MIo.MachineID)
--RUNTIME:03
--(81898 row(s) affected)
--Table 'tbl_MachineInstalls'. Scan count 1, logical reads 1632, physical reads 0, read-ahead reads 0.
SELECT MIo.MachInstallID AS CurrentInstallID,
MIo.ActiveServiceDlrID AS ServicedByID,
MIo.InstallDate AS CurrentInstallDate,
MIo.MachineID,
MIo.entryTIME
FROM dbo.tbl_MachineInstalls AS MIo WITH (NOLOCK)
WHERE EXISTS
(SELECT MAX(MIi.entryTime)
FROM dbo.tbl_MachineInstalls AS MIi WITH (NOLOCK)
WHERE MIi.MachineID = MIo.MachineID
AND MIi.entryTime = MIo.entryTime)
--RUNTIME:05
--(80411 row(s) affected)
--Table 'tbl_MachineInstalls'. Scan count 2, logical reads 3264, physical reads 0, read-ahead reads 0.
SELECT MIo.MachInstallID AS CurrentInstallID,
MIo.ActiveServiceDlrID AS ServicedByID,
MIo.InstallDate AS CurrentInstallDate,
MIo.MachineID,
MIo.entryTIME
FROM dbo.tbl_MachineInstalls AS MIo WITH (NOLOCK)
INNER JOIN (SELECT MachineID, MAX(entryTime) AS entryTIME
FROM dbo.tbl_MachineInstalls WITH (NOLOCK)
GROUP BY MachineID) AS MIi
ON MIi.MachineID = MIo.MachineID AND MIi.entryTIME = MIo.entryTIME
-- Francisco

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 1

I forgot to mention that the 81898 results are ALL the records in that table, why would the EXISTS not filter out any data?

Thanks,

-- Francisco

Jonathan SSC-Insane Points: 20427 More actions · Answer 2

Your first and last queries are essentially the same, and I doubt the execution plans differ. When you're dealing with such short times, one second is probably within the margin of error introduced by other factors and you will probably not get these timings repeatedly if you rerun the same queries. The second query is entirely different and of course will not give you the same result (and will use a different execution plan). IF EXISTS (SELECT MAX(mi.EntryTime)... is the same as IF EXISTS (SELECT *... as you're not correlating on MAX(EntryTime) but instead on just EntryTime.

--Jonathan

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 3

I modified the where clause to show the having, which I missed and posted way to early to this forumn, but maybe there is something that can be done? :S

--RUNTIME:18

--(80411 row(s) affected)

--Table 'Worktable'. Scan count 81888, logical reads 459541, physical reads 0, read-ahead reads 0.

--Table 'tbl_MachineInstalls'. Scan count 2, logical reads 3264, physical reads 0, read-ahead reads 0.

SELECT MIo.MachInstallID AS CurrentInstallID,

MIo.ActiveServiceDlrID AS ServicedByID,

MIo.InstallDate AS CurrentInstallDate,

MIo.MachineID,

MIo.entryTIME

FROM dbo.tbl_MachineInstalls AS MIo WITH (NOLOCK)

WHERE EXISTS

(SELECT MAX(MIi.entryTime)

FROM dbo.tbl_MachineInstalls AS MIi WITH(NOLOCK)

WHERE MIi.MachineID = MIo.MachineID

HAVING MAX(MIi.entryTime) = MIo.entryTIME)

-- Francisco

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 4

Jonothan,

Thanks for your post, and yes, the query plan for the 1st and 3rd query were the same, it's kinda bizzar that the 3rd is still even a second slower than the 1st. I consistently re-ran each set and continued to get the same results each time... and now that I've re-written the EXSTS it's even slower. I was trying to optimize this section because I have a udf that returns a table and when observing it's query plan I thought that perhaps I could optimize this section first.

the entire UDF runs at about 40 seconds which is not terrible, but is seemingly causing problems on another sproc which it is the basis for, on Friday the same procedure would run in about 8-20 seconds, now today on monday it is taking all of 8 minutes... I'm not sure what could have changed so drasticaly

-- Francisco

Jonathan SSC-Insane Points: 20427 More actions · Answer 5

You may want to study some more basic SQL, as your new subquery used with EXISTS should of course be just:

(SELECT *

FROM dbo.tbl_MachineInstalls AS MIi WITH(NOLOCK)

WHERE MIi.MachineID = MIo.MachineID AND MAX(MIi.entryTime) = MIo.entryTIME)

I would not expect this to be faster than the other variants, though.

Try taking your timings in milliseconds rather than seconds. I think you'll find the variations in times between the first and third queries as you rerun them will show the margin of error to be large enough for them to have equivalent times.

As for your slow procedure, have you tried updating the statistics and recompiling the procedure (and any child procedures)? Have any indexes been dropped recently?

--Jonathan

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 6

Thanks again Jonathan for your help,

2 quick questions ...

1) how do I measure in milliseconds, till now i've been getting the time displayed in the status bar in QA.

2) the modified EXISTS gives me the following error

An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

any thoughts?

-- Francisco

Jonathan SSC-Insane Points: 20427 More actions · Answer 7

1)

DECLARE @t datetime

SET @t = GETDATE()

--first query here

PRINT DATEDIFF(ms,@t,GETDATE())

SET @t = GETDATE()

--second query here

PRINT DATEDIFF(ms,@t,GETDATE())

2) I should study some basic SQL, your method is about the only way to make that work. (But I wouldn't do it.)

--Jonathan

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 8

1) ah, ok, I thought there was a setting in QA to change the output time display.

2) fair enough... Is my table just set up wrong? the table takes inputs, would I need to modify it w/ a trigger update that if a new install date is added, to automatically move the older date out of the way? The table serves to provide a history of everywhere the machine has been installed at.

So... the Max Installdate that i'm searching for here serves to help find the LAST location it was installed at...

-- Francisco

Jonathan SSC-Insane Points: 20427 More actions · Answer 9

Is my table just set up wrong?

No, I just meant I wouldn't use EXISTS in a case like this. This is not an unusual type of query, and your other two methods are fine (other than I would use = rather than IN in the first). In some cases the best performance can be found with TOP 1 rather than MAX(), e.g.:

SELECT MIo.MachInstallID AS CurrentInstallID,

MIo.ActiveServiceDlrID AS ServicedByID,

MIo.InstallDate AS CurrentInstallDate,

MIo.MachineID,

MIo.entryTIME

FROM dbo.tbl_MachineInstalls AS MIo

WHERE MIo.entryTIME =

(SELECT TOP 1 MIi.entryTime

FROM dbo.tbl_MachineInstalls AS MIi

WHERE MIi.MachineID = MIo.MachineID

ORDER BY MIi.entryTime DESC)

--Jonathan

Peter Evans Ten Centuries Points: 1187 More actions · Answer 10

Francisco I think the query below will fix your broken exists query so that you can do the benchmark.

SELECT

MIo.MachInstallID AS CurrentInstallID,

MIo.ActiveServiceDlrID AS ServicedByID,

MIo.InstallDate AS CurrentInstallDate,

MIo.MachineID,

MIo.entryTIME

FROM

dbo.tbl_MachineInstalls AS MIo WITH (NOLOCK)

WHERE MIo.entryTime =

(

SELECT Max(MIi.entryTime)

FROM dbo.tbl_MachineInstalls AS MIi WITH (NOLOCK)

WHERE MIi.MachineID = MIo.MachineID

)

Peter Evans (__PETER Peter_)

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 11

This seems to have a longer run time than the previous ones, I keep yielding an even 8 seconds or 8000ms average run time as compared to even any of the other ones...

at this point I re-organized the FROM table joins from the most outer query and this speed up the query for 2 of the 4 that were having problems.

-- Francisco

gljjr SSChampion Points: 10080 More actions · Answer 12

Francisco,

What is the reasoning behind using the NOLOCK hint? Do you really need it?

BTW: You can have QA give you the elapsed time and cpu time for queries by using something like the following.

SET STATISTICS TIME ON

SELECT * FROM sysobjects

SET STATISTICS TIME OFF

Shows:

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

(423 row(s) affected)

SQL Server Execution Times:

CPU time = 17 ms, elapsed time = 17 ms.

Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer

This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.

SeeCoolGuy SSCrazy Points: 2902 More actions · Answer 13

I've found that if I use WITH(NOLOCK), my Selects return that much faster... SQL BOL states

READUNCOMMITTED

Specifies that dirty reads are allowed. This means that no shared locks are issued and no exclusive locks are honored. Allowing dirty reads can result in higher concurrency, but at the cost of lower consistency. If READUNCOMMITTED is specified, it is possible to read an uncommitted transaction or to read a set of pages rolled back in the middle of the read; therefore, error messages may result. For more information about isolation levels, see SET TRANSACTION ISOLATION LEVEL.

So these are dirty reads but that's fine for this process.

-- Francisco