February 28, 2005 at 5:40 am
Given this table:
CREATE TABLE #T_PEOPLE (
SSN char (9),
BIRTHDAY datetime
)
insert #T_PEOPLE(ssn, birthday) values('123456789', '1/1/60')
insert #T_PEOPLE(ssn, birthday) values('111223333', '1/1/50')
Which query is faster to find the ssn of the older person?
This query:
select top 1 SSN
from #T_PEOPLE
order by birthday
or this one:
select SSN
from #T_PEOPLE
where birthday=
(select min(birthday) from #T_PEOPLE)
February 28, 2005 at 6:02 am
According to the Query Plan (From toolbar in QA select Query.. Show Execution Plan) Query 1 is more efficient (doesn't perform a TOP).
However, the TOP cost is 0% it is just an additional step.
I tried the following query as well (Like the structure better)
SELECT T1.SSN
FROM #T_PEOPLE T1
INNER JOIN (SELECT MIN(birthday) birthday FROM #T_PEOPLE) T2
ON T1.birthday = T2.birthday
It gave same plan as #2.
I guess you just need to test each query with a full set of data and make sure that you properly INDEX the table....
Good Hunting!
AJ Ahrens
webmaster@kritter.net
March 1, 2005 at 7:30 pm
I expected little difference between these two queries, performance-wise. But, being the curious soul I am, I decided to try it out, with the following code:
DROP TABLE #T_PEOPLE GO CREATE TABLE #T_PEOPLE ( SSN char (9), BIRTHDAY datetime ) SET NOCOUNT ON DECLARE @counter int SET @counter = 100000000 WHILE @counter < 101000000 BEGIN INSERT INTO #T_PEOPLE (SSN, BIRTHDAY) VALUES (@counter, DATEADD(dd, CEILING(36525 * RAND()), '1905-01-01')) IF (@counter % 1000 = 0) BEGIN PRINT CONVERT(varchar,@counter) + ' - ' + CONVERT(varchar,getdate()) END SET @counter = @counter + 1 END SET NOCOUNT OFF GO
Then I ran the two queries through as a single batch, with Show Execution Plan turned on.
The "order by" query accounted for 96% of execution; the "select min" plan only 4%.
Out of curiosity, I added an index (right before SET NOCOUNT OFF above:
CREATE INDEX [people_birthday] ON #T_PEOPLE (BIRTHDAY) WITH FILLFACTOR = 90
This time, "order by" only took 66% of the execution, and "select min" 34%.
The "select min" approach would seem to be a clear winner.
And, yes, I have no life.
R David Francis
March 1, 2005 at 10:32 pm
Forgot to add this when I recreated my post (first attempt disappeared when I hit "Preview", probably because of some sort of time-out thing).
You need "TOP 1" in the "select min" statement, in case there is more than one record with that birthdate; In my 1,000,000 row test table, there were several.
R David Francis
March 2, 2005 at 6:12 am
The percentages within a plan is only valid in the context of that particular plan. You can't compare 'order by 75%' from one plan to 'min 25%' from another plan and draw any conclusions from that. What the figures say is just that in the first plan, the 'order by' stood for 75% of that entire plan, and the 'min' stood for 25% of the other plan's total.
They can't be compared to each other as percentages because they have no correlation between plans. To give some hint about efficiency, you have to drill into the particular statements in each plan and compare the metrics for each, (ie pages, i/o, rows etc)
/Kenneth
March 2, 2005 at 8:15 am
I ran the two statements together as a batch. Therefore, the plan included both statements. When I say that the "order by" statement had a percentage of 96%, I mean that all the elements of the statement, within the overall plan, added up to 96%. Similarly, all the elements of the "select min" statement added up to 4% of the same plan. Otherwise, it would be a barely credible coincidence that, taken together, they added up to 100%.
I do agree that there are caveats involved. In particular, comparing the percentages in the first plan (both statements, no index) to those in the second plan (both statements, birthday indexed) is not directly applicable. The first statement took about 2 seconds to execute; the second was virtually instantaneous. Some might take the change in percentage on the "select min" statement to mean that it actually runs less effectively with an index in place. Since we don't have a solid benchmark for comparison, we can't say for certain that it's faster, the same, or slower with the index. We can say that the two statements combined are noticeably faster with the index in place. We can say that it is *likely* that the "select min" statement is not dramatically faster, and thus that the higher percentage in the indexed batch represents the dramatic improvement in the "order by" statement.
R David Francis
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply