What is the optimiser thinking?

Question

Post reply

What is the optimiser thinking?

win

SSCommitted

Points: 1569
More actions
October 25, 2004 at 12:14 am

#152303

I have recently done some testing on various crosstab queries which follow a set of templateS, such as the one below:
1) Inner temporary view creating a number of aggregated virtual fields in the details table
2) Outer select joining the aggregated virtual fields to the main table with or without an explicit sort
I found that the addition of an explicit order by in the outer select can cut the overal process time in half.
Example statement:
SELECT H.HOLEID,H.PROJECTCODE,X.DEPTH,X.GRDE,X.DENL,X.DENB FROM GEOPHYSHEADERID H
INNER JOIN
(select LASGID,DEPTH,
MIN(CASE WHEN NAME='GRDE' THEN VALUE ELSE NULL END) GRDE
,MIN(CASE WHEN NAME='DENL' THEN VALUE ELSE NULL END) DENL
,MIN(CASE WHEN NAME='DENB' THEN VALUE ELSE NULL END) DENB
FROM GEOPHYSLOGDETAILSID1
WHERE NAME IN ('GRDE','DENL','DENB')
GROUP BY LASGID,DEPTH) X
ON H.LASGID=X.LASGID
WHERE HOLEID LIKE 'D%'
--ORDER BY H.HOLEID,H.PROJECTCODE,X.DEPTH
Execution plan:
The statement with the final ORDER BY is substantially faster (50 - 100% with more virtual fields) than without it. There is no discernable difference in
the execution plan apart from the final sort as shown below. Statistics IO output is also identical.
(1)sort                    -- final explicit order by, if used
(2) merge join
    (3) sort
      (4) idx seek on GEOPHYSHEADERID (unique index field HOLEID)
    (3) str aggr
      (4) clust scan on GEOPHYSLOGDETAILSID1 (primary key field NAME)
PKs and indices are:
GEOPHYSHEADERID clustered PK: LASGID (Identity field); Unique index (5 fields starting with HOLEID)
GEOPHYSLOGDETAILSID1 clustered PK: LASGID, DEPTH, NAME; index on LASGID & index on NAME
Profiler trace:
EventClass         EventSubClass   ObjectName   CPU     Reads     Duration MyComments
---------------------------------------------------------------------------------------------
SQLTransaction     Begin           sort_init                                               explicit order by
                          Begin           sort_init
                          commit         sort_init                                  45640
                          commit         sort_init                                 60596
SQL:BatchCompleted                                     44578   640868    61076
......
SQLTransaction     Begin           sort_init                                                NO order by
                          commit         sort_init                                  113513
SQL:BatchCompleted                                     51703   640868    114563
The statement with the explicit order by generates 2 sort_init processes. The process time of the inner sort_init becomes part of the outer sort_init.
The statement without the order by generates one long lasting sort_init process. The CPU works slightly harder (15%). The logical reads remain the same.
Can anybody please explain what is going on under hood. Why doesn't the optimiser include an implicit final sort to speed up the query?
Thanks,
Win

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Clay_G Default port Points: 1454 More actions · Answer 1

If you ran non-orderby version first then order-by version, the data could have been buffered, making it faster. Run:

DBCC DropCleanBuffers

before each run.

Though you said physical reads were the same. Count read-aheads too.

It would not be too suprising if order-by changed the query plan and accidentally improved it, but with the same QP plus added sort, its mysterious.

win SSCommitted Points: 1569 More actions · Answer 2

Clay,

I run a 2-line script with dbcc dropcleanbuffers and freeproccache before every benchmark test. Read - ahead count is also the same.

cheers,

Win