Agregate join

  • Hi Craig,

    I've played around with your scripts and here are my thoughts and findings:

    I see why you've created index on UpdateDate in DESC order rather than ASC so in this case record with the latest UpdateDate value can be read first when doing CROSS APPLY. But looks like that for the server there is no big difference in reading first or last record from the index and the test I did with ASC/DESC clearly proves it.

    -- ORDER BY UpdateDate DESC

    Table 'PriceHistoryTable'. Scan count 50003, logical reads 373098, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1735 ms, elapsed time = 919 ms.

    (50000 row(s) affected)

    -- ORDER BY UpdateDate ASC

    Table 'PriceHistoryTable'. Scan count 50003, logical reads 372987, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1733 ms, elapsed time = 883 ms.

    (50000 row(s) affected)

    And this is the only difference in the plans:

    |--Index Seek(OBJECT: ([Test].[dbo].[PriceHistoryTable].[idx_TestData2] AS [td2]), SEEK: ([td2].[ItemID]=[Test].[dbo].[PriceHistoryTable].[ItemID])

    ORDERED FORWARD)

    |--Index Seek(OBJECT: ([Test].[dbo].[PriceHistoryTable].[idx_TestData2] AS [td2]), SEEK: ([td2].[ItemID]=[Test].[dbo].[PriceHistoryTable].[ItemID])

    ORDERED BACKWARD)


    Here is something even more interesting. I found a way to spoil CROSS APPLY query by just creating one more index and not touching anything else.

    CREATE NONCLUSTERED INDEX idx_TestData3 ON PriceHistoryTable (ItemID)

    This made a big difference (5 times faster) when I run it with cold cache. (DBCC DROPCLEANBUFFERS before each test)

    -- JOIN

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    Table 'PriceHistoryTable'. Scan count 50003, logical reads 347903, physical reads 1006, read-ahead reads 39163, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2733 ms, elapsed time = 3897 ms.

    (50000 row(s) affected)

    -- CROSS APPLY

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    Table 'PriceHistoryTable'. Scan count 50003, logical reads 372631, physical reads 5402, read-ahead reads 22347, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2423 ms, elapsed time = 21576 ms.

    (50000 row(s) affected)

    The difference is that when there is only one index available then server does an index scan on it which results in very efficient read-ahead reads from HDD and only after that it performs Index Seek on the same index (which is cached already). But when there is a better index available to get DISTINCT ItemID it uses idx_TestData3 instead. When it comes to Index Seek on idx_TestData2, it's not cached at all so very slow page by page reads are done on the physical level. Which can be confirmed by comparing physical reads and read-ahead reads. Moreover the whole idx_TestData2 index will be read from the disk anyway which makes no sense of doing Index Seek on it because Scan should be faster.

    With INNER JOIN the only seek we have is Clustered Index Seek but this one has to be done no matter what type of JOIN do we use, that's why it outperforms CROSS APPLY so drastically.

    All tests are done with 10M records and unique count of ItemID 50K.


    Alex Suprun

  • Sorry it took a bit to get back to this, pesky Thanksgiving...

    Alexander Suprun (11/24/2011)


    Hi Craig,

    I've played around with your scripts and here are my thoughts and findings:

    I see why you've created index on UpdateDate in DESC order rather than ASC so in this case record with the latest UpdateDate value can be read first when doing CROSS APPLY. But looks like that for the server there is no big difference in reading first or last record from the index and the test I did with ASC/DESC clearly proves it.

    Agreed, I was simply being anal. You're also looking at a 'working copy', not a finalized script. It's got everything from first draft notes to current conclusions in there so not everything's gospel.

    Here is something even more interesting. I found a way to spoil CROSS APPLY query by just creating one more index and not touching anything else.

    CREATE NONCLUSTERED INDEX idx_TestData3 ON PriceHistoryTable (ItemID)

    This made a big difference (5 times faster) when I run it with cold cache. (DBCC DROPCLEANBUFFERS before each test)

    Now, that IS interesting.

    The difference is that when there is only one index available then server does an index scan on it which results in very efficient read-ahead reads from HDD and only after that it performs Index Seek on the same index (which is cached already). But when there is a better index available to get DISTINCT ItemID it uses idx_TestData3 instead. When it comes to Index Seek on idx_TestData2, it's not cached at all so very slow page by page reads are done on the physical level. Which can be confirmed by comparing physical reads and read-ahead reads. Moreover the whole idx_TestData2 index will be read from the disk anyway which makes no sense of doing Index Seek on it because Scan should be faster.

    Excellent analysis. However, not necessarily on the scan. It depends on if the seek for the segment-top can be gotten from the tree instead of the leaf. I need to go re-read Paul White's blog and articles on that.

    I think. 😉 I'm going to have to dig around on that item and see if I can avoid the distincted list, but I'm not sure that's available.

    With INNER JOIN the only seek we have is Clustered Index Seek but this one has to be done no matter what type of JOIN do we use, that's why it outperforms CROSS APPLY so drastically.

    I'm going to have to take a dig at that and confirm I agree with your findings. Nice review either way, thank you. I'm wondering though if that ItemID index will still perform better on the selective ItemID components than the 'view the world' results.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply