Clustered Index Scan vs Seek

  • Two databases on same server. Query uses one table in each.

    Update DB2..IT

    Set TTI = tds.TI

    From DB2..IT it

    Join TDS On it.TR = tds.DSR And it.DS = tds.DS

    Schema of DB1..TDS is

    TID uniqueidentifier (PK)

    DS int

    DSR varchar(50)

    Schema of DB2..IT is

    ITI uniqueidentifier (PK)

    TR varchar(50)

    DS int

    TTI uniqueidentifier

    Index on TDS's DS and DSR

    Index on IT's DS and TR

    The query does a Clustered Index Scan on the clustered index PK of IT (ITI). I can't see why that makes more sense than using my indexes.

    I've rebuilt the indexes and updated stats.

    Help?

  • You're updating all rows, no reason to seek. Need to scan to hit all the rows.

  • I'm only updating rows where TR matches DSR and DS matches DS - there are a great many other rows in IT that are not being updated. So I still don't see why it doesn't seek to see which rows are to be updated?

  • Brian McGee-355525 (10/18/2009)


    Two databases on same server. Query uses one table in each.

    Update DB2..IT

    Set TTI = tds.TI

    From DB2..IT it

    Join TDS On it.TR = tds.DSR And it.DS = tds.DS

    There's no filter condition on that update. One of those tables has to do a scan, there's no other way. On table will get picked for a scan and then, indexes and row counts permitting, seeks will be done against the other table to get the matching rows. There's no way both of those can seek.

    You haven't said what the rest of the exec plan looks like, so that's all I can really offer, except this:

    http://sqlinthewild.co.za/index.php/2009/07/29/is-a-scan-a-bad-thing/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • True, obviously one of the tables has to do a scan, but I wouldn't have thought both would.

    I've dug a little deeper and it turns out that out of 2m rows, the join hit 1.6m of them, so a scan is probably fine. But how can SQL Server know that there isn't only one row matching, when it's creating the execution plan? If there were a small number of matching rows, I still feel a scan on one table and seek on the other would be better?

  • Brian McGee-355525 (10/19/2009)


    I've dug a little deeper and it turns out that out of 2m rows, the join hit 1.6m of them, so a scan is probably fine. But how can SQL Server know that there isn't only one row matching, when it's creating the execution plan?

    SQL has column statistics that it will use to estimate the number of rows affected.

    If you feel that the exec plan is sub-optimal, please post it so that we can take a look. See this article for posting an exec plan from SQL 2000. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Statistics are the big thing, but it also depends on the size of the table. For example, if you have a table with 100 rows, it probably will just scan. At a 2M row table, I'm not sure what the threshhold will be, but you could experiement. Move the table to another server/db, remove some data that will be updated. I'd be curious to see if you had 400k rows, or 100k rows if it will seek.

  • I've run sp_updatestats, one table has 1.5 million rows, the other two million. Still does scan for both tables. No idea why - surely it should use the index and seek one at least one of the two tables...

  • Because seeks are not always optimal, especially when the actual data row is required (as in an update or a non-covering nonclustered index) and the rows affected is a significant (> 1%) portion of the table. A scan of the cluster is far cheaper than a seek followed by a million or so lookups.

    Again, if you think this exec plan is not optimal, please post it. Discussing the reason for exec plans without looking at them is hard.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply