Clustered Index Scan vs Seek

Question

Post reply

Clustered Index Scan vs Seek

Atat Rowcount

Default port

Points: 1453
More actions
October 18, 2009 at 2:04 pm

#385602

Two databases on same server. Query uses one table in each.
Update DB2..IT
Set TTI = tds.TI
From DB2..IT it
Join TDS On it.TR = tds.DSR And it.DS = tds.DS
Schema of DB1..TDS is
TID uniqueidentifier (PK)
DS int
DSR varchar(50)
Schema of DB2..IT is
ITI uniqueidentifier (PK)
TR varchar(50)
DS int
TTI uniqueidentifier
Index on TDS's DS and DSR
Index on IT's DS and TR
The query does a Clustered Index Scan on the clustered index PK of IT (ITI). I can't see why that makes more sense than using my indexes.
I've rebuilt the indexes and updated stats.
Help?

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 736212 More actions · Answer 1

You're updating all rows, no reason to seek. Need to scan to hit all the rows.

Atat Rowcount Default port Points: 1453 More actions · Answer 2

I'm only updating rows where TR matches DSR and DS matches DS - there are a great many other rows in IT that are not being updated. So I still don't see why it doesn't seek to see which rows are to be updated?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Brian McGee-355525 (10/18/2009)
Two databases on same server. Query uses one table in each.
Update DB2..IT
Set TTI = tds.TI
From DB2..IT it
Join TDS On it.TR = tds.DSR And it.DS = tds.DS

There's no filter condition on that update. One of those tables has to do a scan, there's no other way. On table will get picked for a scan and then, indexes and row counts permitting, seeks will be done against the other table to get the matching rows. There's no way both of those can seek.

You haven't said what the rest of the exec plan looks like, so that's all I can really offer, except this:

http://sqlinthewild.co.za/index.php/2009/07/29/is-a-scan-a-bad-thing/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Atat Rowcount Default port Points: 1453 More actions · Answer 4

True, obviously one of the tables has to do a scan, but I wouldn't have thought both would.

I've dug a little deeper and it turns out that out of 2m rows, the join hit 1.6m of them, so a scan is probably fine. But how can SQL Server know that there isn't only one row matching, when it's creating the execution plan? If there were a small number of matching rows, I still feel a scan on one table and seek on the other would be better?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Brian McGee-355525 (10/19/2009)
I've dug a little deeper and it turns out that out of 2m rows, the join hit 1.6m of them, so a scan is probably fine. But how can SQL Server know that there isn't only one row matching, when it's creating the execution plan?

SQL has column statistics that it will use to estimate the number of rows affected.

If you feel that the exec plan is sub-optimal, please post it so that we can take a look. See this article for posting an exec plan from SQL 2000. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Steve Jones - SSC Editor SSC Guru Points: 736212 More actions · Answer 6

Statistics are the big thing, but it also depends on the size of the table. For example, if you have a table with 100 rows, it probably will just scan. At a 2M row table, I'm not sure what the threshhold will be, but you could experiement. Move the table to another server/db, remove some data that will be updated. I'd be curious to see if you had 400k rows, or 100k rows if it will seek.

Atat Rowcount Default port Points: 1453 More actions · Answer 7

I've run sp_updatestats, one table has 1.5 million rows, the other two million. Still does scan for both tables. No idea why - surely it should use the index and seek one at least one of the two tables...

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

Because seeks are not always optimal, especially when the actual data row is required (as in an update or a non-covering nonclustered index) and the rows affected is a significant (> 1%) portion of the table. A scan of the cluster is far cheaper than a seek followed by a million or so lookups.

Again, if you think this exec plan is not optimal, please post it. Discussing the reason for exec plans without looking at them is hard.

http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass