Why are some SELECTs very slow?

  • I have the following in my MS-SQL 2000 DB:

    Table Name:  PCAPostWord

    Columns:  ID, WordID, ThreadID, PostWordSortOrder (all of type "int")

    ID/WordID is a clustered index; ThreadID is indexed, PostWordSortOrder is indexed.

    The table has about 9,000,000 rows.

    From SQL Query Analyzer, if I do:

    SELECT PostWordSortOrder FROM PCAPostWord WHERE PostWordSortOrder = 2

    it takes under a second to return 25,000 rows.

    If I do:

    SELECT WordID FROM PCAPostWord WHERE PostWordSortOrder = 2

    it takes under a second to return 25,000 rows.

    But if I do:

    SELECT ThreadID FROM PCAPostWord WHERE PostWordSortOrder = 2

    it takes about 30 seconds to return 25,000 rows

    Question:  Why would it take so much longer to SELECT the ThreadID than the other columns?

    Thanks.

    Robert

     

  • Because for each of the 25000 rows that have PostWordSortOrder = 2 a bookmark lookup is needed. The results for the first 2 queries can be created and returned by doing an index seek in the PostWordSortOrder index to find the matching rows. The data to output is then found right there, in the first case it is the same column as is indexed, in the second it is part of the clustered index key which always exists in the nonclustered index rows.

    But for the third query you are referencing a column that is not found in the nonclustered index PostWordSortOrder. So for each of the 25000 hits a bookmakr lookup (in this case a clustered index seek) needs to be done, making the query much slower.

  • Chris,

    Thanks very much for your quick response.  After I saw your response, I looked at the execution plan, and indeed for the slow search a clustered index scan is being done, which accounts for the slowness.

    So the next question is:  What are the various alternatives for making the slow SELECT faster?

    Thanks.

    Robert

  • Ok, then I was slightly incorrect in stating that 25000 bookmark lookups where done. Since so many lookups would be necessary SQL Server instead chose to do a clustered index scan directly. With the existing indexes that is probably the fastest plan for the existing data. If you need to make that query execute fast you could add a covering index on (PostWordSortOrder, ThreadID). But remember that too many indexes might have an overall negative impact on your system. It all depends on where the priorities are and how everything is used.

  • So, If your query is run once a month, at 3 in the morning, maybe you can afford to have it take 30 seconds and avoid another index.  But if it's critical to users, and they run it all day long, then maybe you need the index.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply