Index confusion

Question

Index confusion

PHXHoward

SSCrazy Eights

Points: 8389
More actions
March 11, 2013 at 12:29 pm

#275146

Hi everyone.
I have a table with a few million rows in it. There is a process as part of a stored procedure that checks if data exists. It checks by selecting top 1 from the table prior to moving on.
The developer told me that it gets slower and slower as more data is added to the table so we looked at ways to speed things up.
They are running a query like this:
SELECT TOP 1 * FROM TABLEA WHERE COLUMN4 = 'xxxx' and COLUMN6 = 'xxxx' and COLUMN9 = 'xxxx'
There were no good indexes that would help us so I created a non clustered that includes COLUMN4, COLUMN6, and COLUMN9. After creating this index, the execution plan shows that the query is using the new index.
My question is, why does the above query have the same execution plan as the below query?
SELECT TOP 1 COLUMN4 FROM TABLEA WHERE COLUMN4 = 'xxxx' and COLUMN6 = 'xxxx' and COLUMN9 = 'xxxx'
The second query can get all of its output from the index and should not need to visit the table at all right? The top query is requesting all rows and should need to get the unindexed rows from the actual table right?
Why would the two selects have the same execution plan?
Thanks for helping me understand.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

PHXHoward SSCrazy Eights Points: 8389 More actions · Answer 1

I think I found the problem.

There was another index out there that included 32 of the 35 columns which it was referencing. That seems like a really big index.

5 columns indexed and 29 "included" columns.

Sean Lange SSC Guru Points: 286590 More actions · Answer 2

A better approach then using "select top 1" as way to check if data exists is to use...EXISTS.

IF EXISTS(SELECT * FROM TABLEA WHERE COLUMN4 = 'xxxx' and COLUMN6 = 'xxxx' and COLUMN9 = 'xxxx')

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 3

PHXHoward (3/11/2013)
I think I found the problem.
There was another index out there that included 32 of the 35 columns which it was referencing. That seems like a really big index.
5 columns indexed and 29 "included" columns.

Yikes. Sounds like creating an Indexed View might have been worth looking into at that point.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

PHXHoward SSCrazy Eights Points: 8389 More actions · Answer 4

I have done a lot of reading over the last couple of days and I understand what is going on now.

The table has 35 columns. The developer was using the below query in a stored procedure to check if the record exists before trying to insert it into another table.

select top 1 * from table

where col1 = 'xxxx' and col2 = 'xxx' and col3 = 'xxx'

The table has millions of rows that have a clustered index on an id field but no non clustered so the query was doing a scan.

The DTA suggested a non clustered index using the three rows in the where and all the others as includes to cover the select * query.

This sped up the processing quite a bit but today I told the developer that instead, the record check could be either

If exists(select 1 from table where....) or even select top 1 col1 from table where....

This would allow the NC index not to need to hold the include columns which are useless for this process.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 5

If exists(select 1 from table where....) or even select top 1 col1 from table where....

"EXISTS" will be faster anyway, regardless what you really select inside of it, as this bit is ignored when using EXISTS. It just checks if at least one row exists which satisfies given condition and returns True or False.

"SELECT TOP 1 Col1" will still need to return something from a table ... - so it does take a bit more time.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]