Cluster Index - Huge Table - 0 Fragmentation - Clustered Index Scan???

Question

Post reply

Cluster Index - Huge Table - 0 Fragmentation - Clustered Index Scan???

curious_sqldba

SSC-Dedicated

Points: 36502
More actions
September 5, 2010 at 10:58 pm

#141713

I have about 15 gb table. It has one clustered index with no fragmentation at all.I have a select query which pulls about 1000 records and the where clause is for a column which is a non clustered index. Query is taking more than 50 ssecs and i do see it is doing 100 % index scan? I thought i could make the column in where clause to be a part of clustered index but i am not able to?

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply

Ramji29 Hall of Fame Points: 3832 More actions · Answer 1

Can u post table structure, index definition & the select query...

Rohit

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 2

is your non-clustered index UNIQUE ? i dont think so

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

Index scan or clustered index scan?

Several reasons why it might be doing a clustered index scan:

* The where clause predicate is not SARGable

* The where clause predicate does not refer to a left-based subset of the index key

* The query could seek on a nonclustered index, but there are too many rows returned and the nonclustered index is not covering

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

curious_sqldba SSC-Dedicated Points: 36502 More actions · Answer 4

i ran DTA against the query and it did create a non-clustered index with "INCLUDE" on all columns. What does this mean?

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

It means it's suggesting that you essentially duplicate the entire table. Usually a very bad idea. DTA is not perfect (or even good) a lot of the time.

Why don't you post the query, exec plan and table definitions here so that we can have a look at them?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

curious_sqldba SSC-Dedicated Points: 36502 More actions · Answer 6

GilaMonster (9/6/2010)
It means it's suggesting that you essentially duplicate the entire table. Usually a very bad idea. DTA is not perfect (or even good) a lot of the time.
Why don't you post the query, exec plan and table definitions here so that we can have a look at them?

Gail i am attaching the table definition and the execution plan. The table definition has the new index suggested by DTA and execution plan is before the new index was created. Thanks

query:select * from test where test_regid= 12345

thanks

-Jeff Valued Member Points: 68 More actions · Answer 7

Hi,

Just something to consider. Try forgetting for a minute about tuning the query or getting the best possible results. Instead let's take some time and go back to the basics. Look at the data stored in the table, then look at the definition and ask the below questions. The reason being is because a proper table design will weigh heavily on your performance.

1) What does the data for each of the following columns look like?

a) To find out write a simple query that returns the min, max and average row lengths for each varchar column. We can also go one step further and see all of the row lengths with their respected counts. This will give the distribution and help determine if the varchar 4000's are needed, maybe a length only 200 or 10 will be appropriate.

If indeed all of these 4000 lengths and some of the columns are used most of the time then consider splitting the table into two or more tables. ie. Table A will have the highly used columns and Table B will have the columns that are rarely if ever use.

b) Do any of these columns store only integer or numeric based values without any alpha or special characters?

If they do then change the data type to match the actual data stored.

c) If the answer to b) is unknown as you might receive a feed from an outside source, then contact that person or persons and ask them the above questions.

2) How is the table used? Is it highly transactional with many inserts, updates, deletes? Or is it a daily load, trickle batch load, etc...?

There currently is an identity column with a fillfactor of 90. Since identity columns are increasing in value it might benefit to change the fillfactor to 0. Try it and run some stats to see if that helps. Although before the stats are run compare apples to apples, so do all of the appropriate table and index maintenance first to get the best benchmark.

If it is a highly transactional table then consider a stepped up table maintenance schedule as many columns with a 4000 length will increase the likelihood of large fragmentation to occur.

3) It is possible under some scenarios that column placement can also have an impact on performance especially if the row spans multiple pages. (This is database specific as column based databases and mpp systems don't adhere to this rule).

4) Take the time to review the data and find out if NULLs are ever or will ever be used. If a NULL will never happen, then change the column to NOT NULL. This holds a lot of water when creating indexes on these columns such as Test_RegID. If NULL's only happen when there is a problem with data quality then keep it NOT NULL and let the error be thrown. This is a good way to find problems and address them.

Lastly, look at the query you are testing. "select * from test where test_regid= 12345" Instead run a query that will be used as a "select *" will return all rows from the table and this could increase the time it takes to return the results back. Try doing "select 1 from test where test_regid = 12345"

There's a lot more that can be done to tune this from a table design and database configuration standpoint, although this should be good for now.

Another side note and this has nothing to do with performance. Try to make the queries as neat as possible and follow ANSI SQL standards.

[font="Courier New"]

--* Comment about what the query does

SELECT

t.Test_Id AS Test_Id,

.

t.col_x AS col_x

FROM

Test AS t

WHERE

t.Test_RegID = 12345

[/font]

As with any suggestion it is always best to take before and after stats as some suggestions will work with some scenarios and not others. Keep these stats saved in a spreadsheet for future use because as the data changes the query might have to be modified to accommodate it. This will also help you to remember what worked and didn't work for a particular scenario.

Good luck,

-Jeff

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 8

iqtedar (9/6/2010)
query:select * from test where test_regid= 12345

Answer 3 then.

The query could seek on a nonclustered index, but there are too many rows returned and the nonclustered index is not covering

Is that SELECT * really necessary? Do you absolutely need every single column in that table?

How long does that query take?

Is that too slow? (no point in optimising something that's performing acceptably)

How many rows does it return?

How many rows in the table?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

curious_sqldba SSC-Dedicated Points: 36502 More actions · Answer 9

Thanks Gail. Query takes more than 50 secs and will definitely time out from front end. The index suggested through DTA makes the query run in less than 1 sec. What do u say?

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 10

If you can afford to duplicate the entire 15 GB table (which is what an index with all columns included will do), and don't care about the impact of that on space, backup space, backup and restore time, reindex time, etc, go right ahead.

Otherwise please answer the rest of the questions that I asked.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Derrick Smith SSChampion Points: 10833 More actions · Answer 11

Also keep in mind that every insert/update will take twice as long with an index like that (or longer, if any page splits/moves/etc occur).

SQLBuddy123 One Orange Chip Points: 28294 More actions · Answer 12

Gail is right.

First, Select * should be avoided. Only the required columns must be selected within a query .. not a column more.

Creating an index by including each and every column in the table doesn't make any sense (result of using select *)

Index recommendations from DTA should not be implemented blindly. There will not be much use in creating overlaping or duplicating indexes. Instead they should be eliminated from the tables

Creating covering indexes with included columns is a good practise but never with select * queries.

Thank You,

Best Regards,

SQLBuddy

curious_sqldba SSC-Dedicated Points: 36502 More actions · Answer 13

GilaMonster (9/7/2010)
iqtedar (9/6/2010)
query:select * from test where test_regid= 12345
Answer 3 then.
The query could seek on a nonclustered index, but there are too many rows returned and the nonclustered index is not covering
Is that SELECT * really necessary? Do you absolutely need every single column in that table?
How long does that query take? - 50 + secs
Is that too slow? (no point in optimising something that's performing acceptably)- Yes, will time out
How many rows does it return? --- 534
How many rows in the table?

--- More than 7 million

thanks

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 14

Great. 4 questions down, most important two to go.

Also, please post the query's execution plan as a .sqlplan file. The excerpt of text you posted previously does not contain anywhere close to the amount of info that's available in the plan. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass