Best practice when to use an index

Question

Best practice when to use an index

Frank Kalis

SSC Guru

Points: 111183
More actions
June 24, 2003 at 1:50 am

#64783

Hi all,
I see the advantages to implement index(es) when there are many rows in the table, but I have many say parameter tables with no more than 10 - 20 rows. Does it make sense to create an index for such tables or isn't a table scan quicker?
Cheers,
Frank
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Viewing 15 posts - 1 through 15 (of 54 total)

You must be logged in to reply to this topic. Login to reply

crosspatch Hall of Fame Points: 3168 More actions · Answer 1

Interesting question. I guess the answer is six of one and half a dozen of the other.

With regard to parameter tables in particular I always tend to give these an Identity column as a Primary Key. Why? well firstly it's safer to be very clear about which parameter you want to use and an identity is potentially a lot clearer and simpler than say a Varchar. Secondly, my coroporation's standards often put parameter tables on linked servers plus parameters can be updatable by applications, we've had some probs with DML between linked servers where there isn't a unique key on a table.

I have found it is possible to please all of the people all of the time if you do exactly what they want. Harold Macmillan 1961

Frank Kalis SSC Guru Points: 111183 More actions · Answer 2

Hi crosspatch,

quote:
Interesting question. I guess the answer is six of one and half a dozen of the other.

agreed, this will only be suggestions, which to follow or not to follow.

quote:
With regard to parameter tables in particular I always tend to give these an Identity column as a Primary Key. Why? well firstly it's safer to be very clear about which parameter you want to use and an identity is potentially a lot clearer and simpler than say a Varchar.

That's what I'm doing right now.

Name it id, choose int data type, make it PrimaryKey and Identity and save. But is it really 'best' practice?

Cheers,

Frank

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

stax68 SSChampion Points: 11761 More actions · Answer 3

I'd say it depends how wide the tables are. If they are just: surrogate key(PK ID int column), Param name (%char), Param value (%char), datatype, datalen, precision, then there's not much point in indexing.

If there are many columns which don't usually need to be accessed, then a covering index on a heap table might be worth considering.

A clustered index on the SARG column(s) would be a good idea, but if there are frequent inserts you need to leave enough padding to prevent excessive page splitting.

In any case, you need to approach it by thinking how many pages are going to be traversed to get to the record you're after. this will depend on the width of the records, and on whether you use clustered or nonclustered index, and if there's a clustered index, whether it is on the SARG(s).

But for a table with only 10-20 reasonably narrow rows, I wouldn't expect much difference in performance, whatever your indexing strategy - but as always the only reliable answer will come from experimentation.

Tim

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11761 More actions · Answer 4

Reminder:

When using a surrogate key (ID column), don't forget to put a unique constraint on the real (conceptual) primary key. That's the value you're going to need to search on, unless you are going to follow the exceptionally *bad* practice of hardcoding meaningless key values.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Frank Kalis SSC Guru Points: 111183 More actions · Answer 5

Hi Tim,

quote:
I'd say it depends how wide the tables are. If they are just: surrogate key(PK ID int column), Param name (%char), Param value (%char), datatype, datalen, precision, then there's not much point in indexing.

that's what the tables are. 2, maybe 3 columns.

quote:
A clustered index on the SARG column(s) would be a good idea, but if there are frequent inserts you need to leave enough padding to prevent excessive page splitting.

the data inserted hardly never changes. From year to year I guess.

quote:
But for a table with only 10-20 reasonably narrow rows, I wouldn't expect much difference in performance, whatever your indexing strategy - but as always the only reliable answer will come from experimentation.

Yup, but any recommendations when an index starts to make sense?

Cheers,

Frank

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Jeremy Kemp SSCertifiable Points: 6754 More actions · Answer 6

If I understand things correctly, using an index is at least 2 reads - one (or more) for the index and one for the page containing the data.

If this is right, then it is only worth having an index if the whole table is more than 2 data pages. The actual number of rows will depend on how wide each row is.

I have a number of small tables and had the same question. I used the execution plans in QA to test out whether an index is better and it always suggested that using an index was better than doing a table scan even when there were only a few rows in the table. It doesn't seem right but unless the execution plan is wrong that is what it says and I tended to put an index on the table.

Life is probably a lot more complicated than I believe because of SQL Server caching and RAID caching so what the query plan says might not always be what is really going on.

At the end of the day, I guess you just have to try it an see which works best.

Jeremy

Frank Kalis SSC Guru Points: 111183 More actions · Answer 7

Hi Jeremy,

quote:
If I understand things correctly, using an index is at least 2 reads - one (or more) for the index and one for the page containing the data.

only if you speak of nonclustered indexes.

Cheers,

Frank

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

stax68 SSChampion Points: 11761 More actions · Answer 8

Jeremy -

quote:
unless the execution plan is wrong

It often is, luckily for us.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

David.Poole SSC Guru Points: 76356 More actions · Answer 9

I would tend to index the table on principle but if it is virtually static I would make sure that the index has a very high fill factor.

As you will only be doing look-ups on the table the issue of splitting indices and adding leaves doesn't really apply.

I would say that indexing a ten row table would not normally be expected to do much but it depends on the demands being placed on your server. If its an internet app then not very much x huge audience = major headache.

LinkedIn Profile

Frank Kalis SSC Guru Points: 111183 More actions · Answer 10

Hi David,

quote:
I would tend to index the table on principle but if it is virtually static I would make sure that the index has a very high fill factor.

good point, bringing the fill factor in. But from my understanding indexes are also placed into 8KB pages. Now when the fill factor is less than 100% you might be wasting space, when it is 100% you're in risk of getting fragmented indexes which would affect performance. I've decided to drop any indexes from such tables. If performance or anything else suffers, well, then I#m building them again.

quote:
.. If its an internet app then not very much x huge audience = major headache.

Not sure if I get the meaning

Cheers,

Frank

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

stax68 SSChampion Points: 11761 More actions · Answer 11

Frank,

A clustered index will always have a root node page, so I think Jeremy is right that an index always means at least two reads. That means (I guess) that a clustered index ON THE SARG is worth having if the mean number of reads for a table scan is more than two: i.e. the table is spread over four pages or more.

BUT - this only applies if the clustered index key is a small data value like int - not a wide character column - otherwise the clustered index and any other indexes will become large and slow.

Also we are here assuming that you are looking for a single value, rather than a range.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11761 More actions · Answer 12

Frank,

quote:
Now when the fill factor is less than 100% you might be wasting space, when it is 100% you're in risk of getting fragmented indexes which would affect performance.

You will only get fragmentation if the indexed column is updated, or (I suppose) if you get page splitting/a record moving in the data pages.

With static key column data and fixed-width data in teh rest of the table, you can't get any page splitting in the index. So fragmentation wouldn't be a problem. You would need to reindex when new rows are added and deleted or the key column amended though, of course....

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11761 More actions · Answer 13

quote:
quote:
If its an internet app then not very much x huge audience = major headache.

I didn't get this at first, either, but it's:

If an internet app, then:

(small performance hit)*(huge audience) = (major headache)

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Frank Kalis SSC Guru Points: 111183 More actions · Answer 14

Hi Tim,

quote:
A clustered index will always have a root node page, so I think Jeremy is right that an index always means at least two reads. That means (I guess) that a clustered index ON THE SARG is worth having if the mean number of reads for a table scan is more than two: i.e. the table is spread over four pages or more.

Jeremy mentioned two reads -one (or more) for the index and one for the page containing the data. That is the description of a nonclustered index. If you're looking for say id=10 you'll find it in the index along with a pointer to the data page with the rest. In a clustered index id=10 takes one read less, because when you have found it in the index you also have the rest.

quote:
BUT - this only applies if the clustered index key is a small data value like int - not a wide character column - otherwise the clustered index and any other indexes will become large and slow.

Yes, int and nothing else.

quote:
Also we are here assuming that you are looking for a single value, rather than a range.

Yes again, doesn't make sense to me searching parameter values for a range

BTW, what means ON THE SARG

Cheers,

Frank

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]