Performance Tuning Big Table

Question

Post reply

Performance Tuning Big Table

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply

stolbovoy SSC Journeyman Points: 96 More actions · Answer 1

Will converting varchar(8) to char(8) help? What is average length of the code values?

What is exact scenario that exhibits poor performance on that table?

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

stolbovoy (4/2/2010)
Will converting varchar(8) to char(8) help? What is average length of the code values?

I mentioned that at 1117 yesterday. 😎

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

stolbovoy SSC Journeyman Points: 96 More actions · Answer 3

Sorry, I missed it.

Then just, "what is exact query that performs poorly?"

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 4

The OP has left the bulding...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

LutzM SSC Guru Points: 107049 More actions · Answer 5

Grant Fritchey (4/2/2010)
...
But seriously, I was thinking, and I'm assuming Lutz was thinking, adding a number or or substituting a number for the char field. Idle chatter really.

Your assumption is correct. 🙂

@TheSQLGuru

It might be possible that the OP's requirement to use a character/number combination will be re-evaluated based on the information provided in this thread.

So far it's unclear whether this requirement is a "have to have" or "that's how we're currently doing it" type...

Either way, using a number as a clustered index has not that much to do with whether or not (s)he will use the CI directly or add the char field.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 6

Everyone else seems to have had a go, so here are my thoughts:

1. This table probably does not benefit from having a clustered index. If the insertions are essentially random, it will be hurting you a lot. If rows are never deleted, consider turning it into a heap. (Updates can never make a row wider so forwarded records will not be a problem).

2. Use CHAR(8) instead of VARCHAR(8) to save on storage and fit more rows per page.

3. Give the Code column an explicit binary collation like Latin1_General_BIN, unless case-insensitivity is required, in which case use a SQL collation like SQL_Latin1_General_CP1_CI_AI

4. Split the table up. You should be running Enterprise Edition anyway, but if not, you can still use local partitioned views. Partitioning will decrease the index depth and make maintenance easier.

5. Consider storing State = 0 rows separately from State = 1 rows.

6. Consider adding a hash index: Add a computed column AS CHECKSUM(Code), do not persist it, but do index it. This will store a hash value in four bytes per row and allow you to check if a code exists faster. Since the column is not persisted, it will take no storage in the table itself. Code your routines carefully to use the hash index.

7. Ensure your system has sufficient RAM to keep the hash index in memory.

8. Use the READUNCOMMITTED isolation level when checking for code existence to reduce locking overhead.

9. Maintain the hash index regularly (ONLINE if necessary) to keep fragmentation at acceptable levels. This will promote large-size I/O read-ahead when necessary. This is another vote for Enterprise Edition, which supports single I/O requests up to 1MB for read-ahead.

Paul

edit: removed huge quoted text

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

stolbovoy SSC Journeyman Points: 96 More actions · Answer 7

Paul White NZ (4/3/2010)
Everyone else seems to have had a go, so here are my thoughts:
1. This table probably does not benefit from having a clustered index. If the insertions are essentially random, it will be hurting you a lot. If rows are never deleted, consider turning it into a heap. (Updates can never make a row wider so forwarded records will not be a problem).
...

Why would it help? Reads based on non-clustered index will result in extra page read. Considering that RID is bigger than tinyint, structures of clustered and non-clustered index pages are similar, except leafs, insert will require the same amount of I/O plus writing to a heap page. Storing that small row in a table with only non-clustered index looks like total waste of I/O, disc space and inefficient use of cache. Or am I missing something?

Paul White SSC Guru Points: 150468 More actions · Answer 8

stolbovoy (4/4/2010)
Why would it help?

The record size of each row as the table is currently defined is 20 bytes.

Changing to CHAR(8) reduces the record size to 16 bytes. (20% smaller).

A non-clustered index on the hash value has a record size of 13 bytes (35% smaller).

Integer values also tend to be faster to compare than strings, and may compress better too.

Anyway, it's probably worth clearing something up here: My previous post represents a list of ideas to explore, which was the reason that just about every point featured the word 'Consider...'. 😉

There is not enough detail in the thread to make any firm recommendations, so please don't misread my comments as being intended as in any way definitive. I simply intended to promote constructive discussion.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi