April 2, 2010 at 8:26 am
Will converting varchar(8) to char(8) help? What is average length of the code values?
What is exact scenario that exhibits poor performance on that table?
April 2, 2010 at 9:05 am
stolbovoy (4/2/2010)
Will converting varchar(8) to char(8) help? What is average length of the code values?
I mentioned that at 1117 yesterday. 😎
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
April 2, 2010 at 9:11 am
Sorry, I missed it.
Then just, "what is exact query that performs poorly?"
April 2, 2010 at 6:08 pm
The OP has left the bulding...
--Jeff Moden
Change is inevitable... Change for the better is not.
April 3, 2010 at 3:37 am
Grant Fritchey (4/2/2010)
...But seriously, I was thinking, and I'm assuming Lutz was thinking, adding a number or or substituting a number for the char field. Idle chatter really.
Your assumption is correct. 🙂
It might be possible that the OP's requirement to use a character/number combination will be re-evaluated based on the information provided in this thread.
So far it's unclear whether this requirement is a "have to have" or "that's how we're currently doing it" type...
Either way, using a number as a clustered index has not that much to do with whether or not (s)he will use the CI directly or add the char field.
April 3, 2010 at 8:41 am
Everyone else seems to have had a go, so here are my thoughts:
1. This table probably does not benefit from having a clustered index. If the insertions are essentially random, it will be hurting you a lot. If rows are never deleted, consider turning it into a heap. (Updates can never make a row wider so forwarded records will not be a problem).
2. Use CHAR(8) instead of VARCHAR(8) to save on storage and fit more rows per page.
3. Give the Code column an explicit binary collation like Latin1_General_BIN, unless case-insensitivity is required, in which case use a SQL collation like SQL_Latin1_General_CP1_CI_AI
4. Split the table up. You should be running Enterprise Edition anyway, but if not, you can still use local partitioned views. Partitioning will decrease the index depth and make maintenance easier.
5. Consider storing State = 0 rows separately from State = 1 rows.
6. Consider adding a hash index: Add a computed column AS CHECKSUM(Code), do not persist it, but do index it. This will store a hash value in four bytes per row and allow you to check if a code exists faster. Since the column is not persisted, it will take no storage in the table itself. Code your routines carefully to use the hash index.
7. Ensure your system has sufficient RAM to keep the hash index in memory.
8. Use the READUNCOMMITTED isolation level when checking for code existence to reduce locking overhead.
9. Maintain the hash index regularly (ONLINE if necessary) to keep fragmentation at acceptable levels. This will promote large-size I/O read-ahead when necessary. This is another vote for Enterprise Edition, which supports single I/O requests up to 1MB for read-ahead.
Paul
edit: removed huge quoted text
April 4, 2010 at 5:33 am
Paul White NZ (4/3/2010)
Everyone else seems to have had a go, so here are my thoughts:1. This table probably does not benefit from having a clustered index. If the insertions are essentially random, it will be hurting you a lot. If rows are never deleted, consider turning it into a heap. (Updates can never make a row wider so forwarded records will not be a problem).
...
Why would it help? Reads based on non-clustered index will result in extra page read. Considering that RID is bigger than tinyint, structures of clustered and non-clustered index pages are similar, except leafs, insert will require the same amount of I/O plus writing to a heap page. Storing that small row in a table with only non-clustered index looks like total waste of I/O, disc space and inefficient use of cache. Or am I missing something?
April 4, 2010 at 6:33 am
stolbovoy (4/4/2010)
Why would it help?
The record size of each row as the table is currently defined is 20 bytes.
Changing to CHAR(8) reduces the record size to 16 bytes. (20% smaller).
A non-clustered index on the hash value has a record size of 13 bytes (35% smaller).
Integer values also tend to be faster to compare than strings, and may compress better too.
Anyway, it's probably worth clearing something up here: My previous post represents a list of ideas to explore, which was the reason that just about every point featured the word 'Consider...'. 😉
There is not enough detail in the thread to make any firm recommendations, so please don't misread my comments as being intended as in any way definitive. I simply intended to promote constructive discussion.
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply