May 20, 2019 at 7:03 pm
We have some TXT file imports into SQL tables. As a default, the columns are set to varchar(500) to match the incoming definition. However, the data will never be 500.
Is there a penalty to indexing a VARCHAR(500) column, if the max length will be 50, compared to defining a VARCHAR(50) column and migrating the data to that, and indexing the VARCHAR(50) column ?
May 20, 2019 at 7:20 pm
Yes, there is a penalty, because SQL bases its memory allocation on the column lengths. If you think about it, that actually makes perfect sense, as presumably longer columns would tend to use more memory.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 21, 2019 at 3:44 am
Also, there is a 900 byte max key length that you have to stay within.
The simple fact is, it's going to be harder to index 500 characters than 50 because it's longer. Yes, 420 characters will be more selective than 42, but, your histogram is going to have to cover a lot more territory accurately to make the index reflect reality. Data skew is certainly likely with much larger text fields which can lead to bad plans as the row count estimates are wrong. As narrow as possible on the keys for indexes is a good general rule. It's not carved in a stone tablet, but I do try to follow it. However, if you need to index 500 characters, and testing shows it helps, index 500 characters.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2019 at 2:58 pm
What I'm wondering is if the data is only 50 characters, the remaining 450 characters will be blank. So is an index on basically the first 50 characters of a 500 field worse than an index on a 50 character field ?
May 21, 2019 at 3:03 pm
Well, then it won't matter. If the data will ALWAYS be less than 50 characters then indexing a VARCHAR(50) or a VARCHAR(500) will be the same. It's completely dependent on the data. However, you can't guarantee that the data will be less than 50 because you're allowing 500.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2019 at 3:46 pm
Thanks for the replies.
May 30, 2019 at 8:33 am
"Also, there is a 900 byte max key length that you have to stay within"
In SQL Server 2016 this has been increased to 1700 bytes for non-clustered indexes, see https://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/04/26/increased-nonclustered-index-key-size-with-sql-server-2016/
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply